摘要
JavaScript目前已经成为交互式网页和动态网页中一项广泛采用的技术,恶意的JavaScript代码也变得活跃起来,已经被当作基于网页的一种攻击手段。通过对大量JavaScript恶意代码的研究,对混淆恶意JavaScript代码进行特征提取与归类,从基于属性特征、基于重定向特征、基于可疑关键词特征、基于混淆特征四大类中总共提取了82个特征,其中47个是四大类中的新特征。从真实环境中收集了总数为5525份JavaScript正常与混淆的恶意代码用于训练与测试,利用多种有监督的机器学习算法通过异常检测模式来评估数据集。实验结果表明,通过引入新的特征,所有分类器的检测率较未引入新特征相比有所提升,并且误检率(FalseNegativeRate)有所下降。
JavaScript has become a widely used technology in interactive and dynamic webpages,malicious JavaScript code also becomes active and has been used as an attack method based on Web pages. Based on the study of a large number of JavaScript malicious code,the paper makes the feature extraction and classification on obfuscated malicious JavaScript code. From the four categories: based on attribute features,redirection features,suspicious keyword features and confusion features,a total of 82 features are extracted,of which 47 are new features in the four major categories. 5 525 JavaScript-based pages are collected in a real environment for training and testing,and data sets are further evaluated through anomaly detection patterns using a variety of supervised machine learning algorithms. Experiment shows that compared with not introducing new features,the detection rate of all classifiers is improved by introducing new features,and the False Negative Rate has decreased.
作者
曲文鹏
赵连军
邓旭
QU Wenpeng;ZHAO Lianjun;DENG Xu(College of Computer Science and Technology,Shandong University of Technology,Zibo Shandong 255049,China)
出处
《智能计算机与应用》
2018年第4期42-47,共6页
Intelligent Computer and Applications