

Malware Classification Method Based on Feature Weights
摘要 近年来由于计算机和人们的工作生活结合得更加紧密,为保障信息安全,恶意软件分类的重要性与日俱增,但是现有的恶意软件分类方法大多都存在模型复杂、耗费时间长以及效果不突出等困境。为提高恶意软件分类效率,提出一个结合特征提取和卷积神经网络的恶意软件分类框架。针对目前恶意软件分类算法准确率低、处理时间慢等问题,引入并改进了NLP领域中的一种特征权重算法。通过计算操作码的特征权重,选取具有较大信息增益的操作码作为特征词,然后提取恶意样本的特征图,最后传入卷积神经网络进行训练和分类。实验结果表明,该方法在big2015数据集上的准确率为99.26%,比基于TFIDF特征提取的方法略好,接近该数据集上的冠军方法,在不均衡类别上的分类表现优于基于频率的特征词选择的提取算法,并且在预处理时间上短于其他方法。 In recent years,as computers and people’s work and life have become more closely integrated,the importance of malware classification has increased day by day to ensure information security.However,most of the existing malware classification methods have difficulties such as complex model,long time-consuming,and inconspicuous effects.In order to improve the efficiency of malware classification,a malware classification framework combining feature extraction and convolutional neural network is proposed.Aiming at the problems of low accuracy and slow processing time of current malware classification algorithms,a feature weighting algorithm in the field of NLP is introduced and improved.By calculating the feature weight of the opcode,the opcode with greater information gain is selected as the feature words,then the feature maps of the malicious sample is extracted,and passed into the convolutional neural network for training and classification at last.Experimental results show that the accuracy of the proposed method on the big2015 dataset is 99.26%,which is slightly better than the method based on TFIDF feature extraction.It is close to the champion method on this dataset,and the classification performance on unbalanced categories is better than that based on frequency.The extraction algorithm for feature word selection,and the preprocessing time is shorter than other methods.
作者 叶彪 李琳 丁应 宋荆汉 万振华 YE Biao;LI Lin;DING Ying;SONG Jing-han;WAN Zhen-hua(School of Computer Science and Technology,Wuhan University of Science and Technology,Wuhan 430065,China;Hubei Province Key Laboratory of Intelligent Information Processing and Real-time Industrial System,Wuhan 430065,China;Shanghai Aerospace Precision Machinery Research Institute,Shanghai 201600,China;Shenzhen Open Source Internet Security Technology Co.,Ltd.,Shenzhen 518000,China)
出处 《计算机技术与发展》 2022年第11期115-120,共6页 Computer Technology and Development
基金 国家自然科学基金(61572381) 湖北省教育厅项目(2020354) 湖北省大学生创新创业训练计划项目(S202110488047)。
关键词 特征权重 特征提取 操作码 卷积神经网络 恶意软件分类 feature weight feature extraction opcode convolutional neural network malware classification
  • 相关文献



  • 1LECUN Y, BOTTOU L, BENGIO Y, et al. Gradient-based learning applied to document recognition [J]. Proceedings of the IEEE, 1998, 86(11): 2278-2324. 被引量:1
  • 2HINTON G E, OSINDERO S, TEH Y W. A fast learning algorithm for deep belief nets [J]. Neural Computation, 2006, 18(7): 1527-1554. 被引量:1
  • 3LEE H, GROSSE R, RANGANATH R, et al. Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations [C]// ICML '09: Proceedings of the 26th Annual International Conference on Machine Learning. New York: ACM, 2009: 609-616. 被引量:1
  • 4HUANG G B, LEE H, ERIK G. Learning hierarchical representations for face verification with convolutional deep belief networks [C]// CVPR '12: Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition. Washington, DC: IEEE Computer Society, 2012: 2518-2525. 被引量:1
  • 5KRIZHEVSKY A, SUTSKEVER I, HINTON G E. ImageNet classification with deep convolutional neural networks [C]// Proceedings of Advances in Neural Information Processing Systems. Cambridge, MA: MIT Press, 2012: 1106-1114. 被引量:1
  • 6GIRSHICK R, DONAHUE J, DARRELL T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation [C]// Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition. Washington, DC: IEEE Computer Society, 2014: 580-587. 被引量:1
  • 7LONG J, SHELHAMER E, DARRELL T. Fully convolutional networks for semantic segmentation [C]// Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. Washington, DC: IEEE Computer Society, 2015: 3431-3440. 被引量:1
  • 8SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition [EB/OL]. [2015-11-04]. http://www.robots.ox.ac.uk:5000/~vgg/publications/2015/Simonyan15/simonyan15.pdf. 被引量:1
  • 9SZEGEDY C, LIU W, JIA Y, et al. Going deeper with convolutions [C]// Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. Washington, DC: IEEE Computer Society, 2015: 1-8. 被引量:1
  • 10HE K, ZHANG X, REN S, et al. Deep residual learning for image recognition [EB/OL]. [2016-01-04]. https://www.researchgate.net/publication/286512696_Deep_Residual_Learning_for_Image_Recognition. 被引量:1









使用帮助 返回顶部