摘要
研究垃圾邮件检测准确性问题,提高网络安全。邮件特征具有高维、冗余量大,传统检测模型无法降低特征维数,冗余信息难以消除,导致计算时间长,空间复杂度大,垃圾检测正确率低等缺陷,为提高垃圾检测正确率,提出一种白名单和支持向量机相结合的两层垃圾邮件检测模型。采用聚类特征技术对特征进行聚类,降低特征维数,消除特征间冗余信息,将白名单检测技术作为垃圾检测系统第一道防线,检测已知地址垃圾邮件,支持向量机作为第二道防线,检测新的垃圾邮件,提高网络安全。采用垃圾邮件数据对模型性能进行检验,实验结果表明,两层垃圾邮件检测模型有效提高了垃圾邮件检测效率和正确率,为通信邮件管理提供了有效的手段。
Research spam detection problems. Network security is improved. Mail has the features of high dimension, high redundancy, the traditional testing model cannot reduce the feature dimension and eliminate redundant information, leading to long computation time and space complexity. In order to improve the detection rate of garbage mails, the paper put forward a two layers spam detection model which combined white list with support vector ma- chine. Feature clustering technique was used to reduce cluster feature dimensions and eliminate redundant informa- tion. The white list detection technology was used as the first defense line of garbage detection system to detect the spams whose addresses were known. The support vector machine was used as the second defense line to, test new spasm and enhance the network security. The spam data were used to test the model's performance. The experimental results show that the two - layer spam detection model can effectively improve the spam detection efficiency and accu- racy, and has certain application value.
出处
《计算机仿真》
CSCD
北大核心
2012年第2期120-123,共4页
Computer Simulation
关键词
垃圾邮件
分类
支持向量机
特征选择
Pam
Classification
Support vector machine ( SVM )
Feature selection