期刊文献+

改进贝叶斯垃圾邮件过滤技术的研究 被引量:4

Research on Improved Bayesian Algorithm for Anti-Spam Filtering
下载PDF
导出
摘要 为提高贝叶斯垃圾邮件过滤器的精确率和召回率,提出一种改进加权贝叶斯模型(improved weighted bayes model,IWB),通过提高贝叶斯模型的准确性,改善垃圾邮件过滤性能;不同于朴素贝叶斯模型(nave bayes model,NB)对邮件样本特征值所作的独立性和相同重要性的假设,通过给邮件样本的每一个特征值分配一个权值,减小贝叶斯模型与实际间的失配误差;根据贝叶斯公式建立基于最小二乘算法的目标函数,用于对IWB中权向量的优化;由于目标函数为非线性高维函数,提出一种新的粒子群优化算法,能够获得近似全局最优权向量,从而得到最优贝叶斯模型;通过仿真对NB、传统加权贝叶斯模型(weighted bayes model,WB)与IWB进行比较,仿真结果表明IWB能够显著地改善垃圾邮件过滤性能,提高邮件过滤的精确率和召回率。 An improved weighted hayes model (IWB) is proposed in order to increase precision and recall, and anti--spam filtering performance is increased due to higher accuracy of hayes models. Differing from the assumption of naive hayes (NB) model about that all attributes have the same in- dependency and importance, each attribute of mail samples is assigned with a weight in order to decrease the mismatch error between hayes model and reality. Based on least squares method, the objective function is established with hayes formula to optimize the weight vector of IWB. Because the ob- jective function is nonlinear and multivariable, a novel particle swarm optimization (PSO) method is proposed to obtain the approximately global opti- mal weight vector and obtain the optimal hayes model. Comparing IWB with NB and weighted hayes model (WB), the simulation results show that IWB remarkably improves the anti--spare filtering performance and increases precision and recall.
作者 计宏
出处 《计算机测量与控制》 北大核心 2013年第8期2181-2184,共4页 Computer Measurement &Control
基金 教育部博士点基金项目(20106121110003)
关键词 垃圾邮件 贝叶斯 精确率 加权 粒子群 spam Bayesian precision weighted PSO
  • 相关文献

参考文献11

  • 1郑炜,沈文,张英鹏.基于改进朴素贝叶斯算法的垃圾邮件过滤器的研究[J].西北工业大学学报,2010,28(4):622-627. 被引量:27
  • 2朱志勇,徐长梅,刘志兵,胡晨刚.基于贝叶斯网络的客户流失分析研究[J].计算机工程与科学,2013,35(3):155-158. 被引量:13
  • 3陈春雷,张新家,张荔.电子邮件集成管理系统设计与实现[J].计算机测量与控制,2009,17(1):141-144. 被引量:2
  • 4J. C, A. B D, W. L. An algorithm for Bayesian belief network construction from data: Proceedings of AI & STAT'97 [Z]. Flori- da: 19978-90. 被引量:1
  • 5Jiang L, Zhang H, Cai Z. A Novel Bayes Model Hidden Naive Bayes [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DA- TA ENGINEERING, 10 (2009), 21, 1361-1371. 被引量:1
  • 6Provost F J, Domingos P. Tree induction for probability based rank- ing [J]. Machine Learning, 3 (2003), 52, 199-215. 被引量:1
  • 7Orhan U, Adem K, Comert O. Least Squares Approach to Locally Weighted Naive Bayes Method [J]. Journal of New Results in Sci- ence, (2012) 71-80. 被引量:1
  • 8N. C, J. S. An introduction to Support Vector Machines and other kernel--based methods [M]. Cambridge University Press, 2000. 被引量:1
  • 9李雯..基于贝叶斯技术的邮件过滤研究[D].山东师范大学,2008:
  • 10Mhamdi B, Grayaa K, Aguili T. Hybrid of particle swarm optimi- zation, simulated annealing and tabu search for the reconstruction of two--dimensional targets from laboratory--controlled data [J]. Progress In Electromagneties Research B, 28 (2011) 1- 18. 被引量:1

二级参考文献15

  • 1桂现才,彭宏,王小华.C4.5算法在保险客户流失分析中的应用[J].计算机工程与应用,2005,41(17):197-199. 被引量:33
  • 2莫礼平,樊晓平.BP神经网络在数据挖掘分类中的应用[J].吉首大学学报(自然科学版),2006,27(1):59-62. 被引量:5
  • 3汪国有,熊木子.基于DM642嵌入式电子邮件客户端的开发[J].计算机测量与控制,2007,15(2):226-227. 被引量:1
  • 4王华锋,张新家.三层结构的网络服务器设计与实现[D].西安:西北工业大学,2007. 被引量:3
  • 5Zhang H. Exploring Conditions for the Optimality of Naive Bayes. International Journal of Pattern Recognition and Artificial Intelligence, 2005, 19(2) : 183 - 198. 被引量:1
  • 6Vangelis Metsis,Ion Androutsopoulos, Georgios Paliouras. Spam Filtering with Naive Bayes Which Naive Bayes? CEAS 2006 Third Conference on Email and AntiSpam, 2006. 被引量:1
  • 7Mehran Sahami, Susan Dumais, David Heckerman, Eric Horvitz. A Bayesian Approach to Filtering Junk E-Mail. AAAI Workshop, Madison, Wisconsin. 1998:55 - 62. 被引量:1
  • 8Johan Hovold. Naive Bayes Spare Filtering Using Word-Position-Based Attributes. 2nd Conference on Email and Anti-Spare, Stanford, CA, 2005. 被引量:1
  • 9Zhang I E, Zhu Jingbao, Yao Tianshun. An Evaluation of Statistical Spare Filtering Techniques. ACM Trans on Asian Language Information Processing, 2004, 3 (4) : 243 - 269. 被引量:1
  • 10Aris Kosmopoulos, Georglos Paliouras, Ion Androutsopoulos. Adaptive Spare Filtering Using Only Naive Bayes Text Classifiers. CEAS 2008 Fifth Conference on Email and AntiSpam, 2008, Mountain View, California USA. 被引量:1

共引文献39

同被引文献33

引证文献4

二级引证文献12

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部