期刊文献+

基于粗糙集的带决策规则边界的邮件过滤算法 被引量:2

E-mail filtering algorithm with boundary decision rules based on rough set
下载PDF
导出
摘要 针对垃圾邮件过滤的准确率和稳定性不高,以及为了解决邮件过滤算法在语料分类上存在漏报和误报等问题,提出基于粗糙集的带决策规则边界的邮件过滤算法(RARM)。该算法运用粗糙集理论对语料库进行直接分析,并采用启发式方法提出了粗糙集理论的三种不同决策规则的执行计划,确保当邮件内容的词汇语义较为模糊时,仍能保证一定的分类准确度。在实验仿真中,通过与基于支持向量机(SVM)、Ada Boost和贝叶斯分类的邮件过滤算法相比较,该算法在垃圾邮件过滤上的准确率优于对比算法。 For accuracy and stability of the spam filter is not high , and in order to solve the problem such as e-mail filtering algorithm has false negatives and false positives on the corpus classification. This paper proposed e-mail filtering algorithm with boundary decision rules based on rough set. First, it used rough set theory for direct analysis of corpus and used heuristic methods to propose three different decision rules of the rough set theory in the execution plan, making sure that when the mes- sage content was more blurred at lexical semantics, could still guarantee a certain classification accuracy. In spare classifica- tion experiments, this algorithm is compared with SVM, AdaBoost and Bayesian mail filtering algorithm, which better than the comparison algorithm on the accuracy of spare filtering.
出处 《计算机应用研究》 CSCD 北大核心 2015年第1期258-261,共4页 Application Research of Computers
基金 河南省科技攻关项目(122102210563 132102210215)
关键词 邮件过滤 粗糙集 启发式方法 决策规则边界 spam filtering rough set heuristic methods decision rules boundary
  • 相关文献

参考文献9

  • 1刘伍颖,王挺.结构化集成学习垃圾邮件过滤[J].计算机研究与发展,2012,49(3):628-635. 被引量:12
  • 2邓维斌,王国胤,洪智勇.基于粗糙集的加权朴素贝叶斯邮件过滤方法[J].计算机科学,2011,38(2):218-221. 被引量:21
  • 3YEVSEYEVA I, BASTO-FERNANDES V, RUANO-ORDJI.S D. Op- timising anti-spam filters with evolutionary algorithms [ J]. Expert Systems with Applications,2013,40(10) :4010-4021. 被引量:1
  • 4PEREZ-DIAZ N, RUANO-ORDAS D,FDEZ-RIVEROLA F, et al. SDAI : an integral evaluation methodology for content-based spam fihe- ring models [ J ]. Expert Systems with Applications, 2012, 39 ( 16 ) : 12487-12500. 被引量:1
  • 5LI Cheng-hua, HUANG J X. Spam filtering using semantic eimilarity approach and adaptive BPNN [ J]. Neurocomputing, 2012,92 : 88- 97. 被引量:1
  • 6LAI G H, CHEN C M, LAIH C S, et al. A collaborative anti-spam system [ J ]. Expert System with Applications, 2009,36 ( 3 ) : 6645- 6653. 被引量:1
  • 7CHIU Y F, CHEN C M, JENG B, et al. An alliance-based anti- spam approach [ C ]//Proc of the 3rd International Conference on Na- tural Computation. 2007 : 203- 207. 被引量:1
  • 8KIM J, CHUNG K, CHOI K. Spam filtering with dynamically upda- ted URL statistics[J]. IEEE Security and Privacy,2007,5(4) :33- 39. 被引量:1
  • 9CARRERAS X, MERQUEZ L. Boosting trees for anti-spam e-mail filtering[ C ]//Proc of the 4th International Conference on Recent Ad- vances in Natural Language. 2001:58-64. 被引量:1

二级参考文献20

  • 1陈孝礼,刘培玉,张立伟.一种基于加权支持向量机的垃圾邮件过滤方法[J].山东师范大学学报(自然科学版),2009,24(4). 被引量:1
  • 2姜远,周志华.基于词频分类器集成的文本分类方法[J].计算机研究与发展,2006,43(10):1681-1687. 被引量:22
  • 3邓维斌,王国胤,王燕.基于Rough Set的加权朴素贝叶斯分类算法[J].计算机科学,2007,34(2):204-206. 被引量:43
  • 4Dietterich T G. Ensemble methods in machine learning [C] // Proc of the Multiple Classifier Systems. London: Springer, 2000:1-15. 被引量:1
  • 5Liu Wuying, Wang Ting. Multi-field learning for email spam filtering [C] //Proc of the 33rd Annual Int ACM SIGIR Conf on Research and Development in Information Retrieval. New York: ACM, 2010: 745-746. 被引量:1
  • 6Fabrizio S. Machine learning in automated text categorization [J]. ACM Computing Surveys, 2002, 34(1): 1-47. 被引量:1
  • 7Drucker H, Wu D, Vapnik V N. Support vector machines for spam categorization [J]. IEEE Trans on Neural Networks, 1999, 10(5): 1048-1054. 被引量:1
  • 8Zobel J, Moffat A. Inverted files for text search engines [J]. ACM Computing Surveys, 2006, 38(2):.Article 6. 被引量:1
  • 9Joachims T. Training linear SVMs in linear time [C] //Proc of the 12th ACM SIGKDD Int Conf on Knowledge Discovery and Data Mining. New York: ACM, 2006:217-226. 被引量:1
  • 10Paul G. Better Bayesian filtering [C/OL] //Proc of the 2003 Spam Conf. 2003. [2010-01-01]. http://www, paulgraham. com/better, html. 被引量:1

共引文献31

同被引文献21

引证文献2

二级引证文献3

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部