摘要
电子邮件(e-mail)是人们日常生活中不可缺少的通信手段之一,但是垃圾邮件却给人们带来了很大的危害。文中主要是针对中文垃圾邮件,给出了一种基于Winnow算法的基于邮件内容的反垃圾邮件引擎原型的设计,对于未知邮件可以达到较好的区分效果。首先对邮件的内容进行解码、分词,采用信息增益选取特征项;然后采用Winnow算法构造分类器;最后采用部分邮件样本进行测试,测试结果可以进行反馈学习。最后的测试数据分析表明系统达到了比较好的效果。
Email is one of indispensable communication ways in daily life, but spam has done serious harm to people. In this paper present the design of an anti - spare engine based on Winnow algorithm and focus on Chinese sparn,and the result of distinguishing from unknown mail is good. Firstly it decodes content of the mail, segments, and chooses terms with information gain. Then it constructs the classification. Finally it tests the result with partly maila, and the wrong will result in feedback study. The test data analysis shows that the system outcome is good.
出处
《计算机技术与发展》
2006年第4期170-172,175,共4页
Computer Technology and Development