摘要
垃圾邮件日益泛滥,给用户带来了极大的不便和危害.并对网络安全构成威胁.传统邮件过滤方法单一,过滤精度不高,已不能很好地满足需求.结合规则过滤技术,分析了基于文本内容的贝叶斯分类器实现的关键技术与方法,并给出核心过滤算法在邮件分类中的实现具体方法及过程,进而完成垃圾邮件的判别.为减少邮件的误判对用户造成的损害及垃圾邮件漏判造成的影响,提出相应的改进措施,使用最小风险贝叶斯决策减小误判率,对分类系统经训练部分进行自适应调整,最后给出基于规则与内容的双重防范机制的邮件过滤模型及基于该框架的邮件判别流程.
The increasing junk mail brings great inconvenience and danger to people,threatens the safety of the network. The filtering way is single used by traditional filters, can' t well satisfy the demand of filtering. This paper has analysed the key techniques and methods about Bayesian classifier of content-based, provided the effective way and process of kernelly arithmetic in filtering and completed the judgment of spam. In order to reducing the damages because of mistaking e-mail, we provide the improved methods of using the risk minimization Bayesian decision and selfimprovement of categorization system. The paper finally has described a spam filtering model and process by double defending based on rule and content.
出处
《南京师范大学学报(工程技术版)》
CAS
2006年第2期86-89,共4页
Journal of Nanjing Normal University(Engineering and Technology Edition)
基金
南京工程学院科研基金项目资助(科研令号04-37)
关键词
邮件过滤
贝叶斯原理
文本分类
向量空间模型
spam filter, Bayesian theory, text categorization, vector space model