期刊文献+

基于内容粘合性的邮件分类 被引量:1

Email Classification Based on the Glue of Content
下载PDF
导出
摘要 电子邮件分类一般采用向量空间模型来表示邮件,但是该模型只是基于独立词在邮件内容中出现的频率来建立的,而并未考虑邮件的结构特征,从而使得特征向量不能准确地表示邮件的内容。针对目前向量空间模型出现的这种缺陷,文中将粘合性衡量方法提取n-gram的思想运用于文本表示当中,对词的权重进行赋值,并以此模型设计了一个邮件分类系统,由于粘合性方法考虑到了邮件的结构特征,实例证明,这种方法能够提高系统的分类精确度。 Email classification often uses the Vector Space Model (VSM) as a tool to represent emails. This model is only based on frequencies of the words that disappear in the email. It ignores the structure of the email, therefore VSM can not express the email exactly. In order to overcome the shortcomings of the VSM, the idea that uses glue measure to extract n-grams is applied in this paper, which is then used to weight the words, and an email classification system is designed. Because the structure of email is considered in glue measure, the experiment shows that the new method can improve the precision of classification.
作者 廖玲 文敦伟
出处 《计算机仿真》 CSCD 2008年第2期121-123,共3页 Computer Simulation
关键词 粘合性衡量 邮件分类 向量空间模型 Glue measure Email classification Vector space model
  • 相关文献

参考文献9

  • 1W Cohen.Learning rules that classify email[C].Proceedings of the AAAI spring symposium on Machine Learning in Information Access,1996.18-25. 被引量:1
  • 2J Takkinen,N Shahmehri.CAFE:A conceptual model for managing information in electronic mail[C].Proceedings of 31st Hawaii International Conference on System Sciences,Los Alamitos,CA:IEEE Computer Society,1998.44-53. 被引量:1
  • 3J Provost.Naive-Bayes VS Rule-learning in classification of email[R].Texas:The University of Texas at Austin,Artificial Intelligence Lab,1999. 被引量:1
  • 4K Taghva,et al.Ontology-based classification of email[C].International Conference on Information Technology:Computers and Communications.2003.194-198. 被引量:1
  • 5朱斌,熊应,朱海云.人工智能在电子邮件分类中的应用研究[J].华南理工大学学报(自然科学版),2001,29(12):53-56. 被引量:5
  • 6徐海涛,杨森,柴乔林.基于统计分词的中文邮件智能分类系统[J].华中科技大学学报(自然科学版),2003,31(S1):325-328. 被引量:1
  • 7雷景生,林冬雪,符浅浅.基于改进向量空间模型的Web信息检索技术研究[J].计算机工程,2005,31(1):14-16. 被引量:21
  • 8J F Silva and G P Lopes.A Local Maximum Method and a Fair Dispersion Normalization for Extracting Multiword Units[C].In Proceedings of the 6th Meeting on the Mathematics of Language,1999.369-381. 被引量:1
  • 9Frank Smadja,K R McKeown,V Hatzivassiloglou.Translating Collocations for Bilingual Lexicons:A Statistical Approach[J].Association for Computational Linguistics,1996,22(1):3-38. 被引量:1

二级参考文献8

共引文献24

同被引文献5

引证文献1

二级引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部