期刊文献+

kNN在文本分类中的应用研究 被引量:3

Research of kNN in Text Categorization
下载PDF
导出
摘要 随着网络技术与数字图书馆的迅猛发展,在线文档迅速增加,自动文本分类已成为处理和组织大量文档数据的关键技术。kNN方法作为一种简单、有效、非参数的分类方法,在文本分类中得到广泛的应用。本文介绍了kNN分类算法的思想以及两种不同的决策规则,并通过实现的文本分类系统对基于离散值规则的kNN方法和基于相似度加权的kNN方法进行实验比较。实验结果表明,基于相似度加权的kNN方法的分类性能要优于基于离散值规则的kNN方法。 With the rapid development of network technology and digital libraries, online documents are rapidly increasing. Automatic text classification has become a key technology for massive documents processing. As a simple, effective, non-parametric method of classification, kNN method is widely used in the text classification. This paper introduces the basis theory of the kNN algorithm and two different decision-making rules. Experiments which compared two different decision-making rules are also pres- ented in this paper. The experimental results show that the performance of similarity-weighted function is better than the performance of discrete-valued function.
出处 《计算机与现代化》 2008年第11期69-72,共4页 Computer and Modernization
基金 唐山市重点实验室资助项目(06360301A-6)
关键词 文本分类 KNN 特征选择 text categorization kNN feature selection
  • 相关文献

参考文献6

二级参考文献32

  • 1王聃,贾云伟,林福严.人脸识别系统中的特征提取[J].微计算机信息,2005,21(07X):53-55. 被引量:18
  • 2[1]D D Lewis. Naive (Bayes) at forty: The independence assumption in information retrieval. In: The 10th European Conf on Machine Learning(ECML98), New York: Springer-Verlag, 1998. 4~15 被引量:1
  • 3[2]Y Yang, X Lin. A re-examination of text categorization methods. In: The 22nd Annual Int'l ACM SIGIR Conf on Research and Development in Information Retrieval, New York: ACM Press, 1999 被引量:1
  • 4[3]Y Yang, C G Chute. An example-based mapping method for text categorization and retrieval. ACM Trans on Information Systems, 1994, 12(3): 252~277 被引量:1
  • 5[4]E Wiener. A neural network approach to topic spotting. The 4th Annual Symp on Document Analysis and Information Retrieval (SDAIR 95), Las Vegas, NV, 1995 被引量:1
  • 6[5]R E Schapire, Y Singer. Improved boosting algorithms using confidence-rated predications. In: Proc of the 11th Annual Conf on Computational Learning Theory. Madison: ACM Press, 1998. 80~91 被引量:1
  • 7[6]T Joachims. Text categorization with support vector machines: Learning with many relevant features. In: The 10th European Conf on Machine Learning (ECML-98). Berlin: Springer, 1998. 137~142 被引量:1
  • 8[7]S O Belkasim, M Shridhar, M Ahmadi. Pattern classification using an efficient KNNR. Pattern Recognition Letter, 1992, 25(10): 1269~1273 被引量:1
  • 9[8]V E Ruiz. An algorithm for finding nearest neighbors in (approximately) constant average time. Pattern Recognition Letter, 1986, 4(3): 145~147 被引量:1
  • 10[9]P E Hart. The condensed nearest neighbor rule. IEEE Trans on Information Theory, 1968, IT-14(3): 515~516 被引量:1

共引文献510

同被引文献31

引证文献3

二级引证文献12

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部