期刊文献+

SVM和K-means结合的文本分类方法研究 被引量:5

Research on Text Classification Method of SVM and K-means
下载PDF
导出
摘要 有监督的分类方法是文本分类中常用的方法,它需要采用人工标识的样本进行训练,对样本的人工标识是一个比较繁锁的过程。无监督的分类方法没有这一过程,但其分类的效果往往不太好。针对两者各自的优缺点,利用一种基于SVM和K-means相结合的文本分类方法,首先用K-means方法进行文本聚类,然后选取每类中距离聚类中心较近的一些文本作为该类的训练样本训练SVM分类器,最后用训练好的SVM对文本进行分类。此方法避免了无监督方法分类效果不好的缺点,同时也省去了SVM方法中对样本进行人工标识的繁锁过程。基于灾害文本的实验结果也表明了这种新方法的可行性。 Supervised classification is commonly used in the text classification, but it needs manual identified samples for training, which made the process relatively cumbersome. Unsupervised classification does not in the process, hut the classification result often not good enough. According to the advantages and disadvantages of each method, uses a text classification method based on the combination of SVM and K - means. Using K - means cluster text first, and then chose some samples which are close to each cluster center as study samples to training SVM classifier. Finally, classify texts with the SVM classifier. This method avoids the shortocoming of unsupervised classification, and eliminates the cumhersome process of manual identifying samples of SVM. The experimental result based on disaster text also demonstrates the feasibility of this new approach.
出处 《计算机技术与发展》 2009年第11期35-37,44,共4页 Computer Technology and Development
基金 国家科技支撑计划项目(2006BAD20B02)
关键词 文本分类 K—means 支持向量机 text classification K - means support vector machines
  • 相关文献

参考文献7

二级参考文献62

共引文献191

同被引文献44

引证文献5

二级引证文献15

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部