期刊文献+

A High-Performance Extraction Method for Public Opinion on Internet 被引量:3

A High-Performance Extraction Method for Public Opinion on Internet
下载PDF
导出
摘要 Aiming at the importance of the analysis for public opinion on Internet, the authors propose a high-performance extraction method for public opinion. In this method, the space model for classification is adopted to describe the relationship between words and categories. The combined feature selection method is used to remove noisy words from the original feature space effectively. Then the category weight of words is calculated by the improved formula combining the frequency of words and distribution of words. Finally, the class weights of the not-categorized documents based on the category weight of words are obtained for realizing opinion extraction. Experiment results show that the method has comparatively high classification and good stability. Aiming at the importance of the analysis for public opinion on Internet, the authors propose a high-performance extraction method for public opinion. In this method, the space model for classification is adopted to describe the relationship between words and categories. The combined feature selection method is used to remove noisy words from the original feature space effectively. Then the category weight of words is calculated by the improved formula combining the frequency of words and distribution of words. Finally, the class weights of the not-categorized documents based on the category weight of words are obtained for realizing opinion extraction. Experiment results show that the method has comparatively high classification and good stability.
出处 《Wuhan University Journal of Natural Sciences》 CAS 2007年第5期902-906,共5页 武汉大学学报(自然科学英文版)
基金 Supported by the National High Technology Research and Development Program of China (2005AA147030)
关键词 opinion extraction text categorization class spacemodel feature selection opinion extraction text categorization class spacemodel feature selection
  • 相关文献

参考文献20

二级参考文献70

  • 1寇莎莎,魏振军.自动文本分类中权值公式的改进[J].计算机工程与设计,2005,26(6):1616-1618. 被引量:25
  • 2陈涛,谢阳群.文本分类中的特征降维方法综述[J].情报学报,2005,24(6):690-695. 被引量:79
  • 3Lewis D. D.. An evaluation of phrasal and clustered representalions on a text categorization task. In: Proceedings of SIGIR'92,the 15st ACM International Conference on Research and Development in Information Retrieval, Copenhagen, Denmark,1992, 37-50. 被引量:1
  • 4Sebastiani F,. Machine learning in automated text categorization. ACM Computing Surveys, 2002, 34(1): 1-47. 被引量:1
  • 5Lewis D.. Naive bayes at forty: The independence assumption in information retrieval. In: Proceedings of the 10th European Conference on Machine Learning, Chemnitz, Germany, 1998,4-15. 被引量:1
  • 6Salton G.. Automatic Text Processing: The Transformation,Analysis, and Retrieval of Information by Computer. Reading,MA: Addison Wesley, 1989. 被引量:1
  • 7Mitchell T. M.. Machine Learning. New York: McCraw Hill,1996. 被引量:1
  • 8Joachims T.. Text categorization with support vector machines: Learning with many relevant features. In: Proceedings of the 10th European Conference on Machine Learning,Chemnitz, Germany, 1998, 137-142. 被引量:1
  • 9Yang Y. , Liu X.. A Re-examination of text categorization methods. In: Proceedings of SIGIR'99, the 22nd ACM International Conference on Research and Development in Information Retrieval, Berkeley, CA, 1999, 42-49. 被引量:1
  • 10樊兴华.因果推理和文本分类.清华大学博士后出站报告,2004. 被引量:1

共引文献262

同被引文献34

引证文献3

二级引证文献29

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部