期刊文献+

基于多特征与复合分类法的中文微博情感分析 被引量:8

Sentiment analysis of Chinese micro-blog based on multi-feature and combined classification
下载PDF
导出
摘要 为了提高微博的情感分析的准确率,选取微博文本中的动词和形容词作为特征,提出了基于层次结构的特征降维方法,采用基于表情符号的方法计算特征极性值。在此基础上,提出了基于特征极性值的位置权重计算方法,借助支持向量机(SVM)作为机器学习模型将微博文本分为正面、负面和中性3类。也就是多特征提取,结合字典法与机器学习法2种算法,来提高情感分析的准确率。实验结果表明,该方法能取得平均为72.16%的准确率。提出的基于多特征与复合分类器的情感分析方法能够比较有效地对中文微博文本进行情感分类。 In order to improve the accuracy of sentiment analysis, verbs and adjectives in micro- blog texts are selected as features and a hierarchical structure-based approach to the decline of feature dimension is put forward. The method based on the emoticon is designed to calculate the feature polarity. On this basis, the position weight calculation method based on the feature polarity is proposed. Then the micro-blog texts are classified into three categories including positive, negative and neutral one by SVM. By combining Lexicon-based and SVM Machine Learning method, better accuracy of classification can be achieved. Experimental results show that the approach proposedto the sentiment classification of Chinese micro-blog is effective.
作者 吴维 肖诗斌
出处 《北京信息科技大学学报(自然科学版)》 2013年第4期39-45,共7页 Journal of Beijing Information Science and Technology University
基金 国家自然科学基金项目资助(61171159 61271304)
关键词 微博 表情符号 复合分类法 位置权重 情感分类 micro-blog emoticon picture combined classification position weight sentiment classification
  • 相关文献

参考文献16

  • 1CNNIC(中国互联网信息中心).第29次中国互联网络发展状况统计报告[R].北京:中国互联网络信息中心(CNNIC),2012. 被引量:2
  • 2LunWei Ku, TungHo Wu, LiYing Lee, et al. Construction of an evaluation corpus for opinion extraction [ C ] // NTCIR - 5 Japan. 2005 : 513 - 520. 被引量:1
  • 3DasguptaS, NgV. Mine the easy classify the hard: S semi-supervised approach to automatic sentiment : classification [ C ]//ACL' 09 : 701 - 709. 被引量:1
  • 4Wang Hao, Can Dogan, Abe Kazemzadeh. A system for real-time Twitter sentiment analysis of2012 U. S. presidential election cycle [ C ] /// Proceedings of the ACL 2012 System Demonstrations ,2012 : 115 - 120. 被引量:1
  • 5Apoorv Agarwal, Xie Boyi, Ilia Vovsha. Sentiment analysis of Twitter data [ C ] // Proceedings of the Workshop on Languages in Social Media,2011:30 - 38. 被引量:1
  • 6Jiang Long , Yu Mo , Zhou Ming. Target- dependent Twitter sentiment classification [ C ]// Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics : Human Language Technologies ,2011 : 151 - 160. 被引量:1
  • 7刘志明,刘鲁.基于机器学习的中文微博情感分类实证研究[J].计算机工程与应用,2012,48(1):1-4. 被引量:125
  • 8谢丽星,周明,孙茂松.基于层次结构的多策略中文微博情感分析和特征抽取[J].中文信息学报,2012,26(1):73-83. 被引量:199
  • 9Welcome to the General Inquirer Home Page [EB/OL]. [ 2013 - 04 - 25 ]. http: JJ www. wjh. harvard, edu/- inquirer. 被引量:1
  • 10郝雷红..现代汉语否定副词研究[D].首都师范大学,2003:

二级参考文献35

  • 1刘丹青.“唯补词”初探[J].汉语学习,1994(3):23-27. 被引量:68
  • 2夏齐富.程度副词再分类试探[J].安庆师范学院学报(社会科学版),1996,15(3):63-67. 被引量:18
  • 3M.Q. Hu, B. Liu. Mining and Summarizing Custom- er Reviews[C]//ACM SIGKDD 2004.. 168-177. 被引量:1
  • 4Bo Pang, Lillian Lee. Opinion mining and sentiment a- nalysis[C]//Foundations and Trends in Information Retrieval, 2(1-2):1-135. 被引量:1
  • 5M.Q. Hu, B. Liu. Opinion Extraction and Summari- zation on the Web[C]//AAAI06, Boston: 1621-1624. 被引量:1
  • 6H. Yu, V. Hatzivassiloglou. Towards Answering O- pinion Question: Separating Facts from Opinions and Identifying the Polarity of Opinion Sentences[C]// EMNLP'03 : 129-136. 被引量:1
  • 7Bo Pang, Lillian Lee, Shivakumar Vaithyanathan. Thumbs up? sentiment classification using machine learning techniques[C]//ACL'02: 79-86. 被引量:1
  • 8Bo Pang, Lillian Lee. A sentimental education: Senti- ment analysis using subjectivity summarization based on minimum cuts[C]//ACL'04: 271-278. 被引量:1
  • 9E. Riloff, J. Wiebe. 2003. Learning extraction pat-terns for subjective expressions[C]//EMNLP'03: 105- 112. 被引量:1
  • 10Glance, N. , M. Hurst, K. Nigam, et al. 2005. Deri- ving marketing intelligence from online discussion [C]//SIGKDD'05 : 419-428. 被引量:1

共引文献442

同被引文献75

引证文献8

二级引证文献103

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部