期刊文献+

基于LDA主题模型的文本语料情感分类改进方法 被引量:3

The improved method based on LDA topic model for emotion classification of text corpus
下载PDF
导出
摘要 针对传统LDA主题模型无法体现词与词之间的顺序及关联性这一不足,提出一种改进的加权W-LDA情感分类方法.首先,在该模型的主题采样及其分布期望计算过程中引入平均加权值,以此避免与主题紧密相关词被高频词所淹没,从而提高主题间的区分度;然后,以提取到的高质量文档-主题分布及主题-词向量为基础,引入支持向量机算法(SVM),构建一个集有情感词分析与提取、主题分布计算与情感分类功能的文本语料情感分析方法;最后,利用真实的教学评价数据和公共评论集对本文方法的有效性进行了验证.结果表明,本文提出的方法在主题区分度、分类准确率以及F1-Measure方面均明显优于SVM算法和文献[15]中的算法. An improved weighted W-LDA emotional classification method is proposed to solve the problem that the traditional LDA topic model can not reflect the order and relevance among words.Firstly,the average weighted value is used in the theme sampling and distribution expectation calculation process of the model, which avoid some important words related to the theme were drowned by high-frequency words.So these measures contribute to improve the degree of descrimination among the subjects.Secondly,based on the extracted high-quality document-subject distribution and theme-word vector,with the support vector machine algorithm (SVM)involved,a emotion classification method on comentary corpus is proposed in this article.Its functions include the analysis and exaction of emotion words,the topic distribution computation and emotion classifiction.Finally,some experiments are perfomed on the real teaching evaluation data and public comment data.The experimental results show that the proposed method has many advantages over the classific SVM and literatur [15]for the degree of descrimination the topics,the classification accuracy and F1-Measure.
作者 郭晓慧 GUO Xiaohui(Institute of Information Engineering,Yango University,Fuzhou 350015,China)
出处 《延边大学学报(自然科学版)》 CAS 2018年第3期266-273,共8页 Journal of Yanbian University(Natural Science Edition)
基金 福建省教育厅科研项目(JA15631)
关键词 评论语料 LDA主题模型 支持向量机 情感分类 commentary corpus LDA topic model support vector machine emotion classification
  • 相关文献

参考文献10

二级参考文献90

  • 1YE Qiang LI Yijun ZHANG Yiwen.Semantic-Oriented Sentiment Classification for Chinese Product Reviews: An Experimental Study of Book and Cell Phone Reviews[J].Tsinghua Science and Technology,2005,10(z1):797-802. 被引量:7
  • 2王永贵,韩顺平,邢金刚,于斌.基于顾客权益的价值导向型顾客关系管理——理论框架与实证分析[J].管理科学学报,2005,8(6):27-36. 被引量:32
  • 3姚天昉,聂青阳,李建超,李林琳,陈柯,付宁.一个用于汉语汽车评论的意见挖掘系统[C]//中文信息处理前沿进展-中国中文信息学会二十五周年学术会议论文集.北京:清华大学出版社,2006:260-281. 被引量:14
  • 4Hong Yu, Vasileios Hatzivassiloglou. Towards answering opinion questions: separating facts from opinions and identifying the polarity of opinion sentences [C]//Proceedings of EMNLP 2003,2003: 129-136. 被引量:1
  • 5Ellen Riloff, Janyce Wiebe, William Phillips. Exploiting subjectivity classification to improve information extraction [ C ]//Proceedings of AAAI-2005, 2005: 1106-1111. 被引量:1
  • 6Minqing Hu,Bing Liu. Mining opinion features in customer reviews[C]//Proceedings of AAAI-2004,2004: 755-760. 被引量:1
  • 7倪茂树,林鸿飞.基于关联规则和极性分析的商品评论挖掘[C]//第三届全国信息检索与内容安全学术会议,2007:635-642. 被引量:5
  • 8Soo-Min Kim,Eduard Hovy. Automatic detection of opinion bearing words and sentences[C]//Proceedings of IJCNLP-2005,2005 : 61-66. 被引量:1
  • 9Jun Zhao,Kang Liu,GenWang. Adding redundant features for crfs based sentence sentiment classification [C]//Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing, 2008: 117-126. 被引量:1
  • 10Minqing Hu, Bing Liu. Mining and summarizing customer reviews [C]//Proceedings of KDD-2004, 2004 : 168-177. 被引量:1

共引文献314

同被引文献22

引证文献3

二级引证文献3

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部