摘要
随着商品评论文本数据的日益增加,需要利用情感分析技术来自动实现商品的情感极性分类,尤其是细粒度的情感分类.LDA主题模型可以实现大规模文本数据的主题词提取,并利用主题聚类功能发现特征词和情感词之间的潜在关系,但LDA模型倾向于提取粗粒度的情感分类知识,难以满足细粒度情感分析的语义需求.本文提出了一种语义弱监督的主题模型,在LDA模型中嵌入词语关联、全局特征词及主题情感隶属语义先验知识来提升LDA对特征词、情感词及其关系的识别能力.主要研究内容包括:从句法分析、词性关系和语境相关等角度进行词语关联语义约束的提取;全局特征词识别和主题情感隶属两类语义约束的获取;设计语义约束对LDA主题分配的影响机制,构建语义弱监督的细粒度情感分析主题模型SWS-LDA.实验表明,SWS-LDA模型可以改善LDA的语义理解能力,提高局部特征词和局部情感词的提取率,提升主题模型细粒度情感极性分类的准确性.
With the increasing of the product reviews,it is necessary to use the sentiment analysis technology to automatically realize the sentiment polarity classification,especially the fine-grained sentiment classification. LDA model can be used to extract the topic words from large scale text data,and find the potential relationships between aspect words and opinion words by using the topic clustering function. However,the LDA model tends to extract the knowledge of coarse granularity,and it is difficult to meet the semantic requirements of fine-grained sentiment analysis. This paper will propose the semantic weakly-supervised topic model that is embedded the words association,global aspect words and topic sentimental membership as prior knowledge in standard LDA to improve recognition ability of aspect words,opinion words and their relationships. The main research contents include: the extraction of words semantic association from syntactic parsing,POS relation and context relevance; the semantic constraint acquisition of the global aspect words recognition and the membership of the topic to sentiment; the design of influence mechanism of semantic constraints on topic distribution of LDA,and the construction of semantic weakly-supervised topic model( SWS-LDA) for fine-grained sentiment analysis. The experiment results show that the proposed model can effectively improve the semantic understanding of LDA,and increase the extraction rate of local aspect words and local opinion words,as well as enhance the accuracy of fine-grained sentiment polarity classification.
作者
彭云
万红新
钟林辉
PENG Yun;WAN Hong-xin;ZHONG Lin-hui(School of Computer and Information Engineering, Jiangxi Normal University, Nanchang 330022, China;School of Mathematics & Computer Science, Jiangxi Science & Technology Normal University, Nanchang 330038, China)
出处
《小型微型计算机系统》
CSCD
北大核心
2018年第5期978-985,共8页
Journal of Chinese Computer Systems
基金
国家自然科学基金项目(61662032
61462040)资助
江西省高校人文社科项目(JC1544
TQ1505)资助
关键词
商品评论
主题模型
LDA
情感分析
弱监督
product reviews
topic model
latent Dirichlet allocation
sentiment analysis
weak supervision