期刊文献+

基于多文本特征融合的中文微博的立场检测 被引量:23

Stance detection in Chinese microblogs via fusing multiple text features
下载PDF
导出
摘要 微博立场检测是判断微博作者对某一个话题的态度是支持、反对或中立。在基于监督学习的分类框架上,扩展并提出基于多文本特征融合的中文微博的立场检测方法。首先探究了基于词频统计的特征(词袋特征(Bag-ofWords,Bo W)、基于同义词典的词袋特征、考虑词与立场标签共现关系的特征)和文本深度特征(词向量、字向量)。之后使用支持向量机,随机森林和梯度提升决策树对上述特征进行立场分类。最后,结合所有特征分类器进行后期融合。实验表明,文中提出的特征对于不同话题下的微博立场检测的结果都有提升,且文本深度特征和基于词频统计的特征能够捕捉到文本的不同信息,在立场检测中是互补的。基于本文方法的微博立场检测系统在2016年自然语言处理与中文计算会议(NLPCC2016)的中文微博立场检测评测任务中取得了最好的结果。 Stance detection aims to automatically determine whether the author of a text is in favor of the given target,against the given target, or neither. This paper presents a stance detection system based on multiple text feature representations.Firstly, five different feature representations are explored including statistic-based features(Bo W, synonym-based Bo W,s Variance)and deep text features(word vectors and character vectors). Support Vector Machine(SVM), Random Forest and Gradient Boosting Decision Tree(GBDT)are applied as classifiers. Finally, late fusion is conducted to combine different feature representations. Experiment results show that the proposed feature representations can achieve significant improvement over traditional Bo W feature. Moreover, statistic-based features and deep features provide complementary information for stance detection, which leads to the wining system in the Chinese Microblog Stance Detection Evaluation by Natural Language Processing and Chinese Computing(NLPCC 2016).
出处 《计算机工程与应用》 CSCD 北大核心 2017年第21期77-84,共8页 Computer Engineering and Applications
基金 国家重点研发计划项目(No.2016YFB1001202)
关键词 立场检测 情感分析 文本特征表示 微博 文本分类 stance detection sentiment analysis text feature representations Chinese Microblogs text classification
  • 相关文献

参考文献8

二级参考文献137

  • 1朱嫣岚,闵锦,周雅倩,黄萱菁,吴立德.基于HowNet的词汇语义倾向计算[J].中文信息学报,2006,20(1):14-20. 被引量:326
  • 2徐琳宏,林鸿飞,杨志豪.基于语义理解的文本倾向性识别机制[J].中文信息学报,2007,21(1):96-100. 被引量:120
  • 3周强,赵颖泽.汉语功能块自动分析[J].中文信息学报,2007,21(5):18-24. 被引量:13
  • 4.知网辟蹊径 共享新天地-董振东先生谈知网与知识共享[EB/OL].http://www.keenage.com/html/c_index.html,(Hownet04.08.02). 被引量:4
  • 5http://www.csie.ntu.edu.tw/-cjlin/libsvm 2005-5-19 被引量:2
  • 6LIU B, HU M, CHENG J. Opinion observer: Analyzing and comparing opinions on the Web[ C]// Proceedings of the 14th International Conference on World Wide Web: WWW 2005. New York: ACM Press, 2005:342 - 351. 被引量:1
  • 7PANG B, LEE L. A sentimental education: Sentiment analysis using subjectivity summarization based on minimum cuts[ C]// Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics. Morristown, N J, USA: Association for Computational Linguistics, 2004:271 -278. 被引量:1
  • 8YU H, HATZIVASSILOGLOU V. Towards answering opinion questions: Separating facts from opinions and identifying the polarity of opinion sentences[ C]// Proceedings of the 2003 Conference on Empirical Methods in Natural Language Processing. Morristown, N J, USA: Association for Computational Linguistics. 2003:129 - 136. 被引量:1
  • 9WILSON T, HOFFMANN P, SOMASUNDARAN S, et al. Opinion-Finder: A system for subjectivity analysis[ C]// Proceedings of the 2005 Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing. Morristown. NJ, USA: Association for Computational Linguistics. 2005: 34-35. 被引量:1
  • 10DAVE K, LAWRENCE S, DPENNOCK M. Mining the peanut gallery: Opinion extraction and semantic classification of product reviews[ C]// Proceedings of the 12th International Conference on World Wide Web. New York: ACM Press, 2003:519-528. 被引量:1

共引文献859

同被引文献98

引证文献23

二级引证文献50

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部