期刊文献+

基于新词扩充和特征选择的微博观点句识别方法 被引量:8

Recognition of Opinion Bearing Sentences in Microblogs Based on New Words Extension and Features Selection
下载PDF
导出
摘要 微博情感分析已成为目前研究的热点,对于企业营销策划、产品反馈分析、舆情检测、竞争情报挖掘等具有十分重要的作用。微博情感分析通常包含观点句识别、情感要素抽取以及观点分类等一系列工作。由于情感倾向主要通过文本中的观点句来表达,因此观点句识别是影响微博情感分析效果的决定性因素。本论文针对微博观点句识别问题,提出了一种基于新词扩充和特征选择的观点句识别新方法。该方法首先基于微博表情符号和新浪微博实际数据对情感词典进行了扩充,同合并词项的方法将网络新词扩充到分词集合中以提高分词准确率,并进一步融合微博特有特征和情感词、文法、句法、主题等传统特征,使用SVM分类方法进行观点句识别。在来自腾讯微博的20个主题45566条真实微博上的实验表明,我们的方法具有较好的准确率和F测试值。 Microblog sentiment analysis has been one of the hottest topics in recent years, as it plays important roles in enterprise marketing planning, products feedback analysis, public feelings detection, and competitive intelligence mining. Generally, microblog sentiment analysis consists of several processes, including opinionated sentences recognition, sentiment factors extraction, and opinion classification, among which the opinionated sentences recognition has the crucial impact on the performance of microblog sentiment analysis, as sentiment is usually expressed through the opinionated sentences. Focusing on the detection of the opinionated sentences from microblog, in this paper we present a new approach to recognize opinionated sentences, which is based on new words extension and features selection. We first extend the sentiment dictionary by analyzing the expressional signals and a real microblog dataset from Sina Weibo. Next, we employ a word combination method to introduce fresh words into the segmented words list and therefore improve the accuracy of word segmentation. Finally, we fuse the microblog-specific features with traditional features such as sentiment word, n-gram, syntax, and topic, and use SVM to recognize opinionated sentences. We conduct experiments on a real microblog data set from Tencent including 20 topics and 45 566 microblogs show that our proposed method has a good precision and F-measure value.
作者 赵洁 温润
出处 《情报学报》 CSSCI 北大核心 2013年第9期945-951,共7页 Journal of the China Society for Scientific and Technical Information
基金 国家自然科学基金面上项目“基于时空语义的微博突发事件检测与短期预测研究”(编号71273010) 安徽省自然科学基金面上项目(编号1208085MG117)资助
关键词 微博 情感分析 观点句识别 特征融合 microblog, sentiment analysis, opinionated sentence recognition, features fusion
  • 相关文献

参考文献20

  • 1中国互联网络信息中心(CNNIC).第29次《中国互联网发展状况统计报告》[EB/OL].[2012·12-01].http://www.cnnic.cn/dtygg/dtgg/201201/W020120116337628870651.pdf. 被引量:1
  • 2张紫琼,叶强,李一军.互联网商品评论情感分析研究综述[J].管理科学学报,2010,13(6):84-96. 被引量:154
  • 3洪宇,张宇,刘挺,李生.话题检测与跟踪的评测及研究综述[J].中文信息学报,2007,21(6):71-87. 被引量:153
  • 4Long R, Wang H, Chert Y, et al. Towards Effective Event Detection, Tracking and Summarization on Microblog Data [ C ]/! Proc. Of WAIM, Wuhan, China, 201 t : 652-663. 被引量:1
  • 5Sakaki T, Okazaki M, Matsuo Y. Earthquake Shakes Twitter Users: Real-time Event Detection by Social Sensors [ C ]// Proc. Of WWW, 2010 : 851-860. 被引量:1
  • 6Huang J, Iwaihara M. Realtime Social Sensing of Support Rate for Microblogging [ C ]// Proc. Of DASFAA Workshops, Hong Kong, 2011:357-368. 被引量:1
  • 7Popescu A M, Pennacchiotti M, Paranjpe D A. Extracting Events and Event Descriptions from Twitter [ C ]/! Proc. Of WWW, India, 2011:105-106. 被引量:1
  • 8Tsytsarau M, Palpanas T. Survey on Mining Subjective Data on the Web [ J]. Data Mining and Knowledge Discovery (DMKD) , 2012, 24(3): 478-514 . 被引量:1
  • 9周立柱,贺宇凯,王建勇.情感分析研究综述[J].计算机应用,2008,28(11):2725-2728. 被引量:73
  • 10Li G, Liu F. A Clustering-based Approach on Sentiment Analysis [ C ]//Proc. Of 2010 International Conference on Intelligent Systems and Knowledge Engineering (ISKE) , IEEE CS press, 2010:331-337. 被引量:1

二级参考文献99

共引文献368

同被引文献226

引证文献8

二级引证文献117

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部