期刊文献+

面向中文文本的情感信息抽取语料库构建 被引量:8

Corpus Construction on Opinion Information Extraction in Chinese
下载PDF
导出
摘要 情感信息抽取是情感分析中的一个重要子任务。虽然该任务已经开展有一段时间,但是面向中文文本的情感信息抽取任务研究才刚刚起步。目前中文文本的情感信息抽取面临的首要困难在于现有的相关中文语料库还非常有限。为了更好开展中文文本的情感信息抽取研究,该文重点研究了中文语料标注体系,构建一个规模较大、标注类型丰富的中文情感信息抽取语料库。除了常见语料库标注的情感倾向性、评价对象、情感词等信息外,重点标注了评价对象的省略、无情感词情感句表达及极性转移等情况。由语料信息统计可知,该文所指出的特殊现象(例如,评价对象的省略)在中文情感表达中是非常普遍的,开展这方面的研究很有必要。该文所构建的中文文本语料库将为中文情感信息抽取任务提供语料基础。 Opinion information extraction (OIE) is an important sub-task in the research on sentiment analysis. Cur- rently, one pressing issue in Chinese OIE is that the Chinese corpus is not readily avalable. This paper focuses on the annotation framework for Chinese OIE, and constrcuts a Chinese corpus containing rich information. Specifical- ly, in additions to the popular elements including sentiment orientation, opinion target and opinion keyword, our corpus contains the information of opinion target ellipsis, the expressing opinion without sentimental words and the sentimental polarity shifting. The statistics show the popularity and necessity of these special points (e. g. , opinion target ellipsis) in Chinese texts.
出处 《中文信息学报》 CSCD 北大核心 2015年第4期67-73,共7页 Journal of Chinese Information Processing
基金 国家自然科学基金(61003155 60873150) 模式识别国家重点实验室开发课题基金
关键词 情感分析 情感信息抽取 中文语料库 sentiment analysis opinion information extraction Chinese corpus
  • 相关文献

参考文献16

  • 1Pang B,Lee L.Opinion Mining and Sentiment Analysis[J].Foundations and Trends in Information Retrieval,2008,2(1-2) :1-135. 被引量:1
  • 2Pang B,Lee L,Vaithyanathan S.Thumbs up? Sentiment Classification using Machine Learning Techniques[C]//Proceedings of EMNLP-02.2002:79-86. 被引量:1
  • 3宗成庆编著..统计自然语言处理[M].北京:清华大学出版社,2008:475.
  • 4Kim S,Hovy E.Extracting Opinions,Opinion Holders,and Topics Expressed in Online News Media Text[C]//Proceedings of the ACL Workshop on Sentiment and Subjectivity in Text.2006:1-8. 被引量:1
  • 5Ku L,Liu I,Lee C,et al.H.Sentence-Level Opinion Analysis by CopeOpi in NTCIR-7[C]//Proceedings of NTCIR-7 Workshop.2008. 被引量:1
  • 6Hu M,Liu B.Mining Opinion Features in Customer Reviews[C]//Proceedings of AAAI-2004.2004:755-760. 被引量:1
  • 7Zhuang L,Jing F,Zhu X.Movie review mining and summarization[C]//Proceedings of CIKM-2006.2006:43-50. 被引量:1
  • 8Li B,Zhou L,Feng S,et al.A Unified Graph Model for Sentence-based Opinion Retrieval[C]//Proceedings of ACL.2010:1367-1375. 被引量:1
  • 9Jakob N,Gurevych I.Extracting Opinion Targets in a Single and Cross-Domain Setting with Conditional Random Fields[C]//Proceedings of EMNLP-2010.2010:1035-1045. 被引量:1
  • 10王荣洋,鞠久朋,李寿山,周国栋.基于CRFs的评价对象抽取特征研究[J].中文信息学报,2012,26(2):56-61. 被引量:38

二级参考文献12

  • 1倪茂树,林鸿飞.基于关联规则和极性分析的商品评论挖掘[C]//第三届全国信息检索与内容安全学术会议,2007:635-642. 被引量:5
  • 2Pang B.,Lee L.,Vaithyanathan S.Thumbs Up Sentiment Classification Using Machine Learning Techniques[C]//Proceedings of EMNLP-2002.2002:79-86. 被引量:1
  • 3Li S.,Huang C.,Zong C. Multi-domain Sentiment Classification with Classifier Combination[J].Journal of Computer Science and Technology (JCST),2011,26(1):25-33. 被引量:1
  • 4Kim S.,Hovy E.Extracting Opinions,Opinion Holders,and Topics Expressed in Online News Media Text[C]//Proceedings of the ACL Workshop on Sentiment and Subjectivity in Text.2006:1-8. 被引量:1
  • 5Lafferty J.,McCallum A.,Pereira F. Conditional Random Fields:Probabilistic Models for Segmenting and Labeling Sequence Data[C]//Proceedings of IC-ML-2001.2001:282-289. 被引量:1
  • 6Jakob N.,Gurevych I.Extracting Opinion Targets in a Single and Cross-Domain Setting with Conditional Random Fields[C]//Proceedings of EMNLP-2010.2010:1035-1045. 被引量:1
  • 7Hu M,Liu B.Mining Opinion Features in Customer Reviews[C]//Proceedings of AAAI-2004.2004:755-760. 被引量:1
  • 8Titov I.,McDonald R.Modeling Online Reviews with Multi-grain Topic Models[C]//Proceedings of WWW-2008.2008:111-120. 被引量:1
  • 9Lu Y.,Zhai C.,Sundaresan N.Rated aspect summarization of short comments[C]//Proceedings of WWW-2009.2009:131-140. 被引量:1
  • 10Lu B.Identifying Opinion Holders and Targets with Dependency Parser in Chinese News Texts[C]//Proceedings of the NAACL HLT 2010 Student Research Workshop,Los Angeles,California.2010:46-51. 被引量:1

共引文献50

同被引文献155

引证文献8

二级引证文献67

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部