期刊文献+

基于多分类器投票集成的半监督情感分类方法研究 被引量:9

Semi-supervised Sentiment Classification Based On Ensemble Learning with Voting
下载PDF
导出
摘要 情感分类是目前自然语言处理领域的一个具有挑战性的研究热点,该文主要研究基于半监督的文本情感分类问题。传统基于Co-training的半监督情感分类方法要求文本具备大量有用的属性集,其训练过程是线性时间的计算复杂度并且不适用于非平衡语料。该文提出了一种基于多分类器投票集成的半监督情感分类方法,通过选取不同的训练集、特征参数和分类方法构建了一组有差异的子分类器,每轮通过简单投票挑选出置信度最高的样本使训练集扩大一倍并更新训练模型。该方法使得子分类器可共享有用的属性集,具有对数时间复杂度并且可用于非平衡语料。实验结果表明我们的方法在不同语种、不同领域、不同规模大小,平衡和非平衡语料的情感分类中均具有良好效果。 Recently,sentiment classification has become a hot research topic in natural language processing.In this paper,we focus on semi-supervised approaches for this issue.In contrast to the traditional method based on cotraining,this paper presents a semi-supervised sentiment classification via voting based ensemble learning.We construct a set of diversified sub classifiers by choosing different training sets,feature parameters and classification methods.During each voting round,samples with highest confidence are picked out to double the size of training set and then to update the model.This new method also allows sub classifiers to share useful attributes sets.It has a logarithmic time complexity and can be used for non-equilibrium corpus.Experiments show that this method has achieved good results in the sentiment classification task with corpus in different languages,areas,sizes,and both balanced and unbalanced corpus.
作者 黄伟 范磊
出处 《中文信息学报》 CSCD 北大核心 2016年第2期41-49,106,共10页 Journal of Chinese Information Processing
关键词 情感分类 集成学习 半监督学习 sentiment classification ensemble learning semi-supervised learning
  • 相关文献

参考文献25

二级参考文献79

  • 1刘永丹,曾海泉,李荣陆,胡运发.基于语义分析的倾向性文本过滤[J].通信学报,2004,25(7):78-85. 被引量:34
  • 2朱嫣岚,闵锦,周雅倩,黄萱菁,吴立德.基于HowNet的词汇语义倾向计算[J].中文信息学报,2006,20(1):14-20. 被引量:326
  • 3徐琳宏,林鸿飞,杨志豪.基于语义理解的文本倾向性识别机制[J].中文信息学报,2007,21(1):96-100. 被引量:120
  • 4V Hatzivassiloglou,K McKeown.Predicting the Semantic Orientation of Adjectives[A].In:Proceedings of the 35th Annual Meeting of the ACL[C].New Jersey:ACL,1997:174-181. 被引量:1
  • 5Peter D Turney.Thumbs Up or Thumbs Down Semantic Orientation Applied to Unsupervised Classification of Reviews{A].In Proceedings of the 40th ACL[C].New Jersey:ACL,2001:417-424. 被引量:1
  • 6董振东,董强.知网[EB/OL].[2008-08-01].http://www.keenage.com. 被引量:1
  • 7赵军,许洪波,黄萱菁,谭松波,刘康,张奇.中文倾向性分析评测技术报告[C]//第一届中文倾向性分析评测会议(The First Chinese Opinion Analysis Evaluation).COAE,2008. 被引量:13
  • 8R.Vilalta and Y.Drissi.A perspective view and survey of meta-learning[J].Artificial Intelligence Review,2002,18(2):77-95. 被引量:1
  • 9Saso Dzeroski and Bernard Zenko:Is combining classifiers with stacking better than selecting the best one?[J].Machine Learning.2004,54(3):255-273. 被引量:1
  • 10Rie Ando and Tong Zhang.A framework for learning predictive structures from multiple tasks and unlabeled data[J].Journal of Machine Learning Research,2005,6:1817-1853. 被引量:1

共引文献710

同被引文献71

引证文献9

二级引证文献60

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部