期刊文献+

基于无监督学习的产品特征抽取 被引量:2

Product feature extraction based on unsupervised learning
下载PDF
导出
摘要 产品特征抽取是文本观点抽取和倾向性分析中的重要研究课题之一,提出了一种基于无监督学习的产品特征自动抽取方法。该方法从产品评论语句中抽取文本模式,以文本模式作为特征,将产品评论中所有的名词和名词短语(除产品名称)表示为向量,采用聚类算法将表示为向量的名词和名词短语聚为两类,以产品名称作为外部知识,利用表示"整体-部件"关系的文本模式识别产品特征集合。实验结果表明,该方法在电子产品领域的产品评论语料上取得了较好的实验效果。 The extraction of product feature is one of the important topics in text opinion extraction and sentiment analysis. This paper proposes a method based on unsupervised learning to extract product features. Text patterns are extracted from product review sentences; all the nouns and noun phrases(except product names)in product reviews are expressed as vectors by the feature set constructed by text patterns. All the nouns and noun phrases expressed as vectors are grouped into two sets. The product feature set is identified from the two sets by part-of relation text pat-terns with the help of product names. The experimental results indicate that, the method achieves good result in the corpus of electronic product reviews.
作者 熊壮
出处 《计算机工程与应用》 CSCD 2012年第10期160-163,共4页 Computer Engineering and Applications
基金 国家科技重大专项(No.2008ZX06315-001)
关键词 产品评论 文本模式 “整体-部分”关系 product review text pattern part-of relation
  • 相关文献

参考文献7

  • 1姚天昉,聂青阳,李建超,李林琳,陈柯,付宁.一个用于汉语汽车评论的意见挖掘系统[C]//中文信息处理前沿进展-中国中文信息学会二十五周年学术会议论文集.北京:清华大学出版社,2006:260-281. 被引量:14
  • 2Li Zhuang,Feng Jing,Zhu Xiaoyan.Movie review min-ing and summarization[C]//Proceedings of the2006ACM CIKM International Conference on Information and Knowl-edge Management,Arlington,Virginia,USA,2006. 被引量:1
  • 3Hu Minqing,Liu Bing.Mining opinion features in cus-tomer reviews[C]//Proceedings of Nineteenth National Con-ference on Artificial Intelligence AAAI-2004,2004:755-760. 被引量:1
  • 4Popescu A M,Etzioni O.Extracting product features and opinions from reviews[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing,2005:339-346. 被引量:1
  • 5宋晓雷,王素格,李红霞.面向特定领域的产品评价对象自动识别研究[J].中文信息学报,2010,24(1):89-93. 被引量:34
  • 6Wiebe J,Wilson T,Bruce R,et al.Learning subjective language[J].Computational Linguistics,2004,30(3):277-308. 被引量:1
  • 7Berland M,Charniak E.Finding parts in very large cor-pora[C]//Proceedings of the37th Annual Meeting of the Association for Computational Linguistics,1999. 被引量:1

二级参考文献15

  • 1刘非凡,赵军,吕碧波,徐波,于浩,夏迎炬.面向商务信息抽取的产品命名实体识别研究[J].中文信息学报,2006,20(1):7-13. 被引量:47
  • 2赵世奇,刘挺,李生.一种基于主题的文本聚类方法[J].中文信息学报,2007,21(2):58-62. 被引量:23
  • 3赵军,许洪波,黄萱菁,谭松波,刘康,张奇.中文倾向性分析评测技术报告[C]//第一届中文倾向性分析评测会议(The First Chinese Opinion Analysis Evaluation).COAE,2008. 被引量:13
  • 4Hongye Tan,Tieiun Zhao,Jianmin Yao. A Study on Pattern Generalization in Extended Named Entity Recognition[J]. Chinese Journal of Electronic, 2007, 16 (4):675-678. 被引量:1
  • 5Cheng Niu,Wei Li,Jihong Ding, etc. A Bootstrapping Approach to Named Entity Classification Using Successive Learners[C]// Proceedings of the 41st ACL, Sapporo, Japan, 2003 : 335-342. 被引量:1
  • 6何慧,李思,肖芬,等.PRIS中文情感倾向性分析技术报告[C]//Proceedings of the COAE2008,Harbin,2008:46-55. 被引量:1
  • 7张姝,贾文杰,夏迎炬,等.基于CRF的评价对象抽取技术研究[C]//Proceedings of the COAE2008,Harbin,2008:32-37. 被引量:5
  • 8王俞霖,孙乐.中国科学院软件研究所COAE2008报告[C]//Proceedings of the COAE2008,Harbin,2008:1-20. 被引量:1
  • 9赵妍妍,刘鸿宇,秦兵,等.HIT_IR_OMS:情感分析系统[C]//Proceedings of the COAE2008,Harbin,2008:81-88. 被引量:1
  • 10Mingqing Hu and Bing Liu. Mining and Summarizing Customer Reviews[C]//Proceedings of the tenth ACM SIGKDD. 2004 : 168-177. 被引量:1

共引文献44

同被引文献36

引证文献2

二级引证文献3

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部