期刊文献+

基于特征选择和点互信息剪枝的产品属性提取方法 被引量:3

Product Attribute Extraction Based on Feature Selection and Pointwise Mutual Information Pruning
下载PDF
导出
摘要 产品属性的自动抽取是情感分析中的重要研究内容.文中提出一种基于特征选择和词频及点互信息剪枝的产品属性提取方法.首先引入在分类任务中常用的l1-norm正则化(Lasso)方法,将产品属性抽取问题转换为分类中的特征选择问题,利用Lasso生成稀疏模型的特性,将模型中少量的特征作为产品特征属性候选集.然后根据候选特征属性集中的特征属性在文本中出现的频率进行排序并剪枝.最后经过进一步合并和点互信息剪枝处理,得到最终的产品属性集.在中文产品评论集上的实验证实文中方法的有效性. Product attribute extraction is a key point in sentiment analysis. In this paper, a product attribute extraction method based on feature selection and pointwise mutual information pruning strategies is proposed. Firstly, the extraction task is transferred to a feature selection task in a classifier. The classification model with l1-norm regularization, such as Lasso, can encourage a sparse model with fewer important selected features. Secondly, some extracted features are selected through a frequency threshold. The features as the product attributes are finally generated with point mutual information pruning . The experiments on the product reviews in Chinese demonstrate the effectiveness of the proposed method.
出处 《模式识别与人工智能》 EI CSCD 北大核心 2015年第2期187-192,共6页 Pattern Recognition and Artificial Intelligence
基金 国家自然科学基金项目(No.61003112 61170181) 国家社会科学基金重点项目(No.11AZD121) 江苏省自然科学基金项目(No.BK2011192)资助
关键词 情感分析 产品属性提取 l1-norm正则化 点互信息剪枝 Sentiment Analysis Product Attribute Extraction l1-norm Regularization Pointwise Mutual Information Pruning
  • 相关文献

参考文献14

  • 1Hatzivassiloglou V, McKeown K R. Predicting the Semantic Orien- tation of Adjectives//Proe of the 35th Annual Meeting of the Asso- ciation for Computational Linguistics and the 8th Conferenee of the European Chapter of the Association for Computational Linguistics. Madrid, Spain, 1997:174-181. 被引量:1
  • 2赵妍妍,秦兵,刘挺.文本情感分析[J].软件学报,2010,21(8):1834-1848. 被引量:539
  • 3杨卉..Web文本观点挖掘及隐含情感倾向的研究[D].吉林大学,2011:
  • 4Popescu A M, Etzioni O. Extracting Product Features and Opinions fi'om Revie, ws// Proc of the Conference on turnan I,anguage Tech- nology and Empirical Methods in Natural Language Processing. Van- couver, Canada, 2005:339-346. 被引量:1
  • 5Hu M Q, Liu B. Mining and Summarizing Customer Reviews // Proc of the lOth ACM SIGKDD International tonference on Know- ledge Discove and Data Mining. Seattle, USA, 2004:168-177. 被引量:1
  • 6Qiu G, Liu B, Bu J J, et al. Expanding Domain Sentiment Lexieun through Double Propagation// Proc ff the 21st lnten:ational ,lint (2n- [erence on Artificial lntelligenee. Pasadena, USA, 2009:1199-1204. 被引量:1
  • 7Zhang L, Liu B, Lim S H, et al. Extracting and Ranking Prndue! Features in Opinion Doeuments // Prnc of the 23rd International Conference on Computational Linguistics. Beijing, China, 2010: 1462-1470. 被引量:1
  • 8Hu K Y, Lu Y C, Zhou L Z, et al. Integrating Classification and Association Rule Mining: A Concept Lattice Framework // Prnc of the 7th International Workshop on New Directions in Rongh Sets, Data Mining, and Granular Soft Computing. Yamaguchi, Japan, 1999 : 443-447. 被引量:1
  • 9Turney P D. Thumbs up or Thumbs down? Semantic Orientation Applied to Unsupervised Classifieation of Reviews // Proc of the 40th Annual Meeting of the Association fi)r Cnmputational Linguis- tics. Philadelphia, USA, 2002:417-424. 被引量:1
  • 10Titov I, McDunald R. A Joint Model of Text and Aspect Ratings for Sentiment Summarization//Proc of the 46th Annual Meeting of the Association fOr Computational l,inguistics: Human Language Tech- nologies. Coluinbus, USA, 2008:308-316. 被引量:1

二级参考文献5

共引文献538

同被引文献32

引证文献3

二级引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部