期刊文献+

基于评论行为的商品垃圾评论的识别研究 被引量:9

Research on product review spam detection based on review behavior
下载PDF
导出
摘要 为了识别商品垃圾评论,基于垃评论员发表的多为垃圾评论这一基本思想,提出一种基于评论员评论行为来判定其是否为垃圾评论员的方法。分析定义了垃圾评论员常见的三类评论行为,分别是针对同类商品发表垃圾评论,针对同品牌商品发表垃圾评论和针对同一卖家商品发表垃圾评论;在对这三类评论行为建模的同时提出一种依据重复性过高或过低打分的评论数量来计算评论员垃圾指数(spam score)的方法。实验数据为在当当网摄影摄像商品区发表过评论的评论员的所有评论信息。实验结果通过人工评判和计算NDCG(normalize discounted cumulative gain)值的方法来检验,实验结果准确有效。 To detect product review spam on reviewers' behaviors, a method based on the idea that review spammers always issue product review spams is presented. Three characteristic behaviors of review spammers are identified and modeled, including targeting at product type, targeting at product brand and targeting at product seller. Meanwhile, scoring methods is proposed for the three behaviors to measure spam score of each reviewer based on his or her repeated overhigh or overlow rating. In experiments, the reviews are come from camera product reviewers of DANGDANG website. Manually evaluating and NDGG (nor malize discounted cumulative gain) calculating are adopted to judge our experimental results. The results show that our method is accuracy and effective on detecting review spammers.
出处 《计算机工程与设计》 CSCD 北大核心 2012年第11期4314-4319,共6页 Computer Engineering and Design
基金 北京林业大学新进教师科研启动基金项目(BLX2w8019) 中央高校基本科研业务费专项基金项目(YX2011-30)
关键词 商品评论 垃圾评论 垃圾评论识别 垃圾评论员 评论行为 product review review spam review spam detection review spammer review behavior
  • 相关文献

参考文献10

  • 1Liu B. Opinion spare detection: Detecting fake reviews and fake reviewers [EB/OL]. http: //www. cs. uic. edu/-liub/ FBS/fake-reviews. html, 2011. 被引量:1
  • 2Jindal N, Liu B. Review spam detection [C]. Proceedings of the 16th international conference on World Wide Web. Banff, Alberta, Canada: ACM, 2007: 1189-1190. 被引量:1
  • 3Jindal N, Liu B, Lim E-P. Finding atypical review patterns for detecting opinion spammers [R]. UIC Tech Rep, 2010. 被引量:1
  • 4Jindal N, Liu B, Lim E-P. Finding unusual review patterns using unexpected rules [C]. Proceedings of the 19th ACM International Conference on Information and Know-ledge Management. Toronto, ON, Canada: ACM, 2010: 1549-1552. 被引量:1
  • 5Lim E-P, Nguyen V-A, Jindal N, et al. Detecting product review spammers using rating behaviors [C]. Proceedings of the 19th ACM International Conference on Information and Know-ledge Management. Toronto, ON, Canada: ACM, 2010:930-948. 被引量:1
  • 6WU G, Greene D, Smyth B, et al. Distortion as a validation criterion in the Identification of suspicious reviews [C]. Wa shington, DC, USA: 1st Workshop on Social Media Analytics, 2010. 被引量:1
  • 7Baccianella S, Esuli A, Sebastiani F. Multi-facet rating of product reviews [C]. Proceedings of the 31th European Conference on IR Research on Advances in Information Retrieval. Toulouse, France: Springer-Verlag, 2009: 461-472. 被引量:1
  • 8许嘉璐.现状和设想——试论中文信息处理与现代汉语研究[J].中文信息学报,2001,15(2):1-8. 被引量:21
  • 9陈冠熙.利用使用者评论及商品概述网页撷取商品特色与评价[D].台湾:台湾国立成功大学,2007. 被引量:1
  • 10Furuse O, Hiroshima N, Yamada S, et al. Opinion sentence search engine on open-domain blog [C]. Proceedings of the 20th international joint conference on Artifical intelligence. Hyderabad, India: Morgan Kaufmann Publishers Inc, 2007. 被引量:1

二级参考文献2

共引文献20

同被引文献96

  • 1王斌,潘文锋.基于内容的垃圾邮件过滤技术综述[J].中文信息学报,2005,19(5):1-10. 被引量:129
  • 2蒋涛,张彬.Web Spam技术研究综述[J].情报探索,2007(7):66-68. 被引量:3
  • 3FEI G,MUKHERJEE A,LIU B,et al. Exploiting business in reviews for review spammer detection [ C ]//Proceedings of the Seventh International AAAI Conference on Weblogs and Social Media. USA : Cambridge, Massachusetts, 2013 : 175-180. 被引量:1
  • 4BLEI D M, NG A Y, JORDAN M I. Latent dirichlet allocation[ J ]. Journal of Machine Learning Research, 2003 (3) :993-1 022. 被引量:1
  • 5WANG X, ZHAI C, ROTH D. Understanding evolution of research themes:a probabilistic generative model for citations [C]//Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ACM,2013:I 115-1 123. 被引量:1
  • 6CASELLA G, GEORGE E I. Explaining the gibbs sampler [J]. The American Statistician, 1992,46 ( 3 ) : 167-174. 被引量:1
  • 7GEORGE H. Parameter estimation for text analysis [ R ]. Technical Report,2005. 被引量:1
  • 8巾国互联网信息中心.第32次中国互联网络发展状况统计报告[R/OL].[2013-09-30].http://www.cnnic.net.cn/hlwfzyj/hlwxzbg/hlwtjbg/201307/t20130717_40664.htm. 被引量:1
  • 9Wang G, Xie S H, Liu B, et al. Review Graph Based Online Store Review Spammer Detection [C]. In: Proceedings of the 11th International Conference on Data Mining. Washington, DC, USA: IEEE Computer Society, 2011 : 1242-1247. 被引量:1
  • 10Li F T, Huang M, Yang Y, et al. Learning to Identify Review Spam [C]. In: Proceedings of the 22nd International Joint Conference on Artificial Intelligence. AAAI Press, 2011: 2488-2493. 被引量:1

引证文献9

二级引证文献38

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部