摘要
为了尽量减少科技产品领域虚假评论造成的影响以及提高虚假评论识别的准确率,基于该领域中文虚假评论制造及内容特点,提出了一种基于行为和内容的虚假评论识别方法.基于评论者发表评论数量、频率、长度建立了网络水军特征程度模型;提出了长度程度、专业程度、情感密度、格式规范程度、情感失衡程度等内容特征计算方法;最后,提出了以内容特征为向量,行为特征为调节参数的非监督聚类的科技产品虚假评论判别方法.利用领域评论数据集进行相应实验,结果表明所提出方法具有较高的准确率,且对同领域下不同主题的适应性较强.
In order to minimize the impact of deceptive reviews in the IT products field and improve recognition accuracy of deceptive reviews,taking into account the manufacture and content characteristics of Chinese deceptive reviews in this filed,we propose a deceptive reviews recognition method base on behavior and content. We establish the characteristics degree mode for Internet mercenaries based on the number of reviewers comment,posting frequency,content length; Second,we design the calculation methods for length degree of review,professional level of review,emotional density of review,format specification degree of review,emotional imbalance degree of review; Finally,propose an unsupervised clustering algorithm base on content feature vector and behavior regulating parameter to recognize the deceptive reviews in the IT products field. We take many experiments use the data sets in the IT products field,the results showthat our method has higher accuracy and also has a strong adaptability to the subjects in the IT products field.
出处
《小型微型计算机系统》
CSCD
北大核心
2015年第11期2498-2503,共6页
Journal of Chinese Computer Systems
基金
国家自然科学基金项目(61462037
61173146
61262033)资助
江西省自然科(20142BAB217014
20142BAB207009)资助
江西省教育厅科技项目(GJJ13303)资助
关键词
行为
内容
虚假评论
科技产品
非监督聚类
behavior
content
deceptive review s
technology products
unsupervised clustering