期刊文献+

基于Co-Training的微博垃圾评论识别方法 被引量:3

Recognition Method of Microblogging Spam Comment Based on Co-Training
下载PDF
导出
摘要 微博上大量的垃圾评论对个人、社会,甚至是对国家都会造成不良影响。为对微博中的垃圾评论进行识别,提出基于协同训练的微博垃圾评论识别方法。定义一种基于规则的识别方法过滤出显式垃圾评论,剩余的评论归为相关评论,构建AdaBoost分类器和支持向量机分类器,通过Co-Training算法进行协同训练,判断其是否为垃圾评论,以提高分类精度,节省样本标注工作。实验结果表明,与基于相似度计算的垃圾评论识别方法、基于评论多特征的垃圾评论识别方法相比,该方法具有较好的识别效果。 A large amount of spam comments on microblogging will have an adverse effect on individuals, society, and even the country. In order to identify junk comments in microblogging and reduce junk comments,a microblogging junk comment review method based on collaborative training is proposed. Define a rule-based recognition method to filter out explicit spam comments. The remaining comments are categorized as related comments. The AdaBoost classifier and Support Vector Machine( SVM) classifier are constructed. The Co-Training algorithm is used for collaborative training to determine whether it is a spam comment or not, classification accuracy, saving sample labeling work. Experimental results show that compared with the spam comment recognition method based on similarity calculation and the multi-features comment spam recognition method,this method has a better recognition effect.
作者 李志欣 兰丹媚 张灿龙 唐素勤 LI Zhixin;LAN Danmei;ZHANG Canlong;TANG Suqin(Guangxi Key Lab of Multi-source Information Mining and Security,Guangxi Normal University,Guilin,Guangxi 541004,China;Guangxi Collaborative Innovation Center of Multi-source Information Integration and Intelligent Processing,Guilin,Guangxi 541004,China)
出处 《计算机工程》 CAS CSCD 北大核心 2018年第7期212-218,共7页 Computer Engineering
基金 国家自然科学基金(61663004 61363035 61365009) 广西自然科学基金(2016GXNSFAA380146 2017GXNSFAA198365) 广西多源信息挖掘与安全重点实验室主任基金(16-A-03-02) 广西学位与研究生教育改革专项课题(JGY2015031)
关键词 微博垃圾评论 协同训练 同义词词林 支持向量机 相似度计算 microblogging spam comment collaborative training synonym word forest Support Vector Machine(SVM) similarity computation
  • 相关文献

参考文献9

二级参考文献263

共引文献389

同被引文献15

引证文献3

二级引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部