期刊文献+

基于Hadoop平台的并行kNN网络舆情分类算法 被引量:3

A Parallel k-Nearset Neighbor Network Public Opinion Classification Algorithm Based on Hadoop Platform
下载PDF
导出
摘要 针对网络舆情数据存在数据量大、分散度高、数据非结构化等特点,而常用的文本分类算法难以实现对网络舆情快速、准确分类的问题,因此提出一种基于Hadoop平台的并行k NN网络舆情分类算法,利用Hadoop分布式存储特性和设计并行k NN的MapReduce程序来解决处理大批量数据时存在的问题。对并行k NN算法进行分类能力和分类效率进行测试验证,实验结果表明,基于Hadoop平台的并行k NN网络舆情分类算法在处理大批量网络舆情数据时,能够快速、高效和准确对网络舆情数据进行分类。 According to the characters of Network Public Opinion data,which are volume,high-distribution and non-structured data,the traditional text classification algorithm is diffiuclt to achieve of accurate and fast classification,so a parallel k NN network public opinion classification algorithm was presented based on Hadoop platform. The use of Hadoop distributed storage features and design of parallel k NN MapReduce program to solve the problem of dealing with high-volume data. The results show that the parallel k NN network public opinion classification algorithm based on Hadoop platform can classify the network public opinion data quickly,efficiently and accurately when dealing with high-volume network public opinion data.
作者 杜少波 DU Shaobo(School of Computer & Information Engineering,Guizhou University of Commerce,Guiyan 550014,China)
出处 《电视技术》 2018年第3期58-62,共5页 Video Engineering
基金 贵州省教育厅青年科技人才成长项目(黔教合KY字[2016]235 黔教合KY字[2016]240) 贵州省教育厅教学内容和课程体系改革项目(SJ-JXGC-KC-002) 贵州省普通高等学校工程研究中心(黔教合KY字[2016]016 黔教合KY字[2017]022) 贵州省普通高等学校科技拔尖人才支持计划(黔教合KY字[2016]086)
关键词 k邻近 分类 HADOOP 舆情 MAPREDUCE kNN Classification Hadoop Public Opinion MapReduce
  • 相关文献

参考文献4

二级参考文献35

  • 1谢海光,陈中润.互联网内容及舆情深度分析模式[J].中国青年政治学院学报,2006,25(3):95-100. 被引量:114
  • 2方正智思互联网信息监控分析系统[EB/OL].[2010-03-12].http://www.founderegov.com/Solufions/2010-03/12/content_13320.htm. 被引量:2
  • 3Cohen J,Dolan B,Dunlap M,et al.MAD skills:New analysis practices for big data[J] .PVLDB,2009,2(2):1481-1492. 被引量:1
  • 4D.I.George Amalarethinam,G.J.Joyce Mary.A new DAG based Dynamic Task Scheduling Algorithm(DYTAS) for Multiprocessor Systems[C] //International Journal of Computer Applications,2011:0975-8887. 被引量:1
  • 5LEI Lihui.Formal verification of a solution for green computing[C] //The 3rd International Conference on Quantitative Logic and Soft Computing,2012:370-377. 被引量:1
  • 6C.Baier,J.P.Katoen.Principles of Model Checking[M] .The MIT Press,2007. 被引量:1
  • 7LEI Lihui,LI Yongming.Formal Verification of the Processor Pre-allocation Algorithm for Multiprocessor Scheduling[J] .Journal of Computational Information Systems,2012,8 (16):6971-6978. 被引量:1
  • 8J.C.Palencia,H.M.Gonzalez.Schedulability analysis for tasks with static and dynamic offsets[C] //Proceeding of the 19th IEEE Real-Time Systems Symposium,1998:26-37. 被引量:1
  • 9维克托·迈尔一舍恩伯格,肯尼思·库克耶.大数据时代[M].盛杨燕,周涛,译.杭州:浙江人民出版社,2013. 被引量:33
  • 10Big data [EB/OL]. http://en, wikipedia, org/wiki/Big_data, 2013 -04-26. 被引量:1

共引文献95

同被引文献28

引证文献3

二级引证文献10

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部