期刊文献+

云计算环境下朴素贝叶斯文本分类算法的实现 被引量:21

Nave Bayesian text classification algorithm in cloud computing environment
下载PDF
导出
摘要 采用分布式编程MapReduce模型研究了文本统一格式预处理、训练、测试以及分类等基于朴素贝叶斯文本分类算法主要计算过程的MapReduce并行化方法,并在Hadoop云计算平台进行了实验。实验结果表明:朴素贝叶斯文本分类算法MapReduce并行化后在Hadoop云计算平台上部署运行,具有较好的加速比,对中文网页文本分类识别率达到了86%。 The major procedures of text classification such as uniform text format expression, training, testing and classifying based on Naive Bayesian text classification algorithm were implemented using MapReduce programming mode. The experiments were given in Hadoop cloud computing environment. The experimental results indicate basically linear speedup with an increasing number of node computers. A recall rate of 86% was achieved when classifying Chinese Web pages.
出处 《计算机应用》 CSCD 北大核心 2011年第9期2551-2554,2566,共5页 journal of Computer Applications
基金 中央高校基本科研业务费专项资金资助项目(CZY11002) 武汉市科技攻关项目(201110821229) 工信部国家科技重大专项(2011ZX03002-001-01)
关键词 云计算 并行计算 MapReduce编程模型 文本分类 朴素贝叶斯算法 cloud computing parallel computing MapReduce programming mode text classification Naive Bayes algorithm
  • 相关文献

参考文献14

二级参考文献85

  • 1周锋,李旭伟.一种改进的MapReduce并行编程模型[J].科协论坛(下半月),2009(2):65-66. 被引量:14
  • 2姚再勇,郑启龙,许胤龙,姚震,张红涛,胡晨光.基于Eclipse的并行开发环境EMPI[J].计算机应用与软件,2005,22(10):5-7. 被引量:3
  • 3林金晓,陈伟男,周学功,彭澄廉,吴荣泉.基于Eclipse平台的边界扫描测试软件的开发[J].计算机工程,2007,33(12):280-282. 被引量:5
  • 4Data mining tools you used in 2005 [EB/OL]. [2007].http ://www. kdnugget s.com/polls/2005/data_mining_tools.htm. 被引量:1
  • 5Witten I H,Frank E.Data mining practical machine learning tools and techniques[M].2nd ed.北京:机械工业出版社,2005. 被引量:1
  • 6Kirkby R,Frank E.WEKA explorer user guide for version 3-4-3 [EB/OL].[2007].http://www.es.waikato.ae.nz/ml/WEKA/2004. 被引量:1
  • 7UCI machine learning repository[EB/OL].[2007].http://mlearn.ics.uci. edu/MLRepository.html. 被引量:1
  • 8Sims K. IBM introduces ready-to-use cloud computing collaboration services get clients started with cloud computing. 2007. http://www-03.ibm.com/press/us/en/pressrelease/22613.wss 被引量:1
  • 9Boss G, Malladi P, Quan D, Legregni L, Hall H. Cloud computing. IBM White Paper, 2007. http://download.boulder.ibm.com/ ibmdl/pub/software/dw/wes/hipods/Cloud_computing_wp_final_8Oct.pdf 被引量:1
  • 10Zhang YX, Zhou YZ. 4VP+: A novel meta OS approach for streaming programs in ubiquitous computing. In: Proc. of IEEE the 21st Int'l Conf. on Advanced Information Networking and Applications (AINA 2007). Los Alamitos: IEEE Computer Society, 2007. 394-403. 被引量:1

共引文献2401

同被引文献183

引证文献21

二级引证文献124

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部