摘要
目前,由于云计算的出现,越来越多的中小企业在分析海量数据时能便利地找到廉价的解决方案。本文,鉴于MapReduce作为Hadoop中的重要编程模型,在介绍基于云计算的Hadoop平台和数据挖掘技术中的SPRINT分类算法的基础上,详细描述SPRINT的并行算法在MapReduce编程模型上的执行流程,并利用研究出的决策树模型对输入数据进行分类。
At present, because of the presence of cloud computing, more and more small and medium sized enterprises can find low-cost solution easily when analyzing mass data. In this paper, whereas MapReduce being the important programming model of Hadoop, in the base of introducing the Hadoop platform and SPRINT algorithm of data mining, proposes the detailed procedure of SPRINT algorithm on MapReduce,and classifies the input data by the model of decision tree.
出处
《软件》
2010年第11期57-61,共5页
Software