摘要
云计算环境下的多服务器多分区系统中存在海量数据,传统串行数据挖掘方法对这些数据进行挖掘的过程中,无法对海量数据进行并行处理,挖掘效率低。针对该问题,设计云计算环境下多服务器多分区数据挖掘系统,其包括基础设施即服务层、平台即服务层、软件即服务层,可实现大规模数据的高效挖掘。系统通过平台即服务层中的多服务器多分区数据处理模型,实现海量数据的分布式运算,并基于MapReduce机制实现K均值聚类数据挖掘算法的并行化,通过Map和Reduce函数实现多服务器多分区数据的并行挖掘。实验结果表明,所设计系统大幅度降低了云计算环境下多服务器多分区数据的挖掘时间,提高了数据的挖掘效率和稳定性。
Since there are mass data in the multi?server multi?partition system in cloud computing environment,the tradi?tional serial data mining method cannot be carried out on the parallel processing of the mass data in the data mining process,and its mining efficiency is low,a cloud computing environment multi?server multi?partition data mining system was designed,which includes infrastructure,platform and software,and can realize efficient mass data mining.The system can realize the dis?tributed operation of mass data through the multi?server multi?partition data processing model in the platform,and achieve paral?lelization of K?means clustering data mining algorithm based on MapReduce mechanism.The multi?server multi?partition dataparallel mining is realized with Map and Reduce functions.The experimental results indicate that the designed system has great?ly shortened multi?server multi?partition data mining time in the cloud computing environment,and improved the efficiency andstability of data mining.
作者
李娜
余省威
LI Na;YU Shengwei(Hohhot Vocational College,Hohhot 010051,China;Zhengzhou University,Zhengzhou 450000,China)
出处
《现代电子技术》
北大核心
2017年第10期43-45,49,共4页
Modern Electronics Technique
基金
国家自然科学基金(81100029)
关键词
云计算
多服务器
多分区数据
数据挖掘
cloud computing
multi.server
multi.partition data
data mining