期刊文献+

一种新型的基于Hadoop框架的分布式并行FP-Growth算法 被引量:7

A novel distributed parallel FP-Growth algorithm based on Hadoop framework
下载PDF
导出
摘要 针对传统FP-Growth算法在大规模数据环境下存在的挖掘效率低和内存溢出问题,在传统FP-Growth算法的基础上,提出一种新的并行FP-Growth算法,并在分布式计算框架Hadoop的MapReduce编程模式下实现并行化处理。实验数据表明,并行的FP-Growth算法与传统的FPGrowth算法相比,具有相同数据量下计算时间短,相同时间内处理数据量增大的优点,并在一定条件下解决了大数据挖掘的内存溢出问题。 Aiming at the low mining efficiency and memory overflow problems of the traditional FP-Growth algorithm,on the basis of the traditional FP-Growth algorithm,a novel parallel FP-Growth algorithm is proposed,which can realize parallel processing in MapReduce programming mode of Hadoop distributed computing framework.The tested data shows that compared to the traditional algorithm,the parallel FP-Growth algorithm has great advantages:the calculation time is greatly reduced when processing the same amount of data;processed data volume is greatly increased under the same time;and memory overflow problem in large scale data mining is solved under certain conditions.
出处 《河北工业科技》 CAS 2016年第2期169-177,共9页 Hebei Journal of Industrial Science and Technology
关键词 并行处理 分布式 数据挖掘 闭频繁项集 HADOOP FP-GROWTH parallel processing distributed data mining closed frequent itemsets Hadoop FP-Growth
  • 相关文献

参考文献22

二级参考文献108

共引文献209

同被引文献61

引证文献7

二级引证文献70

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部