期刊文献+

Hadoop环境下基于并行熵的FIUT算法挖掘 被引量:6

Mining research on FIUT algorithm based on parallel entropy in Hadoop environment
下载PDF
导出
摘要 针对传统频繁项集挖掘算法效率低下的问题,提出基于Hadoop平台的并行BMR-FIUT算法。通过引入FIU-Tree(frequent items ultrametric tree)结构挖掘频繁项集,避免传统算法的缺陷;改进FIUT算法的分解过程,使之适应于Map-Reduce框架下的并行计算,达到并行化的目的;利用并行熵作为集群系统的负载均衡度量,使系统尽可能在各节点间合理分发数据以平衡负载。实验结果表明,BMR-FIUT算法能够有效减少并行化过程中节点负载倾斜的问题,较现有的PFP-Growth算法具有更好的性能,适用于海量数据挖掘。 Focusing on the inefficient problem of traditional algorithms for mining frequent itemsets, a parallel algorithm named Balanced _ MapReduce _ FIUT (BMR-FIUT) based on Hadoop platform was proposed. By introducing frequent items ultrametric tree (FIU-Tree) structure, frequent itemsets were mined, effectively avoiding the defects of the traditional algorithm. The process of decomposition was improved with FIUT algorithm to adapt to its parallel computing under the framework of MapReduce, achieving the purpose of parallelization. The parallel entropy was used as the load balance measurement in cluster system, so that system could in all reasonable to distribute data as much as possible between every nodes. Experimental results show that BMR-FIUT algorithm can effectively reduce the problem about load inclination of any node in the process of parallelization, it is superior to the existing PFP-Growth algorithm and it has better performance on mining volume big data.
作者 晏依 徐苏 YAN Yi;XU Su(School of Information Engineering,Nanchang University,Nanchang 330031,China)
出处 《计算机工程与设计》 北大核心 2019年第3期685-690,787,共7页 Computer Engineering and Design
关键词 数据挖掘 频繁项集 MapReduce编程模型 FIUT算法 并行熵 负载均衡 data mining frequent items MapReduce programming model FIUT algorithm parallel entropy load balance
  • 相关文献

参考文献7

二级参考文献78

共引文献86

同被引文献50

引证文献6

二级引证文献12

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部