摘要
针对现有的最大频繁项集挖掘算法挖掘时间过长、内存消耗较大的问题,提出了一种基于构造链表B-list的最大频繁项集挖掘算法BMFI。该算法利用B-list数据结构来挖掘频繁项集,并采用全序搜索树作为搜索空间,然后采用父等价剪枝技术来缩小搜索空间;最后再结合基于MFI-tree的投影策略实现超集检测来提高算法的效率。实验结果表明,BMFI算法在时间效率与空间效率方面均优于FPMAX与MFIN算法。该算法在稠密数据集与稀疏数据集中进行最大频繁项集挖掘时均有良好的效果。
In order to solve the problems that existing in the maximal frequent itemset mining algorithms,such as the mining time is too long and the memory consumption is too large,this paper presented a maximal frequent itemset mining algorithm BMFI which employed B-list to mining frequent itemsets and employed the whole sequence search tree as the search space.Then,it used the parent equivalence pruning technique to reduce the search space. Finally,which combined with the MFItree-based projection strategy to achieve superset detection to improve the efficiency of the algorithm. The experimental results show that the performance of BMFI algorithm is superior to FPMAX algorithm and MFIN algorithm in terms of time efficiency and spatial efficiency. The proposed algorithm has good performance when mining the maximal frequent itemset in dense data set and sparse data set.
作者
张昌
文凯
郑云俊
Zhang Chang;Wen Kai;Zheng Yunjun(Institute of Applied Communication Technology,School of Communication and Information Engineering,Chongqing University of Posts and Telecommunications,Chongqing 400065,China;Chongqing Information Technology Designing Co. Ltd. ,Chongqing 401121,China)
出处
《计算机应用研究》
CSCD
北大核心
2019年第2期351-354,共4页
Application Research of Computers
关键词
最大频繁项集挖掘
深度优先搜索
剪枝技术
超集检测
maximal frequent itemsets mining
depth-first search
pruning techniques
superset detection