一种基于动态排序的最大频繁项集挖掘算法

A Maximal Frequent Itemsets Mining Algorithm Based on Dynamic Reordering

导出

摘要集合枚举树是最大频繁项集挖据算法中常采用的数据结构。在此算法中,最大频繁项集的挖掘过程也可以看作对集合枚举树的搜索过程。为缩小对集合枚举树的搜索空间,本文提出了一种新颖而高效的剪枝方法:根据已挖掘得到的最大频繁模式动态排列枚举树节点的顺序,最大限度的施行剪枝,从而缩小搜索空间。该算法采用位图的数据格式与深度优先的搜索策略。实验结果表明,该算法能有效提高最大频繁项集的挖掘效率,在采用相同的测试数据情况下,效率优于FPMax。 Set Enumeration Tree（SET） is a common data structure in maximal frequent itemsets（MFI） mining algorithm. For this kind of algorithm, the process of mining MFI is a searching process. To reduce the search space of SET,a novel and effective pruning method, MPDR algorithm, in which SET node is max-patterned reordered dynamically, is introduced. Bitmap data format and depth-first search strategy are adopted by this algorithm. Experiments indicate that this algorithm accelerates the generation of maximal frequent itemsets obviously. Using the same testing data sets, our new approach outperforms the FPMax algorithm.

作者汪成亮罗昌银

机构地区重庆大学计算机学院重庆大学电气工程学院

出处《世界科技研究与发展》 CSCD 2010年第4期440-444,共5页 World Sci-Tech R&D

关键词数据挖据最大频繁项集集合枚举树动态排序 data mining maximal frequent itemsets set enumeration tree dynamic reordering

分类号 TP311.13 [自动化与计算机技术—计算机软件与理论]

引文网络
相关文献

参考文献13

1Jr. Roberto J. Bayardo. Effciently mining long patterns from databases. In SIGMOD'98:Proceedings of the 1998 ACM SIGMOD international conference on Management of data[ C]. New York, USA:ACM Press, 1998:85-93. 被引量：1
2Karam Gouda and Mohammed Javeed Zaki. Effciently mining maximal frequent itemsets. In ICDM 01 :Proceedings of the 2001 IEEE International Conference on Data Mining [ C ]. Washington, DC, USA, IEEE Computer Society,2001 : 163-170. 被引量：1
3Dao Lin and Zvi M. Kedem. Princer - search : A new algorithm for discovering the maximum frequent itemset. In Proceedings of the 1998 International Conference on Extending DataBase Technology (EDBT'98) [C]. ed. [S. l.] 1998:105-109. 被引量：1
4Ramesh C. Agarwal, Charu C. Aggarwal, and V. V. V. Prasad. Depth first generation of long patterns. In KDD '00:Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining[ C]. New York, NY, USA, ACM Press 2000: 108-118. 被引量：1
5Mohammed J. Zaki. Scalable algorithms for association rule mining [ J]. IEEE Transactions on Knowledge and Data Engineering,2001, 12 ( 3 ) :372-390. 被引量：1
6Douglas Burdick, Manuel Calimlim, and Johannes Gehrke. MAFIA : A maximal frequent itemset algorithm for transactional databases. In Proceedings of the 17th International Conference on Data Engineering [ C ]. Washington, DC, USA, IEEE Computer Society ,2001:443452. 被引量：1
7Gosta Grahne and Jianfei Zhu. Fast algorithms for frequent itemset mining using fp-trees. [ J]. IEEE Transactions on Knowledge and Data Engineering ,2005,17 ( 10 ) : 1347-1362. 被引量：1
8路松峰,卢正鼎.快速开采最大频繁项目集[J].软件学报,2001,12(2):293-297. 被引量：113
9宋余庆,朱玉全,孙志挥,陈耿.基于FP-Tree的最大频繁项目集挖掘及更新算法[J].软件学报,2003,14(9):1586-1592. 被引量：164
10J Hipp, U Guntzer, G Nakaeizadelr. Algorithms from association rule mining A general survey and comparison. The 2000 ACM SIGMOD Int' 1 Conf. On Management of Data. Dallas,USA,ACM Press. 2000. 被引量：1

二级参考文献24

1李庆华,王卉,蒋盛益.挖掘最大频繁项集的并行算法[J].计算机科学,2004,31(12):132-134. 被引量：5
2宋余庆朱玉全孙志辉陈耿.基于FP—Tree的最大频繁项集挖掘及其更新算法.软件学报,2003,14(9):1586—1592[J].http://wwwjos.org.cn/1000-9825/14/1586.htm,:. 被引量：1
3Agrawal R, Srikant R. Fast algorithms for mining association rules. In: Proc. of the 20th Int'l Conf. on VLDB. 1994. 487-499.http://www.almaden.ibm.conVcs/people/srikant/papers/vldb94.pdf. 被引量：1
4Bayardo R. Efficiently mining long patterns from databases. In: Haas LM, ed. Proc. of the ACM SIGMOD Int'l Conf. on Management of Data. New York: ACM Press, 1998. 85-93. 被引量：1
5Burdick D, Calimlim M, Gehrke J. Mafia: A maximal frequent itemset algorithm for transactional databases. In: Proc. of the 17th Int'l Conf. on Data Engineering. 2001. 443-452. http://www.cs.cornell.edu/boom/2001 sp/yiu/mafia-camera.pdf. 被引量：1
6Gouda K, Zaki MJ. Efficiently mining maximal frequent itemsets. In: Proc. of the 1st IEEE Int'l Conf. on Data Mining. 2001.163-170. http ://www.cs .tau. ac .il/-fiat/dmsem03/E fficient%20Mining%20Maxmal%20Frequent%20Itemsets%20-%202001 .pdf. 被引量：1
7Wang H, Li QH. An improved maximal frequent itemset algorithm. In: Wang GY, eds. Proc of the Rough Sets, Fuzzy Sets, Data Mining, and Granular Computing, the 9th Int'l Conf (RSFDGrC 2003). LNCS 2639, Heidelberg: Springer-Verlag, 2003. 484-490. 被引量：1
8Zhou QH, Wesley C, Lu BJ. SmartMiner: A depth 1st algorithm guided by tail information for mining maximal frequent itemsets.In: Proc of the IEEE Int'l Conf on Data Mining (ICDM2002). 2002. 570-577. http://www.serviceware.com/pdffiles/datasheets/ServiceWare-Smartminer-Datasheet.pdf. 被引量：1
9Grahne G, Zhu JF. High performance mining of maximal frequent itemsets. In: Proc of the 6th SIAM Int'l Workshop on High Performance Data Mining (HPDM 2003). 2003. 135-143. http://www.cs.concordia.ca/db/dbdm/hpdm03.pdf. 被引量：1
10Agarwal RC, Aggarwal CC, Prasad VVV. Depth 1 st generation of long patterns. In: Proc. of the 6th ACM SIGKDD Int'l Conf on Knowledge Discovery and Data Mining. 2000. 108-118. http://www.cs.tau.ac.il/-fiat/dmsem03/Depth%20First%20Generation%20of%20Long%20Patterns%20-%202000.pdf. 被引量：1

共引文献269

1谢志强,朱孟杰,杨静.基于改进FP-树的最大项目集挖掘算法[J].计算机应用研究,2009,26(2):502-505. 被引量：1
2姜晗,贾泂.基于标记域FP-Tree快速挖掘最大频繁项集[J].计算机研究与发展,2007,44(z2):334-349. 被引量：4
3杨种学.基于并行FP-growth算法挖掘网上关联交易规则[J].南京晓庄学院学报,2005,21(5):65-70.
4王盛,董黎刚,李群.一种基于逆序编码的关联规则挖掘研究[J].杭州电子科技大学学报（自然科学版）,2010,30(5):169-172. 被引量：1
5陈晴光,李际军.汽车ERP中关联规则挖掘与动态更新的实现策略[J].机械制造,2004,42(6):69-72. 被引量：2
6杨君锐.逆向启发式开采最大频繁项目集[J].计算机工程,2004,30(14):116-118. 被引量：1
7朱玉全,宋余庆,陈耿.约束最大频繁项目集的增量式更新算法[J].计算机工程,2004,30(18):31-32.
8杨君锐,赵群礼.一种不产生候选集的最大频繁集快速挖掘算法[J].微电子学与计算机,2004,21(11):125-128. 被引量：4
9张莹,韩芳溪,柴乔林.基于频繁模式树的AOI聚类算法[J].计算机工程与应用,2004,40(35):178-179.
10李清峰,杨路明,张晓峰.关联规则中最大频繁项目集的研究[J].计算机应用研究,2005,22(1):93-95. 被引量：3

1郑杰,张勇军.动态排序的最大频繁项集挖掘算法的应用[J].科技信息,2010(21).
2马莉,任学军,赵纪涛.一种挖掘关联规则的改进算法[J].郑州轻工业学院学报（自然科学版）,2008,23(3):117-120.
3徐凤生,赵永华.一种新的关联规则挖掘算法[J].德州学院学报,2002,18(4):45-47.
4传老鹰.数据库查询结果动态排序的三种解决方案[J].中文信息（程序春秋）,2003(4):98-99.
5徐凤生,陆玉昌.模糊关联规则的挖掘算法[J].德州学院学报,2002,18(2):65-68. 被引量：3
6武坤.一种快速挖掘关联规则的改进算法[J].河南财政税务高等专科学校学报,2016,30(1):91-95.
7叶军,周田华,张德琨.一种改进的以太网协议的实现研究[J].中国数据通信,2005,7(6):29-32.
8廖列法,兰红.基于Java类的动态排序类的研究与设计[J].微型电脑应用,2005,21(2):59-61.
9蒋瑜.基于集合枚举树的最小属性约简算法[J].计算机工程与应用,2013,49(11):101-104. 被引量：2
10崔尚卿.在JTable中实现数据库的动态排序显示[J].计算机系统应用,2005,14(8):74-76. 被引量：2

世界科技研究与发展

2010年第4期

浏览历史

内容加载中请稍等...

一种基于动态排序的最大频繁项集挖掘算法

参考文献13

二级参考文献24

共引文献269

相关作者

相关机构

相关主题

浏览历史