期刊文献+

一种基于有效修剪的最大频繁项集挖掘算法 被引量:2

Maximal frequent itemsets mining algorithm based on effective pruning mechanisms
下载PDF
导出
摘要 对关联挖掘中的最大频繁项集挖掘问题进行了研究,提出了一种基于项集格修剪机制的最大频繁项集挖掘算法.采用项集格生成树的数据结构,将最大频繁项集挖掘过程转化为对项集格生成树进行深度优先搜索获取所有最大频繁节点的过程.其中提高算法效率的一个重要措施是在遍历项集格生成树的过程中对生成树进行修剪.给出了项集格生成树的三个性质,并在此基础上提出了直接超集修剪、间接超集修剪与事务集等价修剪三种修剪机制,尽可能忽略非频繁节点及其所生成的扩展节点以减少遍历的节点数目.试验结果表明,三种修剪机制都能够有效地减少搜索空间,其中事务集等价修剪机制的效果最好,算法的性能与输入数据集的稠密程度相关. The maximal frequent itemsets mining problem was studied and an algorithm based on pruning itemset lattice effectively was proposed. The itemset lattice tree data structure was adopted to translate maximal frequent itemsets mining into the process of depth-first searching the itemset lattice tree. One of the key measures to promote performance of the algorithm is to prune the itemset lattice tree while traversing it. Three properties of itemset lattice tree were given and three pruning mechanisms, direct superset pruning, indirect superset pruning and transaction sets equivalence pruning, were proposed based on them respectively to prune the infrequent nodes and their extension nodes to reduce the number of nodes while traversing the itemset lattice tree. Test results indicate that all the three pruning mechanisms can reduce the search space effectively and the transaction sets equivalence pruning has the best effect on performance of the algorithm. Test results also indicate that performance of the algorithm is related to denseness of the datasets.
作者 陈鹏 吕卫锋
出处 《北京航空航天大学学报》 EI CAS CSCD 北大核心 2006年第2期218-223,243,共7页 Journal of Beijing University of Aeronautics and Astronautics
基金 国家重点基础研究发展规划资助项目(G1999032709) 国家自然科学基金资助项目(90104008)
关键词 数据挖掘 关联规则 关联挖掘 data mining association rule association mining lattice
  • 相关文献

参考文献14

  • 1Agrawal R,Imielinski,Swami A,et al.Mining association rules between sets of items in large databases[A].In:Peter Buneman,Sushil Jajodia,eds.Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data[C].Washington,1993.207~216 被引量:1
  • 2Agrawal R,Srikant R.Fast algorithms for mining association rules in large database[R].FJ9839,1994 被引量:1
  • 3Houtsma M,Swami A.Set-oriented mining of association rules[A].In:Philip S Yu,Arbee L P Chen.eds.Proceedings of the 11th International Conference on Data Engineering[C].Taipei,1995.25~33 被引量:1
  • 4Zaki M,Ogihara M.Theoretical foundations of association rules[A].In:3rd ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery[C].Washington,1998.7.1~7:8 被引量:1
  • 5Wille R.Restructuring lattice theory:an approach based on hierarchies of concepts[A].In:Ivan Rival,ed.Ordered Sets[C].Reidel,Dordrecht-Boston,1982.445~470 被引量:1
  • 6Agrawal R,Srikant R.Fast algorithms for mining association rules in large database[A].In:Jorge B Bocca,Matthias Jarke Carlo Zaniolo,eds.Proceedings of the 20th International Conference on Very Large Data Bases[C].Santiago,1994.487~499 被引量:1
  • 7Han J,Fu Y.Discovery of multiple-level association rules from large databases[A].In:Jorge B Bocca,Matthias Jarke,Carlo Zaniolo,eds.Proceedings of 21th International Conference on Very Large Data Bases[C].Zurich,rland,1995.39~46 被引量:1
  • 8Bayardo R J.Efficiently mining long patterns from databases[A].In:Proceedings of the 1998 ACM SIGMOD International Conference on Management of Data[C].Seattle,Washington,1998.85~93 被引量:1
  • 9Agawal R,Aggarwal C,Prased V,et al.Depth first generation of long patterns[A].In:Proceedings of the 6th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining[C].Boston,2000.108~118 被引量:1
  • 10Giannella C,Han J,Pei J,et al.Mining frequent patterns in data streams at multiple time granularities[A].In:Kargupta H,Joshi A,K Sivakumar,et al,eds.Proceedings of the NSF Workshop on Next Generation Data Mining[C],AAAI/MIT,2003 被引量:1

二级参考文献13

  • 1宋余庆 朱玉全 孙志辉 陈耿.基于FP—Tree的最大频繁项集挖掘及其更新算法.软件学报,2003,14(9):1586—1592[J].http://wwwjos.org.cn/1000-9825/14/1586.htm,:. 被引量:1
  • 2Agrawal R, Srikant R. Fast algorithms for mining association rules. In: Proc. of the 20th Int'l Conf. on VLDB. 1994. 487-499.http://www.almaden.ibm.conVcs/people/srikant/papers/vldb94.pdf. 被引量:1
  • 3Bayardo R. Efficiently mining long patterns from databases. In: Haas LM, ed. Proc. of the ACM SIGMOD Int'l Conf. on Management of Data. New York: ACM Press, 1998. 85-93. 被引量:1
  • 4Burdick D, Calimlim M, Gehrke J. Mafia: A maximal frequent itemset algorithm for transactional databases. In: Proc. of the 17th Int'l Conf. on Data Engineering. 2001. 443-452. http://www.cs.cornell.edu/boom/2001 sp/yiu/mafia-camera.pdf. 被引量:1
  • 5Gouda K, Zaki MJ. Efficiently mining maximal frequent itemsets. In: Proc. of the 1st IEEE Int'l Conf. on Data Mining. 2001.163-170. http ://www.cs .tau. ac .il/-fiat/dmsem03/E fficient%20Mining%20Maxmal%20Frequent%20Itemsets%20-%202001 .pdf. 被引量:1
  • 6Wang H, Li QH. An improved maximal frequent itemset algorithm. In: Wang GY, eds. Proc of the Rough Sets, Fuzzy Sets, Data Mining, and Granular Computing, the 9th Int'l Conf (RSFDGrC 2003). LNCS 2639, Heidelberg: Springer-Verlag, 2003. 484-490. 被引量:1
  • 7Zhou QH, Wesley C, Lu BJ. SmartMiner: A depth 1st algorithm guided by tail information for mining maximal frequent itemsets.In: Proc of the IEEE Int'l Conf on Data Mining (ICDM2002). 2002. 570-577. http://www.serviceware.com/pdffiles/datasheets/ServiceWare-Smartminer-Datasheet.pdf. 被引量:1
  • 8Grahne G, Zhu JF. High performance mining of maximal frequent itemsets. In: Proc of the 6th SIAM Int'l Workshop on High Performance Data Mining (HPDM 2003). 2003. 135-143. http://www.cs.concordia.ca/db/dbdm/hpdm03.pdf. 被引量:1
  • 9Agarwal RC, Aggarwal CC, Prasad VVV. Depth 1 st generation of long patterns. In: Proc. of the 6th ACM SIGKDD Int'l Conf on Knowledge Discovery and Data Mining. 2000. 108-118. http://www.cs.tau.ac.il/-fiat/dmsem03/Depth%20First%20Generation%20of%20Long%20Patterns%20-%202000.pdf. 被引量:1
  • 10Wang H, Xiao ZJ, Zhang H J, Jiang SY. Parallel algorithm for mining maximal frequent patterns. In: Zhou XM, ed. Advanced Parallel Processing Technologies (APPT 2003). LNCS 2834, Heidelberg: Springer-Verlag, 2003. 241-248. 被引量:1

共引文献67

同被引文献21

引证文献2

二级引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部