期刊文献+

基于AP-CAN的增量关联挖掘算法研究 被引量:1

Research on Incremental Association Mining Algorithm Based on AP-CAN
下载PDF
导出
摘要 随着大数据时代的到来,增量关联规则挖掘已成为数据挖掘领域的热门话题。CAN-tree作为增量关联规则挖掘领域的重要算法,其按项目频次大小进行排序会使树(tree)的规模过大,降低算法效率。针对此问题,提出一种基于AP-CAN的增量关联挖掘算法,采用AP聚类思想将原始数据集按项目的支持度不同分为多个集群,修剪不满足最小支持度的集群,利用哈希头表替代项头表,并根据数据量对每条事务排序。实验结果表明,该方法可以显著削减CAN树的规模,降低项目查找时间,提高数据挖掘效率,在效率和稳定性上均优于现有的CAN-tree算法。 With the advent of the era of big data,incremental association rule mining has become a hot topic in the field of data mining.CAN-tree is an important algorithm in the field of incremental association rule mining,while sorting by item frequency will make the tree scale too large and the algorithm efficiency low.To solve this problem,an incremental association mining algorithm based on AP-CAN is proposed.The algorithm adopts the idea of AP clustering to divide the original data set into multiple clusters according to the different support degree of the project,pruning the clusters that do not meet the minimum support degree,replacing the item head table with the hash head table,and sorting each transaction according to the data volume.Experimental results show that this method CAN significantly reduce the scale of CAN-tree,reduce the search time of items,improve the efficiency of data mining,and is better than the existing CAN-tree algorithm in efficiency and stability.
作者 洪炎 张磊 严加琪 HONG Yan;ZHANG Lei;YAN Jiaqi(College of Electrical and Information Engineering,Anhui University of Science and Technology,Huainan 232001,China)
出处 《安庆师范大学学报(自然科学版)》 2021年第2期20-25,共6页 Journal of Anqing Normal University(Natural Science Edition)
基金 国家自然科学基金青年科学基金项目(61501006) 安徽省自然科学基金面上基金(1808085MF169) 安徽高校自然科学研究项目(KJ2018A0086)。
关键词 关联规则 数据挖掘 AP聚类 CAN-tree算法 association rules data mining AP clustering CAN-tree algorithm
  • 相关文献

参考文献7

二级参考文献30

  • 1冯洁,陶宏才.快速挖掘最大频繁项集[J].微电子学与计算机,2007,24(5):123-126. 被引量:12
  • 2Agrawal R,Imielinski T,Swami A.Mining association rules between sets of items in large databases[C]//SIGMOD'93,Washington,D C.May,1993. 被引量:1
  • 3Agrawal R,Srikant R.Fast algorithms for mining association rules[R].IBM Almaden Research Center,San Jose,C A,June,1994. 被引量:1
  • 4Han J,Pei J,Yin Y.Mining frequent patterns without candidate generation[C]//SIGMOD'2000,Dallas,T X,May,2000. 被引量:1
  • 5Liu Junqiang,Pan Yunhe,Wang Ke,et al.Mining frequent item sets by opportunistic projection[C]//Proc.Of the Eighth ACM SIGKDD Intl.Conf.on Knowledge Discovery and Data Mining,Alberta,Canada,July,2002:229-238. 被引量:1
  • 6Park J S,Chen M S,Yu P S.An effective hash based algorithm for mining association rules[C]//Proc.1995 ACM-SIGMOD,San Jose,CA,Feb,1995:175-186. 被引量:1
  • 7范明 孟小峰.数据挖掘:概念与技术[M].北京:机械工业出版社,2001.. 被引量:26
  • 8Agrawal R, Imielinski T, Swami A. Mining Association Rules Between Sets of Items in Large Databases[C]//Proc. of ACM SIGMOD Conference on Management of Data. Washington D. C., USA: [s. n.], 1993. 被引量:1
  • 9Han Jiawei, Pei Jian, Yin Yiwen. Mining Frequent Patterns Without Candidate Generation[C]//Proc. of ACM SIGMOD Conference on Management of Data. Dallas, TX, USA: [s. n.], 2000. 被引量:1
  • 10Leung K C, Khan Q I, Hoque T. CanTree: A Tree Structure for Efficient Incremental Mining of Frequent Patterns[C]//Proceedings of the 5th IEEE International Conference on Data Mining. New Orleans, USA:[s. n.], 2005. 被引量:1

共引文献25

同被引文献14

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部