期刊文献+

基于ESCS剪枝策略的闭频繁项集挖掘算法 被引量:1

Closed Frequent Itemset Mining Algorithm Based on ESCS Pruning Strategy
下载PDF
导出
摘要 由于在现有的闭频繁项集挖掘算法中,剪枝策略相对单一,大都是针对1-项集进行剪枝,对2-项集和n-项集(n≥3)的剪枝策略相对匮乏,而有效的剪枝策略可以提前发现并剪掉大量没有希望的项集,因此改进闭频繁项集的剪枝策略对此类算法效率的提升具有很大的帮助。为此在ESCS(Estimated Support Co-occurrence Structure)结构基础上,提出针对2-项集的ESCS剪枝策略,并应用其将经典闭频繁项集挖掘算法DCI_Closed(Direct Count Intersect Closed)改进为DCI_ESCS(Direct Count Intersect Estimated Support Co-occurrence Structure)算法,同时对ESCS剪枝策略的效果加以验证。在多个公开数据集上、不同最小支持度阈值下,对改进前后算法时间性能进行比较实验。实验结果表明,改进的DCI_ESCS算法在事务和项集较长的、较稠密的数据集上表现良好,时间效率均有一定程度的提高。 In the existing researches on closed frequent item set mining algorithms, pruning strategies are relatively single, most of which are for 1-item set pruning, and there are relatively few pruning strategies for 2-item set and n-item set(n≥3). However, effective pruning strategies can find and cut off a large number of hopeless item sets in advance. Therefore, improving the pruning strategy of closed frequent item set is of great help to improve the efficiency of this kind of algorithm.On the basis of ESCS(Estimated Support Co-occurrence Structure) structure, an ESCS pruning strategy for 2-itemsets is proposed, and the classical closed frequent itemset mining algorithm DCI_Closed(Direct Count Intersect Closed) is improved to DCI_ESCS(Direct Count Intersect Estimated Support Co-occurrence Structure)algorithm, and the effect of ESCS pruning strategy is verified. On multiple public datasets and under different minimum support thresholds, experiments are conducted to compare the time performance of the algorithm before and after the improvement. The experimental results show that the improved DCI_ESCS algorithm performs well on long and dense data sets with long transaction and itemsets, and the time efficiency is improved to a certain extent.
作者 刘文杰 杨海军 LIU Wenjie;YANG Haijun(School of Information Engineering,Lanzhou University of Finance and Economics,Lanzhou 730020,China)
出处 《吉林大学学报(信息科学版)》 CAS 2023年第2期329-337,共9页 Journal of Jilin University(Information Science Edition)
基金 甘肃省自然科学基金资助项目(18JR3RA216,21JR1RA283) 甘肃省电子商务技术与应用重点实验室(兰州财经大学)开放基金资助项目(2018GSDZSW63A14)。
关键词 闭频繁项集 剪枝策略 数据挖掘 closed frequent itemsets pruning strategy data mining
  • 相关文献

参考文献4

二级参考文献35

共引文献23

同被引文献9

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部