摘要
带约束的关联规则挖据算法(ACARMT)在生成频繁项集时反复比较事务标识符,耗时较多。为进一步提高挖掘的效率,文章提出了一种项约束频繁项集挖掘算法(CFMABTB)。该算法首先根据约束条件C过滤原始数据库,再为每一个项目建立事务二进制,然后通过反复与运算计算各项目集的计数,进而挖掘出k项频繁集。最后在mushroom、chess数据集上,对CFMABTB、ACARMT算法进行了实验对比。结果表明,在数据规模和项目数量不是非常大时,CFMABTB算法的时间性能远优于ACARMT算法。
Generated frequent item sets in constraint association rules mining is the most time consuming step. To improve the efficiency of mining, an improved algorithm for ACARMT named frequent item sets mining algorithm(CFMABTB) is proposed in this paper. CFMABTB stores items corresponding transaction binary with bit set container, and calculates the support counts of item sets trough the & operations. Furthermore, CFMABTB mines constraint k frequent item sets with a recursive method. We compared of the run time of CFMABTB, Separate, ACARMT algorithm in mushroom dataset. The experimental results indicate that CFMABTB can quickly generate all frequent item sets, and has better time performance than the other two algorithms in the same conditions.
作者
陈平
王利钢
Chen Ping;Wang Ligang(School of Digital Commerce,Nanjing College of Information Technology,Nanjing 210023,China;School of Artificial Intelligence,Nanjing College of Information Technology,Nanjing 210023,China)
出处
《信息化研究》
2021年第5期18-22,共5页
INFORMATIZATION RESEARCH
基金
江苏省2017年度高校哲学社会科学研究基金项目(No.2017SJB0674)。