摘要
针对关联规则挖掘的AprioriTid算法缺点提出了一种改进的算法,即在构造k阶Tid表时,考虑当前交易项包含的所有k阶候选项的全部元素组成的集合,它肯定是当前交易项的子集。如果它的范数大于k就将其写入k阶Tid表中,而不是它所包含的所有k阶候选项集都写入表中,这样必然减少下一次寻找k+1阶大项集时所需要扫描的交易量,从而使AprioriTid算法得到进一步优化。通过在Northwind数据集上的实验,验证了该算法有效地优化了空间复杂度和时间复杂度。
An improved AprioriTid algorithm is presented, which avoids the shortcoming of AprioriTid algorithm in association rules mining. The basic idea of the algorithm is: When construct the k-Tid tables, the set is composed by all elements of candidate k-itemsets included in the current transaction item, it is certainly the subset of the current transaction item. lfits norm is greater than k, then it will be stored in the k-Tid table, not all of candidate k-itemsets included in the current transaction item. Thus it certainly reduces the amount of scanning transactions when finding the k+1-large itemsets next time and optimizes the AprioriTid algorithm to some extent. By experiment in the Northwind dataset, it shows that the algorithm much effectively optimizes space complexity and time complexity.
出处
《计算机工程与设计》
CSCD
北大核心
2009年第15期3581-3583,共3页
Computer Engineering and Design
基金
贵州省2008年省级信息化专项基金项目(0830)
贵州省科技计划工业攻关基金项目(黔科合GY字[2008]3035)