摘要
挖掘最大频繁项目集是许多数据挖掘中的关键问题.为克服早期基于Apriori的最大频繁项目集算法中的缺点,相继有多种挖掘最大频繁项目集方法被提出.其中对基于FP-tree的最大频繁项目集挖掘算法比较多,但对FP-tree中的结点的频度计数关注的很少.通过对FP-tree结构进行了仔细分析后,在FP-tree中结点的频度计数和集合理论的基础上,提出了一种新的最大频繁项目集挖掘算法USDMFIA(using set to discover maximum frequent itemsets algorithm).通过分析比较,显示此算法是有效的.
Mining maximum frequent itemsets is a key problem in many data mining applications. In order to overcome the drawbacks in previous maximum itemsets algorithm based on Apriori, a lot of approaches for mining maximum frequent itemsets were proposed, which include much more algorithm for mining maximum frequent itemsets based on FP-Tree, but few focus on frequent count of node in FP-Tree. A algorithm——USDMFIA (using set to mining maximum frequent itemsets algorithm )for mining maximum frequent itemsets based on frequent count of node and set theory through analyzing FP-Tree structure is proposed. A comparative and analysis to previous methods show that the algorithm is efficient.
出处
《云南大学学报(自然科学版)》
CAS
CSCD
北大核心
2006年第S2期97-101,共5页
Journal of Yunnan University(Natural Sciences Edition)
基金
云南省高校教学
科研带头人基金
云南省院省校合作项目(2004YX42)