摘要
分布式关联规则挖掘在知识发现中占着不可忽视的地位,在以往分布式算法的基础上提出了一个加优先权值的PDDM算法,并将修改后的算法与抽样算法、知识网格的思想相结合形成一个GDS算法.GDS算法改善了以往分布式算法中通信量过载,算法难于扩展的问题,而且只扫描一遍数据库,减缓了大数据集挖掘中的I/O问题.理论分析和试验结果表明提出的算法是有效可行的.
Distributed data mining for association rules plays an important role in knowledge discovery. This paper presents a PDDM algorithm with priority on the basic of previous algorithm and a GDS algorithm which combines PDDM algorithm with the Sampling algorithm and Knowledge Grid's idea. The GDS algorithm, which improves the scalability and I/O problem effectively, decreases the communication of previous algorithm, and scans the database single time. Theory analysis and experimental results show the feasibility and effectiveness of the algorithm.
出处
《小型微型计算机系统》
CSCD
北大核心
2006年第8期1544-1548,共5页
Journal of Chinese Computer Systems
关键词
数据挖掘
分布式
关联规则
抽样算法
知识网格
data mining
distributed
association rule
sampling algorithm
knowledge grid