摘要
为了提供一个灵活可扩展的计算平台进行高效的挖掘计算,提出了一种应用于分布和并行环境的数据挖掘计算框架和相应的算法。通过分析关联规则挖掘理论和以往算法的优缺点,建立一种分布式并行数据挖掘的计算框架,并给出相应的求解算法。实例分析表明该计算框架能够减少节点间的通信开销,保持了良好的可扩展性;挖掘算法则利用本地节点动态有序集合枚举树生成方法代替数据库节省了本地空间的占用,大大提高了查找的计算效率。
In order to provide a flexible and patulous calculating platform and execute high efficiency data mining, a calculating architecture and algorithms of data mining are presented to apply in distributed and parallel environment. The distributed and parallel calculating architecture of data mining and the corresponding algorithms are established by analyzing mining theory of association rule and merit & shortcoming of former algorithms. Examples show that the calculating architecture can reduce overhead traffic, and keep a favorable expansibility. The algorithms save occupation of local space by using the generating method of dynamic order set enumerate trees in local nodes to replace database, and the seeking efficiency is improved greatly.
出处
《微电子学与计算机》
CSCD
北大核心
2006年第9期223-225,共3页
Microelectronics & Computer
基金
国家自然科学基金项目(60473083)
"863"高技术项目(2005AA103110-2)
关键词
数据挖掘
关联规则
项集
分布式并行结构
Data mining, Association rule, Item-set, Distributed and parallel structure