摘要
本文讨论了在并行序贯模式数据挖掘方面采用“预聚类并行序贯模式挖掘”的策略,对数据序列聚类后按不同的类分布到不同的并行节点上,以减少甚至消除不必要的通讯开销,以便能够提高并行序贯模式挖掘在集群式高性能计算机上的执行效率。
In this article, we present the Pre-Clustered Sequential Pattern Mining Algorithm in the parallel sequential pattern mining field. The method clusters the data sequences according to different classes and distribute the them into different parallel computing nodes. Thus it greatly reduces the unnecessary communications overhead and improves the execution efficiency of parallel sequential pattern mining on clustered high-performance computers.
出处
《计算机工程与科学》
CSCD
2004年第10期66-68,90,共4页
Computer Engineering & Science
基金
上海市科委"基于高性能计算的数据挖掘和知识发现"项目(01JC14002)
上海市教委"第四期重点学科"项目(205153)