期刊文献+

分布式环境下的序列模式发现研究 被引量:1

The Research Sequential Pattern Discovery in Distributed Environment
原文传递
导出
摘要 提出一种称为DMSP(DistributedMiningofSequentialPatterns)的算法,以解决分布式环境下的序列模式挖掘问题.其主要思想是:利用前缀投影技术划分模式搜索空间,降低数据库的规模,生成局部序列模式;利用模式前缀指定选举站点降低通信开销;多线程异步运行,提高算法的并行性.实验结果显示:在具有海量数据的局域网环境中,DMSP算法的性能优于将数据集中后采用GSP算法65%以上. An algorithm called DMSP (Distributed Mining of Sequential Patterns) is proposed in order to deal with mining sequential patterns in distributed environment. The main idea is that each site utilizes prefix-projected technique which divides the pattern search space and decreases the size of the database to generate local sequential patterns; each site utilizes polling site associated with prefix to decrease the cost of communication; multi-threads run asynchronously in each site to increase the concurrency of algorithm. The experiments show that algorithm DMSP is outperforming applying algorithm GSP after centralizing data by above 65 percent and scaleable over LAN with huge amount of data.
出处 《复旦学报(自然科学版)》 CAS CSCD 北大核心 2004年第5期737-741,共5页 Journal of Fudan University:Natural Science
基金 国家自然科学基金资助项目(70171052 60075015)
关键词 序列模式挖掘 分布式环境 算法 多线程 海量数据 局域网 并行性 低通 开销 投影技术 data mining sequential pattern distributed algorithm
  • 相关文献

参考文献10

  • 1Agrawal R, Srikant R. Mining sequential patterns[A]. In: Philip S Y, Arbee L, Chen P,eds.Proceedings of the International Conference on Data Engineering[C]. Taipei:IEEE Computer Society, 1995. 3-14. 被引量:1
  • 2Agrawal R, Srikant R. Mining sequential patterns: Generalizations and performance improvements[A]. In: Jarke M,ed.Proceeding of the International Conference on Extending Database Technology[C]. Colorado, USA:Springer-Verlag, 1996. 3-17. 被引量:1
  • 3Han J, Pei J, Mortazavi-Asl B, et al. PrefixSpan: Mining sequential patterns efficiently by Prefix-Projected pattern growth [A].In: Alex G, Per-Ake L,eds. Proceedings of the International Conference on Data Engineering[C]. Heidelberg, Germany:IEEE Press, 被引量:1
  • 4Parthasarathy S, Zaki M J, Ogihara M, et al. Incremental and interactive sequence mining[A]. In: Fredric G,ed.Proceedings of the 8th International Conference on Information and Knowledge Management[C]. Kansas City, Missouri, USA:ACM, 1999. 251-258. 被引量:1
  • 5Masseglia F, Poncelet P, Teisseire M. Incremental mining of sequential patterns in large databases[EB/OL]. Http://citeseer.nj.nec.com/masseglia00incremental.html, 2000-01-10/2003-12-12. 被引量:1
  • 6Guralnik V, Garg N, Karypis G. Parallel tree projection algorithm for sequence Mining[J]. Lecture Notes in Computer Science, 2001, 2150:310-320. 被引量:1
  • 7Zaki M J. Parallel sequence mining on shared-memory machines [J]. Journal of Parallel and Distributed. Computing, 2001, 61:401-426. 被引量:1
  • 8Cheung D, Han J, Vincent T Ng, et al. A fast distributed algorithm for mining association rules[A]. In: Wei S,Naughton J,eds.Proceedings of International Conference on Parallel and Distributed Inforamtion Systems[C]. Miami Beach, Florida: IEEE Computer So 被引量:1
  • 9Kargupta H, Park B, Hershbereger D, et al. Collective data mining: A new perspective toward distributed data mining[A]. In: Kargupta H, Chan P,eds. Accepted in the Advances in Distributed Data Mining[M]. Cambridge MA:AAAI/MIT Press,1999. 被引量:1
  • 10邹翔,张巍,蔡庆生,王清毅.大型数据库中的高效序列模式增量式更新算法[J].南京大学学报(自然科学版),2003,39(2):165-171. 被引量:10

二级参考文献15

  • 1Agrawal R, Srikant R. Mining sequential patterns. Proceedings of the International Conference on Data Engineering. IEEE Computer Society, 1995: 3-14. 被引量:1
  • 2Agrawal R, Srikant R. Mining sequential patterns: Generalizations and performance improvements.Proceeding of the International Conference on Extending Database Technology. New York: Springer-Verlag, 1996: 3-17. 被引量:1
  • 3Bettini C, Sean Wang X, Jajodia S. Mining temporal relationships with multiple granularities in time sequences. Data Engineering Bulletin, 1998, 21: 32-38. 被引量:1
  • 4Ozden B, Ramaswamy S, Silberschatz A. Cyclic association rules. Proceedings of the International Conference on Data Engineering. IEEE Press, 1998: 412-421. 被引量:1
  • 5Garofalakis M, Rastogi R, Shim K. Spirit: Sequential pattern mining with regular expression constraints.Proceedings of the International Conference on Very Large DataBases. San Franciso: Morgan Kaufmann Publishers Inc, 1999: 223-234. 被引量:1
  • 6Han J, Pei J, Mortazavi-Asl B, et al. Freespan: Frequent pattern-projected sequential pattern mining.Proceedings of the International Conference on Knowledge Discovery and Data Mining. ACM, 2000:355-359. 被引量:1
  • 7Han J, Pei J, Mortazavi-Asl B, et al. PrefixSpan: Mining sequential patterns effieiently by prefix-projected pattern growth. Proceedings of the International Conference on Data Engineering. IEEE Press,2001 : 215-226. 被引量:1
  • 8Cheung D W, Han J, Ng V T, et al. Maintenance of discovered association rules: An incremental update technique. Proceedings of the 12th International Conference on Data Engineering. IEEE Press, 1996:106-114. 被引量:1
  • 9Cheung D W, Lee S D, Kao B. A general incremental technique for maintaining discovered associationrules. Proceedings of the Fifth International Conference on Database Systems for Advanced Applications.Singapore: World Scientific Publishing, 1997: 185-194. 被引量:1
  • 10Wang K. Discovering patterns from large and dynamic sequential data. Journal of Intelligent Information System, 1997: 8-33. 被引量:1

共引文献9

同被引文献7

  • 1[1]Agrawal R,Srikant R.Mining Sequential Patterns[C]//Philip S Y,Arbee L,Chen P,et al.Proceedings of the International Conference on Data Engineering.Taipei:IEEE Computer Society,1995:3-14. 被引量:1
  • 2[2]Srikant R,Agrawal R.Mining Sequential Patterns:Generalization and Performance Improvements[C]//Jarke M.Proceeding of the International Conference on Extending Database Technology.Colorado:Spring Verlag,1996:3-17. 被引量:1
  • 3[3]Han J,Pei J,Mortazavi-Asl B,et al.Prefixspan:Mining Sequential Patterns Efficiently by Prefix-Projected Pattern Growth[C]//AlexG,Per-Ake L.Proceedings of the International Conference on Data Engineering.Heidelberg:TEEE Press,2001:115-116. 被引量:1
  • 4[4]Guralnik V,Garg N,Karypis G.Parallel Tree Projection Algorithm for Sequence Mining[J].Lecture Notes in Computer Science,2001,2150:310-320. 被引量:1
  • 5[5]Zaki M J.Parallel Sequence Mining on Shared-Mmemory Machines[J].Journal and Distributed Computing,2001,61:401-426. 被引量:1
  • 6[7]Godin R,Missaoui R.Alaui H.Incremental Concept Formation Algorithms Based on Galois (concept) Lattices[J].Computational Intelligence,1995,11 (2):246-267. 被引量:1
  • 7孙莹,胡学钢.基于频繁概念格的序列模式发现研究[J].计算机科学,2004,(S2):168-171. 被引量:2

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部