
一种分布式序列模式挖掘算法 被引量:2

Mining algorithm of distributed sequential pattern
摘要 针对分布式环境下的序列模式挖掘问题,提出了一种分布式序列模式挖掘(DSPM)算法。DSPM以PrefixSpan算法为基础,使用抽样检测技术平衡了任务负载,将挖掘任务分解后分配到多台计算机上以多进程、多线程并行执行。另外采用了伪投影技术来降低生成投影数据库的开销。实验结果表明,DSPM算法能够快速有效地挖掘分布式环境下的全局序列模式。 In order to mine sequential patterns in distributed environment, Distributed Sequential Pattern Mining (DSPM) algorithm based on prefixSpan was proposed. Sample dataset was detected to balance the workload. Mining tasks were decomposed and distributed to many other computers. Pesudo-projected techniques were used to reduce the cost and the parallel was advanced by muhithreading. The experimental results show that DSPM algorithm can mine global sequential patterns effectively and quickly.
出处 《计算机应用》 CSCD 北大核心 2008年第11期2964-2966,2974,共4页 journal of Computer Applications
基金 国家自然科学基金资助项目(60572112) 江苏省高技术重大项目资助(BG2007028) 江苏省六大人才高峰项目(07-E-025) 江苏省教育厅项目(06KJB120051)
关键词 数据挖掘 序列模式 分布式 模式增长 data mining sequential pattern distributed pattern growth
  • 相关文献


  • 1AGRAWAL R, SR1KANT R. Mining sequential patterns[ C]//Proceedings of the 11th International Conference on Data Engineering. Taipei: [s. n. ], 1995:3 - 14. 被引量:1
  • 2HAN J, PEI J, MORTAZAVI-ASL B, et al. PrefixSpan-Mining sequential patterns efficiently by prefix-projected pattern growth[ C]// Proceedings of the 17th International Conference on Data Engineering. Heidelberg, DE: [s.n. ], 2001:215-224. 被引量:1
  • 3ZAKI M J. Parallel sequence mining on shared-memory machines [ J]. Journal of Parallel and Distributed Computing, 2001, 6(1) : 401 - 426. 被引量:1
  • 4GURALNIK V, GARG N, KARYPIS G. Parallel tree projection algorithm for sequence mining[ C]// Proceedings of the 7th International Euro-Par Conference Manchester on Parallel Processing. London, UK: Springer-Verlag, 2001: 310-320. 被引量:1
  • 5邹翔,张巍,刘洋,蔡庆生.分布式序列模式发现算法的研究[J].软件学报,2005,16(7):1262-1269. 被引量:19
  • 6龚振志,胡孔法,达庆利,张长海.DMGSP:一种快速分布式全局序列模式挖掘算法[J].东南大学学报(自然科学版),2007,37(4):574-579. 被引量:2
  • 7ZHOU LI-JUAN, QIN BAI, WANG YU, et al. Research on parallel algorithm for sequential pattern mining[ C]// Proceedings of theSPIE Conference on Data Mining, Intrusion Detection, Information Assurance, and Data Networks Security. [ S. L. ] : SPIE, 2008:69 -73. 被引量:1
  • 8CHEN JIN-LIN, COOK T. Mining contiguous sequential patterns from Web logs[ C]// Proceedings of the 16th International Conference on World Wide Web. New York: ACM Press, 2007:1177 -1178. 被引量:1


  • 1陆介平,杨明,孙志挥,鞠时光.快速挖掘全局最大频繁项目集[J].软件学报,2005,16(4):553-560. 被引量:27
  • 2邹翔,张巍,刘洋,蔡庆生.分布式序列模式发现算法的研究[J].软件学报,2005,16(7):1262-1269. 被引量:19
  • 3Agrawal R, Srikant R. Mining sequential patterns. In: Yu PS, Chen ASP, eds. Proc. of the 11th Int'l Conf. on Data Engineering. Washington DC: IEEE Computer Society Press, 1995. 3-14. 被引量:1
  • 4Agrawal R, Srikant R. Mining sequential patterns: Generalizations and performance improvements. In: Apers PMG, Mokrane B, et al., eds. Proc. of the 5th Int'l Conf. on Extending Database Technology. Heidelberg: Springer-Verlag, 1996. 3-17. 被引量:1
  • 5Ozden B, Ramaswamy S, Silberschatz A. Cyclic association rules. In: Proc. of the 14th Int'l Conf. on Data Engineering. 1998. http://citeseer.ist.psu.edu/ozden98cyclic.html 被引量:1
  • 6Garofalakis M, Rastogi R, Shim K. Spirit: Sequential pattern mining with regular expression constraints. In: Atkinson MP, Orlowska ME, et al., eds. Proc. of the Int'l Conf. on Very Large Data Bases. Edinburgh: Morgan Kaufmann Publishers, 1999. 223-234. 被引量:1
  • 7Han J, Pei J, Mortazavi-Asl B, Chen QM, Dayal U, Hsu MC. Freespan: Frequent pattern-projected sequential pattern mining. In: Ramakrishnan R, ed. Proc. of the Int'l Conf. on Knowledge Discovery and Data Mining. New York: ACM Press, 2000. 355-359. 被引量:1
  • 8Han J, Pei J, Mortazavi-Asl B, Pinto H, Chen QM, Dayal U, Hsu MC. PrefixSpan: Mining sequential patterns efficiently by prefix-projected pattern growth. In: Proc. of the 17th Int'l Conf. on Data Engineering. 2001. http://citeseer.ist.psu.edu/470226.html 被引量:1
  • 9Ayres J, Gehrke J, Yiu T, Flannick J. Sequential pattern mining using a bitmap representation. In: Proc. of the 8th ACM SIGKDD Int'l Conf. on Knowledge Discovery and Data Mining. 2002. http://citeseer.ist.psu.edu/ayres02sequential.html 被引量:1
  • 10Parthasarathy S, Zaki MJ, Ogihara M, Dwarkadas S. Incremental and interactive sequence mining. In: Gauch S, ed. Proc. of the 8th Int'l Conf. on Information and Knowledge Management. New York: ACM Press, 1999. 251-258. 被引量:1



  • 1缪裕青.频繁闭合项目集的并行挖掘算法研究[J].计算机科学,2004,31(5):166-168. 被引量:5
  • 2Agrawal R, Srikant R. Mining Sequential Patterns. Proe. of the 1 lth Int. Conf. on Data Engineering. Taipei, 1995: 3-14. 被引量:1
  • 3Srikant R, Agrawal R. Mining Sequential Patterns:Gener- alizations and Performance Improvements. Proc. of the 5th Int. Conf. on Extending Database Technology. Avignon, France, 1996:3-17. 被引量:1
  • 4Masseglia F, Cathala F, Poncelet E The PSP Approach for Mining Sequential Patterns. Proc. of the 2nd European Symposium on Principles of Data Mining and Knowledge Discovery. Berlin: Springer-Verlag, 1998:174-184. 被引量:1
  • 5Zhang Ming hua, Kao B, Yip Chi Lap, et al. A GSP-based Efficient Algorithm for Mining Frequent Sequences. Proc. of International Conference on Artificial Intelligence. Las Vegas, Nevada, 2001. 被引量:1
  • 6Zaki MJ. SPADE:An Ettieient Algorithm for Mining Frequent Sequences. Machine Learning Journal, 2001,42(1):31-60. 被引量:1
  • 7Han Jia-wei, Pei J, Mortazavi-Asl B, et al. FreeSpan: Fre- quent Pattern-Projected Sequential Pattern Mining. Proc. of 2000 Int. Conf. on Knowledge Discovery and Data Mining. Boston, MA, 2000:355-359. 被引量:1
  • 8Pei J, Han Jia-wei, Mortazavi-Asl B, et al. PrefixSpan:Mining Sequential Patterns Efficiently by Prefix-Projected Pattern Growth. Proc. of 2001 Int. Conf. on Data Engineering. Heidelberg, Germany, 2001:215-224. 被引量:1
  • 9Lin Ming-Yen, Lee Suh-Yin. Fast Discovery of Sequential Patterns by Memory Indexing. Proc. of the 4th International Conference of Data Warehousing and Knowledge Discovery. London, Springer-Verlag, 2002:150-160. 被引量:1
  • 10Hsieh Chia-ying, Yang Don-lin, Wu Jung-pin. An Efficient Sequential Pattern Mining Algorithm Based on the 2-Seq- uence Matrix. Proc. of 2008 IEEE International Conference on Data Mining. Pisa, Italy, 2008:583-591. 被引量:1










使用帮助 返回顶部