期刊文献+

基于数据流的序列模式挖掘算法

Mining Frequent Sequence from Data Stream
下载PDF
导出
摘要 为了实现对数据流的序列模式挖掘,提出了基于数据流的序列模式挖掘算法MFSDS-1和MFSDS-2,它们均通过调整入选度的大小来调整保存信息的粒度.算法MFSDS-2利用分层存储结构,不仅能更好地保存序列信息,而且可以通过与全局序列模式的对比得到当前活动的一些异常序列模式.实验结果表明,基于分层存储的算法MFSDS-2的效率比算法MSFDS-1高. Although sequence pattern mining has been deeply studied, it is a challenge to extend to data streams. In the paper, algorithm MFSDS-1 and MFSDS-2 are presented for mining sequential patterns from data stream. Both of the algorithms use the storage of elected sequences which are accommodated by the elected rate. In algorithm MFSDS-2 a new structure based on levels is proposed, which not only can store the sequence well, but also can be used to find the abnormal sequence. The experiments results show that MFSDS- 2 is more efficient than MFSDS-1.
出处 《江南大学学报(自然科学版)》 CAS 2007年第6期763-768,共6页 Joural of Jiangnan University (Natural Science Edition) 
基金 江苏省自然科学基金项目(BK2005135)
关键词 序列模式挖掘 频繁序列 数据流 sequence pattern mining frequent sequenee data stream
  • 相关文献

参考文献10

  • 1Agrawal R, Srikant R. Mining sequential patterns[C]// YU P S. CHEN A. Proceedings of the 11th international conference on data engineering (ICDE'95). Taipei: IEEE Computer Society Press, 1995: 3-14. 被引量:1
  • 2Bonfield J K, Staden R. ZTR: a new format for DNA sequence trace data[J]. Bioinformatics, 2002, 18(1) : 3-10. 被引量:1
  • 3Srikant R, Agrawal R. Mining sequential patterns: generalizations and performance improvements[C]// Apers P, Bouzeghoub M, Gardarin G. Proceedings of The 5th International Conference on Extending Database Technology (EDBT '96). Avignon: Springer Verlag, 1996: 3-17. 被引量:1
  • 4Zaki M J. SPADE: an efficient algorithm for mining frequent sequences [J]. Machine Learning, 2001, 42(1) :31-60. 被引量:1
  • 5Pei J, Han J W, Mortazavi B, et al. PrefixSpan: mining sequential patterns efficiently by prefix-projected pattern growth [C]// Reuter A. Proceedings of The 17th International Conference on Data Engineering (ICDE'01) . Heidelberg: IEEE Computer Society Press, 2001: 215-224. 被引量:1
  • 6金澈清,钱卫宁,周傲英.流数据分析与管理综述[J].软件学报,2004,15(8):1172-1181. 被引量:161
  • 7Muthukrishnan S. Data streams algorithms and applications[C]// Littleton R. Proceedings of The 14th Annual ACM-SIAM Symposium on Discrete Algorithms. Philadelphia: Society for Industrial and Applied Mathematics, 2003: 413-413. 被引量:1
  • 8Manku G, Motwani R. Approximate frequency counts over data streams[C]// Bernstein A. Proceeding of The 28th Conference on Very Large Data Bases (VLDB'02). Hong Kong: Morgan Kaufmann Publishers, 2002:346-357. 被引量:1
  • 9Giannella J, Han J, Pei J. Mining frequent patterns in data streams at multiple time granularities[C]//Kargupta H, Joshi A, Sivakumar K, et al. Next Generation Data Mining. Cambridge: MIT Press, 2003:191-212. 被引量:1
  • 10IBM Almaden Research Center. Quest data mining project [EB/OL]. (1996-03-12) [2007-05-26]. http://www. almaden. ibm. com/cs/quest/syndata. html. 被引量:1

二级参考文献52

  • 1Babcock B, Babu S, Datar M, Motwani R, Widom J. Models and issues in data streams. In: Popa L, ed. Proc. of the 21st ACM SIGACT-SIGMOD-SIGART Symp. on Principles of Database Systems. Madison: ACM Press, 2002. 1~16. 被引量:1
  • 2Terry D, Goldberg D, Nichols D, Oki B. Continuous queries over append-only databases. SIGMOD Record, 1992,21(2):321-330. 被引量:1
  • 3Avnur R, Hellerstein J. Eddies: Continuously adaptive query processing. In: Chen W, Naughton JF, Bernstein PA, eds. Proc. of the 2000 ACM SIGMOD Int'l Conf. on Management of Data. Dallas: ACM Press, 2000. 261~272. 被引量:1
  • 4Hellerstein J, Franklin M, Chandrasekaran S, Deshpande A, Hildrum K, Madden S, Raman V, Shah MA. Adaptive query processing: Technology in evolution. IEEE Data Engineering Bulletin, 2000,23(2):7-18. 被引量:1
  • 5Carney D, Cetinternel U, Cherniack M, Convey C, Lee S, Seidman G, Stonebraker M, Tatbul N, Zdonik S. Monitoring streams?A new class of DBMS applications. Technical Report, CS-02-01, Providence: Department of Computer Science, Brown University, 2002. 被引量:1
  • 6Guha S, Mishra N, Motwani R, O'Callaghan L. Clustering data streams. In: Blum A, ed. The 41st Annual Symp. on Foundations of Computer Science, FOCS 2000. Redondo Beach: IEEE Computer Society, 2000. 359-366. 被引量:1
  • 7Domingos P, Hulten G. Mining high-speed data streams. In: Ramakrishnan R, Stolfo S, Pregibon D, eds. Proc. of the 6th ACM SIGKDD Int'l Conf. on Knowledge Discovery and Data Mining. Boston: ACM Press, 2000. 71-80. 被引量:1
  • 8Domingos P, Hulten G, Spencer L. Mining time-changing data streams. In: Provost F, Srikant R, eds. Proc. of the 7th ACM SIGKDD Int'l Conf. on Knowledge Discovery and Data Mining. San Francisco: ACM Press, 2001. 97~106. 被引量:1
  • 9Zhou A, Cai Z, Wei L, Qian W. M-Kernel merging: Towards density estimation over data streams. In: Cha SK, Yoshikawa M, eds. The 8th Int'l Conf. on Database Systems for Advanced Applications (DASFAA 2003). Kyoto: IEEE Computer Society, 2003. 285~292. 被引量:1
  • 10Gibbons PB, Matias Y. Synopsis data structures for massive data sets. In: Tarjan RE, Warnow T, eds. Proc. of the 10th Annual ACM-SIAM Symp. on Discrete Algorithms. Baltimore: ACM/SIAM, 1999. 909-910. 被引量:1

共引文献160

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部