期刊文献+

大型数据库中的高效序列模式增量式更新算法 被引量:10

An Efficient Incremental Updating Algorithm for DiscoveringSequential Patterns in Large Database
下载PDF
导出
摘要  提出一种称为FIMS(fastincrementalminingofsequentialpatterns)的序列模式增量式更新算法,处理因数据库的更新而引起的序列模式的维护问题.主要思想是利用原先的序列模式挖掘结果,通过建立一个投影数据库来减少对整个数据库的扫描次数和候选序列的生成,从而提高挖掘的效率.实验结果显示在更新数据量远小于整个数据库的大小时,FIMS算法的性能优于GSP算法4~7倍. An incremental updating technique for discovering sequential patterns called FIMS (fast incremental mining of sequential patterns) is proposed in order to deal with the maintenance of discovered sequential patterns resulted from the updating of database. The main idea is to utilize the results acquired during an earlier mining process to cut down on the cost of finding new sequential patterns in the updated database. Firstly, scan the whole database which is composed of the original database and the incremental database twice and construct a projected database from the whole database. Then, mine the projected database to get all the new candidate sequential patterns. lastly, scan the whole database once to get all the new sequential patterns. Since the algorithm FIMS only needs to scan the whole database three times in all and the projected database is much smaller than the whole database, the scan of the database and the growth of candidate sequences are greatly reduced. As a result, the efficiency of mining is improved. Our experiments show that the algorithm FIMS is greatly outperforming the algorithm GSP by a factor of 4 to 7 when the amount of the updated data is only a small portion of the whole database.
出处 《南京大学学报(自然科学版)》 CAS CSCD 北大核心 2003年第2期165-171,共7页 Journal of Nanjing University(Natural Science)
基金 国家自然科学基金(70171052 60075015)
关键词 数据库 增量式更新算法 数据挖掘 序列模式 扫描次数 侯选序列 data mining, sequential pattern, incremental updating
  • 相关文献

参考文献15

  • 1Agrawal R, Srikant R. Mining sequential patterns. Proceedings of the International Conference on Data Engineering. IEEE Computer Society, 1995: 3-14. 被引量:1
  • 2Agrawal R, Srikant R. Mining sequential patterns: Generalizations and performance improvements.Proceeding of the International Conference on Extending Database Technology. New York: Springer-Verlag, 1996: 3-17. 被引量:1
  • 3Bettini C, Sean Wang X, Jajodia S. Mining temporal relationships with multiple granularities in time sequences. Data Engineering Bulletin, 1998, 21: 32-38. 被引量:1
  • 4Ozden B, Ramaswamy S, Silberschatz A. Cyclic association rules. Proceedings of the International Conference on Data Engineering. IEEE Press, 1998: 412-421. 被引量:1
  • 5Garofalakis M, Rastogi R, Shim K. Spirit: Sequential pattern mining with regular expression constraints.Proceedings of the International Conference on Very Large DataBases. San Franciso: Morgan Kaufmann Publishers Inc, 1999: 223-234. 被引量:1
  • 6Han J, Pei J, Mortazavi-Asl B, et al. Freespan: Frequent pattern-projected sequential pattern mining.Proceedings of the International Conference on Knowledge Discovery and Data Mining. ACM, 2000:355-359. 被引量:1
  • 7Han J, Pei J, Mortazavi-Asl B, et al. PrefixSpan: Mining sequential patterns effieiently by prefix-projected pattern growth. Proceedings of the International Conference on Data Engineering. IEEE Press,2001 : 215-226. 被引量:1
  • 8Cheung D W, Han J, Ng V T, et al. Maintenance of discovered association rules: An incremental update technique. Proceedings of the 12th International Conference on Data Engineering. IEEE Press, 1996:106-114. 被引量:1
  • 9Cheung D W, Lee S D, Kao B. A general incremental technique for maintaining discovered associationrules. Proceedings of the Fifth International Conference on Database Systems for Advanced Applications.Singapore: World Scientific Publishing, 1997: 185-194. 被引量:1
  • 10Wang K. Discovering patterns from large and dynamic sequential data. Journal of Intelligent Information System, 1997: 8-33. 被引量:1

二级参考文献3

共引文献33

同被引文献70

引证文献10

二级引证文献44

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部