期刊文献+

自适应分形聚类进化甄别算法

Self-Adaptive Fractal Technique on Detecting Cluster Evolution
下载PDF
导出
摘要 数据流随时间演变具有突发性及随机性的特点,如何自适应、实时追踪这种变化是数据流挖掘面临的一个重要问题,完全由用户通过试探来甄别这种变化在实际中无法实现,同时也失去了数据流聚类进化追踪的现实意义。针对聚类变化自动追踪问题,考虑到现实的计算资源限制和处理速度要求,结合分形聚类、自适应采样技术与Chernoff不等式,提出了数据流聚类演变实时追踪算法,进行聚类演变的自动追踪;通过合成与实际数据集上的实验工作验证了算法的有效性。 Stream data can often show important changes in trends over time. In such cases, it is useful to understand, visualize and diagnose the evolution of these trends. When the data streams are fast and continuous, it becomes important to analyze and predict the trends quickly in online fashion. This paper discusses the real-time clustering evolution tracking for data stream algorithm which integrates the fractal cluster technique, self-adaptive sampline technique with the restriction of computing resource and the requirement of processing speed, and can discriminate the cluster evolution of stream data on time. The experiments over a number of real and synthetic data sets illustrate the effectiveness and efficiency provided by this approach.
出处 《计算机科学与探索》 CSCD 2010年第7期662-672,共11页 Journal of Frontiers of Computer Science and Technology
基金 新世纪优秀人才支持计划No.NCET-10-0017 兰州市科技计划项目No.2008-1-28~~
关键词 数据挖掘 聚类进化 分形 自适应采样 data mining cluster evolution fractal self-adaptive sampling
  • 相关文献

参考文献28

  • 1Babcock B,Babu S,Datar M,et al.Models and issues in data stream systems[C] //Proceedings of the 21st ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems,2002:1-16. 被引量:1
  • 2Guha S,Mishra N,Motwani R,et al.Clustering data streams[C] //Proceedings of the 41st Annual Symposium on Foundations of Computer Science,2000:359-366. 被引量:1
  • 3Aggarwal C,Han J,Wang J,et al.On high dimensional projected clustering ofdata streams[J].Data Mining and Knowledge Discovery,2005,10(3):251-273. 被引量:1
  • 4Han J,Kamber M.Data mining:Concepts and techniques[M].San Marco,CA,USA:Morgan Kaufmann Publishers Inc,2000. 被引量:1
  • 5Guha S,Meyerson A,Mishra N,et al.Clustering data streams:Theory and practice[J].IEEE Transactions on Knowledge and Data Engineering,2003,15(3):515-528. 被引量:1
  • 6Babcock B,Datar M,Motwani R,et al.Maintaining variance and k-medians over data stream windows[C] //Proceedings of the 22nd ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems,2003:234-243. 被引量:1
  • 7Charikar M,O'Callaghan L,Panigrahy R.Better streaming algorithms for clustering problems[C] //Proc of 35th ACM Symposium on Theory of Computing(STOC),2003. 被引量:1
  • 8Ordonez C.Clustering binary data streams with K-means[C] //Proceedings of the 8th ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery,2003:12-19. 被引量:1
  • 9Domingos P,Hulten G.A general method for scaling up machine learning algorithms and its application to clustering[C] //Proceedings of the 18th International Conference on Machine Learning,2001. 被引量:1
  • 10O'Callaghan L,Mishra N,Meyerson A,et al.Streaming-data algorithms for high-quality clustering[C] //Proceedings of 18th International Conference on Data Engineering,2002:685-694. 被引量:1

二级参考文献90

共引文献125

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部