期刊文献+

基于混合高斯模型的增量式聚类 被引量:2

Incremental clustering based on Gaussian mixture model
下载PDF
导出
摘要 为有效处理动态增长的数据集,获得增量数据的聚类结果,文中利用高斯混合模型模拟数据分布,将原有样本所属高斯成分的均值和先验概率视为这些样本的代表点信息,并将其后验概率松弛为0和1两种状态,在密度参数新的迭代公式的基础上,依据标准EM算法设计增量式EM算法,避免了在估计混合密度参数的过程中对原有样本后验概率的重复计算,从而在高效获得增量数据聚类结果的同时实现了高斯混合模型密度参数的更新.实验结果表明,增量式EM算法能够高效处理大规模增量数据集,并能达到较高的聚类精度. To effectively deal with dynamic data set and obtain clustering results Of incremental data, Gaussian mixture model is used to fit distribution of the data in this paper. Each mean and prior probability of the Gaussian components of original data are regarded as representative points of the original data, and the posterior probability is relaxed to two states, namely 0 or 1. Based on the new iteration formulas of density parameters, incremental EM algorithm is designed according to standard EM algorithm. Therefore, repeated computation of posterior prob- ability about the original data is avoided during the estimation process of the mixture density parameters. Further- more, clustering results of the incremental data are effectively obtained and updates of the mixture density param- eters are realized. The results from experiments display that the incremental EM algorithm can effectively deal with large scale incremental data set and good clustering accuracy can be obtained, too.
出处 《江苏科技大学学报(自然科学版)》 CAS 北大核心 2011年第6期597-601,共5页 Journal of Jiangsu University of Science and Technology:Natural Science Edition
基金 国家民航总局软科学项目(MHRD201007)
关键词 增量聚类 EM算法 增量EM算法 incremental clustering EM algorithm incremental EM algorithm
  • 相关文献

参考文献14

  • 1Charikar M, Chekuri C, Feder T, et al. Incremental clustering and dynamic information retrieval [ C ] //Proceeding of the 29th Symposium on Theory of Computing. New York, USA: ACM press, 1997:626-635. 被引量:1
  • 2Hsu Chungchian, Huang Yanping. Incremental clustering of mixed data based on distance hierarchy [ J ]. Expert Systems with Applications, 2008, 35 : 1177 - 1185. 被引量:1
  • 3Ester M, Kriegel H P, Sander J, et al. Incremental clustering for mining in a data warehousing environment [ C ] //Proceedings of the 24th International Conference on Very Large Data Bases. New York, USA: Morgan Kaufman Press, 1998:323 - 333. 被引量:1
  • 4Ning Huazhong, Xu Wei, Chi Yun, et al. Incremental spectral clustering with application to monitoring of evolving blog communities [ C ]//SIAM International Conference on Data Mining. Minnesota, USA: SIAM Press , 2007:261 - 272. 被引量:1
  • 5夏胜平,刘建军,袁振涛,虞华,张乐锋,郁文贤.基于集群的增量分布式RSOM聚类方法[J].电子学报,2007,35(3):385-391. 被引量:5
  • 6MacQueen J B. Some methods for classification and analysis of multivariate observations [ C] JJ ln the 5th Berkeley Symposium on Mathematical Statistics and Probability. California, USA: University of California Press, 1967 : 281 - 297. 被引量:1
  • 7Goldberger J, Greenspan H K, Dreyfuss J, et al. Simplifying mixture models using the unscented transform [ J ]. IEEE Transactions On Pattern Analysis And Machine Intelligence, 2008, 30(8):1495- 1502. 被引量:1
  • 8Gabriela E, Alina C. Incremental clustering using a corebased approach [ J ]. Lecture Notes in Computer Science, 2005, 3733 : 854 - 863. 被引量:1
  • 9Luhr S, Lazarescu M. Incremental clustering of dynamic data streams using connectivity based representative points [ J ]. Data & Knowledge Engineering, 2009, 68 ( 1 ) : 1 - 27. 被引量:1
  • 10Fraley C, Raftery A, Wehrens R. Incremental model- based clustering for large datasets with small clusters [ J ]. Journal of Computational and Graphical Statistics,2005,14(3) :529 - 546. 被引量:1

二级参考文献29

  • 1夏胜平,张乐锋,虞华,张静,胡卫东,郁文贤.基于RSOM树模型的机器学习原理与算法研究[J].电子学报,2005,33(5):939-944. 被引量:11
  • 2INMON W H.Building data warehouse[M].2nd ed.New York:John Wiley,1996. 被引量:1
  • 3GUHA S,RASTOGI R,SHIM K.CURE:an efficient clustering algorithm for large databases[A].Proceedings of the ACM SIGMOD International Conference on Management of Data[C].Seattle:ACM Press,1998. 被引量:1
  • 4ESTER M,KRIEGEL H P,SANDER J,Xu X.A density based algorithm for discovering clusters in large spatial databases with noise[A].Proceedings of the 2nd International Conference on Knowledge Discovery and Data Mining[C].Portland:AAAI Press,1996. 被引量:1
  • 5AGRAWAL R,GEHRKE J,GUNOPOLOS D,et al.Automatic subspace clustering of high dimensional data for data mining application[A].Proceedings of the ACM SIGMOD International Conference on Management of Data[C].Seattle:ACM Press,1998. 被引量:1
  • 6Fisher,DOUGLAS H.Knowledge Acquisition Via Incremental Conceptual Clustering[J].Machine Learning,1987,2:139-172. 被引量:1
  • 7CHEN Zhuo,MENG Qing-chun.A New-type Incremental Clustering Algorithm Based on Swarm Intelligence Theory[A].The Third International Conference on Machine Learning and Cybernetics (ICMLC -2004).Shanghai,2004.1768-1772. 被引量:1
  • 8HAN J W,KAMBR M.Data Mining Concepts and Techniques[M].Beijing:Higher Education Press,2001. 被引量:1
  • 9KAUFAN L,ROUSSEEUW P J.Finding Groups in Data:an Introduction to Cluster Analysis[M].New York:John Wiley & Sons,1990. 被引量:1
  • 10Wang W,Yang J,Muntz R.STING+:An approach to active spatial data mining[A].Procof the 15th Int'l Conf on Data Engineering[C].USA:IEEE Computer Society,1999.119-125. 被引量:1

共引文献10

同被引文献10

引证文献2

二级引证文献10

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部