期刊文献+

基于密度峰值的聚类集成 被引量:15

Clustering Ensemble Based on Density Peaks
下载PDF
导出
摘要 聚类集成的目的是为了提高聚类结果的准确性、稳定性和鲁棒性.通过集成多个基聚类结果可以产生一个较优的结果.本文提出了一个基于密度峰值的聚类集成模型,主要完成三个方面的工作:1)在研究已有的各聚类集成算法和模型后发现各基聚类结果可以用密度表示;2)使用改进的最大信息系数(Rapid computation of the maximal information coefficient,Rapid Mic)表示各基聚类结果之间的相关性,使用这种相关性来衡量原始数据在经过基聚类器聚类后相互之间的密度关系;3)改进密度峰值(Density peaks,DP)算法进行聚类集成.最后,使用一些标准数据集对所设计的模型进行评估.实验结果表明,相比经典的聚类集成模型,本文提出的模型聚类集成效果更佳. Clustering ensemble aims to improve the accuracy, stability and robustness of clustering results. A good ensemble result is achieved by integrating multiple base clustering results. This paper proposes a clustering ensemble model based on density peaks. First, this paper discovers that the base clustering results can be expressed with density after studying and analyzing the existing clustering algorithms and models. Second, rapid computation of the maximal information coefficient (RapidMic) is introduced to represent the correlation of the base clustering results, which is then used to measure the density of these original datasets after base clustering. Third, the density peak (DP) algorithm is improved for clustering ensemble. ~rthermore, some standard datasets are used to evaluate the proposed model. Experimental results show that our model is effective and greatly outperforms some classical clustering ensemble models.
出处 《自动化学报》 EI CSCD 北大核心 2016年第9期1401-1412,共12页 Acta Automatica Sinica
基金 国家科技支撑计划课题(2015BAH19F02) 国家自然科学基金(61262058 61572407) 教育部在线教育研究中心在线教育研究基金(全通教育)(2016YB158) 西南交通大学中央高校基本科研业务费专项基金(A0920502051515-12)资助~~
关键词 聚类集成 近邻传播 密度峰值 相似性矩阵 Clustering ensemble, affinity propagation, density peaks, similarity matrix
  • 相关文献

参考文献6

二级参考文献159

  • 1唐伟,周志华.基于Bagging的选择性聚类集成[J].软件学报,2005,16(4):496-502. 被引量:95
  • 2TIAN Zheng,LI XiaoBin,JU YanWei.Spectral clustering based on matrix perturbation theory[J].Science in China(Series F),2007,50(1):63-81. 被引量:19
  • 3罗会兰,孔繁胜,李一啸.聚类集成中的差异性度量研究[J].计算机学报,2007,30(8):1315-1324. 被引量:36
  • 4Estivill-Castro V. Why so many clustering algorithms-A position paper. SIGKDD Explorations, 2002,4(1):65-75. 被引量:1
  • 5Dietterich TG. Machine learning research: Four current directions. AI Magazine, 1997,18(4):97-136. 被引量:1
  • 6Breiman L. Bagging predicators. Machine Learning, 1996,24(2):123-140. 被引量:1
  • 7Zhou ZH, Wu J, Tang W. Ensembling neural networks: Many could be better than all. Artificial Intelligence, 2002,137(1-2):239-263. 被引量:1
  • 8Strehl A, Ghosh J. Cluster ensembles-A knowledge reuse framework for combining partitionings. In: Dechter R, Kearns M,Sutton R, eds. Proc. of the 18th National Conf. on Artificial Intelligence. Menlo Park: AAAI Press, 2002. 93-98. 被引量:1
  • 9MacQueen JB. Some methods for classification and analysis of multivariate observations. In: LeCam LM, Neyman J, eds. Proc. of the 5th Berkeley Symp. on Mathematical Statistics and Probability. Berkeley: University of California Press, 1967,1:281-297. 被引量:1
  • 10Blake C, Keogh E, Merz CJ. UCI Repository of machine learning databases. Irvine: Department of Information and Computer Science, University of California, 1998. http://www.ics.uci.edu/~mlearn/MLRepository.html 被引量:1

共引文献288

同被引文献92

引证文献15

二级引证文献113

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部