期刊文献+

基于样本对加权共协关系矩阵的聚类集成算法 被引量:3

Sample pairwise weighting co-association matrix based ensemble clustering algorithm
下载PDF
导出
摘要 聚类集成的目标是通过集成多个聚类结果来提高聚类算法的稳定性、鲁棒性以及精度.近些年,聚类集成受到了越来越多的关注.现有的集成聚类通常平等地对待所有基聚类,而不考虑它们的重要度.虽然学者们已经在这一方面做出了一些努力,例如使用加权策略来改进共协关系矩阵,但无论是给基聚类加权还是对类重要度评价时都忽略了样本对于其所在类贡献的差异.为此,提出了基于样本对加权共协关系矩阵的聚类集成算法,该算法利用k.means算法产生多个基聚类结果,然后对于其中的每个类再利用k.means算法产生多个小类,并计算去掉样本对所在的小类后类的不确定性变化的程度来评价该样本对的重要度,最后通过层次聚类算法得到聚类结果.在六个UCI数据集上的实验结果表明,基于样本对加权共协关系矩阵的聚类集成算法的性能优于三种经典的基于共协关系矩阵的聚类集成算法。 The goal of clustering ensemble is to improve the stability, robustness and accuracy of the final clustering results by integrating multiple clustering results. In recent years, clustering ensemble has attracted more and more attention. One limitation of most existing clustering ensemble methods is that they generally treat all base clustering equally, regardless of their importance. Although scholars have made some efforts in this aspect, for example, the weighted strategy is used to improve the co-association matrix. However,they ignore the difference in the contribution of samples to the classes they belong to when either weighting the base clustering or evaluating the class importance. Therefore, sample pairwise weighting co-association matrix based ensemble clustering algorithm is proposed. The algorithm firstly uses the k-means algorithm to generate multiple base partition results and multiple small classes for each class. The importance of the sample to the class is evaluated by calculating the change degree of uncertainty of the class after removing the subclass of the sample pairwise. Finally,the final clustering result can be obtained through the hierarchical clustering algorithm. Experimental results on six UCI data sets show that the performance of sample pairwise weighting co.association matrix based clustering ensemble algorithm is superior to the three classical clustering ensemble algorithms based on co.association matrix.
作者 王彤 魏巍 王锋 Wang Tong;Wei Wei;Wang Feng(School of Computer and Information Technology,Shanxi University,Taiyuan,030006,China;Key Laboratory of Computation Intelligence and Chinese Information Processing,Ministry of Education,Shanxi University,Taiyuan,030006,China)
出处 《南京大学学报(自然科学版)》 CAS CSCD 北大核心 2019年第4期592-600,共9页 Journal of Nanjing University(Natural Science)
基金 国家自然科学基金(61772323,61303008,61603229,61502288) 山西省高等教育机构科技创新项目(2016111)
关键词 聚类 聚类集成 共协矩阵 加权策略 clustering clustering ensemble co-association matrix weighted strategy
  • 相关文献

参考文献2

二级参考文献23

  • 1Shi J,Proceedings of the Conference on Computer Vision and Pattern Recognition,IEEECom,1994年,593页 被引量:1
  • 2STREHL A, GHOSH J. Cluster ensembles-a knowledge reuse framework for combining muhiple partitions [ J ]. The journal of machine learning research, 2005, 3(3): 585-617. 被引量:1
  • 3CRISTOFOR D, SIMOVICI D. Finding median partitions u- sing information-theoretical-based genetic algorithms [ J ]. Journal of universal computer science, 2002, 8 ( 2 ) : 153- 172. 被引量:1
  • 4FERN X Z, BRODLEY C E. Solving cluster ensemble prob- lems by bipartite graph partitioning [ C ]//Proceedings of the 21st International Conference on Machine Learning. New York, NY, USA, 2004. 被引量:1
  • 5FRED A L N, JAIN A K. Combining multiple clusterings u-sing evidence accumulation[ J ]. IEEE transactions on pattern analysis and machine intelligence, 2005, 27(6) : 835-850. 被引量:1
  • 6WANG Xi, YANG Chunyu, ZHOU Jie. Clustering aggrega- tion by probability accumulation [ J ]. Pattern recognition, 2009, 42(5): 668-675. 被引量:1
  • 7SINGH V, MUKHERJEE L, PENG Jiming, et al. Ensemble clustering using semidefinite programming with applications [J]. Machine learning, 2010, 79(1/2) : 177-200. 被引量:1
  • 8HUANG Dong, LAI Jianhuang, WANG Changdong. Exploi- ting the wisdom of crowd: a multi-granularity approach to clustering ensemble [ C ]//Proceedings of the 4th Internation- al Conference on Intelligence Science and Big Data Engineer- ing. Beijing, China, 2013: 112-119. 被引量:1
  • 9HUANG Dong, LAI Jianhuang, WANG Changdong. Combi- ning multiple clusterings via crowd agreement estimation and multi-granularity link analysis [ J ]. Neurocomputing, 2015, 170 : 240-250. 被引量:1
  • 10HUANG Dong, LAI Jianhuang, WANG Changdong. Ensem- ble clustering using factor graph [ J ]. Pattern recognition, 2016, 50: 131-142. 被引量:1

共引文献26

同被引文献24

引证文献3

二级引证文献33

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部