期刊文献+

一种基于近邻传播算法的最佳聚类数确定方法 被引量:23

Method for determining optimal number of clusters based on affinity propagation clustering
原文传递
导出
摘要 在聚类分析中,决定聚类质量的关键是确定最佳聚类数.对此,从样本几何结构的角度定义了样本聚类距离和样本聚类离差距离,设计了一种新的聚类有效性指标.在此基础上,提出一种基于近邻传播算法确定样本最佳聚类数的方法.理论研究和实验结果表明,所提出的指标和方法能够有效地对聚类结果进行评估,适合于确定样本的最佳聚类数. It is crucial to determine optimal number of clusters for the quality of clustering in cluster analysis. From the standpoint of sample geometry, two concepts of sample clustering distance and sample clustering deviation distance are defined, and a new clustering validity index is designed. In addition, a method for determining optimal number of clusters based on affinity propagation clustering algorithm is proposed. Theoretical research and experimental results show that the proposed index and method can evaluate the clustering results effectively, and be suitable for determining optimal number of clusters.
出处 《控制与决策》 EI CSCD 北大核心 2011年第8期1147-1152,1157,共7页 Control and Decision
基金 国家自然科学基金项目(60703106) 中央高校基本科研业务费专项资金项目(JUSRP21012)
关键词 近邻传播 聚类数 聚类有效性指标 聚类分析 affinity propagation number of clusters clustering validity index cluster analysis
  • 相关文献

参考文献12

  • 1Frey B J, Dueck D. Clustering by passing messages between data points[J]. Science, 2007, 315(5814): 972-976. 被引量:1
  • 2M6zard Marc. Where are the exemplars?[J]. Science, 2007, 315(5814): 949-951. 被引量:1
  • 3Davies D L, Bouldin D W. A cluster separation measure[J]. IEEE Trans on Pattern Analysis and Machine Intelligence, 1979, 1(2): 224-227. 被引量:1
  • 4Dudoit S, Fridlyand J. A prediction-based resampling method for estimating the number of clusters in a dataset[J]. Genome Biology, 2002, 3(7): 1-21. 被引量:1
  • 5Chen G, Jaradat S A, Banerjee N, et al. Evaluation and comparison of clustering algorithms in analyzing ES cell gene expression data[J]. Statistica Sinica, 2002, 12(1): 241-262. 被引量:1
  • 6Kapp A V, Tibshirani R. Are clusters found in one dataset present in another dataset?[J]. Biostatistics, 2007, 8(1): 9-31. 被引量:1
  • 7肖宇,于剑.基于近邻传播算法的半监督聚类[J].软件学报,2008,19(11):2803-2813. 被引量:165
  • 8Rousseeuw P J. Silhouettes: A graphical aid to the interpretation and validation of cluster analysis[J]. J of Computational and Applied Mathematics, 1987, 20(1): 53-65. 被引量:1
  • 9王开军,张军英,李丹,张新娜,郭涛.自适应仿射传播聚类[J].自动化学报,2007,33(12):1242-1246. 被引量:144
  • 10Dcmb616 D, Kastner E Fuzzy C-means method for clustering microarray data[J]. Bioinformatics, 2003, 19(8): 973-980. 被引量:1

二级参考文献12

  • 1Frey B J, Dueck D. Clustering by passing messages between data points. Science, 2007, 315(5814): 972-976 被引量:1
  • 2Kelly K. Affinity program slashes computing times [Online], available: http://www.news.utoronto.ca/bin6/070215-2952. asp. October 25, 2007 被引量:1
  • 3Dudoit S, Fridlyand J. A prediction-based resampling method for estimating the number of clusters in a dataset. Genome Biology, 2002, 3(7): 1-21 被引量:1
  • 4Wang K J. Supplement of adaptive affinity propagation clustering [Online], available: http://www.mathworks. com/matlabcentral/fileexchange/loadAut hor .do?object Type =author&objectId=1095267, October 25, 2007 被引量:1
  • 5Velamuru P K, Renaut R A, Guo H B, Chen K W. Robust clustering of positron emission tomography data. In: Joint Interface CSNA. USA: 2005 被引量:1
  • 6Dembele D, Kastner P. Fuzzy C-means method for clustering microarray data. Bioinformatics, 2003, 19(8): 973-980 被引量:1
  • 7Strehl A. Relationship-based Clustering and Cluster Ensembles for High-dimensional Data Mining [Ph. D. dissertation], The University of Texas at Austin, 2002 被引量:1
  • 8Blake C L, Merz C J. UCI repository of machine learning databases (University of California) [Online], available:http://mlearn.ics.uci.edu/MLRepository.html, September 27, 2007 被引量:1
  • 9Ben H A, Guyon I, Elisseeff A. A stability based method for discovering structure in clustered data. In: Proceedings of the 7th Pacific Symposium on Biocomputing. Hawaii, USA: 2002. 6-17 被引量:1
  • 10Ross D T, Scherf U, Eisen M B, Perou C M, Rees C, Spellman P. Systematic variation in gene expression patterns in human cancer cell lines. Nature Genetics, 2000, 24(3): 227-235 被引量:1

共引文献278

同被引文献213

引证文献23

二级引证文献117

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部