摘要
在聚类分析中,决定聚类质量的关键是确定最佳聚类数.对此,从样本几何结构的角度定义了样本聚类距离和样本聚类离差距离,设计了一种新的聚类有效性指标.在此基础上,提出一种基于近邻传播算法确定样本最佳聚类数的方法.理论研究和实验结果表明,所提出的指标和方法能够有效地对聚类结果进行评估,适合于确定样本的最佳聚类数.
It is crucial to determine optimal number of clusters for the quality of clustering in cluster analysis. From the standpoint of sample geometry, two concepts of sample clustering distance and sample clustering deviation distance are defined, and a new clustering validity index is designed. In addition, a method for determining optimal number of clusters based on affinity propagation clustering algorithm is proposed. Theoretical research and experimental results show that the proposed index and method can evaluate the clustering results effectively, and be suitable for determining optimal number of clusters.
出处
《控制与决策》
EI
CSCD
北大核心
2011年第8期1147-1152,1157,共7页
Control and Decision
基金
国家自然科学基金项目(60703106)
中央高校基本科研业务费专项资金项目(JUSRP21012)
关键词
近邻传播
聚类数
聚类有效性指标
聚类分析
affinity propagation
number of clusters
clustering validity index
cluster analysis