摘要
针对近邻传播算法(AP)在处理大规模复杂数据集时聚类时间和精度上的不足,调整密度敏感距离作为相似性度量,提出一种基于密度敏感距离的多级近邻传播聚类算法.首先将原数据集构造为k最近邻稀疏图,以局部长度作为相似性测度,应用AP算法对数据集进行初步聚类;然后以全局距离作为相似性测度,多次应用AP算法再聚类,直到得到合适的聚类数目.实验结果表明,该算法在处理规模较大、结构较复杂的数据集时聚类时间与效果明显好于传统的AP算法.
For the insufficient of time complexity and accuracy about Affinity Propagation (AP) algorithm in dealing with large-scaled and complex datasets,an adjusted density-sensitive distance is utilized as the similarity measure,and a Multilevel Affinity Propagation clustering algorithm based on Density-Sensitive Distance (MAP-DSD) is proposed.Firstly,by using the original datasets,a nearest neighbor sparse graph is constructed,and applying AP clustering algorithm,which let Local-length as similarity measure,preliminary clustering is obtained.Then,repeatedly applying AP algorithm with the Global-distance as the similarity measure to cluster on the preliminary clustering datasets,the appropriate cluster number is obtained.The results of experiments show that the algorithm in processing large-scaled and complex datasets outperforms is better than the original AP algorithm in terms of speed and effects.
出处
《兰州理工大学学报》
CAS
北大核心
2013年第6期85-89,共5页
Journal of Lanzhou University of Technology
基金
国家自然科学基金(11361033)
甘肃省自然科学基金(1212RJZA029)
关键词
近邻传播
密度敏感距离
多级聚类
无监督聚类
affinity propagation
density-sensitive distance
multilevel clustering
unsupervised clustering