摘要
针对传统的谱聚类算法不适合处理多尺度问题,引入一种新的相似性度量—密度敏感的相似性度量,该度量可以放大不同高密度区域内数据点间距离,缩短同一高密度区域内数据点间距离,最终有效描述数据的实际聚类分布.本文引入特征间隙的概念,给出一种自动确定聚类数目的方法.数值实验验证本文所提的算法的可行性和有效性.
According to the traditional spectral clustering algorithm is not suitable for processing some multi-scale problems,this paper introduces a new similarity measure-density sensitive similarity measure,this measure can magnify the distance between the data points in the different high density area,and at the same time,shorten the distance between the data points in the same high density area,and finally can effectively describe the actual data clustering distribution.At the same time,this paper introduces the concept of characteristics gap,presents a new automatic sure clustering number of methods.And through the numerical experiments show this paper proposed the feasibility and effectiveness of the algorithm.
出处
《数学的实践与认识》
CSCD
北大核心
2013年第20期150-156,共7页
Mathematics in Practice and Theory
基金
山西省自然科学基金(2010011002-1)
国家自然科学基金(61171179)
关键词
谱聚类
密度敏感度量
特征间隙
spectral clustering
density sensitive measurement
characteristics gap