摘要
针对多密度数据集聚类的时间复杂度过高和聚类结果对参数设置的依赖性过强的问题,提出了一种自动计算参数的多密度网格聚类算法MGCP,该方法用网格单元的密度和单元间质心距离来构造判别函数,用判别函数的统计信息自动确定参数。实验结果表明,MGCP算法能够有效处理任意形状和不同密度的类,以较小的时间代价获得较高的聚类精度。
Aiming to address the issues of excessive reliance of parameters and long processing time resulted from the high complexity in clustering of multi-density data set ,a multi-density grid clustering algorithm to calculate parameters auto-matically(MGCP) is proposed .This method uses the discriminant function ,which is based on the unit density and the unit centroid distance ,to automatically determine the similar threshold according to the statistical information of discriminant function .The experimental results indicate the MGCP algorithm can effectively process the class with arbitrary shape or dif-ferent densities .This algorithm can achieve a higher cluster precision in a shorter time span .
出处
《计算机与数字工程》
2014年第7期1141-1145,共5页
Computer & Digital Engineering
关键词
网格聚类
邻接单元
判别函数
相似阈值
参数计算
grid clustering
adjacent unit
discriminant function
similar threshold
parameter calculation