期刊文献+

二阶自然最近邻和多簇合并的密度峰值聚类算法

Second-order natural nearest neighbors and multi-clustersmerge density peaks clustering algorithm
下载PDF
导出
摘要 密度峰值聚类(density peaks clustering, DPC)算法基于局部密度和相对距离识别簇中心,忽视了样本所处环境对样本点密度的影响,因此不容易发现低密度区域的簇中心;DPC算法采用的单步分配策略的容错性差,一旦一个样本点分配错误,将导致后续一系列样本点分配错误。针对上述问题,提出二阶自然最近邻和多簇合并的密度峰值聚类算法(TNMM-DPC)。首先,引入二阶自然邻居的概念,同时考虑样本点的密度与样本点所处的环境,重新定义了样本点的局部密度,以降低类簇的疏密对类簇中心选择的影响;其次,定义了核心点集来选取初始微簇,依据样本点与微簇间的关联度对样本点进行分配;最后引入了邻居边界点集的概念对相邻的子簇进行合并,得到最终的聚类结果,避免了分配错误连带效应。在人工数据集和UCI数据集上,将TNMM-DPC算法与DPC及其改进算法进行了对比,实验结果表明,TNMM-DPC算法能够解决DPC算法所存在的问题,可以有效聚类人工数据集和UCI数据集。 The DPC algorithm identifies cluster centers based on local density and relative distance,ignoring the influence of the sample environment on the sample point density,so it is not easy to find cluster centers in low-density areas.The single-step allocation strategy of the DPC algorithm has poor fault tolerance,and once a sample point allocation error occurs,it will lead to a series of sample point allocation errors in the follow-up.To solve the above problems,this paper proposed a density peak clustering algorithm(TNMM-DPC)based on second-order natural nearest neighbor and multi-cluster merging.Firstly,it introduced the concept of second-order natural neighbor and considered the density of the sample point and the environment of the sample point at the same time,it redefined the local density of the sample point to reduce the influence of cluster density on the selection of cluster center.Secondly,it defined the core point set to select the initial micro clusters,and allocated the sample points according to the correlation degree between the sample points and the micro clusters.Finally,it introduced the concept of neighbor boundary point set to merge the adjacent subclusters to obtain the final clustering results,avoiding the cascade effect of allocation errors.This paper compared TNMM-DPC algorithm with DPC and its improved algorithm on the artificial dataset and the UCI dataset,and the experimental results show that the TNMM-DPC algorithm can solve the problems existing in the DPC algorithm and can effectively cluster the artificial dataset and UCI dataset.
作者 张紫丹 徐华 杨重阳 Zhang Zidan;Xu Hua;Yang Chongyang(School of Artificial Intelligence&Computer Science,Jiangnan University,Wuxi Jiangsu 214122,China)
出处 《计算机应用研究》 CSCD 北大核心 2023年第12期3559-3565,共7页 Application Research of Computers
基金 国家自然基金青年基金资助项目(62106088)。
关键词 密度峰值 自然邻居 局部密度 核心点集 子簇合并 peak density natural neighbors local density core point set micro-cluster merging
  • 相关文献

参考文献11

二级参考文献33

共引文献130

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部