摘要
聚类算法作为数据处理的一种技术,发展迅速且被广泛应用于各个领域。密度峰值聚类算法(clustering by fast search and find of density peaks, DPC)作为一种基于密度的聚类方法,可通过高效样本分配进行图像分割。然而DPC算法的聚类结果依赖于参数截断距离d;的选取,为此提出一种基于信息熵的DPC算法以实现d;的自适应选取。信息熵反映的信息大小与随机事件的概率呈负相关,随机事件发生的概率越大,提供的信息反而越少。因此信息熵可以体现出事件的不确定性,故可将使得信息熵最小的d;作为DPC算法的最优参数。另外聚类算法的簇类个数K普遍难以确定,而DPC算法中的簇类中心通常由局部密度极大值点构成,改进算法则根据数字图像各区域内聚程度自适应确定K的选择阈值。为将DPC算法高效应用于图像分割,改进算法通过分块与合并的方式解决DPC算法时间复杂度较大的问题。经实验对比,改进算法具有更精确的聚类效果;在图像分割方面,改进算法能够更为精准地提取图像分割边缘并与GroundTruth更加吻合。
As a technology of data processing, clustering algorithm develops rapidly and is widely used in various fields. Clustering by fast search and find of density peaks(DPC) is a density-based clustering method that can perform image segmentation through efficient sample allocation. However, the clustering result of the DPC algorithm depends on the selection of the parameter cutoff distance d;. For this reason, a DPC algorithm based on information entropy is proposed to realize the adaptive selection of d;. The size of information reflected by information entropy is negatively related to the probability of random events. The greater the probability of random events, the less information provided. Therefore, the information entropy can reflect the uncertainty of the event, so the d;that minimizes the information entropy can be used as the optimal parameter of the DPC algorithm. In addition, the number of clusters K of the clustering algorithm is generally difficult to determine, and the cluster center in the DPC algorithm is usually composed of local density maximum points. The improved algorithm adaptively determines the selection threshold of K according to the degree of cohesion of each area of the digital image. In order to efficiently apply the DPC algorithm to image segmentation, the improved algorithm solves the problem of large time complexity of the DPC algorithm by means of block and merge. After experimental comparison, the improved algorithm has a more accurate clustering effect. In terms of image segmentation, it can more accurately extract the edge of image segmentation and is more consistent with GroundTruth.
作者
张力丹
王军锋
ZHANG Li-dan;WANG Jun-feng(School of Science,Xi’an University of Technology,Xi’an 710054,China)
出处
《计算机技术与发展》
2022年第5期47-52,共6页
Computer Technology and Development
基金
国家自然科学基金面上项目(61976176)。
关键词
密度峰值聚类
图像分割
簇类合并
块处理
自适应截断距离
density peak clustering
image segmentation
cluster merging
block processing
adaptive cutoff distance