摘要
光谱聚类(spectral clustering,SC)由于在无监督学习中的有效性而受到越来越多的关注。然而其计算复杂度高,不适用于处理大规模数据。近年来提出了许多基于锚点图方法来加速大规模光谱聚类,然而这些方法选取的锚点通常不能很好地体现原始数据的信息,从而导致聚类性能下降。为克服这些缺陷,提出了一种二分k-means锚点提取的快速谱聚类算法(fast spectral clustering algorithm based on anchor point extraction with bisecting kmeans,FCAPBK)。该方法利用二分k-means从原始数据中选取一些具有代表性的锚点,构建基于锚点的多层无核相似图;然后通过锚点与样本间的相似关系构造层次二部图。最后在5个基准数据集上分别进行实验验证,结果表明FCAPBK方法能够在较短的时间内获得良好的聚类性能。
Spectral clustering(SC)has received increasing attention due to its effectiveness in unsupervised learning.However,due to its high computational complexity,it is not suitable for processing large-scale data.In recent years,many anchor points graph-based methods have been proposed to accelerate large-scale spectral clustering.However,the anchor points selected by these methods usually cannot well reflect the information of the original data,which leads to the degradation of clustering performance.To overcome these shortcomings,a fast spectral clustering algorithm based on anchor point extraction with bisecting k-means(FCAPBK)is proposed.The method uses bisecting k-means to select some representative anchor points from the original data,then constructs a multi-layer kernel-free similarity graph based on anchor points,and constructs a hierarchical bipartite graphs through the similar relationship between the anchor points and the sample.Finally,experiments are carried out on five benchmark datasets,and the results show that the FCAPBK method can obtain good clustering performance in a short time.
作者
罗兴隆
贺兴时
杨新社
LUO Xinglong;HE Xingshi;YANG Xinshe(College of Science,Xi’an Polytechnic University,Xi’an 710600,China;College of Science and Technology,Middlesex University,London NW44BT,UK)
出处
《计算机工程与应用》
CSCD
北大核心
2023年第16期74-81,共8页
Computer Engineering and Applications
基金
国家自然科学基金(12101477)
陕西省自然科学基础研究计划(2020JQ-831)。
关键词
二分k-means
二部图
锚点图
谱聚类
bisecting k-means
bipartite graphs
anchor points
spectral clustering