摘要
谱聚类能识别非线性数据,且优于传统聚类。谱聚类中度量相似性的高斯核函数尺度参数σ和聚类个数k对聚类效果影响较大,但需要人工判断。用向量之间夹角余弦代替σ并且通过特征值的跳跃性确定聚类个数,对于非线性高维数据,提出一种自适应谱聚类算法,将数据通过显式构造映射到随机特征空间,在随机特征空间中实现聚类。实验结果表明,在UCI数据上该算法与传统算法相比效果更好。
Spectral clustering can identify nonlinear data,and it is better than traditional clustering.The Gaussian kernel function scale parametersσand the number of clusters k that measure the similarity in spectral clustering have a great influence on the clustering effect,and face the problem of human determination.This paper used the angle cosine between the vectors to replace theσand determined the number of clusters k by jumping the eigenvalues.In order to solve the nonlinear high-dimensional data clustering,an adaptive spectral clustering algorithm was proposed to map the data to the stochastic feature space through explicit construction,and the clustering was realized in the stochastic feature space.The experimental results show that,compared with the traditional algorithm on UCI data,this algorithm has better effect.
作者
王鸿菲
杜洪波
林凯迪
姚云飞
朱立军
Wang Hongfei;Du Hongbo;Lin Kaidi;Yao Yunfei;Zhu Lijun(School of Science,Shenyang University of Technology,Shenyang 110870,Liaoning,China;School of Computer Science and Technology,Tianjin University,Tianjin 300050,China;School of Information and Computing Science,Northern University for Nationalities,Yinchuan 750021,Ningxia,China)
出处
《计算机应用与软件》
北大核心
2021年第9期268-272,292,共6页
Computer Applications and Software
基金
国家自然科学基金项目(61362033)。
关键词
谱聚类
非线性高维
自适应
随机特征空间
Spectral clustering
Non-linear high-dimensional
Adaptive
Random feature space