摘要
为了加快K-means计算速度和寻找最优聚类子空间,使用特定的变换矩阵对数据进行投影,将特征空间划分为聚类空间和噪声空间,前者包含全部空间结构信息,后者不包含任何信息。将噪声空间舍弃,在聚类空间下进行K-means每一次迭代。算法不同于PCA K-means先降维再聚类,而是在迭代过程中达到筛选维度的效果,并将保留的维度反馈给下一次迭代,同时聚类空间的维度信息是自动发现的,没有引入额外的参数。实验证明AC Kmeans算法相较于已有同类型算法在准确度和计算时间方面都得到了大幅提升。
In order to speed up K-means computation and find the optimal clustered subspace, the data are projected using a specific transformation matrix, and the feature space is divided into clustered space and noise space. The former contains all spatial structure information, while the latter does not contain any information. The noise space is discarded and K-means is performed in the clustering space. The algorithm is different from PCA K-means in that it first reduces dimension and then clusters, but achieves the effect of dimension selection in the iteration process, and feeds the retained dimension back to the next iteration. At the same time, the dimension information of clustered space is automatically found without introducing additional parameters. Experiments show that the accuracy and computation time of the AC K-means algorithm are greatly improved compared with the existing similar algorithms.
作者
王义武
杨余旺
WANG Yiwu;YANG Yuwang(College of Computer Science and Engineering,Nanjing University of Science and Technology,Nanjing 210094,China)
出处
《计算机工程与应用》
CSCD
北大核心
2020年第7期200-204,共5页
Computer Engineering and Applications
基金
国家自然科学基金(No.61640020)
江苏省农业自主创新项目(No.CX(13)3054,No.CX(16)1006)
江苏省重点研发计划项目(No.BE2016368-1)
江苏省科技重点及面上项目(No.SBE2018310371)。
关键词
K-MEANS算法
空间投影
最优子空间
加速
降维
K-means algorithm
spatial projection
optimal subspace
acceleration
dimensionality reduction