摘要
针对传统聚类算法无法处理大数据中多视图高维数据问题,提出一种基于混沌粒子群优化算法的智能加权K均值聚类算法。在聚类模型中引入聚类之间的耦合程度以扩大聚类的相似性。为了消除初始聚类中心的敏感性,利用混沌粒子群优化算法通过全局搜索得到最优初始聚类中心、视图权重和特征权重。引入一种精确摄动策略提高混沌粒子群优化算法的寻优性能。通过在Apache Spark和Single Node两个平台上的实验验证了该方法在视图多、维数高的复杂数据集条件下具有较好的聚类性能。
The traditional clustering algorithm can not deal with multi view and high dimension data in big data, so we propose an intelligent weighted K-means clustering algorithm based on chaos particle swarm optimization. The coupling degree between clusters was introduced to expand the similarity of clusters. Through global search, we used chaos particle swarm optimization to obtain the optimal initial clustering center, view weight and feature weight to eliminate the sensitivity of the initial clustering center. An accurate perturbation strategy was introduced to improve the performance of chaos particle swarm optimization. The experiments were carried out on two platforms named Apache Spark and Single Node. The results prove that the proposed method has better clustering performance under the condition of complex data sets with multiple views and high dimensions.
作者
刘洪基
Liu Hongji(School of Economics and Management,Chuxiong Normal University,Chuxiong 675000,Yunnan,China)
出处
《计算机应用与软件》
北大核心
2022年第4期311-319,共9页
Computer Applications and Software
基金
云南省科技计划项目(2017FH001-124)。
关键词
大数据
K均值聚类
高维多视图数据
粒子群优化算法
Big data
K-means clustering
Data with high dimensions and multiple views
Particle swarm optimization