摘要
现有的聚类有效性指标大都是基于欧氏距离而设计.虽然对超球型数据效果较好,但对非超球型数据效果并不理想.基于此,提出一种基于多目标进化算法的多距离聚类有效性指标(MoMDVI).首先使用两种距离设计两个聚类目标,并使用类代表点代替类中心点;其次使用一组实数设计染色体,该组实数可解码成代表点序号的形式;然后使用基于正则化的分布估计算法(RMMEDA)对两个目标进行优化.在进化算子中,加入差分进化算子对RMMEDA算法进行改进,以提高算法的收敛速度.将MoMDVI与现有算法在不同结构的数据上对比可知,MoMDVI不仅可以自动检测超球型数据聚类数目,也可以自动检测非超球型数据聚类数目.
Most of the existing clustering validity indices are usually based on Euclidean distance. Although,they can obtain satisfactory clustering results for hyper-spherical datasets,they do not behave well for irregular datasets. Based on this,this paper proposes a multiple distance validity index based on a multiobjective evolutionary algorithm( MoMDVI). Firstly,two clustering objectives are designed based on two distances respectively and the cluster center is replaced by cluster representative point. Secondly,the chromosome is designed by the indexes of the cluster representative points. Then,two objectives are optimized by distribution estimation algorithm based on the regularization( RMMEDA). In the designing of the evolutionary operator,we employ differential evolution to assist RMMEDA operator for increasing the convergence speed. Finally,MoMDVI is compared with several existing algorithms on test datasets with different structures and the experimental results show that MoMDVI not only can detect the cluster number on hyper-spherical datasets automatically,but also achieve good performance on irregular datasets.
作者
刘丛
陈倩倩
陈应霞
LIU Cong;CHEN Qian-qian;CHEN Ying-xia(School of Optical-electrical and Computer Engineering,University of Shanghai for Science and Technology,Shanghai 200093,China;School of Computer Science and Software Engineering,East China Normal University,Shanghai 200062,China)
出处
《小型微型计算机系统》
CSCD
北大核心
2019年第10期2209-2214,共6页
Journal of Chinese Computer Systems
基金
国家自然科学基金项目(61703278,61772342)资助
关键词
有效性指标
多距离聚类
多目标进化算法
聚类数目
validity index
multiple distance clustering
multiobjective evolutionary algorithm
cluster number