摘要
近年来,基于联合训练的深度聚类方法,如DEC(Deep Embedding Clustering)和DDC(Deep Denoising Clustering)算法,使基于特征提取的图像聚类取得了很多新进展,带来了聚类性能的突破,而且特征提取环节对后续聚类任务有直接影响。但是,这些方法的泛化能力较差,在不同数据集使用不同的网络结构,聚类性能相比分类性能仍有很大的提升空间。为此,文中提出了一种基于自注意力的自监督深度聚类方法(Self-attention Based Self-supervised Deep Clustering,SADC)。首先设计一个深度卷积自编码器用于提取特征,并且用带噪声的输入数据训练该网络来增强模型的鲁棒性;其次引入自注意力机制,辅助网络捕获对聚类有用的信息;最后编码器部分结合K-means算法形成一个深度聚类器,用于进行特征表示和聚类分配,通过迭代更新网络参数来提高聚类精度和网络的泛化能力。在6个图像数据集上验证所提聚类算法的性能,并与深度聚类算法DEC,DDC等进行比较。实验结果表明,SADC能提供令人满意的聚类结果,而且聚类性能与DEC和DDC相当。总之,统一的网络结构在保证聚类精度的同时降低了深度聚类算法的复杂度。
In recent years,deep clustering methods using joint optimization strategy,such as DEC(deep embedding clustering)and DDC(deep denoising clustering)algorithms,have made great progress in image clustering that heavily related to features representation ability of deep networks,and brought certain degree breakthroughs in clustering performances.The quality of feature extraction directly affects the subsequent clustering tasks.However,the generalization abilities of these methods are not satisfied,exactly as different network structures are used in different datasets to guarantee the clustering performance.In addition,there is a quite larger space to enhance clustering performances compared to classification performances.To this end,a self-supervised deep clustering(SADC)method based on self-attention is proposed.Firstly,a deep convolutional autoencoder is designed to extract features,and noisy images are employed to enhance the robustness of the network.Secondly,self-attention mechanism is combined with the proposed network to capture useful features for clustering.At last,the trained encoder combines with K-means algorithm to form a deep clustering model for feature representation and clustering assignment,and iteratively updates parameters to improve the clustering accuracy and generalization ability of the proposed network.The proposed clustering method is verified on 6traditional image datasets and compared with the deep clustering algorithms DEC and DDC.Experimental results show that the proposed SADC can provide better clustering results,and is comparable to the state-of-the-art clustering algorithms.Overall,the unified network structure ensures the clustering accuracy and simultaneously reducing computational complexity of the deep clustering algorithms.
作者
韩洁
陈俊芬
李艳
湛泽聪
HAN Jie;CHEN Jun-fen;LI Yan;ZHAN Ze-cong(Hebei Key Laboratory of Machine Learning and Computational Intelligence,College of Mathematics and Information Sciences,Hebei University,Baoding,Hebei 071002,China;School of Applied Mathematics,Beijing Normal University Zhuhai,Zhuhai,Guangdong 519087,China)
出处
《计算机科学》
CSCD
北大核心
2022年第3期134-143,共10页
Computer Science
基金
河北省引进留学人员资助项目(C20200302)
河北省自然科学基金(F2018201096)
广东省自然科学基金(2018A0303130026)
河北省社会科学基金项目(HB20TQ005)。
关键词
深度卷积自编码器
图像聚类
特征表示
自注意力
计算复杂度
Deep convolutional autoencoder
Image clustering
Features representation
Self-attention
Computational complexity