摘要
为处理目标数据集仅有部分成对约束信息可用的半监督聚类场景,基于非负矩阵分解(NMF)架构,通过学习给定成对约束知识和运用流形正则化理论提出了流形学习与成对约束联合正则化非负矩阵分解聚类方法(NMF-JRMLPC)。该方法一方面引入图拉普拉斯以刻画大量无标记样本蕴含的流形结构信息,另一方面将已知样本间的must-link或cannot-link成对约束规则融入目标优化设计,在很大程度上提高了所得算法的聚类性能。此外基于l2,1范数的损失函数设计也有助于优化NMF-JRMLPC的鲁棒性。在八个真实数据集上的实验结果证实了所提方法的有效性。
In order to handle semi-supervised clustering scenarios where only part of the pairwise constraint information is available in the target dataset,on the basis of nonnegative matrix factorization(NMF)architecture,this paper proposes a nonnegative matrix factorization-based clustering algorithm using joint regularization of manifold learning and pairwise constraints(NMF-JRMLPC)by learning given pairwise constraint knowledge and using manifold regularization theory.On the one hand,graph Laplacian is introduced to depict the manifold structure information contained in a large number of unlabeled samples,and on the other hand,the must-link or cannot-link pair-constraint rules among known samples are integrated into the target optimization design,which greatly improves the clustering performance of the algorithm.In addition,the l2,1 norm based loss function design also helps to optimize the robustness of NMF-JRMLPC.Experimental results on eight real datasets confirm the validity of the proposed method.
作者
曹佳伟
钱鹏江
CAO Jiawei;QIAN Pengjiang(School of Digital Media,Jiangnan University,Wuxi,Jiangsu 214122,China)
出处
《计算机科学与探索》
CSCD
北大核心
2020年第7期1211-1220,共10页
Journal of Frontiers of Computer Science and Technology
基金
国家自然科学基金面上项目Nos.61772241,61702225
江苏省自然科学基金No.BK20160187
中央高校基本科研业务费专项资金No.JUSRP51614A
江苏省青蓝工程项目
江苏省六大人才高峰项目。
关键词
聚类
非负矩阵分解(NMF)
流形正则化
成对约束
半监督学习
clustering
nonnegative matrix factorization(NMF)
manifold regularization
pairwise constraints
semisupervised learning