摘要
现有的半监督聚类集成方法能利用先验信息,使集成的准确性、鲁棒性和稳定性得到提高,但在集成阶段加入成对约束信息时,只考虑了给定的约束信息而忽视了约束点与被约束点的邻域点之间的关系。针对此问题,提出了一种基于数据相关性的半监督模糊聚类集成方法。该方法首先利用半监督模糊聚类算法建立集成信息矩阵,并将其转换为相似性矩阵;然后,利用已知的约束信息及约束点与被约束点的邻域点之间的关系来修改相似性矩阵;最后,利用图划分算法得到最终的聚类结果。真实数据上的实验结果表明,提出的方法可以有效提高聚类质量。
Semi-supervised clustering ensemble has emerged as a powerful machine learning paradigm that provides im- proved precision, robustness and stability by taking advantage of prior information,while most of them only consider the given pairwise constraints and do not consider the neighbors around the data points constrained in the ensemble step. In this paper,a semi-supervised fuzzy clustering ensemble with data eorrelation(SFCEDC)was proposed to overcome this defect. Firstly, an ensemble information matrix is built by primarily exploiting the results of semi-supervised fuzzy clus- tering and a similarity matrix is constructed by aggregating much information of the ensemble information matrix. And then this matrix is modified by using the given constraints and the neighbors around the data points constrained. Final- ly, a graph partitioning algorithm is employed to get the final clustering results. Experimental results on UCI datasets demonstrate that the proposed approach can improve clustering performance effectively.
出处
《计算机科学》
CSCD
北大核心
2015年第6期41-45,共5页
Computer Science
基金
国家自然科学基金(61170111
61134002)
西南交通大学牵引动力国家重点实验室自主研究课题(2012TPL_T15)资助
关键词
半监督聚类集成
模糊聚类
成对约束
邻域点
Semi-supervised clustering ensemble
Fuzzy clustering
Pairwise constraints
Neighbors points