摘要
为克服k-means算法难以探测出一些局部分布稀疏不均、聚类区域的形状与大小不规整数据点集的聚类分布结构这个缺点,在半监督学习思想的指导下,针对混合属性空间区域中具有同一分布性质的带有类别标记的小样本数据集和无类别标记的大样本数据集,提出了一种基于半监督学习的k平均聚类框架。仿真实验表明:该框架经常能取得比k-means更好的聚类精度,从而说明这个半监督学习框架具有一定的有效性。
For some sparse-odd data sets with different size and shape of clusters, ordinary k-means algorithm cannot work well in exploiting the cluster-distribution.In order to conquer this shortcom-ing, under the idea of semi-supervised learning, a k-means clustering framework based on semi-su-pervised leaning is presented for an unlabeled large sample which has the same distribution with a labeled small sample in a hybrid attributes space.Simulations show that the framework can often get better clustering accuracy than k-means algorithm, validating the effectiveness of the semi-supervised learning framework to some extent.
出处
《广西大学学报(自然科学版)》
CAS
北大核心
2014年第5期1074-1082,共9页
Journal of Guangxi University(Natural Science Edition)
基金
国家自然科学基金资助项目(61103038)
关键词
半监督学习
混合属性
k平均聚类
归属度
semi-supervised learning
hybrid attributes
k-means clustering
attributive measure