摘要
目前,大多数聚类方法是二支聚类,即对象要么属于一个类,要么不属于一个类,聚类的结果必须具有清晰的边界。然而,将某些不确定的对象强制分配到某个类中将降低聚类结果的结构和精度。三支聚类是一种重叠聚类,它采用核心域和边界域来表示每个类别,较好地处理了具有不确定性对象的聚类问题。提出了一种使用样本邻域将二支聚类转化为三支聚类的方法。该方法利用二支聚类的结果和每个类中元素的邻域是否完全包含在该类中来对集合进行收缩,同时利用不在该类中的元素的邻域是否与该类有交集来进行扩张。收缩的区域称为核心域,扩张域和核心域的差集称为边界域。在UCI数据集上的实验结果显示,该方法在提高聚类结果的结构和F1值方面有较好的效果。
Most of the existing clustering methods are two-way clustering,which are based on the assumption that a cluster must be represented by a set with crisp boundary.However,assigning uncertain points into a cluster will reduce the accuracy of the method.Three-way clustering is an overlapping clustering which describes each cluster by core region and fringe region.This paper presented a strategy for converting a two-way cluster to three-way cluster using the neighborhood of the samples.In the proposed method,a two-way cluster is shrunk according to whether the neighborhood of sample are contained in this cluster and it is stretched according to whether the neighborhood of sample intersects with this cluster.The shrunk result is called core region and the difference between the shrunk result and stretched result is regarded as the fringe region.Experiment using the proposed method on UCI data sets shows that this strategy is effective in improving the structure and F1 values of clustering results.
出处
《计算机科学》
CSCD
北大核心
2018年第1期62-66,89,共6页
Computer Science
基金
国家自然科学基金资助项目(61503160
61572242)
江苏省高校自然科学基金(15KJB110004)资助
关键词
三支聚类
邻域
K-MEANS聚类
谱聚类
Three-way clustering
Neighborhood
K-means clustering
Spectral clustering