摘要
聚类问题是近几年来机器学习和数据挖掘领域研究的热点问题,由于获取大量监督信息费时费力,目前国内外研究的重点是如何获得少量但对聚类性能提高显著的监督信息,再加上实际问题中存在的动态模糊性,故本文提出一种结合主动学习的动态模糊聚类算法DF-DBSCAN,通过引入动态模糊等价关系、动态模糊信任测度和动态模糊似然测度这3个约束信息来指导DBSCAN的聚类过程,以提高聚类的性能。实验结果表明,DF-DBSCAN算法不仅解决了实际问题中存在的动态模糊性数据的描述和表示问题,而且能够高效地进行数据聚类,显著地提高聚类性能。
Data clustering has recently become a topic of significant interest to data mining and machine learning communities. Because achieving supervised data may be expensive, the research focuses on attaining the supervised information form little information but can significantly improve the clustering performance. Moreover, there are many dynamic fuzzy problems in the real world. This paper presents a dynamic fuzzy data clustering algorithm based on active learning, and introduces three constraints which in- clude dynamic fuzzy equivalence relation, dynamic fuzzy trust measure and dynamic fuzzy likelihood measure to guide the clustering process of DBSCAN, aiming at improving clustering performance. Experimental results show that this proposed approach is effective in data clustering; also it can describe the dynamic fuzzy data of the clustering problem better. The clustering performance of active DF-DBSCAN has been dramatically improved with three constraints and better than the three representative methods.
出处
《计算机与现代化》
2014年第5期24-27,32,共5页
Computer and Modernization
基金
常州市科教城院校科研基金资助项目(K2012311)
关键词
主动学习
聚类算法
动态模糊集
动态模糊关系
动态模糊测度
active learning
clustering algorithm
dynamic fuzzy sets
dynamic fuzzy relation
dynamic fuzzy measure