摘要
为解决经典聚类算法在处理数据量稀少或存在干扰信息的数据时聚类效果不佳,甚至可能失效的问题,提出借鉴历史知识的类中心距离极大化聚类算法(transfer fuzzy c-means clustering algorithm with center distance maximization,CMT-FCM)。有效借鉴历史知识,利用历史类中心实现迁移学习,在保护原始数据私密性的同时,验证其聚类的有效性;在处理具有干扰信息(噪声点或者干扰点)的数据时,干扰信息对类中心有一定吸引力,导致类中心偏移或者一致,通过引入类中心距离极大化项,有效避免该问题。通过模拟数据集和真实数据集上的实验验证了该算法的有效性。
To address these issues that traditional clustering algorithms do not work well,and even are prone to fail when the data are quite sparse or distorted due to plenty of noise or outliers,the transfer fuzzy c-means clustering algorithm with center distance maximization(CMT-FCM),which benefited from the guidance of historical knowledge,was proposed.It was verified to be highly effective,and the privacy of raw data was protected.In the situations where the data are distorted due to much noise,interference information appeals to every class center to some extent,leading to shift or consistency of class center.Using the algorithm avoided the problem by introducing center distance maximization.Experimental studies on both artificial and real-world datasets demonstrate the effectiveness of the algorithm.
出处
《计算机工程与设计》
北大核心
2016年第8期2206-2212,共7页
Computer Engineering and Design
基金
江苏省自然科学基金重点研究专项基金项目(BK2011003)
关键词
聚类算法
历史知识
迁移学习
隐私保护
中心距离极大化
clustering algorithm
historical knowledge
transfer learning
privacy protection
center distance maximization