摘要
随着数据规模的不断增大,提高K-modes聚类算法或模糊K-modes聚类算法的运行效率成为了一个重要问题.为了提高其算法执行效率,提出了一种基于分治法的高维分类数据聚类方法.该方法并不是一次性对所有的数据进行聚类,而是将分类数据集分成若干个子集,对每个子集同时进行聚类,最后对聚类结果进行融合以形成最终的聚类结果.实验结果表明大多数情况下较传统的方法在聚类的速度上有显著的提高.
With the increasing size of data set,improving the efficiency of K-modes clustering algorithm or fuzzy K-modes clustering algorithm is becoming a critical issue.In order to improve the efficiency of the algorithm,a clustering method based on divided and conquered method was proposed.This method,not a one-time clustering of all data,divided the data set into several subsets,and each subset was clustered at the same time;the fusion results of each subset cluster form the final clustering results.The results show that the efficiency of clustering has been increased greatly compared with traditional clustering method in most cases.
出处
《微电子学与计算机》
CSCD
北大核心
2011年第6期88-91,共4页
Microelectronics & Computer
基金
国家自然科学基金资助项目(60970014)
教育部高等学校博士点基金(200801080006)
教育部科学技术研究重点项目(207018)
山西省重点实验室开放基金项目(2007031017)
太原市科技明星专项基金项目(09121001)
关键词
聚类分析
模糊聚类
分治法
分类数据
评价指标
clustering analysis
fuzzy clustering
divided and conquered method
large categorical data sets
evaluation index