期刊文献+

一种新的密度加权粗糙K-均值聚类算法 被引量:11

A novel rough K-means clustering algorithm based on the weight of density
原文传递
导出
摘要 为了克服粗糙K-均值聚类算法初始聚类中心点随机选取,以及样本密度函数定义所存在的缺陷,基于数据对象所在区域的样本点密集程度,定义了新的样本密度函数,选择相互距离最远的K个高密度样本点作为初始聚类中心,克服了现有粗糙K-均值聚类算法的初始中心随机选取的缺点,从而使得聚类结果更接近于全局最优解。同时在类均值计算中,对每个样本根据定义的密度赋以不同的权重,得到不受噪音点影响的更合理的质心。利用UC I机器学习数据库的6组数据集,以及随机生成的带有噪音点的人工模拟数据集进行测试,证明本文算法具有更好的聚类效果,而且对噪音数据有很强的抗干扰性能。 A novel rough K-means clustering algorithm was presented based on the weight of exemplar density to overcome the drawback of selecting initial seeds randomly of available rough K-means algorithms.A new density function was defined for each sample according to the denseness of samples,and the top K samples with higher density and far away from each other were selected as initial centers of a rough K-means clustering algorithm.Also the new weight was defined for each exemplar according to the value of the new density function,so that the better centroids of each cluster could be calculated out without being influenced by noisy data.Experiments on six UCI data sets and on synthetically generated data sets with noise points proved that our algorithm got a better clustering result,and had a strong anti-interference performance for noise data.
出处 《山东大学学报(理学版)》 CAS CSCD 北大核心 2010年第7期1-6,共6页 Journal of Shandong University(Natural Science)
基金 中央高校基本科研业务费专项资金重点资助项目(GK200901006) 陕西省自然科学基础研究计划项目(2010JM3004)
关键词 聚类算法 粗糙K-均值 聚类中心 加权 密度 clustering algorithm rough K-means clustering center weight density
  • 相关文献

参考文献8

  • 1朱明..数据挖掘[M],2002.
  • 2孙吉贵,刘杰,赵连宇.聚类算法研究[J].软件学报,2008(1):48-61. 被引量:1070
  • 3PAWLAK Z. Rough sets[J].International Journal of Information and Computer Sciences, 1982, 11 (5) : 341-356. 被引量:1
  • 4LINGRAS P, WEST C. Interval set clustering of web users with rough K-means [ J ]. Journal of Intelligent Information Systems, 2004, 23(1) : 5-16. 被引量:1
  • 5WANG R Z, MIAO D Q, LI G, et al. Rough overlapping biclustering of gene expression data[ C]//Proceedings of the 7th IEEE International Conference on Bioinformatics and Bioengineering. Washington: 1EEE Computer Society, 2007:828-834. 被引量:1
  • 6郑超,苗夺谦,王睿智.基于密度加权的粗糙K-均值聚类改进算法[J].计算机科学,2009,36(3):220-222. 被引量:25
  • 7PARK H S, JUN C H. A simple and fast algorithm for K-medoids clustedng[J].Expert Systems with Applications, 2009, 36 (2) : 3336-3341. 被引量:1
  • 8SUN Y, ZHU Q M, CHEN Z X. An iterative initial-points refinement algorithm for categorical data clustering [J].Pattern Recognition Letters, 2002, 23 ( 7 ) : 875-884. 被引量:1

二级参考文献11

  • 1李洁,高新波,焦李成.基于特征加权的模糊聚类新算法[J].电子学报,2006,34(1):89-92. 被引量:114
  • 2Pawlak Z. Rough sets. International Journal of Information and Computer Sciences, 1982,11 : 145-172 被引量:1
  • 3Lingras P, West C. Interval set clustering of web users with rou - gh k-means. Journal of Intelligent Information Systems, 2004,23 (1):5-1643 被引量:1
  • 4Wang Ruizhi, Miao Duoqian, Li Gang, et al. Rough Overlapping Biclustering of Gene Expression Data//Proceedings of the 7th IEEE International Conference on Bioinformatics and Bioengi- neering. 2007:828-834 被引量:1
  • 5Peters G. Some refinements of rough k-means clustering. Pattern Recognition, 2006,39 (8) : 1481-1491 被引量:1
  • 6Mitra S. An evolutionary rough partitive clustering. Pattern Recognition Letters, 2004,25 (12) : 1429-1449 被引量:1
  • 7Peters G, Lampart M. A Partitive Rough Clustering Algorithm. Rough Sets and Current Trends in Computing,2006,4259(1):658 被引量:1
  • 8Davies D, Bouldin D. A Cluster Separation Measure. IEEE Trans, Pattern Anal, 1979,1 (2) : 224-227 被引量:1
  • 9Blake C L, Merz C J. UCI repository of learning databases, http://www. ics. uci. eud/-mlearn/MLRepository.html 被引量:1
  • 10Sun Y, Zhu Q M, Chen Z X. An iterative initial-points refinement algorithm for categorical data clustering. Pattern Recognition Letters, 2002,23 (7) : 880-883 被引量:1

共引文献1090

同被引文献107

引证文献11

二级引证文献89

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部