期刊文献+

最近最远得分的聚类性能评价指标 被引量:8

A clustering evaluation index based on the nearest and furthest score
下载PDF
导出
摘要 聚类算法是数据分析中广泛使用的方法之一,而类别数往往是决定聚类算法性能的关键。目前,大部分聚类算法需要预先给定类别数,在很多情况下,很难根据数据集的先验知识获得有效的类别数。因此,为了获得数据集的类别数,本文基于最近邻一致性和最远邻相异性的准则,提出了一种最近最远得分评价指标,并在此基础上提出了一种自动确定类别数的聚类算法。实验结果证明了所提评价指标在确定类别数时的有效性和可行性。 The clustering algorithm is one of the widely-used methods in data analysis. However ’ the number of clusters is essential to determine the performance of the clustering algorithm. At present ’ the number of clusters usually need to be specified in advance. In most cases ’ it is difficult to obtain the valid cluster number according to a priori knowledge of the dataset. To obtain the number of clusters automatically ’ a Nearest and Furthest Score (NFS) index was proposed based on the principles of the nearest neighbor consistency and the furthest neighbor difference. Moreover,an Automatic Clustering NFS (ACNFS) algorithm was also proposed’ which can determine the number of clusters automatically. The experimental results prove the index is reasonable and practicable to determine the cluster number.
出处 《智能系统学报》 CSCD 北大核心 2017年第1期67-74,共8页 CAAI Transactions on Intelligent Systems
基金 国家自然科学基金"重点"项目(61532005)
关键词 最近邻一致性 最远邻相异性 K-MEANS聚类算法 评分机制 评价指标 层次聚类 the nearest neighbor consistency the furthest neighbor difference K-means clustering algorithm scoring mechanism evaluation index hierarchical clustering
  • 相关文献

参考文献8

二级参考文献92

  • 1普运伟,金炜东,朱明,胡来招.核模糊C均值算法的聚类有效性研究[J].计算机科学,2007,34(2):207-210. 被引量:28
  • 2范玉军,王冬冬,孙明明.改进的人工鱼群算法[J].重庆师范大学学报(自然科学版),2007,24(3):23-26. 被引量:43
  • 3胡春春,孟令奎,谢文君,周新忠.空间数据模糊聚类的有效性评价[J].武汉大学学报(信息科学版),2007,32(8):740-743. 被引量:5
  • 4CALINSKI R,HARABASZ J.A dendrite method for cluster analysis[J].Communications in Statistics,1974,3(1):1 -27. 被引量:1
  • 5DAVIES D L,BOULDIN D W.A cluster separation measure[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,1979,1(2):224-227. 被引量:1
  • 6DUDOIT S,FRIDLYAND J.A prediction-based resampling method for estimating the number of clusters in a dataset[J].Genome Biology,2002,3(7):1-21. 被引量:1
  • 7DIMITRIADOU E,DOLNICAR S,WEINGESSEL A.An examination of indexes for determining the number of cluster in binary data sets[J].Psychometrika,2002,67(1):137-160. 被引量:1
  • 8KAPP A V,TIBSHIRANI R.Are clusters found in one dataset present in another dataset?[J].Biostatistics,2007,8(1):9-31. 被引量:1
  • 9ROUSSEEUW P J.Silhouettes:a graphical aid to the interpretation and validation of cluster analysis[J].Journal of Computational and Applied Mathematics,1987,20(1):53 -65. 被引量:1
  • 10DEMB(E)L(E) D,KASTNER P.Fuzzy C-means method for clustering microarray data[J].Bioinformatics,2003,19(8):973-980. 被引量:1

共引文献231

同被引文献76

引证文献8

二级引证文献28

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部