期刊文献+

欧氏距离的加权处理对K-means法聚类效果的改进 被引量:1

A developed K-means method based on weighted Euclidean distance
下载PDF
导出
摘要 目的针对K-means法倾向于产生大小相等的球状类这一缺点,对K-means法进行改进,使其在对方差大小不等的类进行聚类时,可以达到较好的效果。方法以修正后的方差的倒数为权重,对欧氏距离的平方进行加权处理,从而用"相对距离"代替"绝对距离"来计算样品点与类间的相似度。结果在对方差大小不等的2个类进行聚类时,改进K-means法得到的正确率高于传统的K-means法。结论在对方差相差悬殊的两类进行聚类时,改进的K-means法优于传统的K-means法。 Objective The purpose of this dissertation is to propose a developed K-means method, which is more effective than traditional K-means method especially when identifying clusters whose variances are unequal. Methods The relative distance but not absolute distance was used to calculate the distance between the individual and the cluster center. Relative distanee, as what is called, is defined as the ratio between the squared Euclidean distance and the adjusted variance of the cluster. Results When identifying clusters whose variances are unequal, the developed K-means method may lead to a higher accuracy evaluated with actual clusters. Conclusion The developed K-means method is more effective than traditional K-means method when identifying clusters whose variances are unequal.
出处 《中国医院统计》 2008年第1期9-12,共4页 Chinese Journal of Hospital Statistics
基金 广东省科技计划项目(2004B33701010)
关键词 聚类分析 欧氏距离 加权 Cluster analysis Euclidean distance Weighting
  • 相关文献

参考文献9

二级参考文献29

  • 1白莉媛,胡声艳,刘素华.一种基于模拟退火和遗传算法的模糊聚类方法[J].计算机工程与应用,2005,41(9):56-58. 被引量:11
  • 2屈建平,罗文坚,王煦法.基于K-均值聚类的改进非选择算法研究[J].计算机工程与应用,2005,41(28):29-32. 被引量:4
  • 3Bezdek J C, et al. Multiple-Prototype Classifier Design. IEEE Trans Syst Man Cybern, 1998, 24(9):67~79 被引量:1
  • 4Selim S Z. Ismail M A. K-Means-Type Algorithms: A Generalized Convergence Theorem and Characterization of Local Optimality. IEEE Trans Pattern Analysis and Machine Intelligence,1984, PAMI-6(1): 81~87 被引量:1
  • 5Bradley P S, Fayyad U M. Refining Initial Points for K-Means Clustering. Advances in Knowledge Discovery and Data Mining.MIT Press, 1996 被引量:1
  • 6Raymond T. Ng, Han Jiawei. Efficient and Effective Clustering Methods for Spatial Data Mining. In: Proc. of the 20th VLDB Conf. Santiago, Chile, 1994 被引量:1
  • 7Selim S Z,Alsultan K. A Simulated Annealing Algorithms for the Clustering Problem. Pattern Recognition, 1991,24 (10): 1003 ~1008 被引量:1
  • 8HANJia-wei KAMBERM.数据挖掘概念与技术[M].北京:机械工业出版社,2001.1 51-161. 被引量:36
  • 9Jiawei Han, Micheline Kamber.Data Mining: Concepts and Techniques[M].Simon Fraser University, 2000. 被引量:1
  • 10Alsabti K,Ranka S,Singh V. An efficient k-means clustering algorithm, IPPS-98[A].Proceedings of the First Workshop on High Performance Date Mining[C]. Orlando, Florida, USA, 1998. 被引量:1

共引文献229

同被引文献10

引证文献1

二级引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部