期刊文献+

启发式初始化独立的k-均值算法研究 被引量:4

Research on heuristic initialization-independent k-means algorithm
下载PDF
导出
摘要 针对传统k-均值算法对初始聚类中心敏感的问题,提出了启发式初始化独立的k-均值算法。该算法引入prim算法选择k个初始聚类中心,且通过设置阈值参数θ,避免同一类中的多个数据对象同时作为初始聚类中心,否则将导致聚类迭代次数增加,并得到错误的聚类结果。与传统的k-均值算法和基于遗传算法的k-均值聚类算法相比,实验结果表明改进的算法不仅降低了初始聚类中心选取的随机性对聚类性能产生的影响,有效减少了聚类迭代次数,而且降低了离群点对聚类性能的影响,从而验证了算法的可行性和有效性。 According to the initialization sensitivity problem of the traditional k-means algorithm,a heuristic initializationindependent k-means algorithm is proposed.Prim algorithm is introduced to solve the selection of initial clustering centers,and the threshold parameter θ is set,which can avoid several data objects from the same class as the initial clustering centers simultaneously,otherwise the algorithm increases the iteration times,and the wrong clustering results are got.Compared with the traditional k-means algorithm and k-means clustering analysis based on genetic algorithm,the experimental result shows that the improved algorithm not only reduces the impact of random selection of initial clustering centers,and decreases the iteration times effectively,but also reduces the affect of outliers in the process of clustering,which validates the feasibility and effectiveness of the suggested algorithm.
出处 《计算机工程与应用》 CSCD 2012年第11期129-132,160,共5页 Computer Engineering and Applications
基金 国家自然科学基金(No.60970059) 国家科技支撑计划(No.2009BAH42B02) 山西省自然科学基金(No.2008011040) 山西省青年基金项目(No.2011021013-3)
关键词 聚类分析 K-均值算法 PRIM算法 初始化敏感 聚类中心 clustering analysis k-means clustering prim algorithm initialization sensitivity clustering center
  • 相关文献

参考文献8

二级参考文献43

共引文献333

同被引文献34

  • 1单梁,强浩,李军,王执铨.基于Tent映射的混沌优化算法[J].控制与决策,2005,20(2):179-182. 被引量:195
  • 2范英,张忠能,凌君逸.聚类方法在通信行业客户细分中的应用[J].计算机工程,2004,30(B12):440-441. 被引量:9
  • 3刘靖明,韩丽川,侯立文.基于粒子群的K均值聚类算法[J].系统工程理论与实践,2005,25(6):54-58. 被引量:122
  • 4Dung X L,Berti E L,Srivastava D.Truth discovery and copying detection in a dynamic world [J].Proceedings of the VLDB En-dowment,2009,2(1):562-573. 被引量:1
  • 5Kopeke H,Thor A,Rahm E.Evaluation of entity resolution ap-proaches on real-world match problems [J].Proceedings of the VLDBEndowment,2010,3(1/2):484-493. 被引量:1
  • 6Fan W F,Geerts F.Capturing missing tuples and missing value [A].Proc of the 29th ACM SIGMOD slGAcT-SIGART Symp c Principles of Database Systems [C].New York:ACM,2010:169-178. 被引量:1
  • 7Li M J,Ng M K,et al.Agglomerative fuzzy K-means clustering algo-rithm with selection of number of clusters [J].IEEE Transactions on Knowledge and Data Engineering,2008,20(11):1519-1534. 被引量:1
  • 8Frank A,Asuncion A.UCI machine learning repository [EB/0L].[2012-05-20]http://archive.ics.uci.edu/mI. 被引量:1
  • 9Li M J,Ng M K. Agglomerative fuzzy K-means clustering algorithm with selection of number of clusters[J].{H}IEEE Transactions on Knowledge and Data Engineering,2008,(11):1519-1534. 被引量:1
  • 10孙吉贵,刘杰,赵连宇.聚类算法研究[J].软件学报,2008(1):48-61. 被引量:1069

引证文献4

二级引证文献8

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部