期刊文献+

一种优化初始化中心的k均值web信息聚类算法 被引量:2

An Optimization Initial Points K-Means Algorithm for Web Clustering
下载PDF
导出
摘要 k-means算法是一种重要的聚类算法,在网络信息处理领域有着广泛的应用。由于k-means算法终止于一个局部最优状态,所以初始类中心点的选择会在很大程度上影响其聚类效果。针对k-means算法所存在的问题,构造了文本集合的相似度矩阵,基于平均相似度集合通过排序迭代优选出了初始中心点。实验表明此算法可以有效减少迭代次数并提高聚类精度,最终获得较好的聚类效果。 K-means is an important clustering algorithm. It is widely used in the field of Internet information processing technologies. Because K-means algorithm terminates at a local optimum state, -so the choice of the initial class center point to a great extent influences the clustering effects. For the existing problems of K-means algorithm, the text set similarity matrix is structured. Based on the mean similarity set, the initial center points with higher quality are computed by sorting and iterating the mean similarity. Experiments show that the method can effectively reduce the number of iterations and improve the clustering accuracies, and ultimately, achieve a better clustering results.
出处 《北京石油化工学院学报》 2011年第4期55-58,共4页 Journal of Beijing Institute of Petrochemical Technology
关键词 K均值 聚类 初始中心点 优化 K-means clustering initial center point optimization
  • 相关文献

参考文献7

二级参考文献19

  • 1苏金树,张博锋,徐昕.基于机器学习的文本分类技术研究进展[J].软件学报,2006,17(9):1848-1859. 被引量:386
  • 2[1]Usama M.Fayyad Cory A.Reina Paul S.Bradley,Initialization of Iterative Refinement Clustering Algorithms[C].Proc.4th International Conf.On Knowledge Discovery & Data Mining,1998. 被引量:1
  • 3[2]Pena J M ,J.A.Lozano,and P.Larranaga,An Empirical Comparison of four Initialization Methods for the K-Means Algorithm[J].Pattern Recognition Letters, 1999,20:1027-1040. 被引量:1
  • 4[3]Pal N R and J.C.Bezdek,On Cluster Validity for the Fuzzy c-Means Model,IEEE Transactions on Fuzzy Systems[J].1995,3:370-390. 被引量:1
  • 5[4]Rezaee M R, B P F Lelieveldt and J.H.C.Reiber,A New Cluster Validity Index for Fuzzy c-Means[J].Pattern Recognition Letters ,1998,19:237-246. 被引量:1
  • 6[5]Ray S and R H Turi,Determination of Number of Clusters in K-Means Clustering and Application in Colour Image Segmentation[C].ICAPRDT'99,Calcutta,India,27-29 December,1999. 被引量:1
  • 7Fraley C,Raftery A E.How Many Clusters? Which Clustering Method? Answers Via Model-Based Cluster Analysis. Department of Statistics University of Washington Technical Report, 1998. 被引量:1
  • 8Xu L.How Many Clustering?:A YING-YANG Machine Based Theory For A Classical Open Problem In Pattern Recognition. IEEE Trans, Neural Networks, 1996 ; 3(10). 被引量:1
  • 9Jiang M F,Tseng S S,Su C M.Two-phase clustering process for outliers detection. Pattern Recognition Letters,2001 ; (22) (6-7). 被引量:1
  • 10Michaud P.Clustering techniques. Future Generation Computer System, 1997 ; 13(6). 被引量:1

共引文献35

同被引文献3

引证文献2

二级引证文献4

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部