摘要
数据挖掘中对多维数据的处理时空见惯,分析了传统k-means的不足,通过维简约、聚类前孤点排除,降低数据样本的复杂度与孤点对聚类结果的影响,以数据空间中各维分量的聚类中心作为聚类初始中心值.通过实验结果分析,改进后的k-means算法能在很大程度上提高多维聚类的效率与聚类质量.
The processing of multidimensional data in data mining become a common occurrence.This paper analyzes the lack of traditional k-means,through dimension reducing and eliminating outlier before clustering then proposes a new algorithm of using the clustering center value of each dimension as the initial center of the clustering of all data space.Experiments results show the efficiency and clustering quality of this algorithm in clustering.
出处
《曲阜师范大学学报(自然科学版)》
CAS
2012年第4期65-69,共5页
Journal of Qufu Normal University(Natural Science)
基金
武夷学院青年教师专项科研基金(XQ201110)