摘要
对海量数据信息进行迭代聚类能够为数据挖掘提供准确的依据,具有重要的应用价值。传统算法对于初始参数的选取过于敏感,从而降低了迭代聚类的准确率。提出基于并归聚类的海量数据信息中的迭代聚类方法。采用离差隶属度进行数据信息聚类中心的计算,确定数据信息的聚类中心;采用类间距离作为并归聚类判别的标准,用于判定数据信息特征与聚类中心的距离;对海量数据信息特征与聚类中心的计算结果进行归类处理,直至所有的数据信息的聚类中心都结束并归,从而获得准确的迭代聚类结果。仿真实验结果表明,改进算法能够提高海量数据信息中的迭代聚类结果,效果令人满意。
Iterative clustering of huge amounts of data information can provide accurate basis for data mining, has important application value. Traditional algorithm for the selection of initial parameters is too sensitive, which reduces the iterative clustering accuracy. Put forward based on information and belongs to the huge amounts of data clustering of iterative clustering method. Adopting the membership degree deviation is the calculation of data clustering center, determine the clustering center of the data and information; Use class as the distance between and clustering discriminant standard, used to determine the data information characteristic and the clustering center distance; Characteristics of huge amounts of data information and classifies the calculation result of clustering center, processing, until all the clustering center of the data information and return to the end of the iterative clustering to obtain accurate results. The simulation experimental results show that the improved algorithm can improve the huge amounts of data information of iterative clustering results, the effect is satisfactory.
出处
《科技通报》
北大核心
2016年第4期152-155,共4页
Bulletin of Science and Technology
关键词
海量数据信息
迭代聚类
并归
huge amounts of data information
iterative clustering
and return