摘要
针对典型划分式聚类算法对噪声和孤立点数据敏感问题,提出一种基于数据分散度的聚类算法。该算法定义数据分散度指标,将其引入非欧氏距离函数建立相似性度量实现数据的聚类,并根据基于改进划分系数的有效性函数获取最佳聚类数。将其应用于纺织浆纱过程质量指标建模中,采用径向基神经网络建立上浆率质量指标模型,通过该聚类算法确定隐层节点数,求取径向基函数中心。实验结果表明所提及的基于数据分散度的聚类算法对噪声和孤立点数据敏感度低,所建立的上浆率质量指标模型具有较高精度。
For the sensitivity of noise and outliers data in the typical partitioning clustering algorithm, a clustering algorithm based on data dispersion was proposed. The data dispersion was defined and introduced to a non-Euclidean distance. The similarity metric was established, and the data clustering was realized. The optimal clustering number was obtained by the validity function based on improved partition coefficient. Then the proposed clustering algorithm was applied to quality index model in slashing process. A size add-on quality index model was built by radial basis function neural networks. The node number of hidden layer was determined and the center of radial basis function was obtained by the proposed clustering algorithm. The empirical result shows that the clustering result is insensitive to noise and outliers data, and the accuracy of size add-on quality index model is higher.
出处
《系统仿真学报》
CAS
CSCD
北大核心
2016年第8期1707-1714,共8页
Journal of System Simulation
基金
国家自然科学基金(61102124)
辽宁省自然科学基金(2015020064)
关键词
质量指标模型
聚类
数据分散度
非欧氏距离
纺织浆纱过程
quality index model
clustering
data dispersion
non-Euclidean distance
textile slashing process