摘要
传统的主成分聚类方法往往会因对离群值比较敏感而导致聚类的结果与实际不相符。针对这一现象,本文运用稳健统计量对传统主成分聚类方法进行修正,构建出稳健主成分聚类分析算法,以克服离群值对模型计算结果的影响。由模拟和实证分析的计算结果可得知:当数据中没有离群值时,稳健主成分聚类方法的结果与传统主成分聚类方法一致;但当数据中有离群值时,相对于传统主成分聚类方法而言,稳健主成分聚类方法能有效抵抗离群值的影响,具有良好的抗干扰性和高抗差性。
The traditional principal component clustering method is sensitive to outliers,which leads to the result of the model is not consistent with the actual results.In response to this phenomenon,this paper uses robust statistics to improve the traditional principal component clustering analysis method,and build a robust principal component clustering analysis algorithm,in order to overcome the influence of outliers.The results of Simulation and empirical analysis show that:When there are no outliers in the data,the results obtained by the traditional method and robust method to obtain the results consistent,but when data have outliers,the robust principal component clustering method have better anti-interference and high robustness than the traditional principal component clustering method.
作者
李雄英
颜斌
LI Xiong-ying;YAN Bin(School of Statistics and Mathematics,Guangdong University of Finance and Economics,Guangdong Guanghzou 510320,China;School of Economics,Jinan University,Guangdong Guanghzou 510632,China)
出处
《数理统计与管理》
CSSCI
北大核心
2019年第5期849-857,共9页
Journal of Applied Statistics and Management
基金
全国统计科学研究项目(2018LY04)
广东省教育厅青年创新人才类项目(2016WQNCX046)的资助
关键词
主成分聚类分析
稳健统计量
协方差矩阵
离群值
principal component cluster analysis
robust statistics
covariance matrix
outlier