摘要
针对复杂模式分类中特征空间维数过高问题,提出一种基于流形学习的PCA-SLPP降维方法。先对初始特征量集中样本数据进行主成分分析(PCA),使特征量在全局空间中相互独立;再采用改进的有监督局部保留投影(SLPP)法对主成分分析后的数据进行映射,使数据在特征空间中的局部流行结构得以保持的同时扩大样本数据的可区分性;最终依据累计贡献率与特征值的大小实现特征空间的降维。采用降维后的数据集训练支持向量机分类器,对具有复杂结构的煤岩惰质组显微组分进行分类。实验结果表明:PCA可以有效去除特征数据间的信息冗余,有助于分类正确率的提高;PCA的维数一定且总维数较高时,采用SLPP继续降维对分类正确率影响不大,但当总维数降到2及以下时,分类正确率迅速下降;在特征空间总维数降到初始维数的1/2及以下时,本文方法的分类正确率明显高于其他算法;本文方法在耗时上与SLPP相近。
In view of the problem that the dimension of the feature space of complex pattern is so high that makes classification difficult,a dimensionality reduction method named PCA-SLPP was proposed based on manifold learning.Firstly,feature data of samples in the original feature set were analyzed with the method of principal component analysis(PCA)to make them uncorrelated.Then,with an improved supervised locality preserving projections(SLPP),data after PCA were mapped to make them more distinguishable while the manifold structure of feature data were preserved.The dimension of feature space was finally reduced according to the cumulative contribution in PCA and eigenvalue in SLPP.With the final dimensionality reduced data,a support vector machine was trained,and the macerals of inertinite of coal,which have complex structures,were classified.Experimental results show that,with PCA the redundancy of the feature data can be reduced effectively,which is helpful to improve the classification accuracy;when the dimension of PCA is constant and the total dimension is higher,the further reducing of dimension with the improved SLPP has fewer influence on the classification accuracy,but when total dimension is reduced to or below 2,the classification accuracy is decreased rapidly;with the proposed dimensionality reduction method,the classification accuracy is significantly higher than those from other algorithms when feature space dimension is reduced to or below half of the original dimension;the calculation time of the proposed method approaches to that of SLPP.
作者
王培珍
王慧
刘曼
王高
张代林
WANG Peizhen;WANG Hui;LIU Man;WANG Gao;ZHANG Dailin(School of Electrical Engineering & Information,Anhui University of Technology,Ma’anshan 243002,China;Anhui Key Laboratory of Clean Conversion and Utilization,Anhui University of Technology,Anhui University of Technology,Ma’anshan 243002,China;Key Laboratory of Metallurgical Emission Reduction & Resources Recycling,Ministry of Education,Anhui University of Technology,Ma’anshan 243002,China)
出处
《安徽工业大学学报(自然科学版)》
CAS
2018年第4期352-359,共8页
Journal of Anhui University of Technology(Natural Science)
基金
国家自然科学基金项目(51574004)
安徽省高校学科拔尖人才学术资助重点项目(2016041)
关键词
降维
主成分分析
流形学习
有监督的局部保留投影
煤岩
分类
dimensionality reduction
principal component analysis(PCA)
manifold learning
supervised locality preserving projections(SLPP)
coal
classification