期刊文献+

几种降维技术在分类问题中的效果评估 被引量:5

下载PDF
导出
摘要 高维数据将会给数据分析带来极大的困难,因其所导致的数据分布稀疏化和数据组织效果的下降将会大大影响模型的性能。降维就是用于解决"维度灾难"的方法之一。文章从PCA、LLE、Isomap三种常见的降维方法入手,首先介绍了它们的实现原理,进一步结合KNN、SVM、Random Forest、Naive Bayes以及Logistics Regression模型构建了用于评价三种降维方法的综合交叉模型。结果表明,在文章所使用的数据集中,经过PCA方法与Isomap方法降维后的数据在可视的2维空间上分布较为均匀,而LLE方法分布则较为集中。且使用了PCA与Isomap方法的分类模型训练后的平均准确率高达96.44%与96.90%,高于LLE方法处理后所得的90.74%,PCA与Isomap具有较优的降维效果。本研究中所采用的方法与所得的结果为降维方法的选择提供了有益的参考。 High-dimensional data will bring great difficulties to data analysis, and the sparse distribution of data and the decline of data organization effect it causes will greatly affect the performance of the model. Dimensionality reduction is one of the ways to solve the "dimension disaster". Starting with three common dimensionality reduction methods, i.e., PCA, LLE and Isomap, this paper introduces their implementation principles, and then constructs a comprehensive cross model for evaluating the three dimensionality reduction methods based on the models of KNN, SVM, RandomForest, Naive Bayes and Logistics Regression. The results show that in the data set used in this paper, after dimensionality reduction by PCA method and Isomap method, the distribution of the data is unifoml in the visible two-dimensional space, while the distribution of LLE method is more concentrated. The average accuracy of the classification model trained with PCA and Isomap is 96.44% and 96.90%, which is higher than 90.74% with Isomap and 90.74% with LLE. The methods used in this study- and the results obtained provide a useful reference for the choice of dimensionality reduction methods.
出处 《科技创新与应用》 2018年第21期22-23,26,共3页 Technology Innovation and Application
关键词 降维 PCA LLE ISOMAP 效果评估 dimensionality reduction PCA LLE Isomap effect evaluation
  • 相关文献

参考文献6

二级参考文献62

共引文献524

同被引文献31

引证文献5

二级引证文献14

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部