摘要
维数约简作为机器学习的经典问题之一,主要用于处理维数灾问题、帮助加速算法的计算效率和提高可解释性以及数据可视化.传统的维数约简算法如主成分分析(Principal component analysis,PCA)和线性判别分析等只能处理无标签数据或者分类数据.然而,当预测变量为一元或多元连续型实值变量时,这些处理无标签数据或分类数据的维数约简方法则不能形成有效的预测性能.近20年来,有一系列工作从多个角度对这一问题展开了研究,并取得了系统性的研究成果.在此背景下,本文将综述这些面向回归问题的降维算法,即实值多变量维数约简.本文将介绍与实值多变量维数约简密切相关的基本概念、算法、理论,并探讨一些潜在的研究方向.
As one of the classical problems in machine learning, dimension reduction is used for dealing with the curse of dimensionality, speeding up computational efficiency of the algorithm and improving interpretability as well as visualizing high-dimensional data. Traditional dimension reduction algorithms such as principal component analysis (PCA) and linear discriminant analysis are mainly suitable for unlabeled data or classification data. When the response variables are univariate or multivariate continuous real-valued ones, however, such dimension reduction methods cannot guarantee the effective predictive performance of the reduced subspace studying this issue with different viewpoints, attaining many we will survey the developments of real-valued multivariate In the recent two decades, researchers have been devoted to promising and systemic achievements. Under this background, dimension reduction in detail. We will also introduce its basic concepts, algorithms and theories, and discuss some potential research directions deserving investigating
出处
《自动化学报》
EI
CSCD
北大核心
2018年第2期193-215,共23页
Acta Automatica Sinica
基金
国家自然科学基金(61673118)
上海市浦江人才计划(16PJD009)资助~~
关键词
维数约简
维数灾难
回归分析
条件独立性
互信息
Dimension reduction, curse of dimensionality, regression analysis, conditional independence, mutual infor-mation