摘要
针对样品的近红外(NIR)光谱与其物理化学性质之间存在的非线性关系,提出了一种结合等距映射(Isomap)和偏最小二乘(PLS)的非线性建模新方法。Isomap是一种新的非线性降维方法,属于流形学习方法,能有效地发现高维数据中的本真低维结构。Isomap-PLS建模方法首先用Isomap对高维NIR光谱数据作非线性降维,再用PLS降维并建立校正模型。将Isomap-PLS建模方法分别应用于两个公开的NIR光谱标准数据集,并与PLS单独建模进行比较。结果表明,在两个数据集上,用Isomap-PLS方法建立的校正模型比单独用PLS算法建立的校正模型具有更小的交叉验证均方根误差(RMSECV);对某些性质数据,Isomap-PLS模型比PLS模型的RMSECV值要小2-5倍。因此,Isomap能够有效反映NIR光谱中存在的非线性结构,Isomap-PLS比PLS具有更好的建模与预测能力。
For modeling the nonlinear relationship existing between samples' near infrared (NIR) spectra and their chemical or physical properties, a novel modeling method was put forward in the present paper, which builds model by combining Isomap and partial least squares (PLS). Isomap is a newly proposed nonlinear dimension reduction algorithm, and belongs to the algorithm family of manifold learning, which is a new branch of machine learning. Isomap is based on multidimensional scaling (MDS) algorithm; however, it replaces the Euclidean distance in MDS with an approximated geodesic distance, so it can effectively find out the intrinsic low dimensional structure from high dimensional data. By combining Isomap and PLS, refered to as Isomap-PLS, a novel nonlinear modeling method for NIR spectra analysis was proposed. In this method, Isomap was used to extract nonlinear information from high dimensional NIR spectra while keeping the invariance of geometric property, and then PLS was adopted to remove linear information redundancy and build a calibration model. The parameters of the Isomap, i.e. the number of the nearest neighbor k and output dimension d, can affect the performance of the method. In this paper, a grid search approach was used for parameter optimization. The Isomap-PLS modeling method was applied to two public benchmark NIR datasets, and the modeling results were compared with that of PLS. The results demonstrated that in both datasets, each model built with Isomap-PLS had a smaller rooted mean square error of cross-validation (RMSECV) than the corresponding model built with PLS. Moreover, for some properties, the RMSECV of Isomap-PLS was significantly reduced by a factor of 2-5 compared with that of PLS. It can be concluded that by taking the virtue that Isomap can reflect the intrinsic nonlinear structure of NIR spectra, Isomap-PLS can effectively model the nonlinear correlations between spectra and physicoehemieal properties of the samples, and so it gains more power in calibration and pr
出处
《光谱学与光谱分析》
SCIE
EI
CAS
CSCD
北大核心
2009年第2期322-326,共5页
Spectroscopy and Spectral Analysis
基金
广西科学基金项目(桂科青0542037)
国家自然科学基金项目(30860381)
国家“973”中医理论专项项目(2005CB523503)资助