摘要
该文将蒙特卡洛-无变量信息消除(MC-UVE)算法和变量重要性投影(VIP)算法结合,挑选出重要、有信息的波长变量,建立了MC-UVE-VIP两步波长筛选方法。该法首先采用MC-UVE筛选出稳定性参数大于某一阈值(Mthreshold)的有信息波长集合U_(UVE),然后采用VIP算法从U_(UVE)中筛选出VIP参数大于U_(UVE)中所有波长VIP均值的波长,作为重要、有信息的波长集合U_(UVE)-VIP。基于U_(UVE)-VIP建立玉米中蛋白质含量的偏最小二乘回归(PLSR)近红外光谱预测模型,模型的潜变量个数根据累计贡献率大于99.9%确定。该模型变量少、稳健、可解释性强、运算速度快,其预测两台从机样品蛋白质的平均相对误差(MARE)分别为1.64%与1.88%,均小于MC-UVE模型的从机MARE(5.40%与5.19%)和VIP模型的从机MARE(6.23%与7.16%)。因此,基于MC-UVE-VIP两步波长筛选法所建立的玉米蛋白质含量近红外光谱模型可直接传递到从机,无需进行模型或从机光谱校正。该模型传递性能优于单纯基于MC-UVE或VIP算法筛选波长所建模型及全波长模型。
Previous studies have shown that selecting suitable wavelengths to establish a near-infrared(NIR) spectral model can realize the transfer of calibration model without standard samples.Using the corn spectra of three near-infrared instruments and the protein in corn as the case study,a two-step wavelength selection method of MC-UVE-VIP is proposed in this paper by combining Monte Carlo-uninformative variable elimination(MC-UVE) method with variable importance in projection(VIP) method.The method selects important and informative wavelength variables to build partial least square(PLS) calibration model for protein in corn.Firstly,the method uses MC-UVE to screen the set of U_(UVE)consisting of informative wavelengths,whose stability parameter for screening wavelengths are greater than a threshold(Mthreshold).Then,the VIP algorithm is used to screen the important and informative wavelength set of U_(UVE)-VIP,whose VIP parameter is greater than the average value of VIP of all wavelengths in the U_(UVE).A PLS near-infrared spectral model for predicting protein in corn is established based on U_(UVE)-VIP.The number of latent variables of the PLS model is determined according to the principle that accumulative contribution rate of the first latent variables should be higher or equal to 99.9%.The model is of few variables,robust,good at interpretability,and fast at calculation.The mean absolute relative error(MARE) of predicting protein in the samples tested on the secondaries is 1.64%and 1.88%,respectively.While the MARE of the secondary samples predicted by MC-UVE and VIP models are 5.19%-5.40% and 6.23%-7.16%,respectively.Therefore,the near-infrared spectral model of corn protein based on the MC-UVE-VIP twostep wavelength screening method can be directly transferred to the secondary instruments without correcting model or secondary samples spectra.The transfer performance of the model is better than that of the model based on single MC-UVE or single VIP wavelength selection algorithm,and is better than full wavelength
作者
张站鸽
倪力军
张立国
栾绍嵘
ZHANG Zhan-ge;NI Li-jun;ZHANG Li-guo;LUAN Shao-rong(College of Chemistry and Molecular Engineering,East China University of Science and Technology,Shanghai 200237,China)
出处
《分析测试学报》
CAS
CSCD
北大核心
2023年第2期204-209,共6页
Journal of Instrumental Analysis
基金
上海市科研计划项目(19391902400)。