摘要
针对田间状态下通过行走式设备获取的近红外反射光谱数据,存在干扰因素多,数据获取环境复杂多变,比实验室条件下建立土壤碳预测模型更加困难的情况,研究了通过变量选择来提高模型质量的效果及有效性。从独立检验数据集来分析,与采用所有变量所建模型的预测精度相比,进行变量选择后的预测精度,均有不同程度的提高,说明在建立土壤碳预测模型时,进行光谱变量选择,是有益和必要的。基于无信息变量消除法(UVE)和无信息变量消除-连续投影法(UVE-SPA)进行变量选择所建模型的预测精度较高,而SPA和遗传算法-偏最小二乘法(GA-PLS)的效果较差;对于协同区间最小二乘法而言,分割的区间数、参与建模子区间数的变化,会对所建模型的预测精度产生影响,选择合适的区间分割数和子区间组合,可以获得与UVE和UVE-SPA相当的效果,但其不足是需要大量的运算来进行最优子区间组合的选择。
The present paper tried to evaluate the effectiveness and improvement of variable selection before modeling with partial least squares regression (PLSR). Based on the independent test dataset, and compared with the PLSR model derived from all spectral variables, the prediction accuracy by modeling after variable selection has been improved. Thus, the results showed that variable selection was beneficial and necessary for soil carbon modeling by on-the-go NIRS. UVE (uninformative variable elimi- nation) and UVE-SPA (successive projection algorithm) could perform effective variable selection and created promising models, and SPA and GA-PLS (genetic algorithm PLS) failed to make appropriate models. For synergy interval PLS (siPLS), change in interval number and number of interval for modeling could affect the prediction accuracy obviously. Promising models could be made by selecting appropriate interval number and number of interval for modeling, and siPLS could achieve similar prediction accuracy to LIVE or UVE-SPA, and the shortcoming was that siPLS required a lot of computing time to find optimal combinationo{ intervals {or modeling.
出处
《光谱学与光谱分析》
SCIE
EI
CAS
CSCD
北大核心
2013年第7期1775-1780,共6页
Spectroscopy and Spectral Analysis
基金
浙江省"三农五方"合作计划项目(20100015)
浙江省重点科技创新团队项目(2010R50030)资助
关键词
田间行走式测定
近红外光谱
土壤碳
偏最小二乘回归法
变量选择
On-the-go measurement
Near-infrared spectra
Soil carbon
Partial least square regression
Variable selection