摘要
协同环境变量与机器学习回归模型构建土壤有机质空间预测组合模型对养分精准管理具有重要意义,而多维变量间的信息冗余和相关性会导致模型训练时间过长、预测精度降低等问题。以陕西省咸阳市农耕区为例,选取高程、坡向、坡度、剖面曲率、平面曲率、地形起伏度、地形湿度指数、年均降水量、年均气温、归一化植被指数共10个环境变量,在主成分分析(Principal compo⁃nent analysis,PCA)、核主成分分析(Kernel principal component analysis,KPCA)方法特征提取基础上,组合随机森林(Random forest,RF)、支持向量回归机(Support vector regression,SVR)、K最近邻(K-nearest neighbor,KNN)机器学习模型进行土壤有机质含量空间预测。以单一模型作为对照,通过计算模型决定系数(Coefficient of determination,R^(2))、均方根误差(Root mean square error,RMSE)和相对绝对误差(Relative absolute error,RAE),对不同模型的预测结果进行精度评价。结果表明:利用主成分提取方法和机器学习算法构建组合模型能消除变量间相关性,一定程度上提高土壤有机质含量预测模型精度。KPCA-RF模型对SOM含量预测精度高于其他模型,R2、RMSE、RAE分别为0.791、1.970 g·kg^(-1)、50.100%,该模型良好的预测能力可以为土壤有机质含量的空间预测与制图提供科学依据。
Spatial prediction models of soil nutrients are constructed from collaborative environment variables and machine learning regression models;they are of great significance for accurate nutrient management,but the information redundancy and correlation among multidimensional variables can lead to problems such as a long training time for the model and low prediction accuracy.In this study,the farming area of Xianyang City,Shaanxi Province,China,was taken as an example,and 10 environmental variables were selected:the elevation,aspect,slope,plane curvature,section curvature,relief,topographic wetness index,annual average temperature,annual average precipitation,and normalized difference vegetation index.Features were extracted by principal component analysis(PCA)and kernel PCA(KPCA),which were combined with the random forest(RF),support vector regression(SVR),and K nearest neighbor(KNN)models to develop spatial prediction models for the soil organic matter(SOM).Single models were used as the control.Then,the prediction accuracy of different models was evaluated according to the model determination coefficient(R^(2)),root-mean-squared error(RMSE),and relative absolute error(RAE).The following results were obtained:(1)PCA and KPCA reduced the data dimensionality,which eliminated the correlation and redundancy between variables and helped improve the accuracy and stability of the SOM spatial prediction model.(2)The PCA-RF model had a higher prediction accuracy than the RF model(R^(2) increased by 0.023,RMSE and RAE decreased by 0.070 g·kg^(−1) and 2.440%,respectively),whereas PCA-SVR and PCA-KNN performed worse than SVR and KNN alone.(3)The KPCA-RF model had higher accuracy than the RF model(R^(2),RMSE,and RAE were 0.791,1.970 g·kg^(−1),and 50.100%,respectively).The KPCA-SVR and KPCA-KNN models had better prediction accuracies than the SVR and KNN models.(4)The combined prediction model based on KPCA feature extraction and machine learning had higher prediction accuracy than the PCA-based combined prediction model
作者
胡贵贵
杨粉莉
杨联安
郑玉蓉
王辉
陈卫军
李亚丽
HU Guigui;YANG Fenli;YANG Lian’an;ZHENG Yurong;WANG Hui;CHEN Weijun;LI Yali(Shaanxi Key Laboratory of Earth Surface System and Environmental Carrying Capacity,Northwest University,Xi’an 710127,Shaanxi,China;College of Urban and Environmental Sciences,Northwest University,Xi’an 710127,Shaanxi,China;Xianyang Station of Soil and Fertilizer,Xianyang 712000,Shaanxi,China;Academy of Agriculture Sciences of Xianyang,Xianyang 712000,Shaanxi,China;Xunyi Station of Soil and Fertilizer,Xunyi 711300,Shaanxi,China)
出处
《干旱区地理》
CSCD
北大核心
2021年第4期1114-1124,共11页
Arid Land Geography
基金
国家自然科学基金(41771129)
陕西省农业科技攻关项目(2011K02-11)资助。
关键词
土壤有机质
机器学习
核主成分分析
农耕区
咸阳市
soil organic matter
machine learning
kernel principal component analysis
farming area
Xianyang City