摘要
水质指数(WQI)是评价地表水水质最常用的指标。传统的WQI计算费时,且在派生子指数时常常产生错误。对此,使用4种独立算法(随机森林(RF)、额外树回归(ETR)、梯度提升回归(GBR)、XGBoost)和5种新的混合集成算法(Adaboost+RF、Adaboost+ETR、Adaboost+GBR、Adaboost+XGBoost及Stacking混合模型)来预测香港林村河的WQI值。收集香港大埔区下游TR-12监测站1987~2019年的数据,利用Pearson相关系数构建11个不同的输入参数组合,将数据按7∶3分为训练数据集、测试数据集两组,使用5种统计和视觉评价指标评价模型。结果表明,生化需氧量(BOD)和化学需氧量(COD)对WQI值的预测影响最大,Stacking混合模型的性能最优。
Water quality Index(WQI) is the most commonly used index to evaluate water quality of surface water. The traditional WQI computations are time-consuming and often produce errors in the derivation of sub-indices. Four independent algorithms including random forest(RF), additional tree regression(ETR), gradient lifting regression(GBR), XGBoost and five new hybrid integration algorithms including Adaboost+RF, Adaboost+ETR, Adaboost+GBR, Adaboost+XGBoost and Stacking mixed model were used to predict the WQI value of Lam Tsuen River in Hong Kong. The data of tr-12 monitoring station in the downstream Tai Po district, Hong Kong from 1987 to 2019 were collected. The Person correlation coefficient was used to construct 11 different input parameter combinations, and the data were divided into training data set and test data set with 7∶3. Five statistical and visual evaluation indexes were used to evaluate the model. The results show that biochemical oxygen demand(BOD) and chemical oxygen demand (COD) had the greatest impact on the prediction of WQI value, and the Stacking model had the best performance.
作者
谷志新
郭宇
GU Zhi-xin;GUO Yu(College of Information and Computer Engineering,Northeast Forestry University,Harbin 150038,China)
出处
《水电能源科学》
北大核心
2022年第5期46-49,共4页
Water Resources and Power
基金
中央高校基本科研业务费项目(2572017CB08)
国家自然科学基金项目(51975114)。
关键词
林村河
水质指数
集成机器学习
预测
Lam Tsuen River
WQI
ensemble machine learning
prediction