为了更有效地衡量我国地表水水质状况,本文将基于集成方法与深度网络对我国各流域地表水数据进行建模与分析,研究中国地表水水质指标。本文选取国家地表水环境质量监测网数据,其中包含长江流域、珠江流域等13个流域共计1620条样本数据...为了更有效地衡量我国地表水水质状况,本文将基于集成方法与深度网络对我国各流域地表水数据进行建模与分析,研究中国地表水水质指标。本文选取国家地表水环境质量监测网数据,其中包含长江流域、珠江流域等13个流域共计1620条样本数据。由于本文所选取数据集为不平衡数据,需要先通过随机过采样以平衡样本量;对于数据增强后的全国地表水水质数据集分别建立单学习器、多学习器、神经网络模型,并对上述模型进行对比分析可知,过采样后的梯度提升树对于所选取的数据集具有最优的分类效果,因此本文认为过采样后的梯度提升树模型能够对于我国地表水水质进行精准分类的同时,简化原有较为复杂的监测项目,并能为我国地表水的监测工作提供一些行之有效的建议。In order to more effectively measure the surface water quality status in China, this article will use integrated methods and deep networks to model and analyze surface water data from various watersheds in China, and study surface water quality indicators in China. This paper selects the data from the National Surface Water Environmental Quality Monitoring Network, which includes 1620 sample data from 13 basins, including the Yangtze River Basin and the Pearl River Basin. Due to the imbalanced nature of the dataset selected in this article, it is necessary to first balance the sample size through random oversampling;for the national surface water quality dataset after data augmentation, single learner, multi learner, and neural network models were established separately. Comparative analysis of the above models showed that the oversampled gradient boost-ing tree had the best classification performance for the selected dataset. Therefore, this article be-lieves that the oversampled gradient boosting tree model can accurately classify surface water quality in China, simplify the existing complex monitoring projects and provide some effectiv展开更多
文摘为了更有效地衡量我国地表水水质状况,本文将基于集成方法与深度网络对我国各流域地表水数据进行建模与分析,研究中国地表水水质指标。本文选取国家地表水环境质量监测网数据,其中包含长江流域、珠江流域等13个流域共计1620条样本数据。由于本文所选取数据集为不平衡数据,需要先通过随机过采样以平衡样本量;对于数据增强后的全国地表水水质数据集分别建立单学习器、多学习器、神经网络模型,并对上述模型进行对比分析可知,过采样后的梯度提升树对于所选取的数据集具有最优的分类效果,因此本文认为过采样后的梯度提升树模型能够对于我国地表水水质进行精准分类的同时,简化原有较为复杂的监测项目,并能为我国地表水的监测工作提供一些行之有效的建议。In order to more effectively measure the surface water quality status in China, this article will use integrated methods and deep networks to model and analyze surface water data from various watersheds in China, and study surface water quality indicators in China. This paper selects the data from the National Surface Water Environmental Quality Monitoring Network, which includes 1620 sample data from 13 basins, including the Yangtze River Basin and the Pearl River Basin. Due to the imbalanced nature of the dataset selected in this article, it is necessary to first balance the sample size through random oversampling;for the national surface water quality dataset after data augmentation, single learner, multi learner, and neural network models were established separately. Comparative analysis of the above models showed that the oversampled gradient boost-ing tree had the best classification performance for the selected dataset. Therefore, this article be-lieves that the oversampled gradient boosting tree model can accurately classify surface water quality in China, simplify the existing complex monitoring projects and provide some effectiv