期刊文献+

基于周边站点优化选取的随机森林PM_(2.5)小时浓度预测研究 被引量:6

Research on hourly PM_(2.5) concentration prediction of random forest based on optimal selection of surrounding stations
原文传递
导出
摘要 空气中的PM_(2.5)是威胁人体健康的主要大气污染物,对其进行有效预测和及时预警具有重要意义.大量研究表明,纳入周边站点信息的随机森林模型在单站点PM_(2.5)预测中显示出良好的效果,但在周边站点选取问题上目前尚缺乏针对性研究,部分选取方法带有主观性.本文提出了一种基于时间滞后互相关分析的周边站点优化选取方法,并以上海十五厂空气质量监测站(国控站)为例,构建了预测该站未来1~24 h PM_(2.5)浓度的随机森林回归模型集,比较分析了预测模型中各输入因子的重要性.研究发现,预测站点当前PM_(2.5)浓度值对未来1~16 h的预测最为重要,而气象要素中的风向则对于未来17~24 h的预测重要性最高;周边站点PM_(2.5)信息随着预测时间的延长,其重要程度排名有明显提升,且不同站点对不同时间预测的影响具有显著差异,在建模时应区别对待,优化选取.比较结果表明,使用本文方法选取周边站点建立的预测模型不仅在RMSE等精度指标上具有一定优势(12 h和24 h预报RMSE分别降低11.8%和13.3%),还在有实用价值的污染事件空报率上有明显降低(12 h和24 h预报空报率分别降低16.1%和25.6%),具有业务应用潜力. PM_(2.5) is a major air pollutant that threatens human health, and it is significant to be effectively predicted and promptly warned. Many studies have shown that the Random Forest model(RF) has good results in the prediction of PM_(2.5) concentration at a single station by incorporating the information of surrounding stations. However, the research on the selection of surrounding stations is lack of pertinence, and some existing selection methods are subjective. We proposed a method for optimizing the selection of surrounding stations based on Time-Lag Cross-Correlation(TLCC) analysis in this research. Taking the air quality monitoring station(national-level station) of Shanghai Shiwuchang as an example, a set of RF regression models were constructed to predict the PM_(2.5) concentration of the station in the next 1 to 24 hours, and the importance of each input factor in the prediction model was compared and analyzed. We found that the current PM_(2.5) concentration of the prediction station would significantly impact the prediction of the next 1 to 16 hours, while the wind direction was crucial for the prediction of the next 17 to 24 hours. As the forecast time increased, PM_(2.5) concentration of the surrounding stations significantly improved in importance ranking, and the impact of different stations was significantly different when forecasting at different times. Therefore it was treated differently when modeling. The comparison results showed that the prediction model established by the method of selected surrounding stations proposed in this paper not only had certain advantages in accuracy(12-hour and 24-hour forecast RMSE decreased by 11.8% and 13.3%), but the false alarm ratio also decreased significantly(the forecasted false alarm ratio for 12 hours and 24 hours dropped by 16.1% and 25.6%). The study has practical value and potential applications in predicting and prewarning air pollution.
作者 姚红岩 施润和 YAO Hongyan;SHI Runhe(Key Laboratory of Geographical Information Science,Ministry of Education,East China Normal University,Shanghai 200241;School of Geographic Sciences,East China Normal University,Shanghai 200241;Joint Laboratory for Environmental Remote Sensing and Data Assimilation,East China Normal University,Shanghai 200241;Joint Research Institute of Resources and Environment,East China Normal University,Shanghai 200062;Institute of Eco-Chongming,East China Normal University,Shanghai 202162)
出处 《环境科学学报》 CAS CSCD 北大核心 2021年第4期1565-1573,共9页 Acta Scientiae Circumstantiae
基金 国家重点研发计划项目(No.2016YFC1302602) 教育部哲学社会科学研究重大课题攻关项目(No.19JZD023) 上海市科委科技创新行动计划(No.19DZ1201505) 中央高校基本科研业务费项目。
关键词 时间滞后互相关 时间序列 大气污染物 机器学习 time lag cross-correlation time series atmospheric pollutants machine learning
  • 相关文献

参考文献4

二级参考文献36

  • 1鲍立威,何敏,沈平.关于BP模型的缺陷的讨论[J].模式识别与人工智能,1995,8(1):1-5. 被引量:43
  • 2Grivas,G.,A.Chaloulakou.Artificial neural network models for prediction of PM10 hourly concentrations,in the Greater Area of Athens[J].Atmospheric Environment,2006,40 (7):1216-1229. 被引量:1
  • 3Paschalidou,A.K.,S.Karakitsios,S.Kleanthous,et al.Forecas ting hourly PM10 concentration in Cyprus through artificial neu ral networks and multiple regression models[J].Environment Pollution Research,2011,18(1):316-327. 被引量:1
  • 4Reilly,P.Time series modeling of global mean temperature for managerial decision-making[J].Journal of Environment Management,2005,76 (1):61-70. 被引量:1
  • 5Jenkins,G.M.,G.C.Riesel.Time Series Analysis:Forecasting and Control[M].NY:Prentice Hall Inc,1994. 被引量:1
  • 6Shumway,R.H.,D.S.Stoffer.Time Series Analysis and It's Applications[M].New York:Springer Science Business Media,2006:79-99. 被引量:1
  • 7Kumar,U.,V.K.Jain.ARIMA forecasting of ambient air pollutants (O3,NO,NO2 and CO)[J].Stochastic Environmental Research and Risk Assessment,2010,24(5):751-760. 被引量:1
  • 8Cobourn,W.G.An enhanced PM2.5 air quality forecast model based on nonlinear regression and back-trajectory concentrations[J].Atmospheric Environment,2010,44 (25):3015-3023. 被引量:1
  • 9Chelani,A.B.,S.Devotta.Prediction of ambient carbon monox ide concentration using nonlinear time series analysis technique[J].Transportation Research,2007,12 (8):596-600. 被引量:1
  • 10宋明,韩素芹,张敏,姚青,朱彬.天津大气能见度与相对湿度和PM_(10)及PM_(2.5)的关系[J].气象与环境学报,2013,29(2):34-41. 被引量:108

共引文献171

同被引文献79

引证文献6

二级引证文献14

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部