期刊文献+

环境监测系统中异常数据的识别和修复方法 被引量:12

Method for identifying and repairing the abnormal data in the environmental monitoring system
下载PDF
导出
摘要 为准确掌握企业大气监测数据的变化规律,得到完整的、接近真实的大气环境监测数据,提出一种组合异常数据检测算法(SWDS-LOF)以检测异常值,并利用多项式拟合的方法对异常数据进行修正。针对修正后数据不完整的情况提出多变量季节性时间序列模型(SARIMA),对随季节性变化趋势较明显的大气监测数据缺失值进行恢复,并建立高度相关的污染物传感器组之间的SARIMA模型,利用高度相关污染物之间的关系,通过完整监测数据恢复缺失部分数据,最终得到完整准确的监测数据。以某汽车公司的监测数据为例进行实证分析并验证所提算法和所建模型,结果显示SWDS-LOF算法与SARIMA模型检测并修正了97%的异常数据,缺失数据段的数据全部恢复,且恢复精度达到94.60%,表明对于大气监测数据,该算法及模型具有较高的精度,可为大气监测数据的完整性及可靠性提供有效技术支持。 In order to accurately grasp the changing regularity of the corporate atmospheric monitoring data and make complete close to the real atmospheric environmental monitoring data,a combined abnormal data detection algorithm(SWDS-LOF)based on Sliding Window model,DBSCAN and LOF can be expected to be brought up into being.The said algorithm SWDS-LOF can primarily be encapsulated into the dynamic data stream into a static data set through the sliding window model,and then the DBSCAN method can be used to initially cluster the outliers,which help to reduce the time and space complexity as compared with other algorithms,and then will be able to reduce the neighborhood query range for the traditional LOF method.Therefore,it has higher accuracy than the other algorithms.And,furthermore,through eliminating the abnormal data,the polynomial fitting method can be taken to establish a functional relationship with a higher degree of fit for the remaining positive values,whereas the functional relationship can be adopted to modify the eliminated abnormal data to obtain approximately the accurate monitoring data.And,next,aiming at the incomplete missing data detected and corrected,it would be possible to propose the seasonal autoregressive integrated moving average(SARIMA)is to recover the missing atmospheric monitoring data with the obvious seasonal variation trend.And,at the same time,another pollutant can be found highly related to the pollutant missing part of the monitoring data to establish a pollutant sensor group,and then the data before the missing point and the monitoring data of the missing part of the pollutant can also be proven highly related to it.And,in turn,the original pollutant data can be also expected to build up a SARIMA model that can help to resume automatically the monitoring activities after a period of missing to achieve the purpose of the missing data restoration,so as to eventually get the comprehensive accurate monitoring data,which can help to verify the accuracy of the proposed model and the corresp
作者 陆秋琴 魏巍 黄光球 LU Qiu-qin;WEI Wei;HUANG Guang-qiu(School of Management,Xi’an University of Architecture&Technology,Xi’an 710055,China)
出处 《安全与环境学报》 CAS CSCD 北大核心 2021年第3期1300-1310,共11页 Journal of Safety and Environment
基金 国家自然科学基金项目(71874134) 陕西省自然科学基础研究计划-重点项目(2019JZ-30)。
关键词 环境工程学 大气环境 监测数据 故障树 SWDS-LOF算法 多项式拟合 SARIMA模型 environmental engineering atmospheric environment monitoring data fault tree SWDS-LOF algorithm polynomial fitting SARIMA model
  • 相关文献

参考文献7

二级参考文献77

共引文献113

同被引文献128

引证文献12

二级引证文献16

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部