摘要
当前综采工作面海量数据采集的实时性和完整性差、异常数据清洗耗时大、数据挖掘时延大,导致综采数据利用率低,无法辅助管理层实时下发决策指令。针对上述问题,设计了一种综采工作面海量数据挖掘分析平台。该平台由数据源层、数据采集存储层、数据挖掘层和前端应用层组成。数据源层由工作面各类硬件设备提供原始数据;数据采集存储层使用OPC UA网关实时采集井下传感器监测信息,再通过MQTT协议和RESTful接口将数据存入InfluxDB存储引擎;数据挖掘层利用Hive数据引擎和Yarn资源管理器筛选数据采集过程中受工作现场干扰形成的异常数据,解决因网络延时导致的数据局部采集顺序紊乱问题,并利用Spark分布式挖掘引擎挖掘工作面设备群海量工况数据的潜在价值,提高数据挖掘模型的运行速度;前端应用层利用可视化组件与后端数据库关联,再通过AJAX技术与后端数据实时交互,实现模型挖掘结果和各类监测数据的可视化展示。测试结果表明,该平台能够充分保证数据采集的实时性与完整性,清洗效率较单机MySQL查询引擎提升5倍,挖掘效率较单机Python挖掘引擎提升4倍。
The current real-time and integrity of massive data acquisition in fully mechanized working faces are poor.The abnormal data cleaning takes a long time.The data mining delays are large.This leads to low utilization rate of fully mechanized working data and incapability to assist management in issuing decisionmaking instructions in real-time.In order to solve the above problems,a massive data mining and analysis platform for fully mechanized working faces is designed.The platform consists of a data source layer,a data acquisition and storage layer,a data mining layer,and a front-end application layer.The data source layer is provided with raw data by various hardware devices on the working surface.The data acquisition and storage layer uses the OPC UA gateway to collect real-time monitoring information from underground sensors,and then stores the data in the InfluxDB storage engine through the MQTT protocol and RESTful interface.The data mining layer uses the Hive data engine and Yarn resource manager to filter out abnormal data caused by workplace interference during the data acquisition process.It solves the problem of local data acquisition order disorder caused by network latency.The Spark distributed mining engine is used to explore the potential value of massive working condition data in the working face device group,improving the running speed of the data mining model.The front-end application layer utilizes visual components to associate with the back-end database.It interacts with the back-end data in real-time through AJAX technology to achieve visual display of model mining results and various monitoring data.The test results show that the platform can fully ensure the real-time and integrity of data acquisition.The cleaning efficiency is 5 times better than a standalone MySQL query engine and the mining efficiency is 4 times better than a standalone Python mining engine.
作者
王宏伟
杨焜
付翔
李进
贾思锋
WANG Hongwei;YANG Kun;FU Xiang;LI Jin;JIA Sifeng(Center of Shanxi Engineering Research for Coal Mine Intelligent Equipment,Taiyuan University of Technology,Taiyuan 030024,China;College of Mining Engineering,Taiyuan University of Technology,Taiyuan 030024,China;College of Mechanical and Vehicle Engineering,Taiyuan University of Technology,Taiyuan 030024,China)
出处
《工矿自动化》
CSCD
北大核心
2023年第5期30-36,126,共8页
Journal Of Mine Automation
基金
国家自然科学基金资助项目(52274157)
山西省揭榜招标项目(20201101005)
“科技兴蒙”行动重点专项项目(2022EEDSKJXM010)。
关键词
综采工作面
海量数据
数据挖掘
数据采集
数据存储
数据清洗
数据可视化
fully mechanized working face
massive data
data mining
data acquisition
data storage
data cleaning
data visualization