期刊文献+

基于ETL技术的多源异构数据融合方法研究

A study on multisource heterogeneous data integration methods based on ETL technology
下载PDF
导出
摘要 在多源大数据融合阶段处理历年异构的数据时,涉及到多指标和多维度的问题,需要清洗、转换、映射和对齐等操作。相关数据处理工具和方法逐步涌现,但仍难以解决大量数据的交叉融合问题。为此,研究了基于ETL技术的多源异构数据融合方法,分析了常用的ETL工具和数据融合技术,包括数据抽取、转换、加载工具以及数据处理算法。分析了面向灵活需求、业务交叉较多和实时数据流场景时,遇到的数据源异构、数据结构差异、数据更新频率困难等问题,并从模块化设计、逻辑和参数分离、标准化构件库、轻量级JSON格式的配置文件等方面研究了ETL工具模块化扩展和构件重复使用的方法,以便更好地处理大规模异构数据。解决了多源大数据融合阶段的交叉融合问题,对提高数据处理效率、确保数据质量以及支持更深入的数据分析和决策具有重要意义。 When processing the heterogeneous data over the years in the multi-source big data fusion stage,multi-index and multi-dimensional problems are involved,and cleaning,transformation,mapping and alignment operations are needed.Related data processing tools and methods are gradually emerging,but it is still difficult to solve the problem of cross-fusion of a large number of data.Multi-source heterogeneous data fusion methods were studied based on ETL technology,the common ETL tools and data fusion techniques were analyzed,including data extraction,conversion,loading tools,and the data-processing algorithms.This paper analyzes the heterogeneity of data sources,data structure difference,difficulty of data update frequency,and studies the methods of modular extension and repeated use of ETL tools from modular design,logical and parameter separation,standardized component library,configuration file in lightweight JSON format,so as to better handle large-scale heterogeneous data.It solves the problem of cross-fusion in the multi-source big data fusion stage,which is of great significance to improve data processing efficiency,ensure data quality,and support more in-depth data analysis and decision-making.
作者 杨国立 姜树明 YANG Guoli;JIANG Shuming(Inspur Genersoft Co.,Ltd.,Jinan 250101,China;Information Research Institute,Qilu University of Technology(Shandong Academy of Sciences),Jinan 250014,China)
出处 《齐鲁工业大学学报》 CAS 2024年第4期18-24,共7页 Journal of Qilu University of Technology
基金 国家重点研发计划项目(2019YFB1404700)。
关键词 教育统计 数据挖掘 转换-抽取-加载 软件工程 educational statistics data mining extract-transform-load software engineering
  • 相关文献

参考文献6

二级参考文献13

共引文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部