摘要
为了给数据仓库提供高质量的数据,在数据装载到数据仓库之前必须经过数据的抽取-转换-装载(Extraction-Transformation-loading,ETL)这一系列的预处理工作。复杂性和可用性是制约ETL系统的两大基本问题。为解决这些问题,给出了ETL过程统一的体系结构设计,包括ETL元数据对象建模、ETL转换函数设计、ETL任务建模以及ETL任务模型的描述语言(XTDL)。基于该体系结构和设计思想开发出一个ETL系统—MSETL,目的是为多策略数据挖掘平台(MSMiner)提供高质量的数据。它提供友好界面并对ETL过程进行统一的元数据管理,包括:ETL转换函数的注册和删除;任务模型的生成、执行和删除等功能。
To help data warehouse getting high-quality data, data preprocess is needed. Extraction-Transformation-Loading(ETL)tools can finish this work. Complexity and usability are the primary problems concerning the ETL tools. To deal with these problems we provide a uniform architecture design for ETL processes which covers the aspects of metadata pertinent to ETL modeling, ETL transformation function design, modeling of ETL tasks, and the description language of ETL task model (XTDL). According to this idea of design, we developed an ETL tool named MSETL, aiming to provide the high-quality data for our multi-strategy data mining platform (MSMiner), which provides the friendly interface to manage the metadata of ETL processes, including login and deleting of ETL transformation functions, constructing and deleting the tasks, and browsing the result of execution of task.
出处
《系统仿真学报》
CAS
CSCD
2004年第5期907-911,914,共6页
Journal of System Simulation
基金
国家自然科学基金(60173017
90104021)
北京自然科学基金(4011003)