摘要
ETL过程是构建数据仓库的重要步骤。大多数现有ETL系统在数据量较大时处理效率偏低。根据ETL在公共数据中心的实际应用改进了原有的ETL结构。鉴于公共数据保密性的特点,设计了数据不同的获取方式;在设置数据转换规则的过程中结合相应领域知识,保证数据质量;前置机之间实行负载均衡,同时把数据转换的不同步骤分配到两台ETL服务器来完成,保证抽取和转换效率。通过实验证明了该ETL系统具有较好的效率。
ETL process is an important step in building a data warehouse.Most existing ETL systems are slow when processing large amounts of data.This paper refers to a practical application of ETL in the common data center to improve the original structure of the ETL.In view of the confident characteristics of common data,different data access methods are designed;during the process of setting data transformation rules,corresponding domain knowledge is integrated to ensure data quality;load balancing among front-ends are implemented while data transformation steps are assigned to two separate ETL servers to ensure the extraction and transformation efficiency.Experiment proves the ETL system is more efficient.
出处
《计算机应用与软件》
CSCD
2011年第10期167-169,190,共4页
Computer Applications and Software