摘要
数据仓库中数据规模的不断膨胀和数据实时性需求的提高使得对ETL过程性能要求越来越高。通过对并行处理框架JPPF分析,提出了利用JPPF构建高性能ETL系统架构方案和一种ETL数据处理任务提交算法。经过测试和性能对比证明了该方案在处理包含大规模计算任务的ETL过程优势明显。
With the expansion of the data scale and increased real time data need in data warehouse,high performance ETL process is requested.According to the analysis of JPPF,put forward a solution of constructing high performance ETL structure by using JPPF and an ETL data procession tasks submission algorithm.After testing and performance comparison proves the prototype system has advantage in processing ETL process containing massive computation tasks.
出处
《计算机应用》
CSCD
北大核心
2008年第S2期223-225,270,共4页
journal of Computer Applications
基金
国家863计划项目(2006AA04Z166)
国家科技支撑计划项目(2007BAH19B01)
关键词
数据装载工具
数据仓库
并行计算
网格计算
Extract Transform Load(ETL)
data warehouse
parallel computing
grid computing