摘要
当前GPU(图形处理器),即使是中端服务器配置的中端GPU也拥有强大的并行计算能力.不同于近期的研究成果,中端服务器可能配置有几块高端CPU和一块中端GPU,GPU能够提供额外的计算能力而不是提供比CPU更加强大的计算能力.本文以中端工作站上的CoOLAP(协同OLAP)为中心,描述如何使中端GPU与强大的CPU协同以及如何在计算均衡的异构平台上分布数据和计算以使Co-OLAP模型简单而高效.根据实际的配置,基于内存容量,GPU显存容量,数据集模式和订制的AIR(数组地址引用)算法提出了最大高性能数据分布模型.CoOLAP模型将数据划分为驻留于内存和GPU显存的数据集,OLAP计算也划分为CPU和GPU端的自适应计算负载来最小化CPU和GPU内存之间的数据传输代价.实验结果显示,在SF=20的SSB(星形模型基准)测试中,两块至强六核处理器的性能略优于一块NVIDA Quadra 5 000GPU(352个cuda核心)的处理性能,Co-OLAP模型可以将负载均衡分布在异构计算平台并使每个平台简单而高效.
Nowadays GPUs have powerful parallel computing capability even for moderate GPUson moderate servers. Opposite to the recent research efforts, a moderate server may be equippedwith several high level CPUs and a moderate GPU, which can provide additional computing powerinstead of more powerful CPU computing. In this paper, we focus on Co-OLAP (CooperatedOLAP) processing on a moderate workstation to illustrate how to make a moderate GPU cooperatewith powerful CPUs and how to distribute data and computation between the balanced computingplatforms to create a simple and efficient Co-OLAP model. According to real world configuration,we propose a maximal high performance data distribution model based on RAM size, GPU devicememory size, dataset schema and special designed AIR(array index referencing) algorithm. TheCo-OLAP model distributes dataset into host and device memory resident datasets, the OLAP isalso divided into CPU and GPU adaptive computing to minimize data movement between CPU andGPU memories. The experimental results show that two Xeon six-core CPUs slightly outperformone NVIDA Quadra 5 000 GPU with 352 cuda cores with SF = 20 SSB dataset, the Co-OLAPmodel can assign balanced workload and make each platform simple and efficient.
出处
《华东师范大学学报(自然科学版)》
CAS
CSCD
北大核心
2014年第5期240-251,共12页
Journal of East China Normal University(Natural Science)
基金
中央高校基本科研业务费专项资金(12XNQ072,13XNLF01)