期刊文献+

面向OpenCL模型的GPU性能优化 被引量:21

GPU Performance Optimization Targeting OpenCL Model
下载PDF
导出
摘要 GPU的高性价比吸引了越来越多的通用计算.为充分发挥异构处理平台下GPU的通用计算能力,提出面向OpenCL模型的性能优化方法.该方法建立源程序的多面体表示,分别对GPU的全局存储器和快速存储器进行优化与分配;通过检测存储访问模式发掘可向量化的存储访问实例,利用数据空间变换对存储访问模式进行转换,进而使用向量数据类型提高片外存储器的带宽利用率;通过检测程序中的数据重用,根据数据的访问属性和OpenCL存储模型的特性实现快速存储器的有效分配与优化,提高了片上存储器的使用效率.采用文中方法对6个测试程序进行实验的结果表明,程序的性能提高了1.6~8.4倍,证实了其有效性. Graphic processing units attract more and more general-purpose computing due to high performance/cost ratio. In order to fully exploit the capability of GPU for general-purpose computing under heterogeneous processing platforms, this paper proposes performance optimization methods targeting OpenCL model. Polyhedral representation of a source program is built to optimize and allocate GPU memory system. By checking memory access patterns of the source program, access instances those can be grouped together are discovered by means of graph coloring. Subsequently, data space transformation is utilized to alter irregular memory access patterns for the sake of improving the off-chip memory bandwidth by taking advantage of vector data types. Meanwhile, data reuse information is detected to allocate data into distinct fast memory regions according to both the properties of data accesses and the characteristics of the OpenCL memory model, with the purpose of making best usage of the fast on-chip memory. Experimental results on benchmarks showed that the optimized programs achieved a speedup of 1.6X-8.4X in comparison with the un-optimized versions, demonstrated the effectiveness of the proposed methods.
作者 陈钢 吴百锋
出处 《计算机辅助设计与图形学学报》 EI CSCD 北大核心 2011年第4期571-581,共11页 Journal of Computer-Aided Design & Computer Graphics
基金 上海市重点学科建设项目基金(B114) AMD大学合作计划基金
关键词 OPENCL GPU 性能优化 异构处理 通用计算 多面体表示 OpenCL GPU performance optimization heterogeneous processing general-purpose eomputing polyhedral representation
  • 相关文献

参考文献20

二级参考文献153

共引文献277

同被引文献164

引证文献21

二级引证文献55

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部