期刊文献+

一种基于图形处理器压缩结构的预取结构设计

A prefetch architecture design based on graphics processor compression architecture
下载PDF
导出
摘要 图形处理器(GPU)访存利用率已经成为影响其性能的关键瓶颈之一。在处理器设计中,访存的预取结构设计成为了提高访存利用率的主要方法之一。结合图形处理器的访存密集的特点,在提高预取性能的前提下,减小影响图形流水线正常效率成为热门的研究方向。本文基于一种图形处理器无损压缩的结构,提出了一套图形处理器的预取结构设计。本预取结构设计可在访存密集型的图形流水线中有效提高访存利用率,并不影响当前图形流水线的效率。实验结果表明,在Godson GPU图形处理器平台上,与传统预取结构相比,针对访存密集型测试程序,cache命中率可以提高15%以上。针对访存空闲的测试程序,该设计不会对流水线产生负面影响。 Graphics processing unit(GPU)memory access utilization has become one of the key bottlenecks affecting performance.In processor design,memory access prefetch architecture design has become one of the main methods to improve memory access utilization.Combined with graphics processor memory access,due to the dense features,under the premise of improving the prefetch performance,reducing the influence on the normal efficiency of the graphics pipeline has become a popular research direction.Based on a graphics processor lossless compression architecture,this paper proposes a set of graphics processor prefetch architecture design.The design of the prefetch architecture can effectively improve the memory access utilization in the memory-intensive graphics pipeline,and does not affect the efficiency of the current graphics pipeline.The experimental results show that on the Godson graphic processing unit(GSGPU)graphics processor platform,compared with the traditional prefetch architecture,the cache hit rate can be increased by more than 15% for the memory-intensive test program.For the test program with idle memory,it will not have a negative impact on the pipeline.
作者 赵士彭 张立志 章隆兵 ZHAO Shipeng;ZHANG Lizhi;ZHANG Longbing(State Key Laboratory of Computer Architecture,Institute of Computing Technology,Chinese Academy of Sciences,Beijing 100190;Institute of Computing Technology,Chinese Academy of Sciences,Beijing 100190;University of Chinese Academy of Sciences,Beijing 100049)
出处 《高技术通讯》 CAS 2022年第4期351-357,共7页 Chinese High Technology Letters
基金 国家自然科学基金(61521092,61432016) 中国科学院重点部署项目(ZDRW-XH-2017-1)资助。
关键词 图形处理器(GPU) 访存子系统 预取结构 压缩结构 graphic processing unit(GPU) memory access subsystem prefetch architecture compressed architecture
  • 相关文献

参考文献4

二级参考文献17

  • 1王鹏,伊鹏,金德鹏,曾烈光.基于三级存储阵列缓存高速数据包及性能分析[J].软件学报,2005,16(12):2181-2189. 被引量:8
  • 2蔡士杰,宋继强,蔡敏.计算机图形学[M].第3版.北京:电子工业出版社,2007:10-21. 被引量:3
  • 3Wolf W. High performance embedded computing architectu- res, applications, and methodologies [ M ]. New York : Elsevier, 2007. 被引量:1
  • 4Yoo Hoi-Jun,Woo Jeong-Ho. Mobile 3D graphics SoC from algorithm to chip [ M ]. Republic of Korea:John Wiley & Sons (Asia) Pie Lid,2009,. 被引量:1
  • 5Lindholm E, Nickolls J, Oberman S, et al. NVIDIA Tesla : a u- nified graphics and computing architecture [ J ]. IEEE Micro, 2008,28 (2) :39-55. 被引量:1
  • 6Martin M. Token coherence [D]. Wisconsin : University of Wisconsin-Madison, 2003. 被引量:1
  • 7Johansson M. General purpose computing on graphics process- ing units using OpenCL[ D ]. Sweden: Chalmers University of Technology ,2010. 被引量:1
  • 8Woo R, Choi S, Sohn Ju-Ho, et al. A low-power 3D rendering engine with two texture units and 29Mb embedded DRAM for 3D multimedia tenninals[J]. IEEE Journal of Solid-state Cir- cuits,2004.39(7) :1101-1109. 被引量:1
  • 9Elder G. ATI Radeon 9700:architecture and 3D performance [ C ]//Proc of ACM SIGGRAPH/Eurographics. [ s. 1. ] : ACM ,2002:86-92. 被引量:1
  • 10Gareia J, March M, Cerda L, et al. On the design of hybrid DRAM,/SRAM memory schemes for fast packet buffers [ C ]// Proc of HPSR. [ s. 1. ] : IEEE Computer Society,2004 : 15-19. 被引量:1

共引文献20

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部