期刊文献+

结构体向量化中的存储优化 被引量:2

Storage Optimization for Struct Vectorization
下载PDF
导出
摘要 目前主流的优化编译器无法通过已有的循环变换技术使得含有结构体计算类型引用的循环被优化处理,而结构体计算类型存在于大量的科学计算程序中,严重影响了程序性能的提升.本文从两方面考虑,提出了结构体向量化中的存储优化方法.首先针对结构体在内存中的存储存在"间隙"的问题,提出了结构体的存储预优化算法,压缩结构体的存储空间;其次为了发掘程序中更多的向量化机会,本文提出了程序单元(PU)内结构体数组的动态数据重组优化方法,对程序进行进一步的优化,在当前PU内通过改变结构体数组在内存中存储方式,使得包含结构体数组引用的循环能够被向量化.实验结果证明,文中提出的优化方法对SPEC CPU标准测试程序组中的部分应用程序具有明显的性能提升. Nowadays, loops containing struct references can't be optimized by most loop transformation technologies in mainstream op- timization compiler. Struct references are existed in a large number of scientific computing program, which is a main challenge to im- prove program performance. Firstly,in order to reduce storage space of structure,a preliminary optimization algorithm is proposed to remove storage gap in accessing memory of structure. Then,a dynamic data regrouping method of structure array is proposed in pro- gram unit ( PU ). The method changes storage layout of structure array in current PU, making the loops which contain struct references vectorized. It optimizes applications further and explores more vectorizable probability. The experimental results show that the proposed method can advance the execution efficiency of some applications in SPEC CPU adequately.
出处 《小型微型计算机系统》 CSCD 北大核心 2016年第9期1889-1897,共9页 Journal of Chinese Computer Systems
基金 国家自然科学基金项目(61472447)资助
关键词 结构体数组 数据重组 存储优化 向量化 structure array data regrouping storage optimizing SIMD vectorization
  • 相关文献

参考文献3

二级参考文献19

  • 1Hagog M, Tice C. Cache Aware Data Layout Reorganization Optimization in GCC /// Proceedings of the GCC Developers'Summit. June 2005 : 69-92 被引量:1
  • 2Fu Xiong, Zhang Yu, Chen Yiyun. Data-layout Optimization Using Reuse Distance Distribution//The Proceedings of International Workshop on Embedded Software Optimization (ESO'06). Vol. 4097 of Lecture Notes in Computer Science. Seoul, Korea, August 2006 : 858-867 被引量:1
  • 3McIntosh N, Mannarswamy S, Hundt R. Whole-Program Optimization of Global Variable Layout//Proceedings of PACTS06. Washington, DC, September 2006 被引量:1
  • 4Wilson B P, et al. SUIF:A Parallelizing and Optimizing Research Compiler. ACM SIGPLAN Notices, 1994,29(12):31-37 被引量:1
  • 5Steensgaard B. Points-to Analysis in Almost Linear Time//Proceedings of POPL'96. St. Petersburg,Jan. 1996 被引量:1
  • 6Luk Chi-Keung, Cohn R, et al. Pin: Building Customized Program Analysis Tools with Dynamic Instrumentation // Proceedings of PLDI'05. Chicago, Illinois, USA, June 2005 : 190-200 被引量:1
  • 7Sugumar R A, Abraham S G. Efficient simulation of multiple cache configurations using binomial trees. Technical Report CSE-TR-111-91. University of Michigan, 1991 被引量:1
  • 8Patterson D,Anderson T, Cardwell N, et al. A Case for Intelligent RAM. IEEE Micro, 1997 :34-44 被引量:1
  • 9Beyls K. Software Methods to Improve Data Locality and Cache Behavior. PhD thesis. Ghent University, 2004 被引量:1
  • 10McKinley K S , Carr S , Tseng C W. Improving Data Locality with Loop Transformations. ACM Transactions on Programming Languages and Systems, 1996,18(4) : 424-453 被引量:1

共引文献14

同被引文献17

引证文献2

二级引证文献7

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部