期刊文献+

程序向量化中非规则访存问题研究 被引量:2

Research on Irregular Memory Access Problem for Programs Vectorization
下载PDF
导出
摘要 现有的程序向量化方法通常仅支持连续访存模式,而不支持非连续的内存访问。为实现更多程序的向量并行,提出一种向量化非规则访存处理方法。对访存特征进行检测分类,针对不同特征给出对应的向量化方案,同时设计收益分析方法,以保证向量化的有效性。实验结果表明,该方法可有效提高向量化能力,实现复杂访存形式的程序向量化。 Current programs vectorization methods commonly support only continuous memory access forms,they do not support non-contiguous memory access.In order to implement more programs vector parallelisms,an effective method handing irregular memory access for vectorization is proposed.In this method,memory accesses are classified,and a feature detection method is provided.Different vectorization solutions are proposed corresponding to different memory access features.Finally,cost-benefit analysis method is provided to guarantee the effectiveness of vectorization.Experimental results indicate that this method improves the vectorization ability significantly,and the programs with complex memory access can be vectorized.
出处 《计算机工程》 CAS CSCD 北大核心 2015年第12期86-90,共5页 Computer Engineering
基金 国家"863"计划基金资助项目(2009AA01220) "核高基"重大专项(2009zx10036-001-001)
关键词 非连续访存 向量化 访存特征 数据重组 数组访存 non-contiguous memory access vectorization memory access feature data reorganization array memory access
  • 相关文献

参考文献10

  • 1Intel Corporation. Intel 64 and IA-32 Architectures Software Developer' s Manual [EB/OL ]. 12014-11-15 1. http ://www. intel, com/Assets/PDF/manual/252046, pdf. 被引量:1
  • 2Stewart J. An Investigation of SIMD Instruction Sets[D]. Ballarat,Australia:University of Ballarat,2005. 被引量:1
  • 3D'Arcy P, Beach S. StarCore SC140: A New DSP Architecture for Portable Devices[ Z]. 1999. 被引量:1
  • 4Amarasinghe S P,Anderson J A M,Lam M S, et al. An Overview of the SUIF Compiler for Scalable Parallel Machines [ C]//Proceedings of the 7th SIAM Con- ference on Parallel Processing for Scientific Computing. Philadelphia, USA : SIAM, 1995:662-667. 被引量:1
  • 5Naishlos D. Autovectorization in GCC [ C ]//Proceed- ings of 2004 GCC Developers Summit. Ottawa, Canada: [ s. n. ] ,2004 : 105-118. 被引量:1
  • 6Open64. Overview of the Open64 Compiler Infrastruc- ture [ EB/OL ]. [ 2014-11-15 ]. http://open64, source fore, net. 被引量:1
  • 7AllenR,KennedyK现代体系结构的优化编译器[M].张兆庆,乔如良,冯晓兵,等,译.北京:机械工业出版社,2004. 被引量:4
  • 8Rosen I ,Nuzman D ,Zaks A. Loop-aware SLP in GCC [ C ]// Proceedings of 2007 GCC Summit. New York, USA: [ s. n. ] ,2007:131-142. 被引量:1
  • 9Nuzman D, Rosen I, Zaks A. Auto-vectorization of Inter- leaved Data for SIMD [ J ]. ACM SIGPLAN Notices, 2006, 41(6) :132-143. 被引量:1
  • 10Kahle J A,Day M N,Hofstee H P, et al. Introduction to the Cell Multiprocessor [ J]. IBM Journal of Research and Development, 2005,49 ( 4 ) : 589-604. 被引量:1

共引文献3

同被引文献14

引证文献2

二级引证文献4

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部