期刊文献+

自动并行化中不规则问题的划分方法 被引量:1

Partition Method for Automatic Parallelization of Irregular Problems
下载PDF
导出
摘要 许多大规模计算程序包含了不规则问题。在面向分布存储的自动并行化中,以往的研究在编译时划分不规则问题的循环和数组较难。文章提出了一种划分方法自动为一类常见的不规则问题寻找计算和数据分解,能在编译时通过计算分解分配不规则数组的数据空间,并根据规则数组之间的相关性来减少数组重分布。该方法通过计算分解和数组引用的访问表达式来分配不规则数组访问的数据到各处理器,并通过数组重分布图在循环间寻找一致的分解。实验结果表明了方法的有效性,并对测试用例取得了预期的加速比。 Many large-scale scientific applications contain irregular problems. But the prior work of automatic parallelization on distributed memory is hardly to partition loop and array of irregular problems at compile-time. This paper proposes a partition approach for automatically finding computation and data decomposition of a common class of irregular loops. It' s able to partition the data space of irregular arrays at compile-time and reduce array redistributions by the relativity of regular arrays. The approach distributes data accessed by irregular arrays onto each processor by computation decomposition and access expression of array references, and searches consistent decomposition be- tween loops by array redistribution graph. Experiment results show the validity of the approach and the speedup of test applications.
机构地区 信息工程大学
出处 《信息工程大学学报》 2013年第2期235-242,共8页 Journal of Information Engineering University
基金 国家863计划资助项目(2009AA01120 2009ZX10036-001-001)
关键词 自动并行化 计算分解 不规则循环 不规则数组 automatic parallelization computation decomposition irregular loops irregular loops
  • 相关文献

参考文献20

  • 1Anderson JM , Lam MS. Global optimizations for parallelism and locality on scalable parallel machines [ C ] //Cartwright R,ed. Proceedings of the ACM SIGPLAN 1993 conference on Programming language design and implementation. Albuquerque :ACM New York,1993 : 112-125. 被引量:1
  • 2Anderson JM. Automatic Computation and Data Decomposition for Multiprocessors[ D]. Stanford University, 1997. 被引量:1
  • 3Lim AW. Improve Parallelism and Data Locality with Affine Partitioning[ D]. Stanford University, 2001. 被引量:1
  • 4Lee PZ, Kedem ZM. Automatic Data and Computation Decomposition on Distributed Memory Parallel Computers[ J]. ACMTransactions on Programming Languages and Systems, 2002,24(1) : 1 -50. 被引量:1
  • 5Han L,Zhao RC,Pang JM. Dynamic decomposition algorithm merging control flow analysis [ C ]//Arabniaed HR, ed. The2007 International Conference on Parallel and Distributed Processing Techniques and Applications. Las Vegas Nevada : CSREApress, 2007: 245-250. 被引量:1
  • 6Han L,Zhao RC , Pang JM. A Consistency Combination Algorithm for Global Dynamic Computation and Data Decompositions[C ] //IEEE Second International Conference on Complex, Intelligent and Software Intensive Systems ( CISIS ’ 08 ) . 2008 :148-154. 被引量:1
  • 7夏军,杨学军.基于数据空间融合的全局计算与数据划分方法[J].软件学报,2004,15(9):1311-1327. 被引量:7
  • 8丁锐,赵荣彩,韩林.基于主导值的计算和数据自动划分算法[J].计算机科学,2012,39(3):290-294. 被引量:5
  • 9张为华,王鹏,臧斌宇,朱传琪.一种基于代表元的划分算法[J].计算机学报,2008,31(3):400-410. 被引量:4
  • 10Bondhugula U , Hartono A, Ramanujam J, et al. A practical automatic polyhedral parallelizer and locality optimizer[ C ]//Proceedings of The ACM SIGPLAN 2008 Conference on Programming Language Design and Implementation. 2008 : 101-113. 被引量:1

二级参考文献48

  • 1夏军,杨学军.基于数据空间融合的全局计算与数据划分方法[J].软件学报,2004,15(9):1311-1327. 被引量:7
  • 2[1]Chen TS, Chang CY. Skewed data partition and alignment techniques for compiling programs on distributed memory multicomputers. The Journal of Supercomputing 2002,21 (2): 191~211. 被引量:1
  • 3[2]Chang WL, Chu CP, Wu JH. Communication-Free alignment for array references with linear subscripts in three loop index variables or quadratic subscripts. The Journal of Supercomputer; 2001,20(1):67~83. 被引量:1
  • 4[3]Shih KP, Sheu JP, Huang CH. Statement-Level communication-free partitioning technique for parallelizing compilers. The Journal of Supercomputing, 2000,15(3):243~269. 被引量:1
  • 5[4]Lim AW. Improve parallelism and data locality with affine partitioning[Ph.D. Thesis]. Palo Alto: Stanford University, 2001. 被引量:1
  • 6[5]Chen TS, Sheu JP. Communication-Free data allocation techniques for parallelizing compilers on multicomputers. IEEE Trans on Parallel and Distributed Systems, 1994,5(9):924~938. 被引量:1
  • 7[6]Ramanujam J, Sadayappan P. Compile-Time techniques for data distribution in distributed memory machines IEEE Trans. on Parallel and Distributed systems, 1991,2(4):472~482. 被引量:1
  • 8[7]Huang CH, Sadayappan P. Communication-Free hyperplane partitioning of nested loops. Journal of Parallel ard Distributed Computing, 1993,19(2):90~102. 被引量:1
  • 9[8]Wolf M. High Performance Compilers for Parallel Computing. Redwood Addison-Wesley Publishing Company, 1996. 137~510. 被引量:1
  • 10[9]Wolf M, Lam M. A data locality optimizing algorithm. In: Mauney J, ed. Proc. of the SIGPLAN'91 Conf. on Programming Language Design and Implementation. New York: ACM Press, 1991.30~44. 被引量:1

共引文献10

同被引文献15

  • 1FERNER C S. Revisiting communication code generation algorithms for message-passing systems [ J]. International Journal of Parallel, Emergent and Distributed Systems, 2006, 21(5): 323 -344. 被引量:1
  • 2BONDHUGULA U, HARTONO A, RAMANUJAM J, et al. A prac- tical automatic polyhedral parallelizer and locality optimizer [ C]// PLDI'08: Proceedings of the 2008 ACM SIGPLAN Conference on Programming Language Design and Implementation. New York: ACM, 2008: 101-113. 被引量:1
  • 3BONDHUGULA U. Automatic distributed-memory parallelization and code generation using the polyhedral framework, IISc-CSA-TR-2011- 3 [ R]. Bangalore: Indian Institute of Science, 2011. 被引量:1
  • 4GUO M, PAN Y, LIU Z. Symbolic communication set generation for irregular parallel applications [ J]. The Journal of Supercomputing, 2003, 25(3) : 199 -214. 被引量:1
  • 5RAVISHANKAR M, EISENLOHR J, POUCHET L-N, et al. Code generation for parallel execution of a class of irregular loops on dis- tributed memory systems [ C]//SC'12: Proceedings of the 2012 In- ternational Conference for High Performance Computing, Networ- king, Storage, and Analysis. Los Alamitos: IEEE Computer Socie- ty, 2012: 1-11. 被引量:1
  • 6STROUT M M, GEORGE G, OLSCHANOWSKY C. Set and rela- tion manipulation for the sparse polyhedral framework [ C]// LCPC 2012: Proceedings of the 25th International Workshop on Languages and Compilers for Parallel Computing, LNCS 7760. Berlin: Spring- er-Verlag, 2012:61-75. 被引量:1
  • 7LAMIELLE A, STROUT M M. Enabling code generation within the sparse polyhedral framework, CS-10-102 [ R]. Fort Collins, CO:Colorado State University, 2010. 被引量:1
  • 8BASUMALLIK A, EIGENMANN R. Optimizing irregular shared- memory applications for distributed-memory systems [ C ]// PPOPP'06: Proceedings of the 1 l th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. New York: ACM, 2006: 119-128. 被引量:1
  • 9CAMPANONI S, JONES T, HOLLOWAY G, et al. HELIX: auto- matic parallelization of irregular programs for chip multiprocessing [ C]// CGO'12: Proceedings of the 10th International Symposium on Code Generation and Optimization. New York: ACM, 2012:84 -93. 被引量:1
  • 10KIM H, JOHNSON NP, LEE J W, et al. Automatic speculative DOALL for clusters [ C]/! CGO'12: Proceedings of the 10th Inter- national Symposium on Code Generation and Optimization. New York: ACM, 2012: 94- 103. 被引量:1

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部