自动并行化中不规则问题的划分方法被引量：1

Partition Method for Automatic Parallelization of Irregular Problems

下载PDF

导出

摘要许多大规模计算程序包含了不规则问题。在面向分布存储的自动并行化中,以往的研究在编译时划分不规则问题的循环和数组较难。文章提出了一种划分方法自动为一类常见的不规则问题寻找计算和数据分解,能在编译时通过计算分解分配不规则数组的数据空间,并根据规则数组之间的相关性来减少数组重分布。该方法通过计算分解和数组引用的访问表达式来分配不规则数组访问的数据到各处理器,并通过数组重分布图在循环间寻找一致的分解。实验结果表明了方法的有效性,并对测试用例取得了预期的加速比。 Many large-scale scientific applications contain irregular problems. But the prior work of automatic parallelization on distributed memory is hardly to partition loop and array of irregular problems at compile-time. This paper proposes a partition approach for automatically finding computation and data decomposition of a common class of irregular loops. It＇ s able to partition the data space of irregular arrays at compile-time and reduce array redistributions by the relativity of regular arrays. The approach distributes data accessed by irregular arrays onto each processor by computation decomposition and access expression of array references, and searches consistent decomposition be- tween loops by array redistribution graph. Experiment results show the validity of the approach and the speedup of test applications.

作者丁锐赵荣彩刘晓娴傅立国

机构地区信息工程大学

出处《信息工程大学学报》 2013年第2期235-242,共8页 Journal of Information Engineering University

基金国家863计划资助项目(2009AA01120 2009ZX10036-001-001)

关键词自动并行化计算分解不规则循环不规则数组 automatic parallelization computation decomposition irregular loops irregular loops

分类号 TP314 [自动化与计算机技术—计算机软件与理论]

引文网络
相关文献

参考文献20

1Anderson JM , Lam MS. Global optimizations for parallelism and locality on scalable parallel machines [ C ] //Cartwright R,ed. Proceedings of the ACM SIGPLAN 1993 conference on Programming language design and implementation. Albuquerque :ACM New York,1993 : 112-125. 被引量：1
2Anderson JM. Automatic Computation and Data Decomposition for Multiprocessors[ D]. Stanford University, 1997. 被引量：1
3Lim AW. Improve Parallelism and Data Locality with Affine Partitioning[ D]. Stanford University, 2001. 被引量：1
4Lee PZ, Kedem ZM. Automatic Data and Computation Decomposition on Distributed Memory Parallel Computers[ J]. ACMTransactions on Programming Languages and Systems, 2002,24(1) : 1 -50. 被引量：1
5Han L,Zhao RC,Pang JM. Dynamic decomposition algorithm merging control flow analysis [ C ]//Arabniaed HR, ed. The2007 International Conference on Parallel and Distributed Processing Techniques and Applications. Las Vegas Nevada : CSREApress, 2007: 245-250. 被引量：1
6Han L,Zhao RC , Pang JM. A Consistency Combination Algorithm for Global Dynamic Computation and Data Decompositions[C ] //IEEE Second International Conference on Complex, Intelligent and Software Intensive Systems ( CISIS ’ 08 ) . 2008 :148-154. 被引量：1
7夏军,杨学军.基于数据空间融合的全局计算与数据划分方法[J].软件学报,2004,15(9):1311-1327. 被引量：7
8丁锐,赵荣彩,韩林.基于主导值的计算和数据自动划分算法[J].计算机科学,2012,39(3):290-294. 被引量：5
9张为华,王鹏,臧斌宇,朱传琪.一种基于代表元的划分算法[J].计算机学报,2008,31(3):400-410. 被引量：4
10Bondhugula U , Hartono A, Ramanujam J, et al. A practical automatic polyhedral parallelizer and locality optimizer[ C ]//Proceedings of The ACM SIGPLAN 2008 Conference on Programming Language Design and Implementation. 2008 : 101-113. 被引量：1

二级参考文献48

1夏军,杨学军.基于数据空间融合的全局计算与数据划分方法[J].软件学报,2004,15(9):1311-1327. 被引量：7
2[1]Chen TS, Chang CY. Skewed data partition and alignment techniques for compiling programs on distributed memory multicomputers. The Journal of Supercomputing 2002,21 (2): 191～211. 被引量：1
3[2]Chang WL, Chu CP, Wu JH. Communication-Free alignment for array references with linear subscripts in three loop index variables or quadratic subscripts. The Journal of Supercomputer; 2001,20(1):67～83. 被引量：1
4[3]Shih KP, Sheu JP, Huang CH. Statement-Level communication-free partitioning technique for parallelizing compilers. The Journal of Supercomputing, 2000,15(3):243～269. 被引量：1
5[4]Lim AW. Improve parallelism and data locality with affine partitioning[Ph.D. Thesis]. Palo Alto: Stanford University, 2001. 被引量：1
6[5]Chen TS, Sheu JP. Communication-Free data allocation techniques for parallelizing compilers on multicomputers. IEEE Trans on Parallel and Distributed Systems, 1994,5(9):924～938. 被引量：1
7[6]Ramanujam J, Sadayappan P. Compile-Time techniques for data distribution in distributed memory machines IEEE Trans. on Parallel and Distributed systems, 1991,2(4):472～482. 被引量：1
8[7]Huang CH, Sadayappan P. Communication-Free hyperplane partitioning of nested loops. Journal of Parallel ard Distributed Computing, 1993,19(2):90～102. 被引量：1
9[8]Wolf M. High Performance Compilers for Parallel Computing. Redwood Addison-Wesley Publishing Company, 1996. 137～510. 被引量：1
10[9]Wolf M, Lam M. A data locality optimizing algorithm. In: Mauney J, ed. Proc. of the SIGPLAN'91 Conf. on Programming Language Design and Implementation. New York: ACM Press, 1991.30～44. 被引量：1

共引文献10

1杨学军,窦勇,胡庆丰.Progress and Challenges in High Performance Computer Technology[J].Journal of Computer Science & Technology,2006,21(5):674-681. 被引量：7
2韩林,赵荣彩,姚远.基于融合程序控制流的动态分解算法[J].计算机工程,2008,34(9):61-63. 被引量：1
3韩林,赵荣彩,庞建民.基于线性变换的计算与数据动态分解方法[J].计算机工程,2008,34(15):4-6.
4丁强,谢红梅,何贵青.基于MPI的并行分布式高斯消元算法设计和评估[J].系统仿真学报,2009,21(20):6429-6431. 被引量：4
5赵鹏,严明,李思昆.异构多处理器SoC的应用算法性能优化方法[J].软件学报,2011,22(7):1475-1487. 被引量：5
6丁锐,赵荣彩,韩林.基于主导值的计算和数据自动划分算法[J].计算机科学,2012,39(3):290-294. 被引量：5
7刘晓娴,赵荣彩,丁锐.面向DSWP并行的OpenMP任务调度机制的扩展与实现[J].计算机科学,2013,40(9):38-43. 被引量：2
8丁锐,赵荣彩,韩林.一种基于数组生命期的数据分解算法[J].软件学报,2013,24(12):2843-2858.
9丁锐,赵荣彩,徐金龙,傅立国.自动并行化中不规则循环的代码生成[J].计算机科学,2013,40(12):9-14.
10傅立国,姚远,丁锐.自动并行化中不规则循环的通信代码生成[J].计算机应用,2014,34(4):1014-1018.

同被引文献15

1FERNER C S. Revisiting communication code generation algorithms for message-passing systems [ J]. International Journal of Parallel, Emergent and Distributed Systems, 2006, 21(5): 323 -344. 被引量：1
2BONDHUGULA U, HARTONO A, RAMANUJAM J, et al. A prac- tical automatic polyhedral parallelizer and locality optimizer [ C]// PLDI'08: Proceedings of the 2008 ACM SIGPLAN Conference on Programming Language Design and Implementation. New York: ACM, 2008: 101-113. 被引量：1
3BONDHUGULA U. Automatic distributed-memory parallelization and code generation using the polyhedral framework, IISc-CSA-TR-2011- 3 [ R]. Bangalore: Indian Institute of Science, 2011. 被引量：1
4GUO M, PAN Y, LIU Z. Symbolic communication set generation for irregular parallel applications [ J]. The Journal of Supercomputing, 2003, 25(3) : 199 -214. 被引量：1
5RAVISHANKAR M, EISENLOHR J, POUCHET L-N, et al. Code generation for parallel execution of a class of irregular loops on dis- tributed memory systems [ C]//SC'12: Proceedings of the 2012 In- ternational Conference for High Performance Computing, Networ- king, Storage, and Analysis. Los Alamitos: IEEE Computer Socie- ty, 2012: 1-11. 被引量：1
6STROUT M M, GEORGE G, OLSCHANOWSKY C. Set and rela- tion manipulation for the sparse polyhedral framework [ C]// LCPC 2012: Proceedings of the 25th International Workshop on Languages and Compilers for Parallel Computing, LNCS 7760. Berlin: Spring- er-Verlag, 2012:61-75. 被引量：1
7LAMIELLE A, STROUT M M. Enabling code generation within the sparse polyhedral framework, CS-10-102 [ R]. Fort Collins, CO:Colorado State University, 2010. 被引量：1
8BASUMALLIK A, EIGENMANN R. Optimizing irregular shared- memory applications for distributed-memory systems [ C ]// PPOPP'06: Proceedings of the 1 l th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. New York: ACM, 2006: 119-128. 被引量：1
9CAMPANONI S, JONES T, HOLLOWAY G, et al. HELIX: auto- matic parallelization of irregular programs for chip multiprocessing [ C]// CGO'12: Proceedings of the 10th International Symposium on Code Generation and Optimization. New York: ACM, 2012:84 -93. 被引量：1
10KIM H, JOHNSON NP, LEE J W, et al. Automatic speculative DOALL for clusters [ C]/! CGO'12: Proceedings of the 10th Inter- national Symposium on Code Generation and Optimization. New York: ACM, 2012: 94- 103. 被引量：1

引证文献1

1傅立国,姚远,丁锐.自动并行化中不规则循环的通信代码生成[J].计算机应用,2014,34(4):1014-1018.

1傅立国,姚远,丁锐.自动并行化中不规则循环的通信代码生成[J].计算机应用,2014,34(4):1014-1018.
2陈志,李天瑞,李明,杨燕.基于计算统一设备架构的高铁故障诊断方法[J].计算机应用,2015,35(10):2819-2823. 被引量：3
3丁锐,赵荣彩,徐金龙,傅立国.自动并行化中不规则循环的代码生成[J].计算机科学,2013,40(12):9-14.
4刘勇,王丽宏,方滨兴,胡铭曾.用转换表和通信表解决不规则问题的方案[J].微处理机,1999,20(3):40-43.
5许晓东,李柯,朱士瑞.Web使用挖掘中Apriori算法的改进研究[J].计算机工程与设计,2010,31(3):539-541. 被引量：6
6应毅,刘亚军,陈诚.基于云计算技术的个性化推荐系统[J].计算机工程与应用,2015,51(13):111-117. 被引量：24
7臧勇.有关Apriori算法的拙见[J].新课程（教研版）,2010(10):37-38.
8臧勇.有关Apriori算法的拙见[J].新课程学习（下）,2010(12):17-17.
9程圣宇,白英杰,肖瀛,芦东昕.高速网络内容监控若干关键技术[J].计算机应用,2003,23(z2):365-367. 被引量：4
10董春丽,张平,韩林,林红军.自动计算分解和数据划分算法研究[J].微计算机信息,2005,21(11X):195-197. 被引量：2

信息工程大学学报

2013年第2期

浏览历史

内容加载中请稍等...

自动并行化中不规则问题的划分方法被引量：1

参考文献20

二级参考文献48

共引文献10

同被引文献15

引证文献1

相关作者

相关机构

相关主题

浏览历史

自动并行化中不规则问题的划分方法 被引量：1

参考文献20

二级参考文献48

共引文献10

同被引文献15

引证文献1

相关作者

相关机构

相关主题

浏览历史

自动并行化中不规则问题的划分方法被引量：1