期刊文献+

基于规范划分集的并行循环计算划分 被引量:1

A Computation Partition Based on Uniform Partitioning Schemes for Parallel Loops
下载PDF
导出
摘要 计算划分问题是并行编译中最为重要的问题之一.针对并行循环,在数据分布确定的情况下,提出了基于规范集的计算划分算法,具体讨论了规范集的获取方法及综合通信与负载均衡的最优方案选取算法.实验表明,在并行循环处理方面,这一算法与以前几种算法相比更加简单、有效;采用这一算法的p_HPF编译器对数据并行应用问题可以获得良好的加速比和效率.该编译器已在石油领域得到应用. Computation partition is one of the most important problems in parallel compilation and optimization. For dealing with parallel loops with determinated data distribution, a computation partition algorithm based on the subset of uniform schemes is proposed. The method of getting the subset of uniform schemes is given, as well as the algorithm of selecting the most optimized scheme under the consideration of communication and load balance. The experimental results prove that this algorithm is simpler and more effective than several previous algorithms in dealing with parallel loops, and the p_HPF compiler adopted by this algorithm can obtain good speedups and efficiencies. The compiler has been applied in the field of petroleum.
出处 《软件学报》 EI CSCD 北大核心 2003年第3期362-368,共7页 Journal of Software
基金 Supported by the National Natural Science Foundation of China under Grant No.60173004 (国家自然科学基金)
关键词 规范划分集 并行循环 计算划分 并行编译 编译程序 parallel loop parallel compilation computation partition parallel computation node program
  • 相关文献

参考文献10

  • 1[1]Banerjee U. Unimodular transformations if double loops. In: Proceedings of the 3rd Workshop on Languages and Compilers for Parallel Computing. 1990. 192~219. 被引量:1
  • 2[2]Banerjee U. Loop Transformations for Restructuring Compilers. Norwell: Kluwer Academic Publishers, 1993. 被引量:1
  • 3[3]Anderson JM, Lam MS. Global optimizations for parallelism and locality on scalable parallel machines. In: Proceedings of the ACM SIGPLAN'93 Conference on Programming Language Design and Implementation. 1993. 112~125. 被引量:1
  • 4[4]Lim AW, Cheong GI, Lam MS. An affine partitioning algorithm to maximize parallelism and minimize communication. In: Proceedings of the 13th ACM SIGARCH International Conference on Supercomputing. 1999. 228~237. 被引量:1
  • 5[5]Lim AW, Liao S-W, Lam MS. Blocking and array contraction across arbitrarily nested loops using affine partitioning. ACM SIGPLAN Notices, 2001,36(7):103~112. 被引量:1
  • 6[6]http://www.cs.rice.edu/~dsystem/dhpf/dhpf-overview-96/index.html. 1995. 被引量:1
  • 7[7]Hu CY, Jin GH, Johnsson SL, Kehagias D, Shalaby N. HPFBench: a hign performance Fortran Benchmark suite. ACM Transactions on Mathematical Software, 2000,26(1):99~149. 被引量:1
  • 8[8]CRPC/HPFF/benchmarks/index.cfm. 1995. 被引量:1
  • 9[9]Thirumalai A. Code generation and optimization for high performance Fortran . Department of Electrical and Computer Engineering, Louisiana State University, 1995. 被引量:1
  • 10[10]Huang QJ. Parallel loop and its compiling and optimizing techniques . Beijing: Peking University, 2002 (in Chinese with English Abstract). 被引量:1

同被引文献5

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部