期刊文献+

并行时空处理模型下的快速N-body算法 被引量:3

Parallel Time-Space Processing Model Based Fast N-body Simulation
下载PDF
导出
摘要 图形处理器(graphic processing unit,GPU)的最新发展已经能够以低廉的成本提供高性能的通用计算。基于GPU的CUDA(compute unified device architecture)和OpenCL(open computing language)编程模型为程序员提供了充足的类似于C语言的应用程序接口(application programming interface,API),便于程序员发挥GPU的并行计算能力。采用图形硬件进行加速计算,通过一种新的GPU处理模型——并行时间空间模型,对现有GPU上的N-body实现进行了分析,从而提出了一种新的GPU上快速仿真N-body问题的算法,并在AMD的HD Radeon 5850上进行了实现。实验结果表明,相对于CPU上的实现,获得了400倍左右的加速;相对于已有GPU上的实现,也获得了2至5倍的加速。 With the development of graphic processing unit (GPU), the general processing with high performance can be achieved with low cost. The GPU based compute unified device architecture (CUDA) and open computing language (OpenCL) programming model provide adequate application programming interfaces (APIs) similar to C language, which can be utilized by programmer with the power of GPU parallel processing. This paper presents a novel parallel implementation algorithm of N-body gravitational simulation on GPU. The algorithm uses graphics hardware to accelerate computation, and is optimized to N-body computation based on parallel time-space processing model (PTPM) on GPUs. The paper also analyzes the current implementations of GPU, and gives a new method on implementing N-body algorithm on HD Radeon 5850 GPU of AMD. Experimental results show that this method achieves an acceleration of 400 times compared with CPU, and an acceleration up to 2-5 times compared with GPU.
出处 《计算机科学与探索》 CSCD 2011年第11期1006-1013,共8页 Journal of Frontiers of Computer Science and Technology
基金 国家自然科学基金No.61103068 61174158 NSFC-微软亚洲研究院联合资助项目No.60970155 教育部博士点基金No.20090072110035 上海市优秀学科带头人计划项目No.10XD1404400 高效能服务器和存储技术国家重点实验室开放基金No.2009HSSA06 同济大学青年基金No.0800219105 2009KJ030~~
关键词 N-BODY 并行计算 通用图形处理器(GPGPU) 时间空间模型 N-body parallel computing general purpose graphic processing unit (GPGPU) time-space model
  • 相关文献

参考文献8

  • 1Nyland L, Harris M, Prins J. Fast N-body simulation with CUDA[J]. GPU Germs, 2007, 3: 677-695. 被引量:1
  • 2Harnada T, Titaka I. The chamomile scheme: an optimized algorithm for N-body simulations on programmable graphics processing units[EB/OL]. (2007-03)[2011-04]. http://arxiv.org/abs/astro-ph/0703100. 被引量:1
  • 3陈国良.并行计算--结构.算法鳊程[M].北京:高等教育出版社,2003. 被引量:1
  • 4Hamada T, Nitadori K, Benkrid K, et al. A novel multiple walk parallel algorithm for the Barnes-Hut treecode on GPUs-towards cost effective, high performance N-bodysimulation[J]. Computer Science: Research and Devel- opment, 2009, 24(1/2): 21-31. 被引量:1
  • 5Hagihara Y. Celestial mechanics[M]. Cambridge: MIT Press, 1976. 被引量:1
  • 6Barnes J, Hut P. A hierarchical O(nlogn) force-calculation algorithm[J]. Nature, 1986, 324: 446-449. 被引量:1
  • 7Greengard L, Rokhlin V. A fast algorithm for particle simulations[J]. Journal of Computational Physics, 1987, 73(2): 325-348. 被引量:1
  • 8Hamada T, Yokota R, Nitadori K, et al. 42 TFLOPS hierarchical N-body simulations on GPUs with applications in both astrophysics and turbulence[C]//Proceedings of theConference on High Performance Computing Networking, Storage and Analysis (SC '09). New York, NY, USA: ACM, 2009: 14-20. 被引量:1

同被引文献6

引证文献3

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部