期刊文献+

使用GPU模拟地震波传播的性能研究 被引量:3

Performance Study of Seismic Wave Propagation Simulation Using GPU
下载PDF
导出
摘要 地震波传播的高性能数值模拟是地震研究的重要组成部分。通过挖掘地震波传播弹性动力学方程和其有限差分离散的并行性,着重对地震波传播模拟在GPU体系结构上的性能进行研究。提出了使用GPU模拟地震波传播的优化算法,包括GPU上特有的区域分解法和子区域网格上最大化访存联合的两类片内存储器访问方案。实验表明,优化后的GPU实现与使用英特尔线程构建模块优化的双核CPU上的实现相比获得了42倍以上的加速比。 High performance numerical simulation of seismic wave propagation plays an important role in seismic research. In this paper an optimized simulation algorithm of seismic wave propagation on the graphics processing unit (GPU) is presented. Based on parallelism analysis of elastodynamic equations and their finite-difference discretization, emphasis is placed on optimizations directly targeted at GPU architecture to best exploit the computational capabilities available. We discuss the specific implementation details of GPU kernels for domain decomposition method. We also describe two optimized on-chip memory access schemes with maximized memory coalescing for the meshes on the subdomains. The experimental results show that the optimized GPU implementation is more than 42 times faster than an Intel Threading Building Blocks (TBB) optimized dual-core CPU counterpart.
出处 《系统仿真学报》 CAS CSCD 北大核心 2009年第S1期170-174,共5页 Journal of System Simulation
关键词 地震波传播的数值模拟 地震波可视化 图形处理器 计算统一设备架构 numerical simulation of seismic wave propagation seismic wave visualization GPU CUDA
  • 相关文献

参考文献12

  • 1刘伟峰,王智广.细粒度并行计算编程模型研究[J].微电子学与计算机,2008,25(10):103-106. 被引量:10
  • 2吴恩华.图形处理器用于通用计算的技术、现状及其挑战[J].软件学报,2004,15(10):1493-1504. 被引量:141
  • 3Owens J,Luebke D,Govindaraju N,Harris M,Krüger J,Lefohn A,Purcell T.A Survey of General-Purpose Computation on Graphics Hardware. Computer Graphics . 2007 被引量:1
  • 4Deschizeaux B,Blanc J.Imaging Earth‘s Subsurface Using CUDA. GPU Gems3 . 2008 被引量:1
  • 5Komatitsch D,Michéa D,Erlebacher G.Porting a High-order Finite-element Earthquake Modeling Application to NVIDIA Graphics Cards Using CUDA. Journal of Parallel and Distributed Computing . 2009 被引量:1
  • 6Reinders J.Intel Threading Building Blocks. . 2007 被引量:1
  • 7nVidia.GeForce GTX285Specification. http://www.nvidia.com/object/product_geforce_gtx_285_us.html . 2008 被引量:1
  • 8Young E,Jargstorff F.Image Processing&Video Algorithms with CUDA. nVISION08 . 2008 被引量:1
  • 9Andrade D,Brodman J,Fraguela B,Padua D.Hierarchically Tiled Arrays Vs.Intel Threading Building Blocks for Programming Multicore Systems. Programmability Issues for Multi-Core Computers,(MULTIPROG‘08),in conjunction with HiPEAC‘08 . 2008 被引量:1
  • 10Andrade D,Fraguela B,Brodman J,Padua D.Task-parallel versus Data-parallel Library-based Programming in Multicore Systems. 17th EUROMICRO International Conference on Parallel,Distributed,and Network-based Processing (PDP2009) . 2009 被引量:1

二级参考文献10

  • 1吴恩华,柳有权.基于图形处理器(GPU)的通用计算[J].计算机辅助设计与图形学学报,2004,16(5):601-612. 被引量:227
  • 2吴恩华.图形处理器用于通用计算的技术、现状及其挑战[J].软件学报,2004,15(10):1493-1504. 被引量:141
  • 3Green S. GPU Physics[R]. San Diego, SIC, GRAPH GPGPU Course, 2007. 被引量:1
  • 4Sutter H. The free lunch is over: a fundamental turn toward concurrency in software [ J ]. Dr. Dobb' s Journal, 2005,30(3) :21 - 29. 被引量:1
  • 5Buck I, Foley T, Horn D, et al. Brook for GPUs. stream computing on graphics hardware [ J ]. ACM Trans. on Graphics, 2004, 23(3): 777-786. 被引量:1
  • 6AMD. Brook + [ R ]. Reno: BOF Session of supercomputing, 2007. 被引量:1
  • 7Stratton J, Stone S, Hwu Wen- mei. MCUDA: an efficient implementation of CUDA kernels on multi - cores [R]. USA: IMPACT Technical Report, 2008. 被引量:1
  • 8Henning J. SPEC CPU2000:measuring CPU performance in the new millennium[J]. Computer, 2000, 33(7): 28 - 35. 被引量:1
  • 9Demmel J, Bailey D, Henry G, et al. Design, implementation and testing of extended and mixed precision BLAS [J ]. ACM Transations on Mathematical Software, 2002, 28(2) : 152 - 205. 被引量:1
  • 10Wilkinson B, Allen M. Parallel programming[ M]. 2nd ed. Boston: Prentince Hall. 2005. 被引量:1

共引文献146

同被引文献61

  • 1吴恩华.图形处理器用于通用计算的技术、现状及其挑战[J].软件学报,2004,15(10):1493-1504. 被引量:141
  • 2姜昌华,韩伟,胡幼华.REPAST——一个多Agent仿真平台[J].系统仿真学报,2006,18(8):2319-2322. 被引量:41
  • 3Francisco Ortigosa, Repsol YPF, Mauricio Araya-Po1o, F'elix Rubio. Evalution 3D RTM on HPC Plat forms. SEG Tecknical Program Expanded Abtracts, 2008,27 : 2879 - 2883. 被引量:1
  • 4Martin Kaser, Josep de la Puente, Cristohal Castro. Seismic wave field modeling using high performance computing. SEG Technical Program Expanded Abstracts. 2008,27 : 2884 - 2889. 被引量:1
  • 5Kalyan S Perumalla. Discrete-event Execution Alternatives on General Purpose Graphical Processing Units (GPGPUs) [C]// Proceedings of the 20th Workshop on Principles of Advanced and Distributed Simulation. Singapore: ACM Press, 2006:74-81. 被引量:1
  • 6Hyungwook Park, Paul A Fishwick. A GPU-Based Application Framework Supporting Fast Discrete-Event Simulation [J]. SIMULATION (S0037-5497), 2010, 86(10): 613-628. 被引量:1
  • 7Kalyan S Perumalla, Brandon G Aaby. Data Parallel Execution Challenges and Runtime Performance of Agent Simulations on GPUs [C]// Proceedings of the 2008'Spring Simulation Multi-Conference. Ottawa, Canada: SCS, 2008: 116-123. 被引量:1
  • 8David Strippgen, Kai Nagel. Using common graphics hardware for multi-agent traffic simulation with CUDA [C]// Proceedings of SIMUTools 2009. Rome, Italy: ICST, 2009. 被引量:1
  • 9NVIDIA Corporation. NVIDIA's Next Generation CUDA Compute Architecture: Femi [R]. USA: NVIDIA Corporation, 2009. 被引量:1
  • 10NVID1A Corporation. NVIDIA CUDA: Compute Unified DeviceArchitecture Programming Guide 2.0 [R]. USA: NVIDIA Corporation, 2008. 被引量:1

引证文献3

二级引证文献13

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部