摘要
在常规并行方案的基础上,继续挖掘CPU计算潜力,从CPU指令级优化入手,借用CPU的矢量运算单元(VALU)和SSE指令集,实现了在一个指令周期内并行完成四个浮点数据运算,得到以下认识:1对有限差分类方法的波动方程正演而言,使用SSE指令集可以取得较好的加速效果,能够实现CPU的二次加速;2引入SSE后的加速比会随着正演模型数据量的增大而缓慢增大,但因其一次最多完成4个浮点型数据运算,理论加速比最大不会超过4;3SSE加速不需要额外增加硬件配置就可实现计算效率提升,加速成本低,有很广泛的适用性;4单机执行三级并行能够获得最佳的执行效率,多机间执行三级并行效率与机间数据传输网络速度有关。通过数值模拟实验发现,新的并行方案较常规并行方案在运算速度上有大幅提升,获得了明显的加速效果。
On the basis of conventional parallel scheme, we continue to further tap the potential of CPU computing and start with CPU instruction level optimization, then use vector arithmetic logic unit (VALU) and SSE instruction set to finish four floating-point data operations in an instruction cycle. The following conclusion are obtained: (1) For wave equation forward modeling based on finite-difference, SSE instruction set can get good acceleration and obtain second acceleration of CPU; (2) The acceleration ratio of SSE can increase slowly with the increase of forward modeling data, because it operates 4 floating-point data in one cycle, the maximum acceleration ratio cannot be beyond 4; (3) The acceleration of SSE can increase the efficiency without other devices, so it has low cost of acceleration and wide application; (4) The implementation of three-level parallel on single machine can achieve the best efficiency, however the efficiency on multi-machines depends on network speed. Numerical simulation experiments show that the new parallel scheme has a substantial increase in the operation speed compared with the conventional parallel scheme. © 2016, Editorial Department OIL GEOPHYSICAL PROSPECTING. All right reserved.
出处
《石油地球物理勘探》
EI
CSCD
北大核心
2016年第5期1049-1054,840,共6页
Oil Geophysical Prospecting
关键词
三维波动方程正演模拟
并行计算
VALU加速
SSE指令集
Acceleration
Computation theory
Computer circuits
Digital arithmetic
Efficiency
Logic circuits
Reconfigurable hardware