摘要
针对最短路径算法处理大规模数据集低效的问题,提出了基于图形处理器(Graphics Processing Unit,GPU)加速的全源对最短路径并行算法.首先通过优化矩阵乘法算法实现了在工作组内和组间进行并行运算数据,然后减少了非规则行造成的工作项分支,最后降低了工作项对邻接矩阵计算条带存储资源的访问延时.实验结果表明,与基于AMD Ryzen5 1600X CPU的串行算法、基于开放多处理(Open Multi-Processing, OpenMP)并行算法和基于统一计算设备架构(Compute Unified Device Architecture, CUDA)并行算法相比,最短路径并行算法在开放式计算语言(Open Computing Language, OpenCL)架构下NVIDIA GeForce GTX 1 070计算平台上分别获得了196.35、36.76和2.25倍的加速比,验证了提出的并行优化方法的有效性和性能可移植性.
Aiming at the problem that the shortest path algorithm is inefficient in processing large-scale datasets,this paper proposes an all-pairs shortest paths parallel algorithm based on Graphics Processing Unit(GPU)acceleration.Firstly,the optimized matrix multiplication algorithm is used to realize the parallel computing data inter-workgroup and intra-workgroup.Then the branch of work-items caused by irregular rows is reduced.Finally,the access delay of work-items to the strip storage resources of adjacency matrix calculation is reduced.The experimental results show that compared with the performance of the serial algorithm based on AMD Ryzen51600X CPU,parallel algorithm based on Open Multi-Processing(OpenMP)and parallel algorithm based on Compute Unified Device Architecture(CUDA),the shortest path parallel algorithm obtains 196.35,36.76 and 2.25 times speedup in the NVIDIA GeForce GTX 1070 computing platform under the Open Computing Language(OpenCL)architecture respectively.The validity and performance portability of the proposed parallel optimization method are verified.
作者
肖汉
肖诗洋
李焕勤
周清雷
XIAO Han;XIAO Shi-yang;LI Huan-qin;ZHOU Qing-lei(School of Information Science and Technology,Zhengzhou Normal University,Zhengzhou 450044,Henan,China;School of Civil Engineering,Southeast University,Nanjing 211189,Jiangsu,China;School of Computer and Artificial Intelligence,Zhengzhou University,Zhengzhou 450001,Henan,China)
出处
《云南大学学报(自然科学版)》
CAS
CSCD
北大核心
2023年第5期1022-1032,共11页
Journal of Yunnan University(Natural Sciences Edition)
基金
国家自然科学基金(61572444)
河南省高等学校重点科研项目(22A520049).
关键词
最短路径
重复平方法
图形处理器
开放式计算语言
并行算法
shortest path
repeating square method
Graphics Processing Unit(GPU)
Open Computing Language(OpenCL)
parallel algorithm