摘要
基于图形处理器(GPU)的计算统一设备体系结构(compute unified device architecture,CUDA)构架,阐述了GPU用于通用计算的原理和方法。在Geforce8800 GT下,完成了矩阵乘法运算实验。实验结果表明,随着矩阵阶数的递增,无论是GPU还是CPU处理,速度都在减慢。数据增加100倍后,GPU上的运算时间仅增加了3.95倍,而CPU的运算时间增加了216.66倍。
Based on the CUDA (compute unified device architecture) of GPU (graphics processing unit), the technical fimdamentals and methods for general purpose computation on GPU are introduced. The algorithm ofmatrix multiplication is simulated on Geforce8800 GT. With the increasing of matrix order, algorithm speed is slowed either on CPU or on GPU. After the data quantity increases to 100 times, the operation time only increased in 3.95 times on GPU, and 216.66 times on CPU.
出处
《计算机工程与设计》
CSCD
北大核心
2009年第14期3359-3361,共3页
Computer Engineering and Design
基金
南京工程学院引进人才科研启动基金项目(KXJ07056)
关键词
图形处理器
计算统一设备体系结构
通用计算
矩阵乘法
矩阵阶数
graphics processing unit (GPU)
compute unified device architecture (CUDA)
general purpose computation
matrix multiply
matrix order