期刊文献+

基于FPGA的卷积神经网络并行加速设计 被引量:7

Convolutional neural network parallel acceleration design based on FPGA
下载PDF
导出
摘要 为提升在资源、功耗受限的嵌入式平台上运行的深度卷积网络算法的速度和能效,提出一种基于现场可编程门阵列(FPGA)的卷积并行加速方案。利用卷积层与批归一化(batch normalization,BN)层融合减少计算复杂度;利用数据分片减少片上存储消耗;利用数据复用、并行计算提升运算速度,减少系统硬件开销;利用设计空间探索找到最符合硬件资源约束的计算并行度。实验结果表明,在100 MHz的工作频率下,加速器的峰值计算性能可以达到52.56 GFLOPS,性能是CPU的4.1倍,能耗仅为GPU的9.9%,与其它FPGA方案相比综合性能有一定的提升。 To improve the speed and energy efficiency of deep convolutional network algorithms running on embedded platforms with limited resources and power consumption,a convolutional parallel acceleration scheme based on field programmable gate array(FPGA)was proposed.Convolutional layer and batch normalization(BN)layer fusion was used to reduce computational complexity.Data fragmentation was used to reduce on-chip storage consumption.Data multiplexing and parallel calculation were utilized to increase the operation speed and to reduce the system hardware overhead.Design space exploration was used to find the computational parallelism that best met the hardware resource constraints.Experimental results show that at the working frequency of 100 MHz,the peak computing performance of the accelerator can reach 52.56 GFLOPS,which is 4.1 times better than the performance of the CPU and consumes only 9.9%of the GPU.Compared with other FPGA solutions,the overall performance has certain improvement.
作者 龚豪杰 周海 冯水春 GONG Hao-jie;ZHOU Hai;FENG Shui-chun(Key Laboratory of Electronic Information Technology for Complex Aerospace Systems,National Space Science Center,Chinese Academy of Sciences,Beijing 101499,China;School of Computer Science and Technology,University of Chinese Academy of Sciences,Beijing 101408,China)
出处 《计算机工程与设计》 北大核心 2022年第7期1872-1878,共7页 Computer Engineering and Design
基金 中国科学院青年创新促进会基金项目(E0293401)。
关键词 卷积神经网络 现场可编程门阵列 批归一化 并行计算 数据复用 convolution neural network FPGA batch normalization parallel computing data reuse
  • 相关文献

参考文献5

二级参考文献20

共引文献95

同被引文献74

引证文献7

二级引证文献10

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部