期刊文献+

一种基于FPGA的高性能卷积神经网络加速器的设计与实现 被引量:3

Design and Implementation of a High Performance CNN Accelerator Based on FPGA
下载PDF
导出
摘要 近年来,随着人工智能技术的发展,卷积神经网络(CNN)作为深度学习技术中的常用算法,在计算机视觉、语音识别及自然语言处理等诸多领域得到了广泛的应用。可编程门阵列(FPGA)因其高并行度和高灵活性等优势常被用于CNN的加速。基于此,本文对高性能CNN加速器的设计进行研究。文中采用DSP的级联、卷积核数据的“乒-乓”结构,以及多通道并行、特征图及卷积核数据的复用等方法,以期在资源受限的FPGA平台中为CNN的计算提供高性能加速。实验结果显示,本文的设计方法使用了较少的LUT资源,在Virtex7 VX690T上的峰值运算性能达到1.6TOPs,对VGG16网络加速时吞吐量达到1.334TOPs,具有较高的计算性能和较少的资源消耗。 Recently,with the development of the technology of artificial intelligence,convolution neural network,as a common algorithm in deep learning technology,has been widely used in some domains,such as computer vision,speech recognition and nature language processing. And field programmable gate array(FPGA) is often used in CNN accelerator,due to its high degree of parallelism and high flexibility and other advantages. Based on this, this paper studied the design of the high performance CNN accelerator based on FPGA. This paper used DSP cascading,convolution kernel ping-pong,multichannel parallel computing,feature map and convolution kernel multiplexing,and other technologies,in order to provide high performance acceleration for CNN computing in resource constrained FPGA platform.The test results showed that the design method in this paper reduced the number of LUT used. On the Virtex7 VX690T FPGA platform,the CNN accelerator can achieve a peak performance of 1.6TOPs,and a throughput of 1.334TOPs for VGG16 networks. It has better computing performance and less resource consumption.
作者 曹学成 廖湘萍 李盈盈 丁永林 李炜 CAO Xuecheng;LIAO Xiangping;LI Yingying;DING Yonglin;LI Wei(China Electronics Technology Group Corporation 52nd Research Institute,Hangzhou 311100,China)
出处 《智能物联技术》 2021年第5期11-17,共7页 Technology of Io T& AI
关键词 卷积神经网络 FPGA DSP级联 CNN加速器 convolutional neural network FPGA DSP cascading CNN accelerator
  • 相关文献

参考文献2

二级参考文献8

共引文献1837

同被引文献41

引证文献3

二级引证文献4

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部