摘要
卷积神经网络(CNN)所有子层中卷积层的计算是整个网络计算中最耗费计算资源的问题,本文提出了一种对卷积神经网络的卷积层并行化实现方案。首先对系统的整体处理结构进行分析,然后对计算核的结构进行详细讨论,最后将卷积层中卷积运算并行映射到阵列处理器上。实验结果表明,在250Mhz的工作频率下,该结构可使FPGA(Field Programmable Gate Array,FPGA)提高峰值运算速度。
The calculation of the convolutional layer in all sublayers of convolutional neural networks(CNN) is the most computational resource in the whole network computing. A convolutional layer parallelization implementation scheme for convolutional neural networks is proposed. Firstly,the overall processing structure of the system is analyzed. Then the structure of the computing core is discussed in detail. Finally, the convolutional layer are mapped to the array processor in parallel. The experimental results show that at 250 Mhz operating frequency, the structure can improve the peak operation speed FPGA.
作者
杨博文
杨海涛
高浩浩
YANG Bo-wen;YANG Hai-tao;GAO Hao-hao(Xi'an University of Posts,Xi'an Shaanxi 710121)
出处
《数字技术与应用》
2019年第10期136-137,共2页
Digital Technology & Application