摘要
基于FPGA的量化推理设计了CNN加速系统;通过对主流的深度神经网络结构的运算特性分析,使用(Density-Based Spatial Clustering of Applications with Noise)DBSCAN聚类算法截取阈值的INT8量化推理方法,融合深度神经网络全连接,减少数据运算位宽和压缩网络大小,在准确率损失很小的情况下有效压缩了网络结构;基于LeNet-5、VGG-16与ResNet-50的CNN网络结构,设计出量化CNN加速系统并进行校验;实验结果表明,网络参数和输入特征数据量化精度为8-bits时,网络压缩率在25%的情况下,网络准确率的损失低于1%;在Xilinx XC7K325平台上量化推理CNN加速系统的运行频率为450 MHz,与其他相似类型的加速器比较,其GOPS性能提升2倍。
Based on the quantitative reasoning of FPGA,the Convolutional Neural Network(CNN)acceleration system is designed.Through the analysis of the operation characteristics of the mainstream deep neural network structure,the INT8 quantitative reasoning method of intercepting the threshold using the density based spatial clustering of applications with noise(DBSCAN)clustering algorithm is used to integrate the full connection of the deep neural network,reduces the data operation bit width and compresses the network size,and effectively compresses the network structure with little loss of accuracy.Based on the CNN network structure of LeNet-5,VGG-16 and ResNet-50,a quantitative CNN acceleration system is designed and verified.The experimental results show that,when the quantization accuracy of network parameters and input characteristic data is 8-bits,the loss of network accuracy is less than 1%as the network compression rate is 25%.On Xilinx xc7k325 platform,the running frequency of CNN acceleration system is 450 MHz.Compared with other similar accelerators,the GOPs performance is improved by 2 times.
作者
何家俊
苏成悦
罗荣芳
施振华
陈堆钰
罗俊丰
HE Jiajun;SU Chenyue;LUO Rongfang;SHI Zhenhua;CHEN Duiyu;LUO Junfeng(School of Physics and Optoelectronic Engineering,Guangdong University of Technology,Guangzhou 510006,China)
出处
《计算机测量与控制》
2022年第9期162-169,共8页
Computer Measurement &Control