摘要
针对x86和ARM商用架构CPU因专利、授权导致定制成本过高和灵活性不够的问题,面向物联网领域提出一种基于RISC-V开源指令集的卷积神经网络(CNN)专用指令集处理器。通过自定义拓展指令调用加速器对轻量化CNN中的卷积和池化操作进行加速,提高终端设备能效。在此过程中,配置CNN各层信息控制加速器进行分组运算,以适应不同大小的输入数据,同时调整加速器的数据通路,对耗时操作进行单独或结合运算,以适应不同的轻量化网络。FPGA平台验证结果表明,该处理器在100 MHz工作频率下推理Squeeze Net网络,耗时约40.89 ms,功耗为1.966 W,较手机处理器单核计算速度更快,与AMD Ryzen7 3700X、NVIDIA RTX2070 Super和Qualcomm Snapdragon 835平台相比,其消耗资源少、功耗低,在性能功耗比上也具有优势。
The x86-based and ARM-based CPU are limited by the patent authorization,which increases their customization cost and reduces the flexibility.To address the problem,this paper chooses the open-source instruction set architecture,RISC-V,to build an special instruction set processor for Convolutional Neural Network(CNN)used in the Internet of Things(IoT).The processor uses the custom extended instructions to call the accelerator to speed up the convolution and pooling operations of lightweight CNN,improving the power efficiency of terminal devices.In this process,the information of each layer of CNN is configured to control the accelerator to perform grouping operations,so as to adapt to the input data of different sizes.At the same time,the data path of the accelerator is adjusted,and the timeconsuming operations are operated separately or in combination to adapt to different lightweight networks.The verification results on the FPGA platform show that this processor delivers a power consumption of 1.966 W when inferring SqueezeNet at 100 MHz.The inference takes about 40.89 ms,which is less than the single-core mobile phone processors take.Also,it reduces the consumption of resources and power,demonstrating an obvious advantage in performance power ratio compared with AMD Ryzen73700X,NVIDIA RTX2070 Super and Qualcomm Snapdragon 835.
作者
廖汉松
吴朝晖
李斌
LIAO Hansong;WU Zhaohui;LI Bin(School of Microelectronics,South China University of Technology,Guangzhou 510641,China;Guangdong Artificial Intelligence and Digital Economy Laboratory(Guangzhou),Guangzhou 510330,China)
出处
《计算机工程》
CAS
CSCD
北大核心
2021年第7期196-204,共9页
Computer Engineering
基金
广东省重点领域研发计划项目(2018B010142001)。
关键词
RISC-V指令集
卷积神经网络
领域专用架构
专用指令集处理器
硬件加速
RISC-V instruction set
Convolutional Neural Network(CNN)
Domain Specific Architecture(DSA)
special instruction set processor
hardware acceleration