摘要
从智慧城市到工业自动化,智能物联网在越来越多的场景中得到了广泛应用.模型推理作为实现智能决策和响应的核心技术,在智能物联网系统中扮演着举足轻重的角色.然而,智能物联网设备通常在计算能力、通信带宽、内存容量和电池寿命等资源上高度受限.这使得智能物联网中的模型推理资源开销成为一个关键技术挑战.本综述总结了在智能物联网场景中优化模型推理资源开销的相关技术,对当前在智能物联网应用中使用的主流模型推理优化技术进行概述,并深入分析它们在资源效率方面的优势和不足.本文从推理涉及的三大模块(传感器数据、智能模型、物联网硬件)和五类关键资源的角度出发设计新的技术分类,并首次提出了一套针对智能物联网模型推理的通用的优化流程,能够帮助相关研发人员定位和优化推理效率瓶颈.最后,本文讨论了智能物联网推理效率相关的四个未来研究方向.
From smart cities to industrial automation,AIoT is widely used in more and more scenarios.Model inference,as the core technology for realizing intelligent decision-making and response,plays a pivotal role in AIoT systems.However,AIoT devices are usually highly constrained in resources such as computing,communication,memory,and battery.This makes model inference resource overhead in AIoT a key technical challenge.This review summarizes related technologies for optimizing model inference resource overhead in AIoT scenarios,provides an overview of mainstream model inference optimization techniques currently used in AIoT applications,and deeply analyzes their advantages and disadvantages in terms of resource efficiency.This paper designs a new taxonomy,which is classified from the three modules involved in inference(sensor data,intelligent model,IoT hardware) and five key resources,and proposes the first general optimization workflow for AIoT model inference,which can help relevant R&D personnel locate and optimize inference efficiency bottlenecks.Finally,this paper discusses four future research directions related to the inference efficiency of AIoT.
作者
袁牧
张兰
姚云昊
张钧洋
罗溥晗
李向阳
YUAN Mu;ZHANG Lan;YAO Yun-Hao;ZHANG Jun-Yang;LUO Pu-Han;LI Xiang-Yang((School of Computer Science and Technology,University of Science and Technology of China,Hefei 230026)
出处
《计算机学报》
EI
CAS
CSCD
北大核心
2024年第10期2247-2273,共27页
Chinese Journal of Computers
基金
国家重点研发计划项目(2021ZD0110400)
科技创新2030-“量子通信与量子计算机”重大项目(2021ZD0302900)
国家自然科学基金项目(62132018,62231015,623B2093)
浙江省“尖兵”“领雁”研发攻关计划(2023C010 29,2023C01143)资助。
关键词
智能物联网
模型推理
资源效率
输入过滤
协同推理
Artificial Intelligent of Things
model inference
resource efficiency
input filtering
collaborative inference