摘要
边缘计算(Edge Computing,EC)将计算、存储等资源部署在网络边缘,以满足业务对时延和能耗的要求。计算卸载是EC中的关键技术之一。现有的计算卸载方法在估计任务排队时延时使用M/M/1/∞/∞/FCFS或M/M/n/∞/∞/FCFS排队模型,未考虑高时延敏感型任务的优先执行问题,使得一些对时延要求不敏感的计算任务长期占用计算资源,导致系统的时延开销过大。此外,现有的经验重放方法大多采用随机采样方式,该方式不能区分经验的优劣,造成经验利用率低,神经网络收敛速度慢。基于确定性策略深度强化学习(Deep Reinforcement Learning,DRL)的计算卸载方法存在智能体对环境的探索能力弱和鲁棒性低等问题,降低了求解计算卸载问题的精度。为解决以上问题,考虑边缘计算中多任务移动设备、多边缘服务器的计算卸载场景,以最小化系统时延和能耗联合开销为目标,研究任务调度与卸载决策问题,并提出了基于非抢占式优先排队和优先经验重放DRL的计算卸载方法(Computation Offloading qUeuing pRioritIzed Experience Replay DRL,COURIER)。COURIER针对任务调度问题,设计了非抢占式优先排队模型(M/M/n/∞/∞/NPR)以优化任务的排队时延;针对卸载决策问题,基于软演员-评论家(Soft Actor Critic,SAC)提出了优先经验重放SAC的卸载决策机制,该机制在目标函数中加入信息熵,使智能体采取随机策略,同时优化机制中的经验采样方式以加快网络的收敛速度。仿真实验结果表明,COURIER能有效降低EC系统时延和能耗联合开销。
Edge computing(EC)deploy a large number of computing and storage resources at the edge of the network to meet requirements on latency and power consumption of tasks.Computing offloading is one of the key technologies in EC.When estimating the delay of task queuing,the existing computation offloading methods usually use M/M/1/∞/∞/FCFS or M/M/n/∞/∞/FCFS models.Without considering the priority of high delay sensitive tasks,these methods cause some computation tasks that do not require sensitive delay always occupy the computation resources,increasing the delay cost of these methods.Meanwhile,most of the existing playback methods use random sampling to replay experience,which cannot distinguish the pros and cons of expe-rience,resulting in low experience utilization and slow neural network convergence.At last,the deterministic policy deep reinforcement learning(DRL)based on computational offloading methods have problems,such as weak ability of exploring environment,low robustness and low experience utilization rate,which reduces the accuracy of solving computational unload problem.To solve the above problems,considering the multi-task mobile device and multi-edge server computing offload scenarios,aims to minimize the system delay and energy consumption,study task scheduling and offloading decision-making problems,and computation offloading qUeuing and pRioritIzed experience replay DRL(COURIER)is proposed.COURIER first designs a non-preemptive priority queuing model(M/M/n/∞/∞/NPR)to optimize the queuing delay of tasks.Then,it proposes a maximum entropy deep reinforcement learning algorithm based on prioritized experience replay.For the offloading decision problem,an offloading decision mechanism of priority experience replay SAC is proposed,based on soft actor-critic(SAC)algorithm.In this mechanism,information entropy is added to the objective function to make the agent adopt random strategy,and the empirical sampling me-thod is optimized to accelerate the convergence rate of the network.Simulation result
作者
杨秀文
崔允贺
钱清
郭春
申国伟
YANG Xiuwen;CUI Yunhe;QIAN Qing;GUO Chun;SHEN Guowei(College of Computer Science and Technology,Guizhou University,Guiyang 550025,China;State Key Laboratory of Public Big Data,Guiyang 550025,China;Engineering Research Center of Text Computing&Cognitive Intelligence,Ministry of Education,Guiyang 550025,China;School of Information,Guizhou University of Finance and Economics,Guiyang 550000,China)
出处
《计算机科学》
CSCD
北大核心
2024年第5期293-305,共13页
Computer Science
基金
国家自然科学基金(62102111)
贵州省科技计划项目([2020]1 Y267,黔科合重大专项字[2024]003号)
贵州省教育厅自然科学研究项目([2021136])
贵州大学引进人才项目((2019)52)。
关键词
边缘计算
计算卸载
非抢占式优先排队
信息熵
深度强化学习
优先经验重放
Edge computing
Computing offloading
Non-preemptive priority queuing
Information entropy
Deep reinforcement learning
Priority experience replay