Multi-agent deep reinforcement learning for end—edge orchestrated resource allocation in industrial wireless networks 被引量：2

导出

摘要 Edge artificial intelligence will empower the ever simple industrial wireless networks(IWNs)supporting complex and dynamic tasks by collaboratively exploiting the computation and communication resources of both machine-type devices(MTDs)and edge servers.In this paper,we propose a multi-agent deep reinforcement learning based resource allocation(MADRL-RA)algorithm for end-edge orchestrated IWNs to support computation-intensive and delay-sensitive applications.First,we present the system model of IWNs,wherein each MTD is regarded as a self-learning agent.Then,we apply the Markov decision process to formulate a minimum system overhead problem with joint optimization of delay and energy consumption.Next,we employ MADRL to defeat the explosive state space and learn an effective resource allocation policy with respect to computing decision,computation capacity,and transmission power.To break the time correlation of training data while accelerating the learning process of MADRL-RA,we design a weighted experience replay to store and sample experiences categorically.Furthermore,we propose a step-by-stepε-greedy method to balance exploitation and exploration.Finally,we verify the effectiveness of MADRL-RA by comparing it with some benchmark algorithms in many experiments,showing that MADRL-RA converges quickly and learns an effective resource allocation policy achieving the minimum system overhead.

作者 Xiaoyu LIU Chi XU Haibin YU Peng ZENG

机构地区 State Key Laboratory of Robotics Key Laboratory of Networked Control Systems Institutes for Robotics and Intelligent Manufacturing University of Chinese Academy of Sciences

出处《Frontiers of Information Technology & Electronic Engineering》 SCIE EI CSCD 2022年第1期47-60,共14页 信息与电子工程前沿（英文版）

基金 Project supported by the National Key R&rD Program of China(No.2020YFB1710900) the National Natural Science Foundation of China(Nos.62173322,61803368,and U1908212) the China Postdoctoral Science Foundation(No.2019M661156) the Youth Innovation Promotion Association,Chinese Academy of Sciences(No.2019202)。

关键词 Multi-agent deep reinforcement learning End-edge orchestrated Industrial wireless networks Delay Energy consumption

分类号 TP18 [自动化与计算机技术—控制理论与控制工程] TP393.09 [自动化与计算机技术—控制科学与工程]

引文网络
相关文献

参考文献2

1Kaiqing ZHANG,Zhuoran YANG,Tamer BAŞAR.Decentralized multi-agent reinforcement learning with networked agents: recent advances[J].Frontiers of Information Technology & Electronic Engineering,2021,22(6):802-814. 被引量：6
2Ping ZHANG,Mugen PENG,Shuguang CUI,Zhaoyang ZHANG,Guoqiang MAO,Zhi QUAN,Tony Q.S.QUEK,Bo RONG.Theory and techniques for“intellicise”wireless networks[J].Frontiers of Information Technology & Electronic Engineering,2022,23(1):1-4. 被引量：10

共引文献14

1吴昊昇,郑皎凌,王茂帆.TR-light:基于多信号灯强化学习的交通组织方案优化算法[J].计算机应用研究,2022,39(2):504-509. 被引量：3
2Zeyu WANG,Yaohua SUN,Shuo YUAN.Intelligent radio access networks:architectures,key techniques,and experimental platforms[J].Frontiers of Information Technology & Electronic Engineering,2022,23(1):5-18.
3Huixin DONG,Wei KUANG,Fei XIAO,Lihai LIU,Feng XIANG,Wei WANG,Jianhua HE.Ultra-low-power backscatter-based software-defined radio for intelligent and simplified IoT network[J].Frontiers of Information Technology & Electronic Engineering,2022,23(1):19-30.
4Yu ZHANG,Xuelu WU,Hong PENG,Caijun ZHONG,Xiaoming CHEN.Beamforming and fronthaul compression design for intelligent reflecting surface aided cloud radio access networks[J].Frontiers of Information Technology & Electronic Engineering,2022,23(1):31-46. 被引量：1
5Weihao WANG,Yifan JIANG,Zesong FEI,Jing GUO.Coverage performance of the multilayer UAV-terrestrial HetNet with CoMP transmission scheme[J].Frontiers of Information Technology & Electronic Engineering,2022,23(1):61-72.
6Tian DANG,Chenxi LIU,Xiqing LIU,Shi YAN.Joint uplink and downlink resource allocation for low-latency mobile virtual reality delivery in fog radio access networks[J].Frontiers of Information Technology & Electronic Engineering,2022,23(1):73-85.
7廖建新,付霄元,戚琦,王敬宇,孙海峰.6G-ADM:基于知识空间的6G网络管控体系[J].通信学报,2022,43(6):3-15. 被引量：5
8Jian ZHAO,Youpeng ZHAO,Weixun WANG,Mingyu YANG,Xunhan HU,Wengang ZHOU,Jianye HAO,Houqiang LI.Coach-assistedmulti-agent reinforcement learning framework for unexpected crashed agents[J].Frontiers of Information Technology & Electronic Engineering,2022,23(7):1032-1042. 被引量：2
9熊丽琴,曹雷,赖俊,陈希亮.基于值分解的多智能体深度强化学习综述[J].计算机科学,2022,49(9):172-182. 被引量：14
10Jie HUANG,Zhibin MO,Zhenyi ZHANG,Yutao CHEN.Behavioral control task supervisor with memory based on reinforcement learning for human-multi-robot coordination systems[J].Frontiers of Information Technology & Electronic Engineering,2022,23(8):1174-1188. 被引量：5

同被引文献2

1刘晓宇,许驰,曾鹏,于海斌.面向异构工业任务高并发计算卸载的深度强化学习算法[J].计算机学报,2021,44(12):2367-2381. 被引量：14
2李燕君,蒋华同,高美惠.基于强化学习的边缘计算网络资源在线分配方法[J].控制与决策,2022,37(11):2880-2886. 被引量：12

引证文献2

1Yang LI,Ziling WEI,Jinshu SU,Baokang ZHAO.A multi-agent collaboration scheme for energy-efficient task scheduling in a 3D UAV-MEC space[J].Frontiers of Information Technology & Electronic Engineering,2024,25(6):824-838.
2许驰,唐紫萱,金曦,夏长清.基于李雅普诺夫优化和深度强化学习的多任务端边迁移[J].控制与决策,2024,39(7):2457-2464.

1郑海娇.10kV配电网自愈系统的应用[J].电子测试,2021,32(24):68-70. 被引量：2
2Gongpu Chen,Rui Ma,Mengdan Lei,Xianghui Cao.Channel List Selection Based on Quality Prediction in WirelessHART Networks[J].Journal of Communications and Information Networks,2018,3(3):49-56. 被引量：1
3LI BangJun,LIU HaoRan,WANG RuZhu.Data-driven sensor placement for efficient thermal field reconstruction[J].Science China(Technological Sciences),2021,64(9):1981-1994. 被引量：1
4陈妙云,王雷,盛捷.基于值分布的多智能体分布式深度强化学习算法[J].计算机系统应用,2022,31(1):145-151. 被引量：2
5Wenjing You,Chao Dong,Qihui Wu,Yuben Qu,Yulei Wu,Rong He.Joint Task Scheduling, Resource Allocation, and UAV Trajectory under Clustering for FANETs[J].China Communications,2022,19(1):104-118. 被引量：7
6李维.基于5G+MEC的区块链规模化部署措施[J].通信电源技术,2021,38(18):131-133.
7Chen CHEN,Xiaochen WU,Jie CHEN,Panos M.PARDALOS,Shuxin DING.Dynamic grouping of heterogeneous agents for exploration and strike missions[J].Frontiers of Information Technology & Electronic Engineering,2022,23(1):86-100. 被引量：1
8Didi Lv,Xiaoqun Zhang.A GREEDY ALGORITHM FOR SPARSE PRECISION MATRIX APPROXIMATION[J].Journal of Computational Mathematics,2021,39(5):693-707.
9孔令军,刘伟光,周耀威,裴会增,沈馨怡,赵子昂.一种基于深度可分离卷积的轻量级人体关键点检测算法[J].无线电工程,2022,52(1):76-82. 被引量：8
10LIN Bing,LIN Yuchen,BHATNAGAR Rohit.Optimal policy for controlling two-server queueing systems with jockeying[J].Journal of Systems Engineering and Electronics,2022,33(1):144-155.

Frontiers of Information Technology & Electronic Engineering

2022年第1期

浏览历史

内容加载中请稍等...

Multi-agent deep reinforcement learning for end—edge orchestrated resource allocation in industrial wireless networks 被引量：2

参考文献2

共引文献14

同被引文献2

引证文献2

相关作者

相关机构

相关主题

浏览历史