摘要
Edge artificial intelligence will empower the ever simple industrial wireless networks(IWNs)supporting complex and dynamic tasks by collaboratively exploiting the computation and communication resources of both machine-type devices(MTDs)and edge servers.In this paper,we propose a multi-agent deep reinforcement learning based resource allocation(MADRL-RA)algorithm for end-edge orchestrated IWNs to support computation-intensive and delay-sensitive applications.First,we present the system model of IWNs,wherein each MTD is regarded as a self-learning agent.Then,we apply the Markov decision process to formulate a minimum system overhead problem with joint optimization of delay and energy consumption.Next,we employ MADRL to defeat the explosive state space and learn an effective resource allocation policy with respect to computing decision,computation capacity,and transmission power.To break the time correlation of training data while accelerating the learning process of MADRL-RA,we design a weighted experience replay to store and sample experiences categorically.Furthermore,we propose a step-by-stepε-greedy method to balance exploitation and exploration.Finally,we verify the effectiveness of MADRL-RA by comparing it with some benchmark algorithms in many experiments,showing that MADRL-RA converges quickly and learns an effective resource allocation policy achieving the minimum system overhead.
基金
Project supported by the National Key R&rD Program of China(No.2020YFB1710900)
the National Natural Science Foundation of China(Nos.62173322,61803368,and U1908212)
the China Postdoctoral Science Foundation(No.2019M661156)
the Youth Innovation Promotion Association,Chinese Academy of Sciences(No.2019202)。