期刊文献+

面向多用户动态频谱接入的改进双深度Q网络方法研究

An Improved Double Deep Q Network for Multi-user Dynamic Spectrum Access
下载PDF
导出
摘要 随着移动通信技术的飞速发展,有限的频谱利用资源与大量频谱通信需求之间的矛盾也日益加剧,需要新的智能方法来提高频谱利用率。本文提出了一种基于分布式优先经验池结合双深度Q网络的多用户动态频谱接入方法。通过该方法,次用户可以在动态变化的认知无线网络环境下根据自己感知信息来不断地学习,选择空闲信道完成频谱接入任务来提高频谱利用率。该方法采用分布式强化学习框架,将每个次用户视为一个智能体,各个智能体采用标准单智能体强化学习方法进行学习以降低底层计算开销。另外,该方法在神经网络训练的基础上加入优先级采样,优化了神经网络的训练效率以帮助次用户选择出最优策略。仿真实验结果表明该方法能提高接入信道时的成功率、降低碰撞率和提升通信速率。 With the rapid development of mobile communication technology,the contradiction between the limited spectrum utilization resources and the demand of a lot of spectrum communication is increasingly aggravated.New intelligent methods are needed to improve the utilization rate of spectrum.A multi-user dynamic spectrum access method based on distributed priority experience pool and double deep Q network is proposed.This method can help the secondary users to continuously learn according to their perceived environment information in the dynamic environment,and choose the idle channel to complete the spectrum access task for improving the spectrum utilization rate.In this method,a distributed reinforcement learning framework is adopted,and each secondary user is regarded as an agent.Each agent learns by using standard single-agent reinforcement learning method to reduce the underlying computing overhead.In addition,the method adds priority sampling on the basis of neural network training,and then optimizes the training efficiency of neural network to help sub-users choose the optimal strategy.The simulation results show that this method can improve the success rate,reduce the collision rate and improve the communication rate.
作者 何一汕 王永华 万频 王磊 伍文韬 He Yi-shan;Wang Yong-hua;Wan Pin;Wang Lei;Wu Wen-tao(School of Automation,Guangdong University of Technology,Guangzhou 510006,China)
出处 《广东工业大学学报》 CAS 2023年第4期85-93,共9页 Journal of Guangdong University of Technology
基金 国家自然科学基金资助项目(61971147)。
关键词 动态频谱接入 分布式强化学习 优先经验池 深度强化学习 dynamic spectrum access distributed reinforcement learning prioritized experience pool deep reinforcement learning
  • 相关文献

参考文献6

二级参考文献15

  • 1Christopher J. C. H. Watkins,Peter Dayan.Q-learning[J].Machine Learning (-).1992(3-4) 被引量:1
  • 2Richard S. Sutton.Learning to predict by the methods of temporal differences[J].Machine Learning.1988(1) 被引量:1
  • 3SHEPPARD J.Colearning in differential games[].Machine Learning.1998 被引量:1
  • 4MASAYUKI Y,TAKASHI O.Reinforcement learning with knowledge by using a stochastic gradient method on a bayesian network[].Proceedings of the IEEE International Conference on Neural Networks.1998 被引量:1
  • 5CARLOS R.Embedding a priori knowledge in reinforcement learning[].Journal of Intelligent and Robotic Systems.1998 被引量:1
  • 6CHI -HYON O,TOMOHARU N,HISAO I.Initialization of Q-values by fuzzy rules for accelerating Q-learning[].Proceedings of the IEEE International Conference on Neural Networks.1998 被引量:1
  • 7DEAN H,MARIA G,SLAGLE J.Partitioning input space for reinforcement learning for control[].Proceedings of the IEEE International Congress on Neural Networks.1997 被引量:1
  • 8YOSHIKAZU A,TERUO F,HAJIME A,et al.Multilayered reinforcement learning for complicated collision avoidance problems[].Proceedings of the IEEE International Conference on Robotics and Automation.1998 被引量:1
  • 9SUTTONRS.TemporalCreditAssignmentinReinforce mentLearning[]..1984 被引量:1
  • 10YANGFangchun WANG Shangguang LI Jinglin LIU Zhihan SUN Qibo.An Overview of Internet of Vehicles[J].China Communications,2014,11(10):1-15. 被引量:56

共引文献8

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部