期刊文献+

Dynamic Spectrum Access Based on Prior Knowledge Enabled Reinforcement Learning with Double Actions in Complex Electromagnetic Environment 被引量:2

下载PDF
导出
摘要 The spectrum access problem of cognitive users in the fast-changing dynamic interference spectrum environment is addressed in this paper.The prior knowledge for the dynamic spectrum access is modeled and a reliability quantification scheme is presented to guide the use of the prior knowledge in the learning process.Furthermore,a spectrum access scheme based on the prior knowledge enabled RL(PKRL)is designed,which effectively improved the learning efficiency and provided a solution for users to better adapt to the fast-changing and high-density electromagnetic environment.Compared with the existing methods,the proposed algorithm can adjust the access channel online according to historical information and improve the efficiency of the algorithm to obtain the optimal access policy.Simulation results show that,the convergence speed of the learning is improved by about 66%with the invariant average throughput.
出处 《China Communications》 SCIE CSCD 2022年第7期13-24,共12页 中国通信(英文版)
基金 supported by National Natural Science Foundation of China (No. 62131005)
  • 相关文献

参考文献5

二级参考文献38

  • 1张双民,石纯一.一种基于特征向量提取的FMDP模型求解方法[J].软件学报,2005,16(5):733-743. 被引量:3
  • 2M~Kenzie, D. A. Statistics in Britain, Edinburgh, U.K.: Edinburgh University Press, 1981:1865-1930. 被引量:1
  • 3Rodgers, J.L., and Nicewander, W. A. "Thirteen Ways to Iok at the Correlation Coefficient," The American Statistician, 1988, 42 (1): 59-66. 被引量:1
  • 4Thrun S, Mitchell T. Integrating inductive neural network learning and explanation-based learning[EB/OL].http://www.ri.cmu.edu/pubs/pub-657.html,2001. 被引量:1
  • 5Gordon D, Subramanian D. A multi-strategy learning scheme for agent knowledge acquisition acquisition[J].Informatica,1994,17:331-346. 被引量:1
  • 6Maclin R, Shavlik J W. Creating advice-talking reinforcement learners[J].Machine Learning,1996,22:251-281. 被引量:1
  • 7Bradtke S J, Duff M O. Reinforcement learning methods for continuous-time markov decision problems[EB/OL].http://iridia.ulb.ac.be/-mbiro/rl/rl&ants.html.2000. 被引量:1
  • 8Bao G, Cassandras C G. Elevator dispatchers for down peak traffic[R].Massachusetts:ECE Department, University of Massachusetts,1994. 被引量:1
  • 9Szepesvári C, Littman M L. A unified analysis of value-function-based reinforcement learning algorithms[J].Neuro Computing,1999,11(8):2017-2060. 被引量:1
  • 10Mitchell Tom M. Machine Learning[M]. New York: McGraw Hill, 1997 被引量:1

共引文献149

同被引文献5

引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部