摘要
针对认知无线网络中多用户资源分配时需要大量信道和功率策略信息交互,并且占用和耗费了大规模系统资源的问题,通过非合作博弈模型对用户的策略进行了研究,提出一种基于多用户Q学习的联合信道选择和功率控制算法。用户在自学习过程中将采用统一的策略,仅通过观察自己的回报来进行Q学习,并逐渐收敛到最优信道和功率分配的最优集合。仿真结果表明,该算法可以高概率地收敛到纳什均衡,用户通过信道选择得到的整体回报非常接近最大整体回报值。
When multi-user resources allocate in cognitive radio networks,a large amount of channels and power strategy information need to interact,which will cause a large occupation and expend of system resources.To solve this problem,this paper analyzed the users with a non-cooperative game model and proposed a joint channel selection and power control algorithm based on multi-user Q-learning.In the process of self-learning,the users would observe their own rewards and did Q-learning with a unified strategy,the learning result gradually converged to the optimal set of optimal channel and power allocation.As simulation results show that the algorithm can converge to Nash equilibrium with high probability,and the overall reward obtained from the user channel selection is very close to the maximum overall reward.
作者
蒋涛涛
朱江
Jiang Taotao;Zhu Jiang(Chongqing Key Laboratory of Mobile Communications Technology,School of Communication&Information Engineering,Chongqing University of Posts&Telecommunications,Chongqing 400065,China)
出处
《计算机应用研究》
CSCD
北大核心
2020年第8期2500-2503,共4页
Application Research of Computers
基金
国家自然科学基金资助项目(61102062)
国家教育部科学技术研究重点项目(212145)
重庆市科委自然科学基金资助项目(cstc2015jcyjA40050)。
关键词
认知无线网络
Q学习
信道选择
功率控制
cognitive wireless network
Q-learning
channel selection
power control