摘要
针对正交频分复用系统,提出了一种基于深度强化学习的自适应导频设计算法。将导频设计问题映射为马尔可夫决策过程,导频位置的索引定义为动作,用基于减少均方误差的策略定义奖励函数,使用深度强化学习来更新导频位置。根据信道条件自适应地动态分配导频,从而利用信道特性对抗信道衰落。仿真结果表明,所提算法在3GPP的3种典型多径信道下相较于传统导频均匀分配方案信道估计性能有显著的提升。
For orthogonal frequency division multiplexing(OFDM)systems,an adaptive pilot design algorithm based on deep reinforcement learning was proposed.The pilot design problem was formulated as a Markov decision process,where the index of pilot positions was defined as actions.A reward function based on mean squared error(MSE)reduction strategy was formulated,and deep reinforcement learning was employed to update the pilot positions.The pilot was adaptively and dynamically allocated based on channel conditions,thereby utilizing channel characteristics to combat channel fading.The simulation results show that the proposed algorithm has significantly improved channel estimation performance compared with the traditional pilot uniform allocation scheme under three typical multipath channels of 3GPP.
作者
刘乔寿
周雄
刘爽
邓义锋
LIU Qiaoshou;ZHOU Xiong;LIU Shuang;DENG Yifeng(School of Communications and Information Engineering,Chongqing University of Posts and Telecommunications,Chongqing 400065,China;Advanced Network and Intelligent Connection Technology Key Laboratory of Chongqing Education Commission of China,Chongqing 400065,China;Chongqing Key Laboratory of Ubiquitous Sensing and Networking,Chongqing 400065,China)
出处
《通信学报》
EI
CSCD
北大核心
2023年第9期104-114,共11页
Journal on Communications
基金
国家自然科学基金资助项目(No.61901075)
重庆市教委科学技术基金资助项目(No.KJZDK202200604)。