摘要
在现代电子战中,雷达面临的干扰环境比以前更加复杂,机载干扰机会根据突袭任务与突袭阶段的不同而改变其干扰方式。近年来,基于强化学习的雷达抗干扰方法在单一干扰对抗场景下取得了一定进展,但在实际复杂多干扰场景下的研究仍有不足。为了解决该问题,本文提出了一种基于复数域深度强化学习的多干扰场景雷达抗干扰方法以优化频率捷变雷达的抗干扰策略。首先,针对突袭任务的阶段性特点建立了噪声瞄准干扰、距离假目标欺骗干扰与密集假目标转发干扰3种干扰模型,并设计了3种干扰顺序策略来模拟实际干扰场景。其次,针对多干扰场景模型,构建了一种融合信干噪比与目标航迹完整性的强化学习奖励函数,并针对干扰信号的复数域特征,提出了一种基于复数域深度强化学习的多干扰场景雷达抗干扰方法。最后,基于3种干扰顺序策略设计了雷达抗干扰仿真实验,结果表明,所提方法能够有效解决雷达面临的时序条件下复杂多干扰场景的主瓣干扰问题,与两种经典深度强化学习算法相比该方法抗干扰决策性能大幅提高,平均决策时间降低至405.3 ms。
In modern electronic warfare,the jamming environment of radar is more complex than ever.The airborne jammer adapts its jamming method based on diverse raid missions and stages.Recently,the reinforcement learning–based radar anti-jamming method has made some progress in the confrontation scenario of single jamming;however,the gap with respect to actual complex multi-jamming scenarios is large.To address this issue,this paper proposes a multi-jamming scenario radar anti-jamming method based on deep reinforcement learning in the complex domain to optimize the anti-jamming strategy of frequency agile radar.First,according to the stage characteristics of the raid mission,noise spot jamming,range deception jamming,and dense false target forwarding jamming models are established.The three jamming sequence strategies were designed to simulate actual jamming scenarios.Second,a reinforcement learning reward function that integrates the signal-to-noise ratio and target trajectory integrity is constructed for the multi-jamming scenario model.Thus,a multi-jamming scenario radar anti-jamming method based on deep reinforcement learning in a complex domain is proposed,which is based on the complex domain characteristics of the jamming signal.Finally,radar anti-jamming simulation experiments are performed based on the three jamming sequence strategies.The results show that the proposed method can effectively deal with the main-lobe jamming problem of complex multi-jamming scenarios under time-sequence conditions.Moreover,the average decision-making accuracy was improved,and the average decision-making time was reduced to 405.3 ms compared with the two classical reinforcement learning algorithms.
作者
解烽
刘环宇
胡锡坤
钟平
李君宝
XIE Feng;LIU Huanyu;HU Xikun;ZHONG Ping;LI Junbao(Information Countermeasure Technique Institute,Faculty of Computing,Harbin Institute of Technology,Harbin 150080,China;College of Electronic Science and Technology,National University of Defense Technology,Changsha 410073,China)
出处
《雷达学报(中英文)》
EI
CSCD
北大核心
2023年第6期1290-1304,共15页
Journal of Radars
基金
国家自然科学基金(62271166)
哈尔滨工业大学医工理交叉基金(IR2021104)。
关键词
复数域
深度强化学习
主瓣干扰
序贯干扰
频率捷变雷达
Complex domain
Deep Reinforcement Learning(DRL)
Main-lobe jamming
Sequential jamming
Frequency agile radar