摘要
在现代战争中雷达发展趋于多功能化,甚至多个雷达一起探测目标,使得雷达的抗干扰能力增强。而传统的干扰系统仍遵循着固定的干扰策略,面临需要干扰的雷达数目多时决策的实时性较差,故亟需对认知干扰进行研究。阐述了强化学习的概念并比较了Q学习算法和双Q学习算法的差异,利用强化学习算法在认知电子战的基础上建立模型来实现雷达干扰策略的分配。通过对决策方法进行仿真验证了两种强化学习算法都能完成干扰策略分配任务,并且双Q学习算法在多雷达环境下效果更好。表明了强化学习算法可以进行自主学习,完成对干扰资源分配的认知决策。
In modern warfare,the multifunctional trend of radars,even multiple radars detecting targets together,enhances the anti-jamming capability of radars.However,the traditional jamming system still follows a fixed jamming strategy,and the real-time performance of decision-making facing large numbers of radars is poor.And the cognitive jamming study is urgent.The concept of reinforcement learning is explained and the difference between Q learning algorithm and double Q learning algorithm is compared.The reinforcement learning algorithm is used to establish a model based on cognitive electronic warfare to realize the allocation of radar jamming strategies.The simulation of the decision-making method shows that the two reinforcement learning algorithms can accomplish the task of jamming strategy allocation,and the double-Q learning algorithm works better in a multi-radar environment.It shows that the reinforcement learning algorithm can perform autonomous learning and complete the cognitive decision-making for the allocation of interference resources.
作者
黄星源
李岩屹
Huang Xingyuan;Li Yanyi(College of Information and Communication Engineering,Harbin Engineering University,Harbin 150001,China)
出处
《系统仿真学报》
CAS
CSCD
北大核心
2021年第8期1801-1808,共8页
Journal of System Simulation
关键词
多功能雷达
自适应干扰
双Q学习
干扰决策
multifunctional radar
adaptive interference
double Q-learning
jamming decision-making