摘要
在未来的局部战争中,导弹攻防对抗将成为一个重要的作战样式。用智能小车的追逃来模拟导弹攻防对抗过程,并以深度确定性策略梯度(Deep Deterministic Policy Gradient,DDPG)算法为原型,以视距和视线角为状态,借鉴PID控制思想设计回报函数,提出了一种追逃博弈算法。该算法分别在数学仿真和智能小车实物上进行了验证,实验结果表明算法可以有效地控制小车使其完成追捕任务,并且具有很好的适应性。
The process of attack-defense interaction for guided missiles will be a much important part in the future local war.imulat the attack-defense interaction of missiles with the pursuit-evasion game of intelligent mini-car,a method for solving the pursuit-evasion game,which is based on the eep eterministic olicy radient (DDPG)lgorithm.The state vectors of this method are the distance and the angular of ine f ight ).The reward function is designed by referencing the method of PID controller.The mathematical simulations and experiments of ursuit-vasion game have been done to prove the method,and the results show that it cannot only effectively control the mini-car to complete its mission of capturing the evader,but also has well adaptability.
作者
谭浪
巩庆海
王会霞
Tan Lang;Gong Qinghai;Wang Huixia(Beijing Aerospace Automatic Control Institute,Beijing 100854,China;National Key Laboratory of Science and Technology on Aerospace Intelligence Control,Beijing 100854,China)
出处
《航天控制》
CSCD
北大核心
2018年第6期3-8,19,共7页
Aerospace Control
基金
国家自然科学基金(61773341)
关键词
导弹攻防对抗
追逃博弈
深度强化学习
DDPG
Attack -defense interaction
Pursuit -evasion game
Deep reinforcement learning
DDPG