摘要
为解决低过载比和纯角度量测等约束下的三维机动目标拦截制导问题,提出了一种基于分层强化学习的拦截制导律。首先将问题建模为马尔科夫决策过程模型,并考虑拦截能量消耗与弹目视线角速率,设计了一种启发式奖赏函数。其次通过构建具有双层结构的策略网络,并利用上层策略规划阶段性子目标来指导下层策略生成所需的制导指令,实现了拦截交战过程中的视线角速率收敛,以保证能成功拦截机动目标。仿真结果验证了所提出的方法较增强比例导引具有更高的拦截精度和拦截概率,且拦截过程的需用过载更低。
This paper has proposed an intercept guidance law based on hierarchical reinforcement learning to solve the three-dimensional maneuvering target intercept guidance problem with constraints of low acceleration ratio and bearingsonly measurement.The aforementioned problem was initially modelled using a Markov decision process model,where a heuristic reward function was applied considering both the energy consumption and the missile-to-target line of sight(LOS)angular rate.Besides,the policy of two levels was built up with the lower-level policy generating the required guidance command and being supervised by subgoals that were instructed by the higher levels,allowing the convergence of the LOS angular rate and guaranteeing the successful interception against a maneuvering target.Simulation results have validated the superiority of the proposed method compared with the augmented proportional navigation guidance law in terms of intercept accuracy and hit probability,and its required acceleration ratio is much lower.
作者
王旭
蔡远利
张学成
张荣良
韩成龙
WANG Xu;CAI Yuanli;ZHANG Xuecheng;ZHANG Rongliang;HAN Chenglong(Faculty of Electronic and Information Engineering,Xi’an Jiaotong University,Xi’an 710049,Shaanxi,China;Third Military Representative Office of Army Equipment Department in Shanghai,Shanghai 200031,China;Shanghai Electro-Mechanical Engineering Institute,Shanghai 201109,China)
出处
《空天防御》
2024年第1期40-47,共8页
Air & Space Defense
基金
国家自然科学基金项目(62203349,12302061)。
关键词
末制导
机动目标拦截
低过载比
分层强化学习
guidance law
maneuvering target intercept
low acceleration ratio
hierarchical reinforcement learning