期刊文献+

改进近似动态规划法的攻击占位决策 被引量:6

Attack Placeholder Decision Based on Improved Approximate Dynamic Programming
下载PDF
导出
摘要 瞬息万变的空战环境和日益复杂的空战任务导致应用动态规划法解决机动决策问题容易造成“维数灾难”。基于函数拟合思想优化逼近近似值函数,解决了空战状态的连续性问题。同时,针对近似动态规划在解决机动决策问题时未考虑“过冲”机动和碰撞的问题,提出惩罚因子对近似动态规划法的攻击占位决策方法进行改进。这种方法能够有效应对快速变化的战场态势,而且不需要对空战战术构建专有的战术库。为了验证模型的有效性,将改进的近似动态规划法进行了实验仿真,仿真结果表明改进的攻击决策方法能够有效避免“过冲”机动和碰撞问题,具有较强的鲁棒性。 Due to the ever-changing air combat environment and increasingly complex air combat tasks,the application of dynamic programming to solve the problem of maneuvering decision can easily lead to a"curse of dimensionality".In this paper,the problem of the continuity of air combat states is solved based on the function fitting theory.At the same time,in order to solve the problem of"overshoot"maneuvering and collision,the penalty factor is proposed to improve the method of attacking and place holding in the approximate dynamic programming.This approach is effective in responding to rapidly changing battlefield situations and doesn't require building a proprietary arsenal of air combat tactics.In order to prove the effectiveness of the model,the improved approximate method is simulated and verified in this paper.The simulation results show that the improved attack decision-making method can effectively avoid"overshoot"maneuvering and collision and has strong robustness.
作者 姜龙亭 寇雅楠 王栋 张彬超 胡涛 JIANG Long-ting;KOU Ya-nan;WANG Dong;ZHANG Bin-chao;HU Tao(Aerospace Engineering Acadeny,Air Force Engineering University,Xi'an 710038,CAtna;Unit 95974 of PLA,Cangzhou 061000,China;Unit 95356 of PLA,Leiyang 421800,China)
出处 《火力与指挥控制》 CSCD 北大核心 2019年第7期135-141,共7页 Fire Control & Command Control
基金 航空科学基金资助项目(20141396012)
关键词 维数灾难 近似动态规划 自主攻击 占位决策 惩罚因子 curse of dimensionality approximate dynamic programming autonomous attack place holder decision penalty factor
  • 相关文献

参考文献6

二级参考文献33

  • 1Peters J, Schaal S, Using reward-weighted regression for reinforcement learning of task space control[C]// Proc. of the IEEE Symposium on Approximate Dynamic Programming and Reinforcement Learning, 2007:262 - 267. 被引量:1
  • 2Paternina-Arboleda C D, Montoya-Torres J R, Fabregas-Ariza A. Simulation optimization using a reinforcement learning approach[C]// Proc. of the Winter Simulation Conference, 2008 . 1376 - 1383. 被引量:1
  • 3McGrew J S, How J P. Air combat strategy using approximate dynamic programming[J]. Journal of Guidance, Control and Dynamics, 2010,33 (5) : 1641 - 1654. 被引量:1
  • 4Jia Y Y, Kakade S M, Shimkin N. Markov decision processes with arbitrary reward processes[J].Mathematics of Operations Research ,2009,34(3) :737 - 757. 被引量:1
  • 5Even-Dar E, Kakade S M, Mansour Y. Online Markov decision processes[J]. Mathematics of Operations Research, 2009,34 (3):726- 736. 被引量:1
  • 6Nguyen D, Fisher D C, Ryan L. Agraph-based approach to situation assessment[ C] // Proc. of the AIAA In f otech Aerospace, 2010 : 1 - 6. 被引量:1
  • 7Garlappi L, Skoulakis G. Numerical solutions to dynamic portfolio problems: the case for value function iteration using taylor approxi mation[J]. Computional Economics, 2009,33 (2) 193 - 207. 被引量:1
  • 8Virtanen K, Karelahti J, Raivio T. Modeling air combat by a moving horizon influence diagram game[J]. Journal of Guidance, Control, and Dynamics,2006,29(5) :1080 - 1091. 被引量:1
  • 9Mukal H, Tanikawa A, Tunay I, et al. Sequential linear-quadratic method for differential games with air combat applications [J]. Computational Optimization and Applications, 2003, 25 ( 1 - 3 ) : 193 - 222. 被引量:1
  • 10Frenk J B G, Kassay G, Kolumban J. On equivalent results in minimax theory[J].European Journal o f Operational Research ,2004, 157(1) 46 - 58. 被引量:1

共引文献79

同被引文献137

引证文献6

二级引证文献13

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部