期刊文献+

多无人机系统在线强化学习最优安全跟踪控制

Optimal Secure Tracking Control in Multi-UAVs Based on Online Reinforcement Learning
下载PDF
导出
摘要 在无人机(UAV)编队跟踪任务中,虚假数据注入(FDI)攻击者可向控制指令注入误导性数据,导致无人机无法形成指定的编队构型,故需设计安全编队跟踪控制器。为此,本文利用零和图博弈对攻防过程进行建模,其中FDI攻击者和安全控制器是博弈的参与者,攻击者的目标是最大化设定的成本函数,而安全控制器的目标与之相反,求解博弈并获得最优安全控制策略依赖于求取Hamilton-Jacobi-Isaacs(HJI)方程的解。而HJI方程是耦合偏微分方程,难以直接求解,因此结合经验回放机制引入了有限时间收敛的在线强化学习算法,设计了单评价神经网络近似值函数并获得了最优安全控制策略。最终利用仿真验证了算法的有效性。 In Unmanned Aerial Vehicle(UAV)formation tracking missions,False Data Injection(FDI)attackers can inject misleading data into the control commands,resulting in the fact that UAVs can not form the specified formation configuration,so there is a need to design a secure formation tracking controller.The attack-defense process was modeled as a zero-sum graphical game,in which the FDI attacker and the secure controller were viewed as game players.The attacker aims to maximize the cost function yet the secure controller serves a contrary purpose.Solving the game and acquiring the optimal secure control policy rely on solving the Hamilton-Jacobi-Isaacs(HJI)equation.The HJI equation is a coupled partial differential equation,which is difficult to solve directly.Therefore,the finite-time convergent online reinforcement learning algorithm that combines the experience replay mechanism was introduced and the critic-only neural network was utilized to approximate the value function for obtaining the optimal secure control policy.A numerical simulation was given to show the effectiveness of the raised scheme.
作者 弓镇宇 杨飞生 Gong Zhenyu;Yang Feisheng(Northwestern Polytechnical University,Xi’an 710072,China)
机构地区 西北工业大学
出处 《航空科学技术》 2024年第4期25-30,共6页 Aeronautical Science & Technology
基金 国家自然科学基金(62073269) 航空科学基金(2020Z034053002) 陕西省重点研发计划项目(2022GY-244) 重庆市自然科学基金(CSTB2022NSCQ-MSX0963) 广东省基础与应用基础研究基金(2023A1515011220)。
关键词 FDI攻击 多无人机 在线强化学习 优化控制 零和图博弈 FDI attack multi-UAVs online reinforcement learning optimal control zero-sum graphical game
  • 相关文献

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部