多无人机系统在线强化学习最优安全跟踪控制

Optimal Secure Tracking Control in Multi-UAVs Based on Online Reinforcement Learning

下载PDF

导出

摘要在无人机(UAV)编队跟踪任务中,虚假数据注入(FDI)攻击者可向控制指令注入误导性数据,导致无人机无法形成指定的编队构型,故需设计安全编队跟踪控制器。为此,本文利用零和图博弈对攻防过程进行建模,其中FDI攻击者和安全控制器是博弈的参与者,攻击者的目标是最大化设定的成本函数,而安全控制器的目标与之相反,求解博弈并获得最优安全控制策略依赖于求取Hamilton-Jacobi-Isaacs(HJI)方程的解。而HJI方程是耦合偏微分方程,难以直接求解,因此结合经验回放机制引入了有限时间收敛的在线强化学习算法,设计了单评价神经网络近似值函数并获得了最优安全控制策略。最终利用仿真验证了算法的有效性。 In Unmanned Aerial Vehicle(UAV)formation tracking missions,False Data Injection(FDI)attackers can inject misleading data into the control commands,resulting in the fact that UAVs can not form the specified formation configuration,so there is a need to design a secure formation tracking controller.The attack-defense process was modeled as a zero-sum graphical game,in which the FDI attacker and the secure controller were viewed as game players.The attacker aims to maximize the cost function yet the secure controller serves a contrary purpose.Solving the game and acquiring the optimal secure control policy rely on solving the Hamilton-Jacobi-Isaacs(HJI)equation.The HJI equation is a coupled partial differential equation,which is difficult to solve directly.Therefore,the finite-time convergent online reinforcement learning algorithm that combines the experience replay mechanism was introduced and the critic-only neural network was utilized to approximate the value function for obtaining the optimal secure control policy.A numerical simulation was given to show the effectiveness of the raised scheme.

作者弓镇宇杨飞生 Gong Zhenyu;Yang Feisheng(Northwestern Polytechnical University,Xi’an 710072,China)

机构地区西北工业大学

出处《航空科学技术》 2024年第4期25-30,共6页 Aeronautical Science & Technology

基金国家自然科学基金(62073269) 航空科学基金(2020Z034053002) 陕西省重点研发计划项目(2022GY-244) 重庆市自然科学基金(CSTB2022NSCQ-MSX0963) 广东省基础与应用基础研究基金(2023A1515011220)。

关键词 FDI攻击多无人机在线强化学习优化控制零和图博弈 FDI attack multi-UAVs online reinforcement learning optimal control zero-sum graphical game

分类号 V249.1 [航空宇航科学与技术—飞行器设计]

引文网络
相关文献

1数学[J].初中生辅导,2023(31):72-77.
2张恒伟,董淑海,张贝贝.电力系统及其自动化技术的安全控制研究[J].光源与照明,2024(1):216-218. 被引量：2
3叶翠萍.基于深度学习的小学生高阶思维培养策略研究[J].数学学习与研究,2023(36):80-82.
4孟庆媛,姜斌,马亚杰,任好.基于零和微分博弈的航天器相对位置容错控制[J].中国科学：技术科学,2024,54(3):391-401.
5罗彪,欧阳志华,易昕宁,刘德荣.基于自适应动态规划的移动机器人视觉伺服跟踪控制[J].自动化学报,2023,49(11):2286-2296. 被引量：1
6黄鑫,屈文忠,肖黎.基于卷积注意力机制的阀门内漏声发射识别方法[J].振动与冲击,2024,43(9):105-114.
7秦榆萍.二阶迭代差分方程的连续解[J].数学进展,2024,53(2):359-366.
8周林霞,杜少林,戴文琦,周记超,肖龙,刘晨.交直流电网实时协调安全控制技术设计与实现[J].电子质量,2024(3):57-60.
9丁文俊,柴亚军,杨宇贤,刘佳敏,毛昭勇.基于空海异构无人平台的水下目标搜索与跟踪[J].水下无人系统学报,2024,32(2):237-249.
10史国庆,程嘉毅,张建东,杨啟明,吴勇,武凡.基于反馈线性化的广义预测控制机械臂轨迹跟踪算法[J].西北工业大学学报,2024,42(2):368-376.

航空科学技术

2024年第4期

浏览历史

内容加载中请稍等...

多无人机系统在线强化学习最优安全跟踪控制

相关作者

相关机构

相关主题

浏览历史