期刊文献+

基于鲁棒观测器的深度强化学习垂直起降运载器姿态稳定研究

Robust observer-based deep reinforcement learning for attitude stabilization of vertical takeoff and landing vehicle
下载PDF
导出
摘要 针对考虑弹性振动、模型不确定干扰下的垂直起降运载器姿态稳定问题,将鲁棒观测器和深度强化学习中的近端策略优化算法相结合,研究了一种基于鲁棒观测器的近端策略优化(robust observer-based proximal policy optimization,ROB-PPO)方法。该方法设计鲁棒观测器重构受弹性振动干扰的运载器姿态信息,将鲁棒观测器与运载器动力学模型组成环境,将鲁棒观测器得到的重构姿态作为深度强化学习算法的状态,使得深度强化学习智能体与之不断交互,从而训练智能体控制运载器姿态稳定。仿真结果表明,所研究的ROB-PPO算法相较于目前常用的自适应模糊比例-积分-微分(proportional-integral-derivative,PID)算法鲁棒性更强,收敛速度更快。最后,在自主研制的垂直起降运载器上验证了所提出算法有效性。 A robust observer-based proximal policy optimization(ROB-PPO)control method,which combines a robust observer and a proximal policy optimization in the deep reinforcement learning algorithm,is studied for the attitude stabilization problem of vertical takeoff and landing vehicles under the consideration of elastic vibration and model uncertainty disturbance.The method designs the robust observer to reconstruct the carrier attitude information disturbed by elastic vibration,composes the environment of the robust observer and the carrier dynamics model,and takes the reconstructed attitude obtained by the robust observer as the state of the deep reinforcement learning algorithm,so that the deep reinforcement learning intelligent body continuously interacts with it,thus training the intelligent body to control the carrier attitude stabilization.The simulation results show that the studied ROB-PPO algorithm is more robust and converges faster than the adaptive fuzzy proportional-integral-derivative(PID)algorithm commonly used today.Finally,the effectiveness of the proposed algorithm is verified on a self-developed vertical takeoff and landing vehicle.
作者 李彦铃 罗飞舟 葛致磊 LI Yanling;LUO Feizhou;GE Zhilei(School of Astronautics,Northwestern Polytechnical University,Xi’an 710072,China;China Academy of Launch Vehicle Technology,Beijing 100076,China)
出处 《系统工程与电子技术》 EI CSCD 北大核心 2024年第3期1038-1047,共10页 Systems Engineering and Electronics
关键词 垂直起降运载器 姿态控制 鲁棒观测器 深度强化学习 vertical takeoff and landing vehicle attitude control robust observer deep reinforcement learning
  • 相关文献

参考文献7

二级参考文献33

共引文献33

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部