期刊文献+

基于深度强化学习的四旋翼无人机自主控制方法 被引量:1

Autonomous Control Algorithm for Quadrotor Based on Deep Reinforcement Learning
下载PDF
导出
摘要 随着无人机的广泛应用,无人机控制器的设计成为近年来广泛研究的热点。当前无人机中广泛使用的PID,MPC等控制算法受到参数难调节、模型构建复杂、计算量大等一系列因素的制约。针对上述问题,提出了一种基于深度强化学习的无人机自主控制方法。该方法通过神经网络拟合无人机控制器,直接将无人机的状态映射到舵机的输出以控制无人机运动,在不断与环境进行交互训练中即可得到一个通用的无人机控制器,有效地避免了参数调节、模型构建等复杂操作。同时,为进一步提高模型的收敛速度和准确性,在传统强化学习算法Soft Actor Critic(SAC)的基础之上引入专家信息,提出了ESAC算法,指导无人机对环境进行探索,以增强控制策略的易用性和扩展性。最后在无人机的位置控制以及轨迹跟踪任务中,通过与传统PID控制器和SAC,DDPG等强化学习算法构建的模型控制器进行对比,实验结果表明,通过ESAC算法构建的控制器能够达到与PID控制器同样甚至更优的控制效果,同时在稳定性和准确性上优于SAC和DDPG构建的控制器。 With the wide application of UAV,the design of UAV controller has become a hot research topic in recent years.The control algorithms such as PID and MPC widely used in UAV are restricted by a series of factors such as difficult parameter adjustment,complex model construction,and large amount of calculation.Aiming at the above problems,a UAV autonomous control method based on deep reinforcement learning is proposed.This method fits the UAV controller through a neural network,directly maps the state of the UAV to the output of the steering gear to control the movement of the UAV,and can obtain a general UAV controller in the continuous interactive training with the environment.This method effectively avoids complex operations such as parameter adjustment and model building.At the same time,in order to further improve the convergence speed and accuracy of the model,on the basis of the traditional reinforcement learning algorithm soft actor critic(SAC),by introducing expert information,an ESAC algorithm is proposed,which guides the UAV to explore the environment and enhances the ease of control strategy.Finally,in the position control and trajectory tracking tasks of the UAV,compared to the traditional PID controller and the model controller constructed by SAC,DDPG and other reinforcement learning algorithms,experimental results show that the controller constructed by the ESAC algorithm can achieve the same level as the PID controller,and it is better than the controller built by SAC and DDPG in stability and accuracy.
作者 梁吉 王立松 黄昱洲 秦小麟 LIANG Ji;WANG Lisong;HUANG Yuzhou;QIN Xiaolin(College of Computer Science and Technology,Nanjing University of Aeronautics and Astronautics,Nanjing 211106,China)
出处 《计算机科学》 CSCD 北大核心 2023年第S02期1-7,共7页 Computer Science
基金 国家自然科学基金(61972198)。
关键词 强化学习 四旋翼无人机 自主控制 专家策略 Reinforcement learning Quadrotor Autonomous control Expert policy
  • 相关文献

同被引文献2

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部