期刊文献+

基于强化学习的倒立摆分数阶梯度下降RBF控制 被引量:2

Reinforcement learning based fractional gradient descent RBF neural network control of inverted pendulum
原文传递
导出
摘要 为了提高强化学习的控制性能,提出一种基于分数梯度下降RBF神经网络的强化学习算法.通过评价神经网络和执行神经网络组成强化学习系统,利用神经网络记忆和联想,学会控制倒立摆,提高控制精度,使误差趋于零,直至学习成功,并证明闭环系统的稳定性.通过倒立摆的物理实验发现,当分数阶阶数较大,微分的作用更显著,对角速度和速度的控制效果更好,角速度和速度的均方误差和平均绝对误差较小;当分数阶阶数较小,积分的作用更显著,对倾斜角和位移的控制效果更好,因此倾斜角和位移的均方误差和平均绝对误差较小.仿真实验的结果表明,所提算法动态响应好,超调量小,调整时间短,精度高,泛化性能好.它优于基于RBF神经网络的强化学习算法和传统强化学习算法,能有效地加快梯度下降法的收敛速度,提高其控制性能.在引入适当的干扰后,所提算法能够快速地自我调节并恢复稳定状态,控制器的鲁棒性和动态性能满足实际要求. In order to improve the control performance of reinforcement learning,a reinforcement learning algorithm based on the fractional gradient descent RBF neural network is proposed.Based on the evaluation neural network and action neural network,the reinforcement learning system uses neural network memory and association,and learns to control the inverted pendulum.The control accuracy is improved with the error tending to zero until the learning is successful.The stability of the closed-loop system is proved.The physical experiment of inverted pendulum is carried out.It is pointed that when the fractional order is large,the differential effect is more significant,the control effect of diagonal velocity and velocity is better,and the mean square error and mean absolute error of angular velocity and velocity are smaller.When the fractional order is small,the effect of integral is more significant,and the control effect on tilt angle and displacement is better.The results indicate that the algorithm has good dynamic response,small overshoot,short adjustment time,high precision and good generalization performance.It is superior to the reinforcement learning algorithm based on the RBF neural network and the traditional reinforcement learning algorithm.It can effectively accelerate the convergence speed of the gradient descent method and improve its control performance.After introducing appropriate disturbance,the controller can quickly self-adjust and recover the stable state.The robustness and dynamic performance of the controller meet the actual requirements.
作者 薛晗 邵哲平 方琼林 刘晓佳 XUE Han;SHAO Zhe-ping;FANG Qiong-lin;LIU Xiao-jia(Institute of Navigation,Jimei University,Xiamen 361021,China)
出处 《控制与决策》 EI CSCD 北大核心 2021年第1期125-134,共10页 Control and Decision
基金 国家自然科学基金项目(51579114) 福建省自然科学基金项目(2018J05085).
关键词 强化学习 径向基神经网络 倒立摆 分数阶 梯度下降 神经网络控制 reinforcement learning RBF neural network inverted pendulum fractional order gradient descent neural network control
  • 相关文献

参考文献1

同被引文献16

引证文献2

二级引证文献11

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部