摘要
针对未知环境下移动机器人动态避障存在规划轨迹长、行驶速度慢和鲁棒性差等问题,提出一种基于改进强化学习的移动机器人动态避障方法。移动机器人根据自身速度、目标位置和激光雷达信息直接得到动作信号,实现端到端的控制。基于距离梯度引导和角度梯度引导促使移动机器人向终点方向优化,加快算法的收敛速度;结合卷积神经网络从多维观测数据中提取高质量特征,提升策略训练效果。仿真试验结果表明,在多动态障碍物环境下,所提方法的训练速度提升40%、轨迹长度缩短2.69%以上、平均线速度增加11.87%以上,与现有主流避障方法相比,具有规划轨迹短、行驶速度快、性能稳定等优点,能够实现移动机器人在多障碍物环境下平稳避障。
Aiming to solve the problems of long planning trajectory,slow travel speed and poor robustness of mobile robot dynamic obstacle avoidance in unknown environment,a mobile robot dynamic obstacle avoidance method based on improved reinforcement learning is proposed.According to its own speed,target position and laser radar information,the mobile robot can directly obtain the action signal to achieve end-to-end control.Based on distance gradient guidance and angle gradient guidance,the mobile robot is optimized towards the end point and the convergence speed of the algorithm is accelerated.Combined with convolution neural network,high-quality features are extracted from multi-dimensional observation data to improve the effect of strategy training.The simulation results show that the training speed of the proposed method is increased by 40%,the track length is reduced by more than 2.69%,and the average line speed is increased by more than 11.87%in the multi-dynamic obstacle environment.Compared with the existing mainstream obstacle avoidance methods,the proposed method has the advantages of short planning trajectory,fast travel speed,stable performance and so on.It can realize the smooth obstacle avoidance of mobile robots in the multi-obstacles environment.
作者
徐建华
邵康康
王佳惠
刘学聪
XU Jianhua;SHAO Kangkang;WANG Jiahui;LIU Xuecong(School of Automation,Beijing Institute of Technology,Beijing 100081,China)
出处
《中国惯性技术学报》
EI
CSCD
北大核心
2023年第1期92-99,共8页
Journal of Chinese Inertial Technology
基金
装备重大基础研究项目(5140502A03)。
关键词
移动机器人
动态避障
强化学习
柔性演员评论家算法
卷积神经网络
mobile robot
dynamic obstacle avoidance
reinforcement learning
soft actor-critic
convolutional neural network