摘要
摆臂式履带机器人具有一定的地形适应能力,实现摆臂的自主控制对提升机器人在复杂环境中的智能化作业水平具有重要意义。结合专家越障知识和技术指标对机器人的摆臂控制问题进行马尔可夫决策过程(Markov decision process,MDP)建模,基于物理仿真引擎Pymunk搭建了越障训练的仿真环境;提出一种基于D3QN(dueling double DQN)网络模型的深度强化学习摆臂控制算法,以地形信息与机器人状态为输入,以机器人前后四摆臂转角为输出,能够实现挑战性地形下履带机器人摆臂的自学习控制。在Gazebo三维仿真环境中将算法学得的控制策略与人工操纵进行了对比实验,结果表明:所提算法相对人工操纵具有更加高效的复杂地形通行能力。
Tracked robots with flippers have certain terrain adaptation capabilities.To improve the intelligent operation level of robots in complex environments,it is significant to realize the flipper autonomously control.Combining the expert experience in obstacle crossing and optimization indicators,Markov decision process(MDP)modeling of the robot's flipper control problem is carried out and a simulation training environment based on physics simulation engine Pymunk is built.A deep reinforcement learning control algorithm based on dueling double DQN(D3QN)network is proposed for controlling the flippers.With terrain information and robot state as the input and the four flippers'angle as the output,the algorithm can achieve the self-learning control of the flippers in challenging terrain.The learned flipper control policy is compared with the manual operation in Gazebo 3D simulation environment.The results show that the proposed algorithm can enable the flippers of robot to obtain adaptive adjustment ability,which helps the robot pass complex terrain more efficiently.
作者
潘海南
陈柏良
黄开宏
任君凯
程创
卢惠民
张辉
Pan Hainan;Chen Bailiang;Huang Kaihong;Ren Junkai;Cheng Chuang;Lu Huimin;Zhang Hui(College of Intelligence Science and Technology,National University of Defense Technology,Changsha 410073,China)
出处
《系统仿真学报》
CAS
CSCD
北大核心
2024年第2期405-414,共10页
Journal of System Simulation
基金
国家自然科学基金联合基金重点项目(U1813205,U1913202)。