摘要
针对机器人自动化充电任务中的寻孔操作,研究基于柔性行动者评价者(SAC)深度强化学习算法的机器人寻孔策略。设计一个基于actor-critic框架、以枪头位姿、接触力信息为输入、末端枪头坐标系XY平面动作为输出的策略控制器。该策略控制器共有5个神经网络,分别为actor网络、2个目标critic网络、2个critic网络;actor网络负责输出寻孔动作,目标critic网络负责输出下一寻孔状态下寻孔动作的价值评估,critic网络负责输出当前寻孔状态下寻孔动作的价值评估。基于double-Q trick方法使用2个目标critic网络输出价值中的较小值和2个critic网络输出价值中的较小值来分别更新critic网络和actor网络,以训练策略控制器。采用力位混合控制结构,将actor网络输出的XY平面位移动作转换成期望平动速度,与Z轴力跟踪导纳控制输出的期望速度合成机器人期望速度引导充电枪寻孔。仿真和实验验证了所提方法的有效性。
Aiming at the hole-finding operation in robot automatic charging task,the hole-finding strategy of robot based on soft actor-critic(SAC)deep reinforcement learning algorithm is studied.Based on actor-critic framework,the strategy takes the pose and contact force information of the gun head as input and the XY planes motion of the end-gun head coordinate system as output.The strategy controller has five neural networks,which are actor network,two target critic networks,and two critic networks.The actor network is responsible for outputting the searching ac-tion,the target critic network is responsible for outputting the value evaluation of the searching action at the next state,and the critic network is responsible for outputting the value evaluation of the searching action at the current state.Based on the double-Q trick method,the smaller value of the output values of the two target critic networks and the two critic networks are used to update the critic network and the actor network respectively,thereby training the strategy controller.Using the force and position hybrid control structure,the XY planes displacement motion output by the actor network is converted into the expected translation speed,which is combined with the expected speed output by the Z-axis force tracking admittance control to guide the charging gun to find holes.The effective-ness of the proposed method is verified by simulation and experiment.
作者
徐建明
陈阜
董建伟
XU Jianming;CHEN Fu;DONG Jianwei(College of Information Engineering,Zhejiang University of Technology,Hangzhou 310023)
出处
《高技术通讯》
CAS
2023年第1期63-71,共9页
Chinese High Technology Letters
基金
国家自然科学基金-浙江省自然科学基金联合基金两化融合项目(U1709213)
国家自然科学基金面上项目(61374103)资助。