基于改进PPO算法的双足机器人自适应行走控制

Adaptive walking control for bipedal robots based on enhanced PPO algorithm

下载PDF

导出

摘要针对双足机器人在未知环境行走过程中步态不稳的问题,提出了一种基于近端策略优化(proximal policy optimization,PPO)的双足机器人控制方法.首先,构建动作网络和价值网络,引入长短时记忆(long short-term memory,LSTM),以缩小双足机器人与未知环境交互时的状态估计值与期望值之间的偏差;其次,在动作网络中引入注意力机制,自适应改变神经网络自主学习的权重系数,以提高学习效率,得到适应不同环境的稳定步态;最后,通过仿真实验验证所提算法的有效性.结果表明:改进后近端策略优化算法的收敛速度更快,学习效率更高,能够有效提高双足机器人自适应行走的稳定性. A control method for bipedal robots based on proximal policy optimization(PPO)is proposed to address the issue of unstable gait during walking in unknown environments.Firstly,the construct an action network and value network are constructed,and long short term memory(LSTM)is constructed to reduce the deviation between the estimated state and the expected value when the bipedal robot interacts with the unknown environment.Secondly,the attention mechanism is introduced into the action network to adaptively change the weight coefficients of the neural network for autonomous learning,in order to improve learning efficiency and obtain a stable adapted to different environments.Finally,the effectiveness of the proposed algorithm is verified by simulation experiments.The results show that the improved proximal strategy optimization algorithm has faster convergence speed,higher learning efficiency,and can effectively improve the stability of adaptive walking for bipedal robots.

作者吴万毅刘芳华郭文龙 WU Wanyi;LIU Fanghua;GUO Wenlong(School of Mechanical Engineering,Jiangsu University of Science and Technology,Zhenjiang 212000,China)

机构地区江苏科技大学机械工程学院

出处《扬州大学学报（自然科学版）》 CAS 北大核心 2023年第6期44-50,共7页 Journal of Yangzhou University：Natural Science Edition

基金国家自然科学基金资助项目(62002141)。

关键词近端策略优化算法长短时记忆注意力机制双足行走机器人神经网络 near end strategy optimization algorithm long and short-term memory attention mechanism biped walking robot neural network

分类号 TP181 [自动化与计算机技术—控制理论与控制工程] TP242.6 [自动化与计算机技术—控制科学与工程]

引文网络
相关文献

参考文献2

1Ao Xi,Thushal Wijekoon Mudiyanselage,Dacheng Tao,Chao Chen.Balance Control of a Biped Robot on a Rotating Platform Based on Efficient Reinforcement Learning[J].IEEE/CAA Journal of Automatica Sinica,2019,6(4):938-951. 被引量：7
2葛一敏,袁海辉,甘春标.基于步态切换的欠驱动双足机器人控制方法[J].力学学报,2018,50(4):871-879. 被引量：12

二级参考文献6

1胡凌云,孙增圻.双足机器人步态控制研究方法综述[J].计算机研究与发展,2005,42(5):728-733. 被引量：36
2Chao LI Rong XIONG Qiu-guo ZHU Jun WU, Ya-liang WANG Yi-ming HUANG.Push recovery for the standing under-actuated bipedal robot using the hip strategy[J].Frontiers of Information Technology & Electronic Engineering,2015,16(7):579-593. 被引量：4
3陶波,龚泽宇,丁汉.机器人无标定视觉伺服控制研究进展[J].力学学报,2016,48(4):767-783. 被引量：37
4王冬,吴军,王立平,刘辛军.3-PRS并联机器人惯量耦合特性研究[J].力学学报,2016,48(4):804-812. 被引量：17
5程靖,陈力.空间机器人双臂捕获卫星力学分析及镇定控制[J].力学学报,2016,48(4):832-842. 被引量：25
6田彦涛,孙中波,李宏扬,王静.动态双足机器人的控制与优化研究进展[J].自动化学报,2016,42(8):1142-1157. 被引量：37

共引文献17

1郑鹏,王琪,吕敬,郑旭东.摩擦与滚阻对被动行走器步态影响的研究[J].力学学报,2020,52(1):162-170. 被引量：9
2袁海辉,葛一敏,甘春标.不确定性扰动下双足机器人动态步行的自适应鲁棒控制[J].浙江大学学报（工学版）,2019,53(11):2049-2057. 被引量：3
3余敏,罗建军,王明明,高登巍.一种改进RRT^*结合四次样条的协调路径规划方法[J].力学学报,2020,52(4):1024-1034. 被引量：13
4周宇生,文相容,王在华.论轮式移动结构的非完整约束及其运动控制[J].力学学报,2020,52(4):1143-1156. 被引量：2
5吕阳,方虹斌,徐鉴,马建敏,王启宁,张晓旭.四连杆膝关节假肢的动力学建模与分析[J].力学学报,2020,52(4):1157-1173. 被引量：9
6霍延军,袁旭华.基于CARLA-PSO组合模型的机器人步态控制系统设计[J].计算机测量与控制,2020,28(9):243-247. 被引量：2
7Tongle Zhou,Mou Chen,Jie Zou.Reinforcement Learning Based Data Fusion Method for Multi-Sensors[J].IEEE/CAA Journal of Automatica Sinica,2020,7(6):1489-1497. 被引量：5
8张奇志,张瑞,周亚丽.单足机器人周期跳跃控制的虚拟约束方法[J].力学季刊,2020,41(3):430-440.
9陈槾露,杨仁利,何润泉,王达,丁鹏.基于模块化控制的配电终端智能调试机器人控制系统[J].机械与电子,2020,38(12):76-80. 被引量：2
10敬成林,李宇.大学数学教学中的课程育人[J].科教文汇,2021(5):71-72. 被引量：1

1杨琦,陈钊,黄文鹏,邱永康,宋乐乐,康磊.1例T2椎体侵袭性血管瘤^(18)F-FDG PET/CT表现[J].中国医学影像技术,2023,39(12):1913-1914.
2李晓杰,刘坤.男子100米一级与健将级运动员起跑前5步技术分析与启示[J].体育科技文献通报,2023,31(8):13-16.
3张宁,吕双云,徐立夺.疑似脊髓病变的延髓外侧梗死一例[J].中华脑血管病杂志（电子版）,2023,17(5):529-531.
4Jia-ling Ji,Hui-min Shi,Zuo-lin Li,Ran Jin,Gao-ting Qu,Hui Zheng,E.Wang,Yun-yang Qiao,Xing-yue Li,Ling Ding,Da-fa Ding,Liu-cheng Ding,Wei-hua Gan,Bin Wang,Ai-qing Zhang.Satellite cell-derived exosome-mediated delivery of microRNA-23a/27a/26a cluster ameliorates the renal tubulointerstitial fibrosis in mouse diabetic nephropathy[J].Acta Pharmacologica Sinica,2023,44(12):2455-2468.
5廖展梅.正强化理论护理干预在颈椎病术后康复治疗患者中的价值[J].每周文摘·养老周刊,2023(22):172-174.
6Yudong JIA,Yunhong GAO,Jinxing LIN.Morphological comparison and gonadotropins cell localization of mature female turbot and mouse pituitary[J].Journal of Oceanology and Limnology,2023,41(6):2418-2428.
7Shuai Xiang,Xiao-Ping Chen.Proposal of a modified classification for hilar cholangiocarcinoma[J].Oncology and Translational Medicine,2023,9(6):248-253.
8Ahmed M.Abdelaal,Mohammad Kamal Abdelnasser,Mohamed MA.Moustafa,Ahmed Mohamed Ali,Haisam Atta,Ahmed A.Khalifa.Total hip arthroplasty for post-firearm hip arthritis complicated by coloarticular fistula:A case report[J].Chinese Journal of Traumatology,2023,26(6):369-374.

扬州大学学报（自然科学版）

2023年第6期

浏览历史

内容加载中请稍等...

基于改进PPO算法的双足机器人自适应行走控制

参考文献2

二级参考文献6

共引文献17

相关作者

相关机构

相关主题

浏览历史