Online optimal control of nonlinear discrete-time systems using approximate dynamic programming 被引量：4

Online optimal control of nonlinear discrete-time systems using approximate dynamic programming

导出

摘要 In this paper,the optimal control of a class of general affine nonlinear discrete-time(DT) systems is undertaken by solving the Hamilton Jacobi-Bellman(HJB) equation online and forward in time.The proposed approach,referred normally as adaptive or approximate dynamic programming(ADP),uses online approximators(OLAs) to solve the infinite horizon optimal regulation and tracking control problems for affine nonlinear DT systems in the presence of unknown internal dynamics.Both the regulation and tracking controllers are designed using OLAs to obtain the optimal feedback control signal and its associated cost function.Additionally,the tracking controller design entails a feedforward portion that is derived and approximated using an additional OLA for steady state conditions.Novel update laws for tuning the unknown parameters of the OLAs online are derived.Lyapunov techniques are used to show that all signals are uniformly ultimately bounded and that the approximated control signals approach the optimal control inputs with small bounded error.In the absence of OLA reconstruction errors,an optimal control is demonstrated.Simulation results verify that all OLA parameter estimates remain bounded,and the proposed OLA-based optimal control scheme tunes itself to reduce the cost HJB equation. In this paper,the optimal control of a class of general affine nonlinear discrete-time(DT) systems is undertaken by solving the Hamilton Jacobi-Bellman(HJB) equation online and forward in time.The proposed approach,referred normally as adaptive or approximate dynamic programming(ADP),uses online approximators(OLAs) to solve the infinite horizon optimal regulation and tracking control problems for affine nonlinear DT systems in the presence of unknown internal dynamics.Both the regulation and tracking controllers are designed using OLAs to obtain the optimal feedback control signal and its associated cost function.Additionally,the tracking controller design entails a feedforward portion that is derived and approximated using an additional OLA for steady state conditions.Novel update laws for tuning the unknown parameters of the OLAs online are derived.Lyapunov techniques are used to show that all signals are uniformly ultimately bounded and that the approximated control signals approach the optimal control inputs with small bounded error.In the absence of OLA reconstruction errors,an optimal control is demonstrated.Simulation results verify that all OLA parameter estimates remain bounded,and the proposed OLA-based optimal control scheme tunes itself to reduce the cost HJB equation.

作者 Travis DIERKS Sarangapani JAGANNATHAN

机构地区 DRS Sustainment Systems Department of Electrical and Computer Engineering

出处《控制理论与应用（英文版）》 EI 2011年第3期361-369,共9页

基金 partly supported by the National Science Foundation (No.ECCS#0621924,ECCS-#0901562) the Intelligent Systems Center

关键词 Online nonlinear optimal control Hamilton Jacobi-Bellman Online approximators Discrete-time systems Online nonlinear optimal control Hamilton Jacobi-Bellman Online approximators Discrete-time systems

分类号 TP13 [自动化与计算机技术—控制理论与控制工程]

引文网络
相关文献

参考文献15

1J. Shamma,J. Cloutier.Existence of SDRE stabilizing feedback. IEEE Transactions on Automatic Control . 2003 被引量：1
2D. Vrabie,O. Pastravanu,M. Abu-Khalaf,et al.Adaptive optimal control for continuous-time linear systems based on policy iteration. Automatica . 2009 被引量：1
3H. Zhang,Q. Wei,Y. Luo.A novel infinite-time optimal tracking control scheme for a class of discrete-time nonlinear systems via the greedy HDP iteration algorithm. IEEE Transactions on Systems Man and Cybernetics . 2008 被引量：1
4T. Dierks,B. T. Thumati,S. Jagannathan.Optimal control of unknown affine nonlinear discrete-time systems using offiine-trained neural networks with proof of convergence. Neural Networks . 2009 被引量：1
5K. G. Vamvoudakis,F. L. Lewis.Online actor-critic algorithm to solve the continuous-time infinite horizon optimal control problem. Automatica . 2010 被引量：1
6Khalil HK.Nonlinear Systems. . 2002 被引量：1
7Krstic M,Kanellakopoulos I,Kokotovic P.Nonlinear and adaptive control design. . 1995 被引量：1
8Chen Z,Jagannathan S.Generalized Hamilton-Jacobi- Bellman formulation-based neural network control of a-ne nonlinear discretetime systems. IEEE Transactions on Neu- ral Networks . 2008 被引量：1
9TAMIMI A A,LEWIS F L,ABU-KHALAF M.Discrete-time non-linear HJB solution using approximate dynamic programming:con-vergence proof. IEEE Transactions on Systems,Man, and Cyber-netics,part B:Cybernetics . 2008 被引量：1
10Jagannathan,S.Neural Network Control of Nonlinear Discrete- Time Systems. . 2006 被引量：1

同被引文献4

1Dimitri P.BERTSEKAS.Approximate policy iteration:a survey and somenew methods[J].控制理论与应用（英文版）,2011,9(3):310-335. 被引量：6
2张化光,张欣,罗艳红,杨珺.自适应动态规划综述[J].自动化学报,2013,39(4):303-311. 被引量：80
3张绍杰,邱相玮,刘春生,胡寿松.基于MMST分组的一类MIMO非线性系统执行器故障自适应补偿控制[J].自动化学报,2014,40(11):2445-2455. 被引量：4
4Jing Na,Guido Herrmann.Online Adaptive Approximate Optimal Tracking Control with Simplified Dual Approximation Structure for Continuous-time Unknown Nonlinear Systems[J].IEEE/CAA Journal of Automatica Sinica,2014,1(4):412-422. 被引量：13

引证文献4

1Xiao-hua WANG,Juan-juan YU,Yao HUANG,Hua WANG,Zhong-hua MIAO.Adaptive dynamic programming for linear impulse systems[J].Journal of Zhejiang University-Science C(Computers and Electronics),2014,15(1):43-50.
2张绍杰,吴雪,刘春生.执行器故障不确定非线性系统最优自适应输出跟踪控制[J].自动化学报,2018,44(12):2188-2197. 被引量：9
3Sumit Kumar Jha,Shubhendu Bhasin.Adaptive Linear Quadratic Regulator for Continuous-Time Systems With Uncertain Dynamics[J].IEEE/CAA Journal of Automatica Sinica,2020,7(3):833-841. 被引量：3
4Behzad Farzanegan,Mohsen Zamani,Amir Abolfazl Suratgar,Mohammad Bagher Menhaj.A neuro-observer-based optimal control for nonaffine nonlinear systems with control input saturations[J].Control Theory and Technology,2021,19(2):283-294.

二级引证文献12

1杨雪静,李庆奎,易军凯.基于零和博弈的级联非线性系统的跟踪控制[J].北京信息科技大学学报（自然科学版）,2020,35(2):43-51.
2Seok-Kyoon Kim,Choon Ki Ahn.DC Motor Speed Regulator via Active Damping Injection and Angular Acceleration Estimation Techniques[J].IEEE/CAA Journal of Automatica Sinica,2021,8(3):641-647.
3李炳乾,钱坤,严浩,王发威.推力矢量飞机鲁棒故障检测与辨识和指令滤波容错控制系统设计[J].控制理论与应用,2021,38(4):529-539. 被引量：3
4金璇,刘文慧.时滞非线性参数系统有限时间状态约束控制[J].南京师范大学学报（工程技术版）,2021,21(2):40-46. 被引量：1
5赵光同,曹亮,周琪,李鸿一.具有未建模动态的互联大系统事件触发自适应模糊控制[J].自动化学报,2021,47(8):1932-1942. 被引量：6
6王亚朝,赵伟,徐海洋,刘建业.基于多阶段注意力机制的多种导航传感器故障识别研究[J].自动化学报,2021,47(12):2784-2790. 被引量：4
7刘梦舒,柯彦冰,王爱民,刘智浩,李建宁.针对多重故障多智能体系统的容错控制方法[J].杭州电子科技大学学报（自然科学版）,2022,42(2):49-55. 被引量：1
8乃永强,杨清宇,周文兴,杨莹.具有间歇性执行器故障的非线性系统自适应CFB控制[J].自动化学报,2022,48(10):2442-2461. 被引量：1
9马亚杰,姜斌,任好.航天器位姿运动一体化直接自适应容错控制研究[J].自动化学报,2023,49(3):678-686. 被引量：1
10黄旭,柳嘉润,贾晨辉,骆无意,巩庆海,冯明涛.强化学习控制方法及在类火箭飞行器上的应用[J].宇航学报,2023,44(5):708-718. 被引量：3

1计算机系统、计算机网络与网络互连[J].电子科技文摘,2003,0(4):107-110.
2Call for papers Journal of Control Theory and Applications Special issue on Approximate dynamic programming and reinforcement learning[J].控制理论与应用（英文版）,2010,8(2):257-257.
3QI Jun-Jun,ZHANG Wei-Hai.Discrete-time Indefinite Stochastic LQ Optimal Control： Infinite Horizon Case[J].自动化学报,2009,35(5):613-617. 被引量：3
4程序设计语言[J].电子科技文摘,2003,0(7):125-125.
5S.N.BALAKRISHNAN.Approximate dynamic programming solutions with a single network adaptive critic for a class of nonlinear systems[J].控制理论与应用（英文版）,2011,9(3):370-380. 被引量：2
6LIAO Yong CHEN Xudong XIONG Guangze ZHU Qingxin, SANG Nan LI Yun.Adaptive CPU Resource Allocation for Pervasive Computing Devices Based on Optimal Control[J].Chinese Journal of Electronics,2006,15(3):431-436. 被引量：1
7DUAN ZhiSheng HUANG Lin YANG Ying.The effects of redundant control inputs in optimal control[J].Science in China(Series F),2009,52(11):1973-1981. 被引量：13
8Wang Weihong1,2 & Hou Zhongsheng3 1. School of Tra?c and Transportation, Beijing Jiaotong Univ., Beijing 100044, P. R. China,2. Dept. of Automation, Taiyuan Univ. of Science & Technology, Taiyuan 030024, P. R. China,3. Advanced Control Systems Lab, School of Electronics and Information Engineering, Beijing Jiaotong Univ., Beijing 100044, P. R. China.New adaptive quasi-sliding mode control for nonlinear discrete-time systems[J].Journal of Systems Engineering and Electronics,2008,19(1):154-160. 被引量：11
9DerongLiu.Approximate Dynamic Programming for Self-Learning Control[J].自动化学报,2005,31(1):13-18. 被引量：14
10Chunhui LI,Erchuan ZHANG,Lin JIU,Huafei SUN.Optimal control on special Euclidean group via natural gradient algorithm[J].Science China(Information Sciences),2016,59(11):59-68. 被引量：4

控制理论与应用（英文版）

2011年第3期

浏览历史

内容加载中请稍等...

Online optimal control of nonlinear discrete-time systems using approximate dynamic programming 被引量：4

参考文献15

同被引文献4

引证文献4

二级引证文献12

相关作者

相关机构

相关主题

浏览历史