为了满足日益增长的能源需求并减少对环境的破坏,节能成为全球经济和社会发展的一项长远战略方针,加强能源管理能够提高能源利用效率、促进节能减排.然而,可再生能源和柔性负载的接入使得综合能源系统(Integrated energy system,IES)发...为了满足日益增长的能源需求并减少对环境的破坏,节能成为全球经济和社会发展的一项长远战略方针,加强能源管理能够提高能源利用效率、促进节能减排.然而,可再生能源和柔性负载的接入使得综合能源系统(Integrated energy system,IES)发展成为具有高度不确定性的复杂动态系统,给现代化能源管理带来巨大的挑战.强化学习(Reinforcement learning,RL)作为一种典型的交互试错型学习方法,适用于求解具有不确定性的复杂动态系统优化问题,因此在综合能源系统管理问题中得到广泛关注.本文从模型和算法的层面系统地回顾了利用强化学习求解综合能源系统管理问题的现有研究成果,并从多时间尺度特性、可解释性、迁移性和信息安全性4个方面提出展望.展开更多
As a complex and critical cyber-physical system(CPS),the hybrid electric powertrain is significant to mitigate air pollution and improve fuel economy.Energy management strategy(EMS)is playing a key role to improve the...As a complex and critical cyber-physical system(CPS),the hybrid electric powertrain is significant to mitigate air pollution and improve fuel economy.Energy management strategy(EMS)is playing a key role to improve the energy efficiency of this CPS.This paper presents a novel bidirectional long shortterm memory(LSTM)network based parallel reinforcement learning(PRL)approach to construct EMS for a hybrid tracked vehicle(HTV).This method contains two levels.The high-level establishes a parallel system first,which includes a real powertrain system and an artificial system.Then,the synthesized data from this parallel system is trained by a bidirectional LSTM network.The lower-level determines the optimal EMS using the trained action state function in the model-free reinforcement learning(RL)framework.PRL is a fully data-driven and learning-enabled approach that does not depend on any prediction and predefined rules.Finally,real vehicle testing is implemented and relevant experiment data is collected and calibrated.Experimental results validate that the proposed EMS can achieve considerable energy efficiency improvement by comparing with the conventional RL approach and deep RL.展开更多
In this paper,we present an optimal neuro-control scheme for continuous-time(CT)nonlinear systems with asymmetric input constraints.Initially,we introduce a discounted cost function for the CT nonlinear systems in ord...In this paper,we present an optimal neuro-control scheme for continuous-time(CT)nonlinear systems with asymmetric input constraints.Initially,we introduce a discounted cost function for the CT nonlinear systems in order to handle the asymmetric input constraints.Then,we develop a Hamilton-Jacobi-Bellman equation(HJBE),which arises in the discounted cost optimal control problem.To obtain the optimal neurocontroller,we utilize a critic neural network(CNN)to solve the HJBE under the framework of reinforcement learning.The CNN's weight vector is tuned via the gradient descent approach.Based on the Lyapunov method,we prove that uniform ultimate boundedness of the CNN's weight vector and the closed-loop system is guaranteed.Finally,we verify the effectiveness of the present optimal neuro-control strategy through performing simulations of two examples.展开更多
文摘为了满足日益增长的能源需求并减少对环境的破坏,节能成为全球经济和社会发展的一项长远战略方针,加强能源管理能够提高能源利用效率、促进节能减排.然而,可再生能源和柔性负载的接入使得综合能源系统(Integrated energy system,IES)发展成为具有高度不确定性的复杂动态系统,给现代化能源管理带来巨大的挑战.强化学习(Reinforcement learning,RL)作为一种典型的交互试错型学习方法,适用于求解具有不确定性的复杂动态系统优化问题,因此在综合能源系统管理问题中得到广泛关注.本文从模型和算法的层面系统地回顾了利用强化学习求解综合能源系统管理问题的现有研究成果,并从多时间尺度特性、可解释性、迁移性和信息安全性4个方面提出展望.
基金supported in part by the National Natural Science Foundation of China(61533019,91720000)Beijing Municipal Science and Technology Commission(Z181100008918007)the Intel Collaborative Research Institute for Intelligent and Automated Connected Vehicles(pICRI-IACVq)
文摘As a complex and critical cyber-physical system(CPS),the hybrid electric powertrain is significant to mitigate air pollution and improve fuel economy.Energy management strategy(EMS)is playing a key role to improve the energy efficiency of this CPS.This paper presents a novel bidirectional long shortterm memory(LSTM)network based parallel reinforcement learning(PRL)approach to construct EMS for a hybrid tracked vehicle(HTV).This method contains two levels.The high-level establishes a parallel system first,which includes a real powertrain system and an artificial system.Then,the synthesized data from this parallel system is trained by a bidirectional LSTM network.The lower-level determines the optimal EMS using the trained action state function in the model-free reinforcement learning(RL)framework.PRL is a fully data-driven and learning-enabled approach that does not depend on any prediction and predefined rules.Finally,real vehicle testing is implemented and relevant experiment data is collected and calibrated.Experimental results validate that the proposed EMS can achieve considerable energy efficiency improvement by comparing with the conventional RL approach and deep RL.
基金supported by the National Natural Science Foundation of China(61973228,61973330)
文摘In this paper,we present an optimal neuro-control scheme for continuous-time(CT)nonlinear systems with asymmetric input constraints.Initially,we introduce a discounted cost function for the CT nonlinear systems in order to handle the asymmetric input constraints.Then,we develop a Hamilton-Jacobi-Bellman equation(HJBE),which arises in the discounted cost optimal control problem.To obtain the optimal neurocontroller,we utilize a critic neural network(CNN)to solve the HJBE under the framework of reinforcement learning.The CNN's weight vector is tuned via the gradient descent approach.Based on the Lyapunov method,we prove that uniform ultimate boundedness of the CNN's weight vector and the closed-loop system is guaranteed.Finally,we verify the effectiveness of the present optimal neuro-control strategy through performing simulations of two examples.