摘要
能量管理策略是混合动力汽车关键技术之一。随着计算能力与硬件设备的不断升级,越来越多的学者逐步开展了基于学习的能量管理策略的研究。在基于强化学习的混合动力汽车能量管理策略研究中,智能体与环境相互作用的导向是由奖励函数决定。然而,目前的奖励函数设计多数是主观决定或者根据经验得来的,很难客观地描述专家的意图,所以在该条件不能保证智能体在给定奖励函数下学习到最优驾驶策略。针对这些问题,本文提出了一种基于逆向强化学习的能量管理策略,通过逆向强化学习的方法获取专家轨迹下的奖励函数权值,并用于指导发动机智能体和电池智能体的行为。之后将修改后的权重重新输入正向强化学习训练。从油耗值、SOC变化曲线、奖励训练过程、动力源转矩等方面,验证该权重值的准确性以及在节油能力方面具有一定的优势。综上所述,该算法的节油效果提高了5%~10%。
Energy management strategy is one of the key technologies for hybrid vehicles.With the continu‐ous upgrading of computing power and hardware devices,more and more scholars have gradually carried out re‐search on learning-based energy management strategies.In the study of reinforcement learning-based energy man‐agement strategies for hybrid electric vehicles,the orientation of the interaction between the intelligent agent and the environment is determined by the reward function.However,most of the current reward function design is sub‐jectively determined or based on experience,which is difficult to objectively describe the expert′s intention,so in that condition there is no guarantee that the intelligent body will learn the optimal driving strategy for a given reward function.To address these problems,an energy management strategy based on inverse reinforcement learning is pro‐posed in this paper to obtain the reward function weights under the expert trajectory by means of inverse reinforce‐ment learning and use them to guide the behavior of the engine and battery intelligent agents.Then,the modified weights are input again into the positive reinforcement learning training.The fuel consumption value,SOC variation curve,reward training process and power source torque are used to verify the accuracy of the weight value and its ad‐vantage in terms of fuel saving capability.In summary,the algorithm has improved the fuel saving effect by 5%~10%.
作者
齐春阳
宋传学
宋世欣
靳立强
王达
肖峰
Qi Chunyang;Song Chuanxue;Song Shixin;Jin Liqiang;Wang Da;Xiao Feng(Jilin University,State Key Laboratory of Automotive Simulation and Control,Changchun 130022;College of Automotive Engineering,Jilin University,Changchun 130022;School of Mechanical and Aerospace Engineering,Jilin University,Changchun 130022)
出处
《汽车工程》
EI
CSCD
北大核心
2023年第10期1954-1964,1974,共12页
Automotive Engineering
基金
国家重点研发计划项目(2021YFB2500704)资助。
关键词
混合动力汽车
最大熵逆向强化学习
能量管理策略
正向强化学习
hybrid electric vehicle
maximum entropy reverse reinforcement learning
energy management strategy
positive reinforcement learning