摘要
针对城市轨道交通节能运行问题,提出一种基于Sarsa强化学习算法的城轨列车节能控制策略,实现了城轨列车在自动驾驶状态下,面对不同路况,执行减少能源消耗驾驶策略的同时兼顾准时性和舒适性。根据线路条件将列车状态进行离散化处理,将连续的驾驶过程分为若干个子区间进行分段求解。结合区间限速、初始状态、终末状态等限制条件,基于能耗及运行时间分别构造适当的奖励函数。同时,用当前状态下可达的最大速度与最小速度对可选速度集合进行限制,缩小探索空间,加快算法收敛。最后,通过对北京铁路亦庄线小红门站至肖村站的实例进行仿真。实验结果表明,与传统的动态规划方法相比,Sarsa算法在满足舒适性和准时性要求的情况下节能9.32%。相比于强化学习中的Q学习算法,在速度的选取过程中,超速次数也有明显下降。仿真结果证明Sarsa算法具有更好的节能效果和安全性。在算法参数不变的情况下,调整限速条件,与传统动态规划算法进行二次对比,依旧节能4.21%,验证了算法的鲁棒性。
Focusing on the problem of energy-efficient operation of urban rail transit,an energy-efficient control strategy of urban rail transit based on Sarsa reinforcement learning algorithm is proposed.To achieve the driving strategy of reducing energy consumption while taking into account punctuality and comfort when the urban rail transit is in automatic driving mode and facing different road conditions,the running state of the train is discretized and the continuous driving process is divided into several subsections according to the line conditions.Combined with the speed limit,initial state and terminal state of the track,the appropriate reward functions are constructed based on energy consumption and running time.At the same time,the setting of optional speed is limited by the maximum speed and minimum speed that can be reached in the current state,which reduces the exploration space and accelerates the convergence of the algorithm.Finally,the simulation is carried out for the case of Xiaohongmen-Xiaocun Station on the Yizhuang Urban Rail Line in Beijing.The experimental results show that,compared with the traditional dynamic programming method,the Sarsa algorithm saves 9.32%energy while meeting the requirements of comfort and punctuality.Compared with the Q-learning algorithm in reinforcement learning,the number of overspeed also decreases significantly in the process of speed selection.Simulation results show that the Sarsa algorithm has a better energy-saving effect and security.With the algorithm parameters unchanged,the speed limit conditions are adjusted,and compared with the traditional dynamic programming method again,it still saves 4.21%energy,which verifies the robustness of the algorithm.
作者
孟建军
蒋小一
陈晓强
胥如迅
MENG Jianjun;JIANG Xiaoyi;CHEN Xiaoqiang;XU Ruxun(Mechatronics T&R Institute,Lanzhou Jiaotong University,Lanzhou 730070,China;Gansu Provincial Industry Technology Center of Logistics and Transport Equipment,Lanzhou 730070,China;Gansu Provincial Engineering Technology Center for Informatization of Logistics and Transport Equipment,Lanzhou 730070,China;School of Mechanical Engineering,Lanzhou Jiaotong University,Lanzhou 730070,China)
出处
《铁道标准设计》
北大核心
2024年第8期8-14,共7页
Railway Standard Design
基金
国家自然科学基金项目(62063013)
兰州交通大学青年基金项目(2021018)
甘肃省优秀研究生“创新之星”项目(2022CXZX-517)。
关键词
城市轨道交通
节能
强化学习
Sarsa算法
控制策略
urban rail trains
energy-efficient
reinforcement learning
Sarsa algorithm
control strategy