期刊文献+

基于改进Q学习算法的无人物流配送车路径规划 被引量:2

Path Planning of Unmanned Delivery Vehicle Based on Improved Q-learning Algorithm
下载PDF
导出
摘要 为解决传统的Q学习算法用于无人车路径规划时,存在规划效率低和收敛速度慢等问题,为此,提出一种基于改进Q学习算法的无人物流配送车路径规划算法。借鉴模拟退火算法的能量迭代原理,对贪婪因子ε进行调整,使其在训练过程中动态变化,以平衡探索与利用之间的关系,提高规划效率。将奖励机制中的奖励值由离散值变为连续值,并使其随着无人物流配送车与目标点的欧式距离减小而增大,让目标点牵引无人物流配送车移动以加快算法收敛速度。在两种不同的环境下对改进的Q学习算法进行仿真实验,结果表明:改进后的Q学习算法可以高效地规划出一条从起始点至目标点的路径,步数为34步,优于对比算法的路径质量。通过改变道路环境,验证了改进Q学习算法对不同环境的适应性,规划效率和收敛速度依然优于传统Q学习算法。 To solve the traditional Q-learning algorithm for unmanned vehicle path planning suffers from the problems of low planning efficiency and slow convergence speed,for this reason,a path planning algorithm for unmanned delivery vehicles based on the improved Q-learning algorithm is proposed.Learning from the energy iteration principle of the simulated annealing algorithm,adjusts the greedy factorεto make it change dynamically during the training process,so as to balance the relationship between exploration and utilization,and thus improve the planning efficiency.The reward value in the reward mechanism is changed from a discrete value to a continuous value,and it increases as the European distance between the unmanned delivery vehicle and the target point decreases,so that the target point can pull the unmanned delivery vehicle to move and accelerate the convergence speed of the algorithm.The improved Q-learning algorithm is simulated in two different environments,the simulation results show that the improved Q-learning algorithm can efficiently plan a path from the starting point to the target point with 34 steps,which is better path quality than comparison algorithms.The adaptability of the improved Q-learning algorithm to different environments is verified by changing the road environment,and the planning efficiency and convergence speed are still better than the traditional Q-learning algorithm.
作者 王小康 冀杰 刘洋 贺庆 Wang Xiaokang;Ji Jie;Liu Yang;He Qing(College of Engineering and Technology,Southwest University,Chongqing 400715,China)
出处 《系统仿真学报》 CAS CSCD 北大核心 2024年第5期1211-1221,共11页 Journal of System Simulation
基金 重庆市科学技术局农业农村领域重点研发计划(cstc2021jscx-gksbX0003) 重庆市教育委员会科学技术研究项目(KJZDM202201302) 重庆市博士后研究项目(2021XM3070)。
关键词 Q学习 路径规划 收敛速度 规划效率 路径质量 Q-learning path planning convergence speed planning efficiency path quality
  • 相关文献

参考文献10

二级参考文献124

  • 1赵明,郑泽宇,么庆丰,潘怡君,刘智.基于改进人工势场法的移动机器人路径规划方法[J].计算机应用研究,2020,37(S02):66-68. 被引量:30
  • 2赵真明,孟正大.基于加权A~*算法的服务型机器人路径规划[J].华中科技大学学报(自然科学版),2008,36(S1):196-198. 被引量:32
  • 3魏英姿 ,赵明扬 .一种基于强化学习的作业车间动态调度方法[J].自动化学报,2005,31(5):765-771. 被引量:19
  • 4Agirrebeitia, 3., Aviles, R., de Bustos, I.F., Ajuria, C., 2005. A new APF strategy for path planning in environments with obstacles. Mech. Maeh. Theory., 40(6):645-658. Idol: 10.1016/j.meehmaeht heory.2005.01.0061. 被引量:1
  • 5Alexopoulos, C., Griffin, P.M., 1992. Path plmming for a. mobile robot. IEEE Trans. S'yst. Man CybeT"r,, 22(2): 318-322. [doi:10.1109/21.148404]. 被引量:1
  • 6AI-Taharwa, I., Sheta, A., Al-Weshah, M., 2008. A mobile robot path planning using genetic algorithm in staticenvironment. J. Coztput. Sci., 4(4):341-344. 被引量:1
  • 7Barraquand, J., Langlois, B., Latombe, J.C., 1992. Nu- merical potential field techniques for robot path plan- ning. IEEE Trans. Syst. Man Cybern., 22(2):224-241. [doi: 10.1109/21.148426]. 被引量:1
  • 8Cao, Q., Huang, Y., Zhou, J., 2006. An Evolutionary Artificial Potential Field Algorithm for Dynamic Path Planning of Mobile Robot. Proc. IEEE/RSJ Int. Conf. on Intelligent Robots and Systems, p.3331-3336. [doi: 10.1109/IROS.2006.2825081. 被引量:1
  • 9Castiilo, 0., Trujillo, L., Melin, P., 2007. Multiple objective genetic algorithms for path-planning optimization in autonomous mobile robots. Soft Conput., 11(3):269- 279. [doi: 10.1007/s00500-006-0068-4]. 被引量:1
  • 10I)earden, R., Friedman, N., Russell, S., 1998. Bayesian Q-Learning. Proc. National Conf. on Artificial Intelli- gence, p.761-768. 被引量:1

共引文献71

同被引文献43

引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部