摘要
近年来,在基于Q学习算法的作业车间动态调度系统中,状态-行动和奖励值靠人为主观设定,导致学习效果不理想,与已知最优解相比,结果偏差较大.为此,基于作业车间调度问题的特质,对Q学习算法的要素进行重新设计,并用标准算例库进行仿真测试.将结果先与已知最优解和混合灰狼优化算法、离散布谷鸟算法和量子鲸鱼群算法在近似程度、最小值方面进行比较分析.实验结果表明,与国内求解作业车间调度问题的Q学习算法相比,该方法在最优解的近似程度上显著提升,与群智能算法相比,在大多数算例中,寻优能力方面有显著提升.
In recent years,in the job shop dynamic scheduling system based on Q-learning algorithm,the state action and reward value are set subjectively by human beings,which leads to the unsatisfactory learning effect.Compared with the known optimal solution,the result deviation is larger.For this reason,based on the characteristics of job shop scheduling problem,the elements of Q-learning algorithm are redesigned,and simulation test is carried out with standard case library.The results are compared with the known optimal solution,the hybrid Gray Wolf algorithm,the discrete cuckoo algorithm and the quantum whale swarm algorithm in terms of approximation and minimum.The experimental results show that compared with the Q-learning algorithm for solving the job shop scheduling problem in China,this method is significantly improved in the approximate degree of the optimal solution,and compared with the group intelligence algorithm,in most cases,the optimization ability is significantly improved.
作者
王维祺
叶春明
谭晓军
WANG Wei-Qi;YE Chun-Ming;TAN Xiao-Jun(College of management,University of Shanghai for Science and Technology,Shanghai 200093,China)
出处
《计算机系统应用》
2020年第11期218-226,共9页
Computer Systems & Applications
基金
国家自然科学基金(71840003)
上海理工大学科技发展基金(2018KJFZ043)
关键词
智能制造
作业车间调度
Q学习算法
intelligent manufacturing
job shop scheduling
Q-learning algorithm