摘要
旅行商问题(Traveling Salesman Problem,TSP)是组合最优化问题(Combinatorial Optimization Problem,COP)中的经典问题,多年以来一直被反复研究.近年来深度强化学习(Deep Reinforcement Learning,DRL)在无人驾驶、工业自动化、游戏等领域的广泛应用,显示了强大的决策力和学习能力.结合DRL和图注意力模型,通过最小化路径长度求解TSP问题.改进REINFORCE算法,训练行为网络参数,可以有效地减小方差,防止局部最优;在编码结构中采用位置编码(Positional Encoding,PE),使多重的初始节点在嵌入的过程中满足平移不变性,可以增强模型的稳定性;进一步结合图神经网络(Graph Neural Network,GNN)和Transformer架构,首次将GNN聚合操作处理应用到Transformer的解码阶段,有效捕捉图上的拓扑结构及点与点之间的潜在关系.实验结果显示,模型在100-TSP问题上的优化效果超越了目前基于DRL的方法和部分传统算法.
Traveling Salesman Problem(TSP) is a classic problem in Combinatorial Optimization Problem(COP),which has been repeatedly studied for many years. In recent years,Deep Reinforcement Learning(DRL)has been widely applied in driverless,industrial automation,game and other fields,showing strong decision-making and learning ability. In this paper,DRL and graph attention model are combined to solve TSP by minimizing the path length. Specifically,the behavioral network parameters are trained by an improved REINFORCE algorithm to effectively reduce the variance and prevent local optima;Positional Encoding(PE) is used to the encoding structure to make the multiple node satisfy translation invariance during the embedding process and enhance the stability of the model. Further,we combine Graph Neural Network(GNN) and Transformer architecture,and apply GNN aggregate operation processing to transformer decoding stage for the first time,which effectively capture the topological structure of the graph and the potential relationships between points. The experimental results show that the optimization effect of the model on the 100-TSP problem surpasses the current DRL-based methods and some traditional algorithms.
作者
王扬
陈智斌
杨笑笑
吴兆蕊
Wang Yang;Chen Zhibin;Yang Xiaoxiao;Wu Zhaorui(Faculty of Science,Kunming University of Science and Technology,Kunming,650000,China)
出处
《南京大学学报(自然科学版)》
CAS
CSCD
北大核心
2022年第3期420-429,共10页
Journal of Nanjing University(Natural Science)
基金
国家自然科学基金(11761042)。
关键词
深度强化学习
旅行商问题
图注意力模型
图神经网络
组合最优化
Deep Reinforcement Learning(DRL)
Travel Salesman Problem(TSP)
graph attention model
Graph Neural Network(GNN)
Combinatorial Optimization(CO)