期刊文献+

Q学习演化博弈中决策机制对网络合作水平的影响

The Influence of Decision Mechanisms on Network Cooperation Level in Q-learning Evolutionary Game
下载PDF
导出
摘要 针对博弈决策过程中个体无法获取邻居收益的问题,基于Q学习自我经验学习的特性,提出Q学习演化博弈模型。考虑到不同Q学习决策机制会对网络合作水平产生不同的影响,采用ε-greedy决策机制、Boltzmann决策机制和Max-plus决策机制,针对不同的网络类型、不同的博弈模型参数和不同的强化学习参数进行对比实验,量化分析决策机制对网络合作水平的影响。实验结果表明:与传统的演化博弈模型相比,Q学习演化博弈模型能够普遍提高网络的合作水平,并且不同的Q学习决策机制会对网络合作水平产生不同的影响,使用ε-greedy决策机制的模型合作水平比另两种模型高约35%和37%;较低的学习率、较高的折扣率以及适中的收益均匀性能够促进网络中个体间的合作,使用ε-greedy决策机制的模型合作水平比在较高学习率和较低折扣率下的合作水平分别高约40%和45%;在较高的探索率下,引入考虑个体全局属性的Max-plus决策机制的网络平均收益比引入另两种决策机制的Q学习模型高约22%和17%。 Aiming at addressing the problem that individuals face an inability to obtain benefits from their neighbors in the process of game decision making,this study examines the characteristics of self-experiential learning of Q-learning,thereby proposing a Q-learning evolutionary game model.Considering that different Q-learning decision mechanisms have different effects on the cooperation level of the network,the influence of the decision mechanism on the network cooperation level is quantitatively analyzed using three Q-learning decision mechanisms:ε-greedy,Boltzmann,and Max-plus by conducting comparative experiments on different network types,game model parameters,and reinforcement learning parameters.Experiments show that compared with the traditional evolutionary game models,the Q-learning evolutionary game model can generally improve the cooperation level of the network,with different Q-learning decision mechanisms having different effects on the cooperation level of the network.The cooperation level of the model using theε-greedy decision mechanism is approximately 35%and 37%higher than that of the models using the Boltzmann and Max-plus decision mechanisms,respectively.Lower learning rates,higher discount rates,and moderate benefit uniformity promote cooperation between individuals in the network,such that for theε-greedy decision mechanism,the cooperation level of the model using lower learning and higher discount rates is about 40%and 45%higher than that of the models using higher learning and lower discount rates,respectively.At the higher exploration level,introducing the Max-plus decision mechanism to consider global attributes of individuals improves the cooperation level by about 22%and 17%compared to using theε-greedy and Boltzmann decision mechanisms,respectively.
作者 张尊栋 王岩楠 周慧娟 张艺帆 ZHANG Zundong;WANG Yannan;ZHOU Huijuan;ZHANG Yifan(Beijing Key Laboratory of Urban Intelligent Traffic Control Technology,North China University of Technology,Beijing 100144,China;Intelligent Urban Transportation Systems Laboratory,University of Washington,Seattle 98195,USA;State Key Laboratory of Rail Traffic Control and Safety,Beijing Jiaotong University,Beijing 100044,China)
出处 《计算机工程》 CAS CSCD 北大核心 2023年第6期99-106,114,共9页 Computer Engineering
基金 “十三五”国家重点研发计划(2018YFB1601000)。
关键词 Q学习 决策机制 网络演化博弈 合作水平 折扣率 Q-learning decision mechanism network evolutionary game cooperation level discount rate
  • 相关文献

参考文献4

二级参考文献24

共引文献14

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部