摘要
随着深度学习和强化学习而来的人工智能新浪潮,为智能体从感知输入到行动决策输出提供了“端到端”解决方案。多智能体学习是研究智能博弈对抗的前沿课题,面临着对抗性环境、非平稳对手、不完全信息和不确定行动等诸多难题与挑战。本文从博弈论视角入手,首先给出了多智能体学习系统组成,进行了多智能体学习概述,简要介绍了各类多智能体学习研究方法。其次,围绕多智能体博弈学习框架,介绍了多智能体博弈基础模型及元博弈模型,均衡解概念和博弈动力学,学习目标多样、环境(对手)非平稳、均衡难解且易变等挑战。再次,全面梳理了多智能体博弈策略学习方法,离线博弈策略学习方法,在线博弈策略学习方法。最后,从智能体认知行为建模与协同、通用博弈策略学习方法和分布式博弈策略学习框架共3个方面探讨了多智能体学习的前沿研究方向。
The new wave of artificial intelligence brought about by deep learning and reinforcement learning provides an“end-to-end”solution for agents from perception input to action decision-making output.Multi-agent learning is a frontier subject in the field of intelligent game confrontation,and it faces many problems and challenges such as adversarial environments,non-stationary opponents,incomplete information and uncertain actions.This paper starts from the perspective of game theory,firstly gives the organization of multi-agent learning system,gives an overview of multi-agent learning,and briefly introduces the classification of various multi-agent learning research methods.Secondly,based on the multi-agent learning framework in games,it introduces the basic multi-agent game and meta-game models,game solution concepts and game dynamics,as well as challenges such as diverse learning objectives,non-stationary environment(opponent),and equilibrium hard to compute and easy to transfer.Then comprehensively sort out the multi-agent game strategy learning methods,offline game strategy learning methods and online game strategy learning methods.Finally,some frontiers of multi-agent learning are discussed from three aspects of agent cognitive behavior modelling and collaboration,general game strategy learning methods,and distributed game strategy learning framework.
作者
罗俊仁
张万鹏
苏炯铭
袁唯淋
陈璟
LUO Junren;ZHANG Wanpeng;SU Jiongming;YUAN Weilin;CHEN Jing(College of Intelligence Science and Technology,National University of Defense Technology,Changsha 410073,China)
出处
《系统工程与电子技术》
EI
CSCD
北大核心
2024年第5期1628-1655,共28页
Systems Engineering and Electronics
基金
国家自然科学基金(61806212)
湖南省研究生科研创新项目(CX20210011)资助课题。
关键词
博弈学习
多智能体学习
元博弈
在线无悔学习
learning in games
multi-agent learning
meta-game
online no regret learning