摘要
智能集群系统是人工智能的重要分支,所涌现出的智能形态被称为集群智能,具有个体激发时的自组织性和群体汇聚时的强鲁棒性等特征.智能集群系统的协同决策过程是融合人-机-物,覆盖多元空间,囊括感知-决策-反馈-优化的复杂非线性问题,具有开放的决策模型和庞大的解空间.然而,传统的算法依赖大量的知识与经验,使其难以支持系统的持续演化.强化学习是一类兼具感知决策的端到端方法,其通过试错的方式不断迭代优化,具有强大的自主学习能力.近些年来,受生物群体和人工智能的启发,强化学习算法已由求解个体的决策问题,向优化集群的联合协同问题演进,为增强集群智能的汇聚和涌现注入了新动能.但是,强化学习在处理集群任务时面临感知环境时空敏感、群内个体高度自治、群间关系复杂多变、任务目标多维等挑战.本文立足于智能集群系统的协同决策过程与强化学习运行机理,从联合通信、协同决策、奖励反馈与策略优化四个方面梳理了强化学习算法应对挑战的方法,论述了面向智能集群系统的强化学习算法的典型应用,列举了相关开源平台及其适用算法.最后,从实际需求出发,讨论总结了今后的研究方向.
Intelligent Collective System(ICS)is an essential branch of artificial intelligence,encompassing various intelligent components that collectively give rise to an emergent phenomenon known as Collective Intelligence(CI).CI exhibits the characteristics of selforganization in individual excitation,strong robustness in swarm convergence,and other characteristics.Based on ICS,AI enables the emergence of CI,providing a powerful framework for harnessing the potential of intelligent systems.Specifically,the decision-making process of ICS is a multifaceted and intricate nonlinear problem that intricately integrates humans,machines,and objects.This process spans across diverse spaces and encompasses various stages,including perception,decision-making,feedback,and optimization,forming a dynamic loop of information flow.Within this intricate framework,there exist abundant decision models that enable the system to consider a wide range of possibilities and alternatives.The traditional algorithms mainly rely on a large amount of knowledge and experience,creating a significant challenge in supporting the development of the system.The reliance on vast amounts of explicit knowledge and predefined rules limits the system’s ability to adapt and evolve in dynamic and complex environments.As the system encounters new situations or scenarios,its performance may suffer due to the lack of flexibility and adaptability inherent in these traditional algorithms.Reinforcement Learning(RL)is a powerful and comprehensive approach that seamlessly integrates perception and decision-making within an end-to-end framework.RL exhibits a remarkable autonomous learning capability,enabling systems to improve their performance through iterative optimization driven by trial and error.In RL,the system interacts with its environment,receiving feedback in the form of rewards or penalties based on its actions.Through this iterative process,the system learns to navigate complex decision spaces by exploring different actions and evaluating their consequences
作者
李璐璐
朱睿杰
隋璐瑶
李亚飞
徐明亮
樊会涛
LI Lu-Lu;ZHU Rui-Jie;SUI Lu-Yao;LI Ya-Fei;XU Ming-Liang;FAN Hui-Tao(School of Computer and Artificial Intelligence,Zhengzhou University,Zhengzhou 450001;Engineering Research Center of Intelligent Swarm Systems,Ministry of Education,Zhengzhou 450001;National Supercomputing Center in Zhengzhou,Zhengzhou 450001)
出处
《计算机学报》
EI
CAS
CSCD
北大核心
2023年第12期2573-2596,共24页
Chinese Journal of Computers
基金
国家自然科学基金重点项目(62036010)
国家自然科学基金青年项目(62001422)
国家自然科学基金面上项目(61972362,62372416)
国家重点研发计划课题(2021YFB3301504)资助。
关键词
智能集群系统
集群智能
群体智能
强化学习
感知决策
intelligent collective system
collective intelligence
swarm intelligence;reinforcement learning;perception and decision-making