摘要
近年来,深度强化学习的取得了飞速发展,为了提高深度强化学习处理高维状态空间或动态复杂环境的能力,研究者将记忆增强型神经网络引入到深度强化学习,并提出了不同的记忆增强型深度强化学习算法,记忆增强型深度强化学习已成为当前的研究热点.本文根据记忆增强型神经网络类型,将记忆增强型深度强化学习分为了4类:基于经验回放的深度强化学习、基于记忆网络的深度强化学习算法、基于情景记忆的深度强化学习算法、基于可微分计算机的深度强化学习.同时,系统性地总结和分析了记忆增强型深度强化学习的一系列研究成果存在的优势和不足.另外,给出了深度强化学习常用的训练环境.最后,对记忆增强型深度强化学习进行了展望,指出了未来研究方向.
In recent years,deep reinforcement learning has developed rapidly.To improve the performance of deep reinforcement learning(DRL) in high-dimensional state space and dynamic complex environment,researchers introduce memory-augmented neural networks(MANN) into DRL,and propose various memory-augmented deep reinforcement learning(MADRL) algorithms,which becomes a research hotspot.In this paper according to the types of MANN,MADRL algorithms can be categorized into four classes:MADRL based on experience replay,MADRL based on memory network,MADRL based on episodic memory and MADRL based on differentiable neural computer.In addition,the training environments for DRL are introduced.Meanwhile,this paper systematically summarizes and analyzes the advantages and disadvantages of the research works on MADRL.Finally,the prospect and future research directions of MADRL are discussed.
作者
汪晨
曾凡玉
郭九霞
WANG Chen;ZENG Fan-yu;GUO Jiu-xia(School of Computer Science and Engineering,University of Electronic Science and Technology of China,Chengdu 611731,China;College of Air Traffic Management,Civil Aviation Fight University of China,Guanghan 618307,China)
出处
《小型微型计算机系统》
CSCD
北大核心
2021年第3期454-461,共8页
Journal of Chinese Computer Systems
基金
国家自然科学基金-联合基金项目(U181320052)资助
国家自然科学基金面上项目(6177020680)资助
国家自然科学基金青年科学基金项目(62003381)资助
国家重点研发计划项目(2018YFC0831801)资助
四川省重点研发项目(17ZDYF3184)资助.
关键词
深度强化学习
经验回放
记忆网络
情景记忆
可微分计算机
deep reinforcement learning
experience replay
memory networks
episodic memory
differentiable neural computer