摘要
为了解决Q学习应用于连续状态空间的智能系统所面临的"维数灾难"问题,提出一种基于ART2的Q学习算法.通过引入ART2神经网络,让Q学习Agent针对任务学习一个适当的增量式的状态空间模式聚类,使Agent无需任何先验知识,即可在未知环境中进行行为决策和状态空间模式聚类两层在线学习,通过与环境交互来不断改进控制策略,从而提高学习精度.仿真实验表明,使用ARTQL算法的移动机器人能通过与环境交互学习来不断提高导航性能.
In order to solve the problem of dimension disaster which may be produced by applying Q-learing to intelligent system of continuous state-space,this paper proposes a Q-learning algorithm based on ART 2 and gives the specific steps.Through introducing the ART 2 neural network in the Q-learning algorithm,Q-learning Agent in view of the duty learns an appropriate incremental clustering of state-space model,so Agent can carry out decision-making and a two-tiers online learning of state-space model cluster in unknown environment without any priori knowledge.Through the interaction with the environment unceasingly alternately to improve the control strategies,the learning accuracy is increased.Finally,the mobile robot navigation simulation experiments show that,using the ARTQL algorithm,motion robot can improve its navigation performance continuously by interactive learning with the environment.
出处
《控制与决策》
EI
CSCD
北大核心
2011年第2期227-232,共6页
Control and Decision
基金
国家自然科学基金项目(61070113)
浙江省自然科学基金项目(20080376)