期刊文献+

基于ART2的Q学习算法研究 被引量:3

Study on Q-learning algorithm based on ART 2
原文传递
导出
摘要 为了解决Q学习应用于连续状态空间的智能系统所面临的"维数灾难"问题,提出一种基于ART2的Q学习算法.通过引入ART2神经网络,让Q学习Agent针对任务学习一个适当的增量式的状态空间模式聚类,使Agent无需任何先验知识,即可在未知环境中进行行为决策和状态空间模式聚类两层在线学习,通过与环境交互来不断改进控制策略,从而提高学习精度.仿真实验表明,使用ARTQL算法的移动机器人能通过与环境交互学习来不断提高导航性能. In order to solve the problem of dimension disaster which may be produced by applying Q-learing to intelligent system of continuous state-space,this paper proposes a Q-learning algorithm based on ART 2 and gives the specific steps.Through introducing the ART 2 neural network in the Q-learning algorithm,Q-learning Agent in view of the duty learns an appropriate incremental clustering of state-space model,so Agent can carry out decision-making and a two-tiers online learning of state-space model cluster in unknown environment without any priori knowledge.Through the interaction with the environment unceasingly alternately to improve the control strategies,the learning accuracy is increased.Finally,the mobile robot navigation simulation experiments show that,using the ARTQL algorithm,motion robot can improve its navigation performance continuously by interactive learning with the environment.
出处 《控制与决策》 EI CSCD 北大核心 2011年第2期227-232,共6页 Control and Decision
基金 国家自然科学基金项目(61070113) 浙江省自然科学基金项目(20080376)
关键词 Q学习 ART2 增量式学习 两层在线学习 移动机器人导航 Q-learning ART 2 incremental learning two-tiers online learning mobile robot navigation
  • 相关文献

参考文献13

  • 1Tom M Mitchell. Machine learning[M]. Beijing: Machine Press, 2004. 被引量:1
  • 2Xiao N F, Nahavandi S. A reinforcement learning approach for robot control in an unknown environment[C]. Proc of the IEEE Int Conf on Industrial Technology. Bangkok, 2002: 1096-1099. 被引量:1
  • 3Wang Y C, John M Usher. Application of reinforcement learning for Agent-based production scheduling[J]. Engineering Application of Artificial Intelligence, 2005, 18(1): 73-82. 被引量:1
  • 4高阳.强化学习研究进展-机器学习及其应用[M].北京:清华大学出版社,2006:116-134. 被引量:1
  • 5Watkins C, Dayan P. Q-learning[J]. Machine Learning, 1992, 8(3/4): 279-292. 被引量:1
  • 6Singh S, Jaakkola T, Jordan M I. Reinforcement learning with soft state aggregation[C]. Advances in Neural Information Processing Systems. Morgan Kaufmann: MIT Press, 1995: 361-368. 被引量:1
  • 7Lau H Y K, Mak K L, Lee I S K. Adaptive vector quantization for reinforcement learning[C]. Proc of the 15th World Congress of Int Federation of Automatic Control. Barcelona, 2002: 21-26. 被引量:1
  • 8Davies S. Multidimensional triangulation and interpolation for reinforcement learning[C]. Advance in Neural Information Processing Systems. Cambridge: MIT Press, 1997: 1005-1010. 被引量:1
  • 9Karthikeyan B, Gopal S, Venkatesh S. ART-2: An unsupervised neural network for PD pattern recognition and classification[J]. Expert System Application, 2006, 31 (2): 345-350. 被引量:1
  • 10韩力群编著..人工神经网络理论、设计及应用 第2版[M].北京:化学工业出版社,2007:243.

同被引文献24

引证文献3

二级引证文献3

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部