期刊文献+

改进的Q学习算法及在其RoboCup中的应用 被引量:2

Improved Q-learning Algorithm and Its Application in RoboCup Environment
下载PDF
导出
摘要 传统的Q学习已被有效地应用于处理RoboCup中传球策略问题,但是它仅能简单地离散化连续的状态、动作空间。文章提出一种改进的Q学习算法,提出将神经网络应用于Q学习,系统只需学习部分状态—动作的Q值,即可进行Q学习,有效的提高收敛的速度。最后在RoboCup环境中验证这个算法,对传球成功率有所提高。 Q-learning has traditionally been used effectively in dealing with RoboCup ball tactics,but it is only a simple discretization of continuous state and action space.Proposed a modified Q learning algorithm,neural network applied to Q learning,the system only need to learn some of the state-action Q value,you can get a continuous approximation of Q value,and can effectively improve generalization ability.Finally,in the RoboCup environment,the algorithm is proved to achieve optimal playing strategy,and effectively improves the success rate of passing ball.
作者 周燕艳
出处 《四川理工学院学报(自然科学版)》 CAS 2011年第4期417-421,共5页 Journal of Sichuan University of Science & Engineering(Natural Science Edition)
关键词 ROBOCUP 神经网络 Q学习 智能体 RoboCup neural network Q learning Agent
  • 相关文献

参考文献8

  • 1Yan X W.Fuzzy Advantage Leaming[J].IEEE,2000:865- 870. 被引量:1
  • 2Zamzami N, Hirsch T, Dallaporte B, et al. Mitochondrial implication in accidental and programmed cell death: apoptosis and necrosis[J]. J Bioenerg Biomemb, 1997, 29(2): 185 -193. 被引量:2
  • 3Tamura T, Said S, Lu W, et al. Is apoptosis present in progresssion to chronic hypertensive heart failure? [ J ]. J Card Fail, 2000, 6(1): 37-42. 被引量:2
  • 4Sarah BB, Watkins SC, Hastings TG. Quantitative biochemical and ultrastructural comparison of mitochondrial permeability transition in isolated brain and liver mitochondria: evidence for reduced sensitivity of brain mitochondria [ J ]. Exp Neurol,2000, 16 被引量:3
  • 5Fontaine E, Eriksson O, Ichas F, et al. Regulation of the permeability transition pore skeletal muscle mitochondria[ J ].J Bid Chem, 1998, 273(20): 12662 - 12668. 被引量:3
  • 6Yang J, Liu XS, Kim CN, et al. Prevention of apoptosis by Bcl- 2: release of cytochrome c from mitochondria blocked [J].Science, 1997, 275(21): 1129-1132. 被引量:2
  • 7萨姆布鲁克 著 金冬雁 译.分子克隆[M](第2版)[M].北京:科学出版社,1992.881-884. 被引量:2
  • 8周勇,刘锋.基于改进的Q学习的RoboCup传球策略研究[J].计算机技术与发展,2008,18(4):63-66. 被引量:8

二级参考文献6

  • 1丛爽.面向MATLAB工具箱的神经网络理论与应用[M].合肥:中国科技大学出版社,2003.. 被引量:91
  • 2叶世伟 史忠植译.神经网络原理[M].北京:机械工业出版社,2004.. 被引量:35
  • 3Stone P. Layered learning in Multi- Agent System [ D]. Pittsburgh, PA: Computer Science Department, Carnegie Mellon University, 1998. 被引量:1
  • 4Kaelbling L P, Lit-reran M L,Moore A W. Reinforcement learning:A survey[J]. Journal of Artificial Intelligenee, 1996,4: 237 - 285. 被引量:1
  • 5Sutton R S,Barto A G. Reinforcement Learning[M]. Cambridge,MA: The MIT Press, 1998. 被引量:1
  • 6Tsitsiklis, John N. Asynchronous stochastic approximation and Q- learning [ J ]. Machine Learning, 1994,16 (3):185 - 202. 被引量:1

共引文献9

同被引文献15

引证文献2

二级引证文献4

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部