期刊文献+

基于Boltzamnn机的机器人自主学习算法 被引量:1

Self-learning Algorithm for Robot Based on Boltzamnn Machine
下载PDF
导出
摘要 针对两轮机器人自平衡运动控制问题,提出了一种基于Boltzamnn机的Skinner操作条件反射学习机制作为机器人仿生自主学习的算法.该算法利用Boltzamnn机中Metropolis判据平衡Skinner操作条件反射学习中探索和利用的比例,并依据概率取向机制以一定的概率选择最优行为,从而使机器人在未知环境下可获得像人或动物一样的仿生自主学习技能,实现机器人的自平衡运动控制.最后,分别用基于Boltzamnn机的Skinner操作条件反射的学习算法和基于贪婪策略的Skinner操作条件反射的学习算法做了仿真实验并进行了比较.结果表明,基于Boltzamnn机的Skinner操作条件反射的学习算法能使机器人获得较强的运动平衡控制技能和较好的动态性能,体现了机器人的自主学习特性. In view of the self-balancing movement control problem of the two-wheeled robot,a bionic self-learning algorithm of the robot is proposed as a study mechanism of Skinner's operant conditioning reflection based on the Boltzamnn machine.This algorithm uses the Metropolis criterion in Boltzamnn machine to balance in the proportion of the exploration and the exploitation in the study of Skinner's operant conditioning reflection,and chooses the most superior behavior through certain probability depending on the probability tropism mechanism.Thus the robot can obtain the skill of bionic self-learning like the human or the animal under the unknown environment,and realize the self-balancing movement control of the robot.Finally,the simulation experiments were conducted and the Skinner's operant conditioning reflection study algorithms based on the Boltzamnn machine and the greedy strategy were compared,separately.Results show that the Skinner's operant conditioning reflection study algorithm based on the Boltzamnn machine can obtain the stronger movement balancing control skill and the better dynamic performance,and manifest the self-learning characteristics of the robot.
出处 《北京工业大学学报》 EI CAS CSCD 北大核心 2012年第1期60-64,共5页 Journal of Beijing University of Technology
基金 国家'八六三'计划资助项目(2007AA04Z226) 国家自然科学基金资助项目(60774077) 北京市教委重点资助项目(KZ200810005002)
关键词 Boltzamnn机 Skinner操作条件反射 贪婪策略 自主学习 两轮机器人 Boltzamnn machine Skinner's operant conditioning greedy strategy self-learning two-wheeled robot
  • 相关文献

参考文献16

  • 1RAPHAEL B. The robot 'Shakey' and 'his' successors [J]. Computers and People, 1976, 25: 7-21. 被引量:1
  • 2BROOKS R A. From earwigs to humans[J]. Robotics and Autonomous Systems, 1997, 20 : 291-304. 被引量:1
  • 3WOLF R, HEISENBERG M. Basic organization of operant behavior as revealed in drosophila flight orientation [ J ]. Comp Physiol, 1991, 169: 699-705. 被引量:1
  • 4TOURETZKY D S, SASKIDA L M. Operant conditioning in Skinnerbots [ J ]. Adaptive Behavior, 1997, 5 ( 3/4 ) : 219-47. 被引量:1
  • 5ZALAMA E, GOMEZ J, PAUL M, et al. Adaptive behavior navigation of a mobile robot [ J ]. IEEE Transactions on Systems, Man, and Cybernetics-part A: Systems and Humans, 2002, 32( 1 ) : 160-169. 被引量:1
  • 6DOMINGUEZ S, ZALAMA E. Robot learning in a social robot [ J ]. Lecture Notes in Comuter Science, 2006, 4095 : 691-702. 被引量:1
  • 7HINTON G E, SEJNOWSKI T J, ACKLEY D H. Boltzmann machines: constraint satisfaction networks that learn [ R ] // Mellon University Technical Report. Pitsburgh: CMU, 1984: 1-37. 被引量:1
  • 8HINTON G E, SEJNOWSKI T J. Learning and relearning in Boltzmann machines parallel distributed pressing [ M ]. Cambridge: MIT Press, 1986: 282-317. 被引量:1
  • 9GUO Mao-zu, LIU Yang, JACEK M. A new Q-learning algorithm based on the metropolis criterion [ J ]. IEEE Transactions on Systems, Man, and Cybernetics-part B: Cybernetics, 2004, 34 (5) : 2140-2143. 被引量:1
  • 10DAHMANI Y, BENYETTOU A. Seek of an optimal way by Q-learning[ J]. Journal of Computer Science, 2005, 1 (1) : 28-30. 被引量:1

二级参考文献19

  • 1杨兴明,丁学明,张培仁,赵鹏.两轮移动式倒立摆的运动控制[J].合肥工业大学学报(自然科学版),2005,28(11):1485-1488. 被引量:16
  • 2HA Y S, YUTA S. Trajectory tracking control for navigation of the inverse pendulum type self-contained mobile robot [ J ]. Robotics and Autonomous Systems,1996,17(1-2) :65-80. 被引量:1
  • 3GRASSER F, D' ARRIGO A, COLOMBI S, et al. Joe:a mobile, inverted pendulum[ J]. IEEE Trans on Industrial Electronics,2002, 49(1) :107-114. 被引量:1
  • 4SALERNO A, ANGELES J. On the nonlinear controllability of a quasiholonomic mobile robot [ C ]//Proc of IEEE International Conference on Robotics and Automation. 2003:3379-3384. 被引量:1
  • 5Barto A G, Sutton S, Anderson C W. Neuronlike adaptive elements that can solve difficult learning control problems[J].IEEE Trans on Systems Man and Cybernetics, 1983, 13(5) :834-846. 被引量:1
  • 6Anderson C W. Learning to control an inverted pendulum using neural networks [ J]. IEEE Control System Magazine,1989, 9(4): 31-35. 被引量:1
  • 7White D A, Sofge D A. Handbook of intelligent control:neural, fuzzy, and adaptive approaches[M]. New York: Van Nostrand Reinhold, 1992. 被引量:1
  • 8Si J N, Wang Y T. On-line learning control by association and reinforcement [ J ]. IEEE Transactions on Neural Networks, 2001, 12(2): 264-276. 被引量:1
  • 9Lin C T, Lee C S G. Reinforcement structure/parameter learning for neural network-based fuzzy logic control systems[J]. IEEE Transaction on Fuzzy Systems, 1994, 2( 1 ): 46-63. 被引量:1
  • 10Berenji H R, Khedkar P. Leaming and tuning fuzzy logic controllers through reinforcements [ J]. IEEE Transactions on Neural Networks, 1992, 3(5) :724-740. 被引量:1

共引文献320

同被引文献5

引证文献1

二级引证文献10

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部