期刊文献+

多智能体分层协作规划及在RoboCup中的应用 被引量:3

Multi-Agent Decomposition Collaborative Planning and the Application in Robo Cup
下载PDF
导出
摘要 为了更好地解决一类通讯受限环境中多智能体任务协作规划问题,提出了基于MAXQ-OP的多智能体在线规划方法,并在Robo Cup仿真2D足球比赛的人墙站位和多球员传球问题中对算法进行了实验.实验结果表明,这个方法使智能体在需要协作配合的环境中的表现比传统方法有了明显提升. For solving a kind of multi-agent collaboration task planning problems with limited communication environment, this paper proposes a multi-agent online planning method based on MAXQ-OP. And in the Robocup soccer simulation 2D Wall Station and many players pass the ball to the algorithm in question did an experiment. The experiment shows intelligent agent in this method than the traditional algorithm has increased significantly in the scenario which cooperation is needed.
出处 《计算机系统应用》 2016年第1期17-23,共7页 Computer Systems & Applications
关键词 多智能体决策 机器人世界杯 马尔科夫决策过程 MAXQ分层分解 multi-agent decision-making Robo Cup Markov decision process MAXQ hierarchical decomposition
  • 相关文献

参考文献17

  • 1Sutton RS, Precup D, Singh S. Between MDPs and semi-MDPs:A framework for temporal abstraction in reinforcement learning. Artificial Intelligence, 1999, 112(2):181-211. 被引量:1
  • 2Parr R, Russell S. Reinforcement learning with hierarchies of machines. Advances in Neural Information Processing Systems. 1998. 1043-1049. 被引量:1
  • 3Dietterich TG. Hierarchical reinforcement learning with the MAXQ value function decomposition. Journal of Machine Learning Research, 1999, 13(1):63. 被引量:1
  • 4Bai AJ, Wu F, Chen XP. Online planning for large MDPs with MAXQ decomposition. Proc. of the 11th International Conference on Autonomous Agents and Multiagent Systems-Volume 3. International Foundation for Autonomous Agents and Multiagent Systems. 2012. 1215-1216. 被引量:1
  • 5Kitano H, Asada M, Kuniyoshi Y, et al. RoboCup:a challenge problem for AI. AI Magazine, 1997, 18(1):73-85. 被引量:1
  • 6Dai P, Goldsmith J. Topological value iteration algorithm for Markov decision processes. IJCAI. 2007. 被引量:1
  • 7Meyn SP. The policy iteration algorithm for average reward Markov decision processes with general state space. IEEE Trans. on Automatic Control, 1997, 42(12):1663-1680. 被引量:1
  • 8范长杰,陈小平.实时动态规划的最优行动判据及算法改进[J].软件学报,2008,19(11):2869-2878. 被引量:8
  • 9Nilsson NJ. Principles of Artificial Intelligence. Springer, 1982. 被引量:1
  • 10Browne C, Powley EJ, WhiteHouse D, Lucas SM, Cowling PI, Rohlfshagen P, Travener S, Perez D, Samothrakis S, Colton S. A survey of Monte Carlo tree search methods. IEEE Trans. Comput. Intellig. and AI in Games, 2012, 4(1):1-43. 被引量:1

二级参考文献26

  • 1Boutilier C, Dean T, Hanks S. Decision-Theoretic planning: Structural assumptions and computational leverage. Journal of Artificial Intelligence Research, 1999,11 : 1-94. 被引量:1
  • 2Hansen EA, Zilberstein S. LAO^*: A heuristic search algorithm that finds solutions with loops. Artificial Intelligence, 2001,129(1-2): 35-62. 被引量:1
  • 3Bonet B, Geffner H. Faster heuristic search algorithms for planning with uncertainty and full feedback. In: Proc. of the 18th Int'l Joint Conf. on Artificial Intelligence. Acapulco: Morgan Kaufmann Publishers, 2003. 1233-1238. 被引量:1
  • 4Dean T, Kaelbling LP, Kirman J, Nicholson A. Planning under time constraints in stochastic domains. Artificial Intelligence, 1995, 76(1-2):35-74. 被引量:1
  • 5Ferguson D, Stentz A. 2004. Focused dynamic programming: Extensive comparative results, Technical Report, CMU-RI-TR-04-13, Pittsburgh: Robotics Institute, Carnegie Mellon University, 2004. 被引量:1
  • 6Barto AG, Bradtke SJ, Singh SP. Learning to act using real-time dynamic programming. Artificial Intelligence, 1995,72(1-2): 81-138. 被引量:1
  • 7Pemberton JC, Korf RE. Incremental search algorithms for real-time decision making. In: Proc. of the 2nd Artificial Intelligence Planning Systems Conf. 1994. 140-145. 被引量:1
  • 8Bonet B, Geffner H. Labeled RTDP: Improving the convergence of real-time dynamic programming. In: Giunchiglia E, Muscettola N, Nau D, eds. Proc. of the ICAPS 2003. AAAI Press, 2003. 12-21. 被引量:1
  • 9McMahan HB, Likhachev M, Gordon GJ. Bounded real-time dynamic programming: RTDP with monotone upper bounds and performance guarantees. In: Proc. of the 22nd Int'l Conf. on Machine learning. 2005. 被引量:1
  • 10Smith T, Simmons R. Focused real-time dynamic programming for MDPs: Squeezing More Out of a Heuristic. In: Proc. of the 21 st AAAI Conf. on Artificial Intelligence. AAAI Press, 2006. 被引量:1

共引文献8

同被引文献32

引证文献3

二级引证文献6

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部