摘要
增强学习已经开始向关系增强学习发展,并且产生了许多新的算法。这些方法是将命题表达提升为关系或计算逻辑的表达。提出了一种新的表达形式,称为逻辑半马尔可夫决策过程。它是将逻辑程序与半马尔可夫过程相结合。在此框架中,抽象(状态或行动)是至关重要的,并且提出了对于逻辑半马尔可夫决策过程的Q-学习算法,给出其收敛证明。这种框架对在关系增强学习发展中处理时间连续方面提供了一个合理的基础。
Reinforcement learning has been developed towards relational reinforcement learning and a large number of new algorithms are provided. Most of them are upgrades of propositional representations towards the use of relational or computational logic representations. A novel representation formalism called logical semi-Markov decision process is presented, which inte- grates semi-Markov decision processes with logic programs. Within this framework, abstract- ness (state or action) is fundamental. Then an algorithm of Q-learning for the logical semi- Markov decision process is given and its convergent nature is proved. This framework will pro- vide a sound basis for further development of relational reinforcement learning in dealing with continuous time domain.
出处
《金陵科技学院学报》
2013年第2期13-19,共7页
Journal of Jinling Institute of Technology
基金
金陵科技学院科研基金资助项目(No.jit-b-201207)
关键词
关系增强学习
半马尔可夫
逻辑半马尔可夫
决策过程
relational reinforcement learning
semi-Markov
logical semi-Markov
decision process