摘要
本文研究约束折扣半马氏决策规划(CDSMDP)问题,即在一折扣期望费用约束下,使折扣期望报酬达最大的约束最优问题.假设状态集可数,行动集为紧的非空Borel集.本文给出了p-约束最优策略的充要条件,证明了在适当的假设条件下必存在P-约束最优策略最后构造出一线性规划,证明了该线性规划的最优解与p-约束最优随机平稳策略之间存在——对应关系.
In this paper, optimal causal policies maximizing the discounted reward over a semi-Markov decision process, subject to a constraint on a discounted cost, is investigated. Where the state set is countable, the action set is a non-empty Borel compact subset of a complete separable matric space. It is proved that there exists a p-constraint optimal stochastic stationary policy under some accessible conditions. Finally, a linear programming (LP) is given and the one-to-one corrpspondence between the optimal solution of LP and the p-constraint optimal stochastic stationary policy is proved.
出处
《应用数学学报》
CSCD
北大核心
1997年第2期187-195,共9页
Acta Mathematicae Applicatae Sinica