摘要
研究了离散时间首达时间依分布(随机序)最优模型与风险最小模型。给出存在最优策略的若干充要条件及重要性质,证明最优方程序列存在唯一解,给出寻优的算法。证明了m时段风险最小E最优策略必定存在。
Optimal model for arrival time distribution function (FATDF) (stochastic ordering) in discrete time with countable state and action space is investigated. Some sufficient and necessary conditions of existence of an optimal policy for FATDF are obtained. Several structures and properties of convex combinations and cut-and-piece together of optimal polices are described. It is shown that there is a unique solution for the optimal equations systems series.The algorithms to find all optimal policies and optimal value are given. The risk minimum model is presented and the relationships between it and optimal model for FATDF are derived. Existence of optimal Markov policy for the finite horizon is verified. Such models arise in relation to resource-sharing systems, reliability, queneing systems, etc.
出处
《清华大学学报(自然科学版)》
EI
CAS
CSCD
北大核心
1996年第2期53-59,共7页
Journal of Tsinghua University(Science and Technology)
关键词
马氏决策过程
首达时间
最优模型
最优策略
Markov decision processes
first arrival time
optimal model for distribution function