期刊文献+
共找到3篇文章
< 1 >
每页显示 20 50 100
马尔可夫过程在物价渡动研究中的应用——策略迭代在考虑钱币损失的经济系统中的实现
1
作者 马文 《贵州师范大学学报(自然科学版)》 CAS 1993年第1期24-32,共9页
本文利用马尔可夫过程理论研究了某货物价格变动的有关規律。这是系列研究的第三部份。
关键词 马尔可夫过程 转移概率矩阵 物价
下载PDF
CONSTRAINED DENUMERABLE STATE NON-STATIONARY MDPs WITH EXPECTED TOTAL REWARD CRITERION
2
作者 郭先平 《Acta Mathematicae Applicatae Sinica》 SCIE CSCD 2000年第2期205-212,共8页
In this paper, we consider constrained denumerable state non-stationary Markov decision processes (MDPs, for short) with expected total reward criterion. By the mechanics of intro- ducing Lagrange multiplier and using... In this paper, we consider constrained denumerable state non-stationary Markov decision processes (MDPs, for short) with expected total reward criterion. By the mechanics of intro- ducing Lagrange multiplier and using the methods of probability and analytics, we prove the existence of constrained optimal policies. Moreover, we prove that a constrained optimal policy may be a Markov policy, or be a randomized Markov policy that randomizes between two Markov policies, that differ in only one state. 展开更多
关键词 Non-stationary MDPs expected total reward criterion constrained optimal policies
全文增补中
Asymptotic Evaluations of the Stability Index for a Markov Control Process with the Expected Total Discounted Reward Criterion
3
作者 Jaime Eduardo Martínez-Sánchez 《American Journal of Operations Research》 2021年第1期62-85,共24页
In this work, for a control consumption-investment process with the discounted reward optimization criteria, a numerical estimate of the stability index is made. Using explicit formulas for the optimal stationary poli... In this work, for a control consumption-investment process with the discounted reward optimization criteria, a numerical estimate of the stability index is made. Using explicit formulas for the optimal stationary policies and for the value functions, the stability index is explicitly calculated and through statistical techniques its asymptotic behavior is investigated (using numerical experiments) when the discount coefficient approaches 1. The results obtained define the conditions under which an approximate optimal stationary policy can be used to control the original process. 展开更多
关键词 Control Consumption-Investment Process Discrete-Time Markov Control Process expected total Discounted reward Probabilistic Metrics Stability Index Estimation
下载PDF
上一页 1 下一页 到第
使用帮助 返回顶部