In this paper, the design problem of satisfaction output feedback controls for stochastic nonlinear systems in strict feedback form under long-term tracking risk-sensitive index is investigated. The index function ado...In this paper, the design problem of satisfaction output feedback controls for stochastic nonlinear systems in strict feedback form under long-term tracking risk-sensitive index is investigated. The index function adopted here is of quadratic form usually encountered in practice, rather than of quartic one used to beg the essential difficulty on controller design and performance analysis of the closed-loop systems. For any given risk-sensitive parameter and desired index value, by using the integrator backstepping method, an output feedback control is constructively designed so that the closed-loop system is bounded in probability and the risk-sensitive index is upper bounded by the desired value.展开更多
The authors propose a data-driven direct adaptive control law based on the adaptive dynamic programming(ADP) algorithm for continuous-time stochastic linear systems with partially unknown system dynamics and infinite ...The authors propose a data-driven direct adaptive control law based on the adaptive dynamic programming(ADP) algorithm for continuous-time stochastic linear systems with partially unknown system dynamics and infinite horizon quadratic risk-sensitive indices.The authors use online data of the system to iteratively solve the generalized algebraic Riccati equation(GARE) and to learn the optimal control law directly.For the case with measurable system noises,the authors show that the adaptive control law approximates the optimal control law as time goes on.For the case with unmeasurable system noises,the authors use the least-square solution calculated only from the measurable data instead of the real solution of the regression equation to iteratively solve the GARE.The authors also study the influences of the intensity of the system noises,the intensity of the exploration noises,the initial iterative matrix,and the sampling period on the convergence of the ADP algorithm.Finally,the authors present two numerical simulation examples to demonstrate the effectiveness of the proposed algorithms.展开更多
This study advances the G-stochastic maximum principle(G-SMP)from a risk-neutral framework to a risk-sensitive one.A salient feature of this advancement is its applicability to systems governed by stochastic different...This study advances the G-stochastic maximum principle(G-SMP)from a risk-neutral framework to a risk-sensitive one.A salient feature of this advancement is its applicability to systems governed by stochastic differential equations under G-Brownian motion(G-SDEs),where the control variable may influence all terms.We aim to generalize our findings from a risk-neutral context to a risk-sensitive performance cost.Initially,we introduced an auxiliary process to address risk-sensitive performance costs within the G-expectation framework.Subsequently,we established and validated the correlation between the G-expected exponential utility and the G-quadratic backward stochastic differential equation.Furthermore,we simplified the G-adjoint process from a dual-component structure to a singular component.Moreover,we explained the necessary optimality conditions for this model by considering a convex set of admissible controls.To describe the main findings,we present two examples:the first addresses the linear-quadratic problem and the second examines a Merton-type problem characterized by power utility.展开更多
The paper considers partially observed optimal control problems for risk-sensitive stochastic systems,where the control domain is non-convex and the diffusion term contains the control v.Utilizing Girsanov’s theorem,...The paper considers partially observed optimal control problems for risk-sensitive stochastic systems,where the control domain is non-convex and the diffusion term contains the control v.Utilizing Girsanov’s theorem,spike variational technique as well as duality method,the authors obtain four adjoint equations and establish a maximum principle under partial information.As an application,an example is presented to demonstrate the result.展开更多
This is an overview paper on the relationship between risk-averse designs based on exponential loss functions with or without an additional unknown(adversarial)term and some classes of stochastic games.In particular,t...This is an overview paper on the relationship between risk-averse designs based on exponential loss functions with or without an additional unknown(adversarial)term and some classes of stochastic games.In particular,the paper discusses the equivalences between risk-averse controller and filter designs and saddle-point solutions of some corresponding risk-neutral stochastic differential games with different information structures for the players.One of the by-products of these analyses is that risk-averse controllers and filters(or estimators)for control and signal-measurement models are robust,through stochastic dissipation inequalities,to unmodeled perturbations in controlled system dynamics as well as signal and the measurement processes.The paper also discusses equivalences between risk-sensitive stochastic zero-sum differential games and some corresponding risk-neutral three-player stochastic zero-sum differential games,as well as robustness issues in stochastic nonzero-sum differential games with finite and infinite populations of players,with the latter belonging to the domain of mean-field games.展开更多
A stochastic maximum principle for the risk-sensitive optimal control prob- lem of jump diffusion processes with an exponential-of-integral cost functional is derived assuming that the value function is smooth, where ...A stochastic maximum principle for the risk-sensitive optimal control prob- lem of jump diffusion processes with an exponential-of-integral cost functional is derived assuming that the value function is smooth, where the diffusion and jump term may both depend on the control. The form of the maximum principle is similar to its risk-neutral counterpart. But the adjoint equations and the maximum condition heavily depend on the risk-sensitive parameter. As applications, a linear-quadratic risk-sensitive control problem is solved by using the maximum principle derived and explicit optimal control is obtained.展开更多
More and more data fusion models contain state constraints with valuable information in the filtering process. In this study, an optimal filter of risk sensitive with quasi-equality constraints is formulated using the...More and more data fusion models contain state constraints with valuable information in the filtering process. In this study, an optimal filter of risk sensitive with quasi-equality constraints is formulated using the reference probability method. Through recursion processes of probability density acquired from the probability measure change, the derived algorithm is optimal in the sense of the risk sensitive parameter. The system and constraint models are Consistent in statistics. Simulation results show that it is more robust and efficient than projection filters for the worst-case of noises and model uncertainty.展开更多
Tail risk is a classic topic in stressed portfolio optimization to treat unprecedented risks,while the traditional mean–variance approach may fail to perform well.This study proposes an innovative semiparametric meth...Tail risk is a classic topic in stressed portfolio optimization to treat unprecedented risks,while the traditional mean–variance approach may fail to perform well.This study proposes an innovative semiparametric method consisting of two modeling components:the nonparametric estimation and copula method for each marginal distribution of the portfolio and their joint distribution,respectively.We then focus on the optimal weights of the stressed portfolio and its optimal scale beyond the Gaussian restriction.Empirical studies include statistical estimation for the semiparametric method,risk measure minimization for optimal weights,and value measure maximization for the optimal scale to enlarge the investment.From the outputs of short-term and long-term data analysis,optimal stressed portfolios demonstrate the advantages of model flexibility to account for tail risk over the traditional mean–variance method.展开更多
A new algorithm is proposed, which immolates the optimality of control policies potentially to obtain the robnsticity of solutions. The robnsticity of solutions maybe becomes a very important property for a learning s...A new algorithm is proposed, which immolates the optimality of control policies potentially to obtain the robnsticity of solutions. The robnsticity of solutions maybe becomes a very important property for a learning system when there exists non-matching between theory models and practical physical system, or the practical system is not static, or the availability of a control action changes along with the variety of time. The main contribution is that a set of approximation algorithms and their convergence results are given. A generalized average operator instead of the general optimal operator max (or rain) is applied to study a class of important learning algorithms, dynamic prOgramming algorithms, and discuss their convergences from theoretic point of view. The purpose for this research is to improve the robnsticity of reinforcement learning algorithms theoretically.展开更多
The risk-sensitive filtering design problem with respect to the exponential mean-square cost criterion is con-sidered for stochastic Gaussian systems with polynomial of second and third degree drift terms and intensit...The risk-sensitive filtering design problem with respect to the exponential mean-square cost criterion is con-sidered for stochastic Gaussian systems with polynomial of second and third degree drift terms and intensity parameters multiplying diffusion terms in the state and observations equations. The closed-form optimal fil-tering equations are obtained using quadratic value functions as solutions to the corresponding Focker- Plank-Kolmogorov equation. The performance of the obtained risk-sensitive filtering equations for stochastic polynomial systems of second and third degree is verified in a numerical example against the optimal po-lynomial filtering equations (and extended Kalman-Bucy for system polynomial of second degree), through comparing the exponential mean-square cost criterion values. The simulation results reveal strong advan-tages in favor of the designed risk-sensitive equations for some values of the intensity parameters.展开更多
本文围绕智能工厂中关键性任务的边缘计算开展研究.考虑边缘计算中由于信道的不确定性及计算资源受限可能出现的高时延风险,首先通过使用条件风险价值(Conditional Value at Risk,CVaR)完成时延分布尾部信息的刻画,通过利用CVaR的凸性...本文围绕智能工厂中关键性任务的边缘计算开展研究.考虑边缘计算中由于信道的不确定性及计算资源受限可能出现的高时延风险,首先通过使用条件风险价值(Conditional Value at Risk,CVaR)完成时延分布尾部信息的刻画,通过利用CVaR的凸性和平移等价性,给出了时延CVaR的上界.进一步,通过对边缘服务器的选择与计算资源分配,完成了机器设备处理计算任务的平均时延与CVaR上界的联合优化.通过仿真实验,验证了算法模型对高时延分布刻画的有效性.从仿真结果可知,所提策略不仅提高了计算的可靠性,同时降低了时延的高风险值.展开更多
This paper investigates the risk-sensitive fixed-point smoothing estimation for hnear omcrete-time systems with multiple time-delay measurements. The problem considered can be converted into an optimization one in ind...This paper investigates the risk-sensitive fixed-point smoothing estimation for hnear omcrete-time systems with multiple time-delay measurements. The problem considered can be converted into an optimization one in indefinite space. Then the risk-sensitive fixed-point smoother is obtained by solving the optimization problem via innovation analysis theory in indefinite space. Necessary and sufficient conditions guaranteeing the existence of the risk-sensitive smoother are also given when the risk-sensitive parameter is negative. Compared with the conventional approach, a significant advantage of presented approach is that it provides less computational cost.展开更多
基金This work was supported by the National Natural Science Foundation of China.
文摘In this paper, the design problem of satisfaction output feedback controls for stochastic nonlinear systems in strict feedback form under long-term tracking risk-sensitive index is investigated. The index function adopted here is of quadratic form usually encountered in practice, rather than of quartic one used to beg the essential difficulty on controller design and performance analysis of the closed-loop systems. For any given risk-sensitive parameter and desired index value, by using the integrator backstepping method, an output feedback control is constructively designed so that the closed-loop system is bounded in probability and the risk-sensitive index is upper bounded by the desired value.
基金supported by the Natural Science Foundation of Shandong Province(Grant Nos.ZR2020MA032,ZR2022MA029)the National Natural Science Foundation of China(Grant No.72171133)the high-quality course for postgraduate education in Shandong Province《Intermediate Econometrics(Graded Teaching)》(SDYKC21137).
基金supported in part by the National Natural Science Foundation of China under Grant No.62261136550in part by the Basic Research Project of Shanghai Science and Technology Commission under Grant No.20JC1414000。
文摘The authors propose a data-driven direct adaptive control law based on the adaptive dynamic programming(ADP) algorithm for continuous-time stochastic linear systems with partially unknown system dynamics and infinite horizon quadratic risk-sensitive indices.The authors use online data of the system to iteratively solve the generalized algebraic Riccati equation(GARE) and to learn the optimal control law directly.For the case with measurable system noises,the authors show that the adaptive control law approximates the optimal control law as time goes on.For the case with unmeasurable system noises,the authors use the least-square solution calculated only from the measurable data instead of the real solution of the regression equation to iteratively solve the GARE.The authors also study the influences of the intensity of the system noises,the intensity of the exploration noises,the initial iterative matrix,and the sampling period on the convergence of the ADP algorithm.Finally,the authors present two numerical simulation examples to demonstrate the effectiveness of the proposed algorithms.
基金supported by PRFU project N(Grant No.C00L03UN070120220004).
文摘This study advances the G-stochastic maximum principle(G-SMP)from a risk-neutral framework to a risk-sensitive one.A salient feature of this advancement is its applicability to systems governed by stochastic differential equations under G-Brownian motion(G-SDEs),where the control variable may influence all terms.We aim to generalize our findings from a risk-neutral context to a risk-sensitive performance cost.Initially,we introduced an auxiliary process to address risk-sensitive performance costs within the G-expectation framework.Subsequently,we established and validated the correlation between the G-expected exponential utility and the G-quadratic backward stochastic differential equation.Furthermore,we simplified the G-adjoint process from a dual-component structure to a singular component.Moreover,we explained the necessary optimality conditions for this model by considering a convex set of admissible controls.To describe the main findings,we present two examples:the first addresses the linear-quadratic problem and the second examines a Merton-type problem characterized by power utility.
基金supported by the National Natural Foundation of China under Grant Nos.11801154 and 11901112。
文摘The paper considers partially observed optimal control problems for risk-sensitive stochastic systems,where the control domain is non-convex and the diffusion term contains the control v.Utilizing Girsanov’s theorem,spike variational technique as well as duality method,the authors obtain four adjoint equations and establish a maximum principle under partial information.As an application,an example is presented to demonstrate the result.
基金the Air Force Office of Scientific Research(AFOSR)under Grant No.FA9550-19-1-0353the Army Research Office MURI under Grant No.AG285。
文摘This is an overview paper on the relationship between risk-averse designs based on exponential loss functions with or without an additional unknown(adversarial)term and some classes of stochastic games.In particular,the paper discusses the equivalences between risk-averse controller and filter designs and saddle-point solutions of some corresponding risk-neutral stochastic differential games with different information structures for the players.One of the by-products of these analyses is that risk-averse controllers and filters(or estimators)for control and signal-measurement models are robust,through stochastic dissipation inequalities,to unmodeled perturbations in controlled system dynamics as well as signal and the measurement processes.The paper also discusses equivalences between risk-sensitive stochastic zero-sum differential games and some corresponding risk-neutral three-player stochastic zero-sum differential games,as well as robustness issues in stochastic nonzero-sum differential games with finite and infinite populations of players,with the latter belonging to the domain of mean-field games.
基金supported by the National Basic Research Program of China (973 Program, 2007CB814904)the National Natural Science Foundations of China (10921101)+2 种基金Shandong Province (2008BS01024, ZR2010AQ004)the Science Funds for Distinguished Young Scholars of Shandong Province (JQ200801)Shandong University (2009JQ004),the Independent Innovation Foundations of Shandong University (IIFSDU,2009TS036, 2010TS060)
文摘A stochastic maximum principle for the risk-sensitive optimal control prob- lem of jump diffusion processes with an exponential-of-integral cost functional is derived assuming that the value function is smooth, where the diffusion and jump term may both depend on the control. The form of the maximum principle is similar to its risk-neutral counterpart. But the adjoint equations and the maximum condition heavily depend on the risk-sensitive parameter. As applications, a linear-quadratic risk-sensitive control problem is solved by using the maximum principle derived and explicit optimal control is obtained.
基金supported by the National Natural Science Foundation of China (No. 60874044)the Doctoral Fundation of Ministry of Education(No. 20111102110006)
文摘More and more data fusion models contain state constraints with valuable information in the filtering process. In this study, an optimal filter of risk sensitive with quasi-equality constraints is formulated using the reference probability method. Through recursion processes of probability density acquired from the probability measure change, the derived algorithm is optimal in the sense of the risk sensitive parameter. The system and constraint models are Consistent in statistics. Simulation results show that it is more robust and efficient than projection filters for the worst-case of noises and model uncertainty.
文摘Tail risk is a classic topic in stressed portfolio optimization to treat unprecedented risks,while the traditional mean–variance approach may fail to perform well.This study proposes an innovative semiparametric method consisting of two modeling components:the nonparametric estimation and copula method for each marginal distribution of the portfolio and their joint distribution,respectively.We then focus on the optimal weights of the stressed portfolio and its optimal scale beyond the Gaussian restriction.Empirical studies include statistical estimation for the semiparametric method,risk measure minimization for optimal weights,and value measure maximization for the optimal scale to enlarge the investment.From the outputs of short-term and long-term data analysis,optimal stressed portfolios demonstrate the advantages of model flexibility to account for tail risk over the traditional mean–variance method.
基金Project supported by the National Natural Science Foundation of China (Nos. 10471088 and 60572126)
文摘A new algorithm is proposed, which immolates the optimality of control policies potentially to obtain the robnsticity of solutions. The robnsticity of solutions maybe becomes a very important property for a learning system when there exists non-matching between theory models and practical physical system, or the practical system is not static, or the availability of a control action changes along with the variety of time. The main contribution is that a set of approximation algorithms and their convergence results are given. A generalized average operator instead of the general optimal operator max (or rain) is applied to study a class of important learning algorithms, dynamic prOgramming algorithms, and discuss their convergences from theoretic point of view. The purpose for this research is to improve the robnsticity of reinforcement learning algorithms theoretically.
文摘The risk-sensitive filtering design problem with respect to the exponential mean-square cost criterion is con-sidered for stochastic Gaussian systems with polynomial of second and third degree drift terms and intensity parameters multiplying diffusion terms in the state and observations equations. The closed-form optimal fil-tering equations are obtained using quadratic value functions as solutions to the corresponding Focker- Plank-Kolmogorov equation. The performance of the obtained risk-sensitive filtering equations for stochastic polynomial systems of second and third degree is verified in a numerical example against the optimal po-lynomial filtering equations (and extended Kalman-Bucy for system polynomial of second degree), through comparing the exponential mean-square cost criterion values. The simulation results reveal strong advan-tages in favor of the designed risk-sensitive equations for some values of the intensity parameters.
文摘本文围绕智能工厂中关键性任务的边缘计算开展研究.考虑边缘计算中由于信道的不确定性及计算资源受限可能出现的高时延风险,首先通过使用条件风险价值(Conditional Value at Risk,CVaR)完成时延分布尾部信息的刻画,通过利用CVaR的凸性和平移等价性,给出了时延CVaR的上界.进一步,通过对边缘服务器的选择与计算资源分配,完成了机器设备处理计算任务的平均时延与CVaR上界的联合优化.通过仿真实验,验证了算法模型对高时延分布刻画的有效性.从仿真结果可知,所提策略不仅提高了计算的可靠性,同时降低了时延的高风险值.
基金supported by the National Natural Science Foundations of China under Grant Nos.61273124,61174141China Postdoctoral Science Foundation under Grant No.2011M501132+2 种基金Special Funds for Postdoctoral Innovative Projects of Shandong Province under Grant No.201103043Doctoral Foundation of Taishan University under Grant No.Y11-2-02A Project of Shandong Province Higher Education Science and Technology Program under Grant No.J12LN90
文摘This paper investigates the risk-sensitive fixed-point smoothing estimation for hnear omcrete-time systems with multiple time-delay measurements. The problem considered can be converted into an optimization one in indefinite space. Then the risk-sensitive fixed-point smoother is obtained by solving the optimization problem via innovation analysis theory in indefinite space. Necessary and sufficient conditions guaranteeing the existence of the risk-sensitive smoother are also given when the risk-sensitive parameter is negative. Compared with the conventional approach, a significant advantage of presented approach is that it provides less computational cost.