In this paper,the authors design a reinforcement learning algorithm to solve the adaptive linear-quadratic stochastic n-players non-zero sum differential game with completely unknown dynamics.For each player,a critic ...In this paper,the authors design a reinforcement learning algorithm to solve the adaptive linear-quadratic stochastic n-players non-zero sum differential game with completely unknown dynamics.For each player,a critic network is used to estimate the Q-function,and an actor network is used to estimate the control input.A model-free online Q-learning algorithm is obtained for solving this kind of problems.It is proved that under some mild conditions the system state and weight estimation errors can be uniformly ultimately bounded.A simulation with five players is given to verify the effectiveness of the algorithm.展开更多
To keep the secrecy performance from being badly influenced by untrusted relay(UR), a multi-UR network through amplify-and-forward(AF) cooperative scheme is put forward, which takes relay weight and harmful factor int...To keep the secrecy performance from being badly influenced by untrusted relay(UR), a multi-UR network through amplify-and-forward(AF) cooperative scheme is put forward, which takes relay weight and harmful factor into account. A nonzero-sum game is established to capture the interaction among URs and detection strategies. Secrecy capacity is investigated as game payoff to indicate the untrusted behaviors of the relays. The maximum probabilities of the behaviors of relay and the optimal system detection strategy can be obtained by using the proposed algorithm.展开更多
This paper attempts to study two-person nonzero-sum games for denumerable continuous-time Markov chains determined by transition rates,with an expected average criterion.The transition rates are allowed to be unbounde...This paper attempts to study two-person nonzero-sum games for denumerable continuous-time Markov chains determined by transition rates,with an expected average criterion.The transition rates are allowed to be unbounded,and the payoff functions may be unbounded from above and from below.We give suitable conditions under which the existence of a Nash equilibrium is ensured.More precisely,using the socalled "vanishing discount" approach,a Nash equilibrium for the average criterion is obtained as a limit point of a sequence of equilibrium strategies for the discounted criterion as the discount factors tend to zero.Our results are illustrated with a birth-and-death game.展开更多
This paper presents a novel optimal synchronization control method for multi-agent systems with input saturation.The multi-agent game theory is introduced to transform the optimal synchronization control problem into ...This paper presents a novel optimal synchronization control method for multi-agent systems with input saturation.The multi-agent game theory is introduced to transform the optimal synchronization control problem into a multi-agent nonzero-sum game.Then,the Nash equilibrium can be achieved by solving the coupled Hamilton–Jacobi–Bellman(HJB)equations with nonquadratic input energy terms.A novel off-policy reinforcement learning method is presented to obtain the Nash equilibrium solution without the system models,and the critic neural networks(NNs)and actor NNs are introduced to implement the presented method.Theoretical analysis is provided,which shows that the iterative control laws converge to the Nash equilibrium.Simulation results show the good performance of the presented method.展开更多
Based upon the theory of the nonlinear quadric two-person nonzero-sum differential game,the fact that the time-limited mixed H2/H∞ control problem can be turned into the problem of solving the state feedback Nash bal...Based upon the theory of the nonlinear quadric two-person nonzero-sum differential game,the fact that the time-limited mixed H2/H∞ control problem can be turned into the problem of solving the state feedback Nash balance point is mentioned. Upon this,a theorem about the solution of the state feedback control is given,the Lyapunov stabilization of the nonlinear system under this control is proved,too. At the same time,this solution is used to design the nonlinear H2/H∞ guidance law of the relative motion between the missile and the target in three-dimensional(3D) space. By solving two coupled Hamilton-Jacobi partial differential inequalities(HJPDI),a control with more robust stabilities and more robust performances is obtained. With different H∞ performance indexes,the correlative weighting factors of the control are analytically designed. At last,simulations under different robust performance indexes and under different initial conditions and under the cases of intercepting different maneuvering targets are carried out. All results indicate that the designed law is valid.展开更多
This paper studies the existence and uniqueness of solutions of fully coupled forward-backward stochastic differential equations with Brownian motion and random jumps.The result is applied to solve a linear-quadratic ...This paper studies the existence and uniqueness of solutions of fully coupled forward-backward stochastic differential equations with Brownian motion and random jumps.The result is applied to solve a linear-quadratic optimal control and a nonzero-sum differential game of backward stochastic differential equations.The optimal control and Nash equilibrium point are explicitly derived. Also the solvability of a kind Riccati equations is discussed.All these results develop those of Lim, Zhou(2001) and Yu,Ji(2008).展开更多
基金supported in part by the National Natural Science Foundation of China under Grant Nos.62122043,62192753in part by Natural Science Foundation of Shandong Province for Distinguished Young Scholars under Grant No.ZR2022JQ31in part by the Innovative Research Groups of the National Natural Science Foundation of China under Grant No.61821004.
文摘In this paper,the authors design a reinforcement learning algorithm to solve the adaptive linear-quadratic stochastic n-players non-zero sum differential game with completely unknown dynamics.For each player,a critic network is used to estimate the Q-function,and an actor network is used to estimate the control input.A model-free online Q-learning algorithm is obtained for solving this kind of problems.It is proved that under some mild conditions the system state and weight estimation errors can be uniformly ultimately bounded.A simulation with five players is given to verify the effectiveness of the algorithm.
基金Supported by the National Natural Science Foundation of China(No.61101223)
文摘To keep the secrecy performance from being badly influenced by untrusted relay(UR), a multi-UR network through amplify-and-forward(AF) cooperative scheme is put forward, which takes relay weight and harmful factor into account. A nonzero-sum game is established to capture the interaction among URs and detection strategies. Secrecy capacity is investigated as game payoff to indicate the untrusted behaviors of the relays. The maximum probabilities of the behaviors of relay and the optimal system detection strategy can be obtained by using the proposed algorithm.
基金supported by National Science Foundation for Distinguished Young Scholars of China (Grant No. 10925107)Guangdong Province Universities and Colleges Pearl River Scholar Funded Scheme (2011)
文摘This paper attempts to study two-person nonzero-sum games for denumerable continuous-time Markov chains determined by transition rates,with an expected average criterion.The transition rates are allowed to be unbounded,and the payoff functions may be unbounded from above and from below.We give suitable conditions under which the existence of a Nash equilibrium is ensured.More precisely,using the socalled "vanishing discount" approach,a Nash equilibrium for the average criterion is obtained as a limit point of a sequence of equilibrium strategies for the discounted criterion as the discount factors tend to zero.Our results are illustrated with a birth-and-death game.
基金Project supported by the National Key R&D Program of China(No.2018YFB1702300)the National Natural Science Foundation of China(Nos.61722312 and 61533017)。
文摘This paper presents a novel optimal synchronization control method for multi-agent systems with input saturation.The multi-agent game theory is introduced to transform the optimal synchronization control problem into a multi-agent nonzero-sum game.Then,the Nash equilibrium can be achieved by solving the coupled Hamilton–Jacobi–Bellman(HJB)equations with nonquadratic input energy terms.A novel off-policy reinforcement learning method is presented to obtain the Nash equilibrium solution without the system models,and the critic neural networks(NNs)and actor NNs are introduced to implement the presented method.Theoretical analysis is provided,which shows that the iterative control laws converge to the Nash equilibrium.Simulation results show the good performance of the presented method.
基金Sponsored by the National Natural Science Foundation of China (Grant No.90716028)
文摘Based upon the theory of the nonlinear quadric two-person nonzero-sum differential game,the fact that the time-limited mixed H2/H∞ control problem can be turned into the problem of solving the state feedback Nash balance point is mentioned. Upon this,a theorem about the solution of the state feedback control is given,the Lyapunov stabilization of the nonlinear system under this control is proved,too. At the same time,this solution is used to design the nonlinear H2/H∞ guidance law of the relative motion between the missile and the target in three-dimensional(3D) space. By solving two coupled Hamilton-Jacobi partial differential inequalities(HJPDI),a control with more robust stabilities and more robust performances is obtained. With different H∞ performance indexes,the correlative weighting factors of the control are analytically designed. At last,simulations under different robust performance indexes and under different initial conditions and under the cases of intercepting different maneuvering targets are carried out. All results indicate that the designed law is valid.
基金supported by National Natural Science Foundation of China(10671112)National Basic Research Program of China(973 Program)(2007CB814904)the Natural Science Foundation of Shandong Province(Z2006A01)
文摘This paper studies the existence and uniqueness of solutions of fully coupled forward-backward stochastic differential equations with Brownian motion and random jumps.The result is applied to solve a linear-quadratic optimal control and a nonzero-sum differential game of backward stochastic differential equations.The optimal control and Nash equilibrium point are explicitly derived. Also the solvability of a kind Riccati equations is discussed.All these results develop those of Lim, Zhou(2001) and Yu,Ji(2008).