quotation:		[Copy]
		M. I. Abouheaf,F. L. Lewis,M. S. Mahmoud,D. G. Mikulski.[en_title][J].Control Theory and Technology,2015,13(1):55~69.[Copy]

This Paper:Browse 2755 Download 2113	码上扫一扫！
Discrete-time dynamic graphical games: model-free reinforcement learning solution
M.I.Abouheaf,F.L.Lewis,M.S.Mahmoud,D.G.Mikulski
0 Fontlarge +\|Default\|Small
(Systems Engineering Department, King Fahd University of Petroleum & Mineral)

摘要:

This paper introduces a model-free reinforcement learning technique that is used to solve a class of dynamic games known as dynamic graphical games. The graphical game results from multi-agent dynamical systems, where pinning control is used to make all the agents synchronize to the state of a command generator or a leader agent. Novel coupled Bellman equations and Hamiltonian functions are developed for the dynamic graphical games. The Hamiltonian mechanics are used to derive the necessary conditions for optimality. The solution for the dynamic graphical game is given in terms of the solution to a set of coupled Hamilton-Jacobi-Bellman equations developed herein. Nash equilibrium solution for the graphical game is given in terms of the solution to the underlying coupled Hamilton-Jacobi-Bellman equations. An online model-free policy iteration algorithm is developed to learn the Nash solution for the dynamic graphical game. This algorithm does not require any knowledge of the agents’ dynamics. A proof of convergence for this multi-agent learning algorithm is given under mild assumption about the inter-connectivity properties of the graph. A gradient descent technique with critic network structures is used to implement the policy iteration algorithm to solve the graphical game online in real-time.

关键词: Dynamic graphical games, Nash equilibrium, discrete mechanics, optimal control, model-free reinforcement learning, policy iteration

DOI：

Received:December 31, 2014Revised:January 15, 2015

基金项目:

Discrete-time dynamic graphical games: model-free reinforcement learning solution

M. I. Abouheaf,F. L. Lewis,M. S. Mahmoud,D. G. Mikulski

(Systems Engineering Department, King Fahd University of Petroleum & Mineral;UTA Research Institute, University of Texas at Arlington; State Key Laboratory of Synthetical Automation for Process Industries, Northeastern University;Ground Vehicle Robotics (GVR), U.S. Army TARDEC)

Abstract:

Key words: Dynamic graphical games, Nash equilibrium, discrete mechanics, optimal control, model-free reinforcement learning, policy iteration