This book is intended as a text covering the central concepts and techniques of Competitive Markov Decision Processes. It is an attempt to present a rig orous treatment that combines two significant research topics: Stochastic Games and Markov Decision Processes, which have been studied exten sively, and at times quite independently, by mathematicians, operations researchers, engineers, and economists. Since Markov decision processes can be viewed as a special noncompeti tive case of stochastic games, we introduce the new terminology Competi tive Markov Decision Processes that emphasizes the importance of the link between these two topics and of the properties of the underlying Markov processes. The book is designed to be used either in a classroom or for self-study by a mathematically mature reader. In the Introduction (Chapter 1) we outline a number of advanced undergraduate and graduate courses for which this book could usefully serve as a text. A characteristic feature of competitive Markov decision processes - and one that inspired our long-standing interest - is that they can serve as an "orchestra" containing the "instruments" of much of modern applied (and at times even pure) mathematics. They constitute a topic where the instruments of linear algebra, applied probability, mathematical program ming, analysis, and even algebraic geometry can be "played" sometimes solo and sometimes in harmony to produce either beautifully simple or equally beautiful, but baroque, melodies, that is, theorems.
|Publisher:||Springer New York|
|Product dimensions:||6.10(w) x 9.25(h) x 0.03(d)|
Table of Contents1 Introduction.- 1.0 Background.- 1.1 Raison d’Etre and Limitations.- 1.2 A Menu of Courses and Prerequisites.- 1.3 For the Cognoscenti.- 1.4 Style and Nomenclature.- I Mathematical Programming Perspective.- 2 Markov Decision Processes: The Noncompetitive Case.- 2.0 Introduction.- 2.1 The Summable Markov Decision Processes.- 2.2 The Finite Horizon Markov Decision Process.- 2.3 Linear Programming and the Summable Markov Decision Models.- 2.4 The Irreducible Limiting Average Process.- 2.5 Application: The Hamiltonian Cycle Problem.- 2.6 Behavior and Markov Strategies.- 2.7 Policy Improvement and Newton’s Method in Summable MDPs.- 2.8 Connection Between the Discounted and the Limiting Average Models.- 2.9 Linear Programming and the Multichain Limiting Average Process.- 2.10 Bibliographic Notes.- 2.11 Problems.- 3 Stochastic Games via Mathematical Programming.- 3.0 Introduction.- 3.1 The Discounted Stochastic Games.- 3.2 Linear Programming and the Discounted Stochastic Games.- 3.3 Modified Newton’s Method and the Discounted Stochastic Games.- 3.4 Limiting Average Stochastic Games: The Issues.- 3.5 Zero-Sum Single-Controller Limiting Average Game.- 3.6 Application: The Travelling Inspector Model.- 3.7 Nonlinear Programming and Zero-Sum Stochastic Games.- 3.8 Nonlinear Programming and General-Sum Stochastic Games.- 3.9 Shapley’s Theorem via Mathematical Programming.- 3.10 Bibliographic Notes.- 3.11 Problems.- II Existence, Structure and Applications.- 4 Summable Stochastic Games.- 4.0 Introduction.- 4.1 The Stochastic Game Model.- 4.2 Transient Stochastic Games.- 4.2.1 Stationary Strategies.- 4.2.2 Extension to Nonstationary Strategies.- 4.3 Discounted Stochastic Games.- 4.3.1 Introduction.- 4.3.2 Solutions of Discounted Stochastic Games.- 4.3.3 Structural Properties.- 4.3.4 The Limit Discount Equation.- 4.4 Positive Stochastic Games.- 4.5 Total Reward Stochastic Games.- 4.6 Nonzero-Sum Discounted Stochastic Games.- 4.6.1 Existence of Equilibrium Points.- 4.6.2 A Nonlinear Compementarity Problem.- 4.6.3 Perfect Equilibrium Points.- 4.7 Bibliographic Notes.- 4.8 Problems.- 5 Average Reward Stochastic Games.- 5.0 Introduction.- 5.1 Irreducible Stochastic Games.- 5.2 Existence of the Value.- 5.3 Stationary Strategies.- 5.4 Equilibrium Points.- 5.5 Bibliographic Notes.- 5.6 Problems.- 6 Applications and Special Classes of Stochastic Games.- 6.0 Introduction.- 6.1 Economic Competition and Stochastic Games.- 6.2 Inspection Problems and Single-Control Games.- 6.3 The Presidency Game and Switching-Control Games.- 6.4 Fishery Games and AR-AT Games.- 6.5 Applications of SER-SIT Games.- 6.6 Advertisement Models and Myopic Strategies.- 6.7 Spend and Save Games and the Weighted Reward Criterion.- 6.8 Bibliographic Notes.- 6.9 Problems.- Appendix G Matrix and Bimatrix Games and Mathematical Programming.- G.1 Introduction.- G.2 Matrix Game.- G.3 Linear Programming.- G.4 Bimatrix Games.- G.5 Mangasarian-Stone Algorithm for Bimatrix Games.- G.6 Bibliographic Notes.- Appendix H A Theorem of Hardy and Littlewood.- H.1 Introduction.- H.2 Preliminaries, Results and Examples.- H.3 Proof of the Hardy-Littlewood Theorem.- Appendix M Markov Chains.- M.1 Introduction.- M.2 Stochastic Matrix.- M.3 Invariant Distribution.- M.4 Limit Discounting.- M.5 The Fundamental Matrix.- M.6 Bibliographic Notes.- Appendix P Complex Varieties and the Limit Discount Equation.- P.1 Background.- P.2 Limit Discount Equation as a Set of Simultaneous Polynomials.- P.3 Algebraic and Analytic Varieties.- P.4 Solution of the Limit Discount Equation via Analytic Varieties.- References.