0%

Book Description

The book begins with a chapter on traditional methods of supervised learning, covering recursive least squares learning, mean square error methods, and stochastic approximation. Chapter 2 covers single agent reinforcement learning. Topics include learning value functions, Markov games, and TD learning with eligibility traces. Chapter 3 discusses two player games including two player matrix games with both pure and mixed strategies. Numerous algorithms and examples are presented. Chapter 4 covers learning in multi-player games, stochastic games, and Markov games, focusing on learning multi-player grid games—two player grid games, Q-learning, and Nash Q-learning. Chapter 5 discusses differential games, including multi player differential games, actor critique structure, adaptive fuzzy control and fuzzy interference systems, the evader pursuit game, and the defending a territory games. Chapter 6 discusses new ideas on learning within robotic swarms and the innovative idea of the evolution of personality traits.

• Framework for understanding a variety of methods and approaches in multi-agent machine learning.

• Discusses methods of reinforcement learning such as a number of forms of multi-agent Q-learning

• Applicable to research professors and graduate students studying electrical and computer engineering, computer science, and mechanical and aerospace engineering

Table of Contents

  1. Cover
  2. Title
  3. Copyright
  4. Preface
    1. References
  5. Chapter 1: A Brief Review of Supervised Learning
    1. 1.1 Least Squares Estimates
    2. 1.2 Recursive Least Squares
    3. 1.3 Least Mean Squares
    4. 1.4 Stochastic Approximation
    5. References
  6. Chapter 2: Single-Agent Reinforcement Learning
    1. 2.1 Introduction
    2. 2.2 -Armed Bandit Problem
    3. 2.3 The Learning Structure
    4. 2.4 The Value Function
    5. 2.5 The Optimal Value Functions
    6. 2.6 Markov Decision Processes
    7. 2.7 Learning Value Functions
    8. 2.8 Policy Iteration
    9. 2.9 Temporal Difference Learning
    10. 2.10 TD Learning of the State-Action Function
    11. 2.11 Q-Learning
    12. 2.12 Eligibility Traces
    13. References
  7. Chapter 3: Learning in Two-Player Matrix Games
    1. 3.1 Matrix Games
    2. 3.2 Nash Equilibria in Two-Player Matrix Games
    3. 3.3 Linear Programming in Two-Player Zero-Sum Matrix Games
    4. 3.4 The Learning Algorithms
    5. 3.5 Gradient Ascent Algorithm
    6. 3.6 WoLF-IGA Algorithm
    7. 3.7 Policy Hill Climbing (PHC)
    8. 3.8 WoLF-PHC Algorithm
    9. 3.9 Decentralized Learning in Matrix Games
    10. 3.10 Learning Automata
    11. 3.11 Linear Reward–Inaction Algorithm
    12. 3.12 Linear Reward–Penalty Algorithm
    13. 3.13 The Lagging Anchor Algorithm
    14. 3.14 Lagging Anchor Algorithm
    15. References
  8. Chapter 4: Learning in Multiplayer Stochastic Games
    1. 4.1 Introduction
    2. 4.2 Multiplayer Stochastic Games
    3. 4.3 Minimax-Q Algorithm
    4. 4.3 Minimax-Q Algorithm
    5. 4.5 The Simplex Algorithm
    6. 4.6 The Lemke–Howson Algorithm
    7. 4.7 Nash-Q Implementation
    8. 4.8 Friend-or-Foe Q-Learning
    9. 4.9 Infinite Gradient Ascent
    10. 4.10 Policy Hill Climbing
    11. 4.11 WoLF-PHC Algorithm
    12. 4.12 Guarding a Territory Problem in a Grid World
    13. 4.13 Extension of Lagging Anchor Algorithm to Stochastic Games
    14. 4.14 The Exponential Moving-Average Q-Learning (EMA Q-Learning) Algorithm
    15. 4.15 Simulation and Results Comparing EMA Q-Learning to Other Methods
    16. References
  9. Chapter 5: Differential Games
    1. 5.1 Introduction
    2. 5.2 A Brief Tutorial on Fuzzy Systems
    3. 5.3 Fuzzy Q-Learning
    4. 5.4 Fuzzy Actor–Critic Learning
    5. 5.5 Homicidal Chauffeur Differential Game
    6. 5.6 Fuzzy Controller Structure
    7. 5.7 Q()-Learning Fuzzy Inference System
    8. 5.9 Learning in the Evader–Pursuer Game with Two Cars
    9. 5.6 Fuzzy Controller Structure
    10. 5.10 Simulation of the Game of Two Cars
    11. 5.11 Differential Game of Guarding a Territory
    12. 5.12 Reward Shaping in the Differential Game of Guarding a Territory
    13. 5.13 Simulation Results
    14. References
  10. Chapter 6: Swarm Intelligence and the Evolution of Personality Traits
    1. 6.1 Introduction
    2. 6.2 The Evolution of Swarm Intelligence
    3. 6.3 Representation of the Environment
    4. 6.4 Swarm-Based Robotics in Terms of Personalities
    5. 6.5 Evolution of Personality Traits
    6. 6.6 Simulation Framework
    7. 6.7 A Zero-Sum Game Example
    8. 6.8 Implementation for Next Sections
    9. 6.9 Robots Leaving a Room
    10. 6.10 Tracking a Target
    11. 6.11 Conclusion
    12. References
  11. Index
  12. End User License Agreement
3.144.71.142