This book is dedicated to my parents, Abhijit and Sharbari Majumder, and my late grandfather, Shri Paresh Chandra Majumder.
Machine learning has been instrumental in shaping the scope of technology since its inception. ML has played an important role in the development of things such as autonomous vehicles and robotics. Deep reinforcement learning is that field of learning where agents learn with help of rewards—a thought which has been derived from nature. Through this book, the author tries to present the diversity of reinforcement learning algorithms in game development as well as in scientific research. Unity, the cross-platform engine that is used in a plethora of tasks, from visual effects and cinematography to machine learning and high performance graphics, is the primary tool that is used in this book. With the power of the Unity ML Agents Toolkit, the deep reinforcement learning framework built by Unity, the author tries to show the vast possibilities of this learning paradigm.
The book starts with an introduction to state-based reinforcement learning, from Markov processes to Bellman equations and Q-learning, which sets the ground for the successive sections. A plethora of diverse pathfinding algorithms, from Dijkstra to sophisticated variants of A* star, have been provided along with simulations in Unity. The book also covers how navigation meshes work for automated pathfinding in Unity. An introduction to the ML Agents Toolkit, from standard process for installation to training an AI agent with deep reinforcement learning algorithm (proximal policy operation [PPO]) is provided as a starter. Along the course of this book, there is an extensive usage of the Tensorflow framework along with OpenAI Gym environments for proper visualizations of complex deep reinforcement learning algorithms in terms of simulations, robotics, and autonomous agents. Successive sections of the book involve an in-depth study of the variety of on- and off-policy algorithms, ranging from discrete SARSA/Q-learning to actor critic variants, deep Q-network variants, PPO, and their implementations using the Keras Tensorflow framework on Gym. These sections are instrumental in understanding how different simulations such as the famous Puppo (Unity Berlin), Tiny agents, and other ML Agents samples from Unity are created and built. Sections with detailed descriptions about how to build simulations in Unity using the C# software development kit for ML Agents and training them using soft actor critic (SAC), PPO, or behavioral cloning algorithms such as GAIL are provided.
The latter part of this book provides an insight into curriculum learning and adversarial networks with an analysis of how AI agents are trained in games such as FIFA. In all these sections, a detailed description of the variants of neural networks—MLP, convolution networks, recurrent networks along with long short-term memory and GRU and their implementations and performance are provided. This is especially helpful as they are used extensively during building the deep learning algorithms. The importance of convolution networks for image sampling in Atari-based 2D games such as Pong has been provided. The knowledge of computer vision and deep reinforcement learning is combined to produce autonomous vehicles and driverless cars, which is also provided as an example template (game) for the readers to build upon.
Finally, this book also contains an in-depth review of the Obstacle Tower Challenge, which was organized by Unity Technologies to challenge state-of-the-art deep reinforcement learning algorithms. Sections on certain evolutionary algorithms along with the Google Dopamine framework has been provided for understanding the vast field of reinforcement learning. Through this book, the author hopes to infuse enthusiasm and foster research among the readers in the field of deep reinforcement learning.
The amount of dedication and support that I have received in the making of this book has left me amazed. First, I would like to thank my family, Mr. Abhijit Majumder and Mrs. Sharbari Majumder, who have been instrumental in supporting me all the way. I would also like to extend my heartfelt thanks to the entire Apress Team, without whom this would not have been possible. Special thanks to Mrs. Spandana Chatterjee, the Acquisition Editor, Mr. Shrikant Vishwakarma, the Coordinating Editor, and Laura Berendson, the Development Editor, for their constant support and thorough reviews. Ansh Shah, the Technical Reviewer of this book, has also played an important role and I extend my thanks to him.
I would also like to share this space in thanking my mentor, Carl Domingo from Unity Technologies, who has been so instrumental in guiding me from the beginning of my journey with Unity. The Unity Machine Learning team deserves mention, as this book would not have been possible without their constant efforts to make the ML Agents platform amazing. I especially thank Dr. Danny Lange, whose sessions on machine learning have been instrumental in understanding the framework and the concepts.
I am grateful to everyone who helped in the entire process to make this book, which would help readers understand the beauty of deep reinforcement learning.
Abhilash was a former apprentice/student ambassador for Unity Technologies, where he educated corporate employees and students on using general Unity for game development. He was a technical mentor (AI programming) for the Unity Ambassadors Community and Content Production. He has been associated with Unity Technologies for general education, with an emphasis on graphics and machine learning. He is a community moderator for machine learning (ML Agents) sessions organized by Unity Technologies (Unity Learn). He is one of the first content creators for Unity Technologies India (since 2017) and is responsible for the growth of the community in India under the guidance of Unity Technologies.
is pursuing an MSc Physics and BE Mechanical from Bits Pilani University, India. By day, he is a student. By night, he is a robotics and machine learning enthusiast. He is a core member of BITS ROBOCON, a technical team at college, and is currently working on quadcopter and quadruped.
3.147.65.65