The Asynchronous Advantage Actor Critic Network

In the previous chapters, we have seen how cool a Deep Q Network (DQN) is and how it succeeded in generalizing its learning to play a series of Atari games with a human level performance. But the problem we faced is that it required a large amount of computation power and training time. So, Google's DeepMind introduced a new algorithm called the Asynchronous Advantage Actor Critic (A3C) algorithm, which dominates the other deep reinforcement learning algorithms, as it requires less computation power and training time. The main idea behind A3C is that it uses several agents for learning in parallel and aggregates their overall experience. In this chapter, we will see how A3C networks work. Following this, we will learn how to build an agent to drive up a mountain using A3C.

In this chapter, you will learn the following:

The Asynchronous Advantage Actor Critic Algorithm
The three As
The architecture of A3C
How A3C works
Driving up a mountain with A3C
Visualization in TensorBoard

Table of Contents for The Asynchronous Advantage Actor Critic Network

Create new playlist

Sign In

Sign Up

Table of Contents for
The Asynchronous Advantage Actor Critic Network