The Asynchronous Advantage Actor Critic Network

In the previous chapters, we have seen how cool a Deep Q Network (DQN) is and how it succeeded in generalizing its learning to play a series of Atari games with a human level performance. But the problem we faced is that it required a large amount of computation power and training time. So, Google's DeepMind introduced a new algorithm called the Asynchronous Advantage Actor Critic (A3C) algorithm, which dominates the other deep reinforcement learning algorithms, as it requires less computation power and training time. The main idea behind A3C is that it uses several agents for learning in parallel and aggregates their overall experience. In this chapter, we will see how A3C networks work. Following this, we will learn how to build an agent to drive up a mountain using A3C. 

In this chapter, you will learn the following:

  • The Asynchronous Advantage Actor Critic Algorithm
  • The three As
  • The architecture of A3C
  • How A3C works
  • Driving up a mountain with A3C
  • Visualization in TensorBoard
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.142.200.109