Deep Q-Network

Deep Q-Network (DQN) algorithms combine both the reinforcement learning approach and the deep learning approach. DQN learns by itself, learning in an empirical way and without a rigid programming aimed at a particular objective, such as winning a game of chess.

DQN represents an application of Q-learning with the use of deep learning for the approximation of the evaluation function. The DQN was proposed by Mnih et al. through an article published in Nature on February 26, 2015. As a consequence, a lot of research institutes joined this field, because deep neural networks can empower reinforcement learning algorithms to directly deal with high-dimensional states.

The use of deep neural networks is due to the fact that researchers noted the following: using a neural network to approximate the Q-evaluation function in algorithms with reinforcement learning made the system unstable or divergent. In fact, it is possible to notice that small updates to Q can significantly change the policy, distribution of data, and correlations between Q and target values. These correlations, present in the sequence of observations, are the cause of the instability of the algorithms.

To transform a normal Q-network into a DQN, it is necessary to carry out the following precautions:

  • Replace the single-level neural network with a multi-level convolutional network for approximation of the Q-function evaluation
  • Implement the experience replay
  • Use a second network to calculate the target Q-values during your updates

What is meant by the term experience replay? This means that, instead of running Q-learning on state/action pairs as they occur during a simulation or actual experience, the system stores the data discovered, typically in a large table. In this way, our network can train itself using stored memories from its experience.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.16.67.54