Defining Hyperparameters for DQN

Below are some of the hyperparameters defined that we will be using throughout the code and are totally configurable. 

# Discount in Bellman Equation
gamma = 0.95

# Epsilon
epsilon = 1.0

# Minimum Epsilon
epsilon_min = 0.01

# Decay multiplier for epsilon
epsilon_decay = 0.99

# Size of deque container
deque_len = 20000

# Average score needed over 100 epochs
target_score = 200

# Number of games
episodes = 2000

# Data points per episode used to train the agent
batch_size = 64

# Optimizer for training the agent
optimizer = 'adam'

# Loss for training the agent
loss = 'mse'
  • gamma - Discount parameter in the bellman equation
  • epsilon_decay - Multiplier by which you want to discount the value of 'epsilon' after each episode/game
  • epsilon_min - Minimum value of 'epsilon' beyond which you do not want to decay it
  • deque_len - Size of the deque container used to store the training examples (state, reward, done, and action)
  • target_score - Average score over 100 epochs you want the agent to score after which you stop the learning process
  • episodes - Maximum number of games you want the agent to play
  • batch_size - Size of the batch of training data (stored in the deque container) used to train the agent after each episode
  • optimizer - Optimizer of choice for training the agent
  • loss - Loss of choice for training the agent
Experiment with different learning rates, optimizers, batch size as well as epsilon_decay value to see how these factors affect the quality of your model and if you get better results, show it to the deep learning community.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.142.255.140