Defining the Memory

Let's define a deque object to store the information (state, action, reward, done) related to every relevant step we take when playing the game. We will then be using the data stored in this deque object for training. 

training_data = deque(maxlen=deque_len)

We have defined the deque object to be of size 20000. Once this container is filled with 20000 data points, every new append being made at one end will result in popping a data point at the other end. Then, we will end up retaining only the latest information over time.

We will define a function called memory which when called during the game will accept the information related to the action, state, reward and done as input at that time step and then stores it in the training data deque container we have defined above. You will see that we are storing these five variables as a tuple entry at each timestep.

def memory(state, new_state, reward, done, action):
"""Function to store data points in the deque container."""
training_data.append((state, new_state, reward, done, action))
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.14.246.148