Summary

We learned a lot in this chapter. More importantly, we implemented an agent that learned to solve the Mountain Car problem smartly in 7 minutes or so!

We started by understanding the famous Mountain Car problem and looking at how the environment, the observation space, the state space, and rewards are designed in the Gym's MountainCar-v0 environment. We revisited the reinforcement learning Gym boilerplate code we used in the previous chapter and made some improvements to it, which are also available in the code repository of this book.

We then defined the hyperparameters for our Q-learning agent and started implementing a Q-learning algorithm from scratch. We first implemented the agent's initialization function to initialize the agent's internal state variables, including the Q value representation, using a NumPy n-dimensional array. Then, we implemented the discretize method to discretize the state space; the get_action(...) method to select an action based on an epsilon-greedy policy; and then finally the learn(...) function, which implements the Q-learning update rule and forms the core of the agent. We saw how simple it was to implement them all! We also implemented functions to train, test, and evaluate the agent's performance.

I hope you had a lot of fun implementing the agent and watching it solve the Mountain Car problem at the Gym! We will get into advanced methods in the next chapter to solve a variety of more challenging problems.

Table of Contents for Summary

Create new playlist

Sign In

Sign Up

Table of Contents for
Summary