Markov Decision Process

MDP is an extension of the Markov chain. It provides a mathematical framework for modeling decision-making situations. Almost all Reinforcement Learning problems can be modeled as MDP.

MDP is represented by five important elements: 

  • A set of states  the agent can actually be in.
  • A set of actions that can be performed by an agent, for moving from one state to another.
  • A transition probability (), which is the probability of moving from one state  to another state by performing some action .
  • A reward probability (), which is the probability of a reward acquired by the agent for moving from one state to another state  by performing some action .
  • A discount factor (), which controls the importance of immediate and future rewards. We will discuss this in detail in the upcoming sections.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.219.63.95