Doing MCMC the manual way

In this recipe, we will go through a full example of coding an MCMC algorithm ourselves. This will give us a much better grasp of MCMC mechanics.

In the Bayesian world, we put prior densities and use data to augment those priors, and get posterior densities. The problem is that there are only a few occasions where we can calculate those posterior densities analytically—these are called conjugate families.

The Bayesian problem can be formulated as recovering the conditional density of the parameter given data. This is equal to the ratio of the joint density of the parameters and the data divided by the marginal density of the data. This follows from Bayes theorem, which states that we can invert a conditional probability by dividing the joint probability by the appropriate marginal density. This is the density that we want to compute, but even if we had it, we would need to do very complex calculations to properly marginalize each density (for each parameter). The idea behind MCMC is build random values following a Markov chain (sequence of values) that converge in probability to that distribution. In essence, what we are doing here is just generating random values according to the posterior density:

There are several MCMC algorithms, but the simplest ones are known as Metropolis Hastings (MH) algorithms. The intention in MH is to generate a chain of correlated random values that eventually converges to a stationary distribution of our target (in a regression model this is the joint posterior density involving all the parameters).

In pseudocode, the algorithm works in the following way—π is the posterior density (the multiplication of the prior and the data density):

  1. Step 0 generate random starting values for each parameter. Draw random numbers for each parameter conditional on the current values using a symmetric distribution (that is, a Gaussian distribution):

  1. Compute the following ratio:

Accept the values that we proposed in 1, with probability=p. That means that we draw another random number, and if it smaller than p, we accept the proposed move. Note that whenever the density at the proposed values is larger than at the current ones, we always accept the proposed values (probability=1). On the other hand, when the proposed density is smaller with the proposed values, we sometimes accept the proposed values.

  1. Keep running steps 1-3 until we have a necessary amount of random values.

In this example, we will generate a synthetic dataset where we already know the parameters, with the intention of estimating an MCMC model coded fully by us. Since we will be using quite simple priors, these priors should not be far away from the values used to generate the data.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.226.165.70