Markov chain Monte Carlo sampling

A Markov chain is a dynamic stochastic model that describes a random walk over a set of states, connected by transition probabilities. The Markov property stipulates that the process has no memory, and the next step only depends on the current state. In other words, it's conditional on the present, past, and future being independent, that is, information about past states does not help to predict the future beyond what we know from the present.

Monte Carlo methods rely on repeated random sampling to approximate results that may be deterministic, but that does not permit an analytic, exact solution. It was developed during the Manhattan Project to estimate energy at the atomic level and received its enduring code name to ensure secrecy.

Many algorithms apply the Monte Carlo method to a Markov Chain, and generally proceed as follows:

  1. Start at the current position.
  2. Draw a new position from a proposal distribution.
  1. Evaluate the probability of the new position in light of data and prior distributions:
    1. If sufficiently likely, move to the new position
    2. Otherwise, remain at the current position
  2. Repeat from step 1.
  3. After a given number of iterations, return all accepted positions.

MCMC aims to identify and explore interesting regions of the posterior that concentrate on significant probability density. The memoryless process is said to converge when it consistently moves through nearby high probability states of the posterior where the acceptance rate increases. A key challenge is to balance the need for random exploration of the sample space with the risk of reducing the acceptance rate.

The initial steps of this process are likely to be more reflective of the starting position than the posterior and are typically discarded as burn-in samples. A key MCMC property is that the process should forget about its initial position after a certain (but unknown) number of iterations.

The remaining samples are called the trace of the process. Assuming convergence, the relative frequency of samples approximates the posterior and can be used to compute expected values based on the law of large numbers.

As indicated previously, the precision of the estimate depends on the serial correlation of the samples collected by the random walk, each of which, by design, depends only on the previous state. Higher correlation limits the effective exploration of the posterior and needs to be subjected to diagnostic tests.

General techniques to design such a Markov chain include Gibbs sampling, the Metropolis-Hastings algorithm, and more recent Hamiltonian MCMC methods that tend to perform better.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
54.144.219.156