Quadratic method

The quadratic approximation, also known as the Laplace method or the normal approximation, consists of approximating the posterior, , with a Gaussian distribution, .

This method consists of two steps:

  1. Find the mode of the posterior distribution. This will be the mean of .
  2. Compute the Hessian matrix. From this, we can compute the standard deviation of .

The first step can be done numerically using optimization methods, that is, methods to find the maximum or minimum of a function. There are many off-the-shelf methods for this purpose. Because for a Gaussian the mode and the mean are equal, we can use the mode as the mean of the approximating distribution, . The second step is not that transparent. We can approximately compute the standard deviation of by evaluating the curvature at the mode/mean of . This can be done by computing the inverse of the square-root of the Hessian matrix. A Hessian matrix is the matrix of the second derivative of a function and its inverse provides the covariance matrix. Using PyMC3, we can do the following:

with pm.Model() as normal_approximation:
p = pm.Beta('p', 1., 1.)
w = pm.Binomial('w',n=1, p=p, observed=data)
mean_q = pm.find_MAP()
std_q = ((1/pm.find_hessian(mean_q, vars=[p]))**0.5)[0]
mean_q['p'], std_q
If you try using the pm.find_MAP function in PyMC3, you will get a warning message. Because of the curse of dimensionality, using the maximum a posteriori (MAP) to represent the posterior or even to initialize a sampling method is not generally a good idea.

Let's see how the quadratic approximation looks for the beta-binomial model:

# analytic calculation
x = np.linspace(0, 1, 100)
plt.plot(x, stats.beta.pdf(x , h+1, t+1),
label='True posterior')

# quadratic approximation
plt.plot(x, stats.norm.pdf(x, mean_q['p'], std_q),label='Quadratic
approximation')
plt.legend(loc=0, fontsize=13)

plt.title(f'heads = {h}, tails = {t}')
plt.xlabel('θ', fontsize=14)
plt.yticks([]);

Figure 8.3

Figure 8.3 shows that the quadratic approximation is not that bad, at least for this example. Strictly speaking, we can only apply the Laplace method to unbounded variables, that is, variables living in . This is because the Gaussian is an unbounded distribution, so if we use it to model a bounded distribution (such as the beta distribution), we will end up estimating a positive density where in fact the density should be zero (outside the [0, 1] interval for the beta distribution). Nevertheless, the Laplace method can be used if we first transform the bounded variable to make it unbounded. For example, we usually use a HalfNormal to model the standard deviation precisely because it's restricted to the [0, ∞) interval, we can make a HalfNormal variable unbounded by taking the logarithm of it.

The Laplace method is limited, but can work well for some models and can be used to obtain analytical expressions to approximate the posterior. Also is one of the building block of a more advanced method known as Integrated Nested Laplace Approximation (INLA).

In the next section, we will discuss variational methods that are somehow similar to the Laplace approximation but are more flexible and, powerful, and some of them can be applied automatically to a wide range of models.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.128.79.215