Non-centered parameterization

I have presented the non-centered model as a magic trick that solves the sampling problem. Let's take a movement to undercover the trick and remove the magic. 

From Figure 8.13, we can see that the a and b parameters are correlated. Since b is a vector of shape 10, we have chosen to plot b(0), but any other element of b should show the same pattern, in fact this is rendered crystal clear in Figure 8.14. The correlation, and this particular funnel shape, are a consequence of the model definition and the ability of the model to partially pool data. As the values of a decrease, the individual values of b become closer to each other and closer to the global mean value. In other words, the shrinkage level gets higher and higher, and thus the data gets more and more pooled (up to the point of being completely pooled). The same structure that allows for partial pooling also introduces correlations that affect the performance of the sampler methods.

In Chapter 3Modeling with Linear Regression, we saw that linear models also lead to correlation (of a different nature); for those models, an easy fix is to center the data. We may be tempted to do the same here,  but unfortunately that will not help us to get rid of the sampling issues introduced by the funnel shape. The tricky feature of the funnel shape is that correlations vary with the position in parameter space, thus centering the data will not help to reduce that kind of correlation. As we saw, MCMC methods, such as Metropolis-Hastings, have problems exploring highly-correlated spaces; the only way these methods find to properly get samples is to propose a new step in the neighborhood of the previous step. As a result, the exploration becomes highly autocorrelated and painfully slow. The slow mixing can be so dramatic that simply increasing the number of samples (draws) is not a reasonable or practicable solution. Samplers such as NUTS are better at this job as they propose steps based on the curvature of the parameter space, but as we already discussed, the efficiency of the sampling process is highly dependent on the tuning phase. For some geometries of the posterior, such as those induced by hierarchical models, the tuning phase gets overly tuned to the local neighborhood where the chains started, making the exploration of other regions inefficient as the new proposals are more random, resembling the behavior of Metropolis-Hastings.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.220.124.177