Linear models and high autocorrelation

Linear models lead to posterior distribution where  and  are highly correlated. See the following code and Figure 3.4 for an example:

az.plot_pair(trace_g, var_names=['α', 'β'], plot_kwargs={'alpha': 0.1})
Figure 3.4

The correlation we are seeing in Figure 3.4 is a direct consequence of our assumptions. No matter which line we fit to our data, all of them should pass for one point, that is, the mean of the  variable and the mean of the  variable. Hence, the line fitting process is somehow equivalent to spinning a straight line fixed at the center of the data, like a wheel of fortune. An increase in the slope means a decrease of the intercept and vice versa. Both parameters are going to be correlated by definition of the model. Hence, the shape of the posterior (excluding ) is a very diagonal space. This can be problematic for samplers such as Metropolis-Hastings and to a lesser extent NUTS. For details of why this is true, see Chapter 8, Inference Engines.

Before continuing and for the sake of truth, let me clarify something. The fact that the line is constrained to pass through the mean of the data is only true for the least square method (and its assumptions). Using Bayesian methods, this constrain is relaxed. Later in the examples, we will see that, in general, we get lines around the mean values of  and  and not exactly through the exact mean. Moreover, if we use strong priors, we could end up with lines far away from the mean of  and . Nevertheless, the idea that the autocorrelation is related to the line spinning around a more or less defined point remains true, and that is all we need to understand regarding the correlation of the  and  parameters.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.216.115.44