Divergences

We will now explore tests that are exclusive of NUTS, as they are based on the inner-working of the method and not a property of the generated samples. These tests are based on something called divergences and are a powerful and sensitive method of diagnosing samples.

While I tried to set the models in this book to avoid divergences, you may have seen PyMC3 messages indicating that a divergence had occurred. Divergences may indicate that NUTS has encountered regions of high curvature in the posterior that it can not explore properly; it is telling us that the sampler could be missing a region of the parameter space, and thus our results will be biased. Divergences are generally much more sensitive than the tests discussed here, and thus they can signal problems even when the rest of the tests pass. A nice feature of divergences is that they tend to appear close to the problematic parameter space regions, and thus we can use them to identify where the problem may be. One way to visualize divergences is by using az.plot_pair with the divergences=True argument:

_, ax = plt.subplots(1, 2, sharey=True, figsize=(10, 5), constrained_layout=True)

for idx, tr in enumerate([trace_cm, trace_ncm]):
    az.plot_pair(tr, var_names=['b', 'a'], coords={'b_dim_0':[0]}, kind='scatter',
                 divergences=True, contour=False, divergences_kwargs={'color':'C1'},
                 ax=ax[idx])
    ax[idx].set_title(['centered', 'non-centered'][idx])

Figure 8.13

In Figure 8.13, the small (blue) dots are regular samples and the larger (black and orange) dots represent the divergences. We can see that the centered model has divergences that are mostly concentrated at the tip of the funnel. We can also see that the non-centered model has no divergences and a sharper tip. The sampler, through the divergences, is telling us that it is having a hard time sampling from the region close to the tip of the funnel. We can indeed check in Figure 8.13 that the centered model does not have samples around the tip, close to where the divergences are concentrated. How neat is that?!

Divergences are also represented in ArviZ's trace plots using black "|" markers, see Figure 8.7 for an example. Notice how divergences are concentrated around the pathological flat portions of the trace.

Another useful way to visualize divergences is with a parallel plot:

az.plot_parallel(trace_cm)

Figure 8.14

Here, we can see that divergences are concentrated around 0 for both b and a. Plots such as those in Figure 8.13 and 8.14 are super important, because they let us know which part of the parameter space could be problematic, and also because they help us to spot false positives. Let me explain this last point. PyMC3 uses a heuristic to label divergences, sometimes, this heuristic could say we have a divergence when we do not. As a general rule, if divergences are scattered in the parameter space, we probably have false positives; if divergences are concentrated, then we likely have a problem. When getting divergences, there are three general ways to get rid of them, or at least reduce the number of them:

Increase the number of tuning step, something like pm.sample(tuning=1000).
Increase the value of the target_accept parameter from its default value of 0.8. The maximum value is 1, so you can try with values such as 0.85 or 0.9.
Re-parameterize the model. As we just saw, the non-centered model is a re-parameterization of the centered model, which leads to better samples and an absence of divergences.

Table of Contents for Divergences

Create new playlist

Sign In

Sign Up

Table of Contents for
Divergences