Convergence

We can visualize the samples over time and their distributions to check the quality of the results. The following charts show the posterior distributions after an initial 100 and an additional 100,000 samples, respectively, and illustrate how convergence implies that multiple chains identify the same distribution. The pm.trace_plot() function shows the evolution of the samples as well (see the notebook for more information):

Posterior distributions

PyMC3 produces various summary statistics for a sampler. These are available as individual functions in the stats module, or by providing a trace to the pm.summary() function:

statsmodels

mean

sd

hpd_2.5

hpd_97.5

n_eff

Rhat

Intercept

-1.97

-1.97

0.04

-2.04

-1.89

69,492.17

1.00

sex[T. Male]

1.20

1.20

0.04

1.12

1.28

72,374.10

1.00

age

1.10

1.10

0.03

1.05

1.15

68,446.73

1.00

I(age ** 2)

-0.54

-0.54

0.02

-0.58

-0.50

66,539.66

1.00

hours

0.32

0.32

0.02

0.28

0.35

93,008.86

1.00

educ

0.84

0.84

0.02

0.80

0.87

98,125.26

1.00

 

The preceding tables includes the (separately computed) statsmodels logit coefficients in the first column to show that, in this simple case, both models agree because the sample mean is very close to the coefficients.

The remaining columns contain the highest posterior density (HPD) estimate for the minimum width credible interval, the Bayesian version of a confidence interval, which here is computed at the 95% level. The n_eff statistic summarizes the number of effective (not rejected) samples resulting from the ~100K draws.

R-hat, also known as the Gelman-Rubin statistic, checks convergence by comparing the variance between chains to the variance within each chain. If the sampler converged, these variances should be identical, that is, the chains should look similar. Hence, the statistic should be near 1. The pm.forest_plot() function also summarizes this statistic for the multiple chains (see the notebook for more information).

For high-dimensional models with many variables, it becomes cumbersome to inspect numerous traces. When using NUTS, the energy plot helps to assess problems of convergence. It summarizes how efficiently the random process explores the posterior. The plot shows the energy and the energy transition matrix, which should be well-matched, as in the following example (see references for conceptual detail):

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.139.97.157