Summary

Posterior predictive checks is a general concept and practice that can help us understand how well models are capturing the data and how well the model is capturing the aspects of a problem we are interested in. We can perform posterior predictive checks with just one model or with many models, and thus we can use it as a method for model comparison. Posterior predictive checks are generally done via visualizations, but numerical summaries like Bayesian -values can also be helpful.

Good models have a good balance between complexity and predictive accuracy. We exemplified this feature by using the classical example of polynomial regression. We discussed two methods to estimate the out-of-sample accuracy without leaving data aside: cross-validation and information criteria. We focused our discussion on the latter. From a practical point of view, information criteria is a family of methods that's balancing two contributions: one that measures how well a model fits the data and the other penalizing complex models. From the many information criteria available, WAIC is the most useful for Bayesian models. Another measure that is useful and, in practice, provides very similar results to WAIC is PSIS-LOO-CV (or LOO). This is a method that's used to approximate leave-one-out cross validation without the high computational cost of actually refitting a model several times. WAIC and LOO can be used for model selection and can also be helpful for model averaging. Instead of selecting a single best model, model averaging is about combining all available models by taking a weighted average of them.

A different approach to model selection, comparison, and model averaging is Bayes factors, which are the ratio of the marginal likelihoods of two models. Bayes factor computations can be really challenging. In this chapter, we showed two routes to compute them with PyMC3: a hierarchical model where we directly try to estimate the relative probability of each model using a discrete index, and a sampling method known as Sequential Monte Carlo. We suggested using the latter.

Besides being computationally challenging to compute Bayes factors are problematic to use, given that they are very sensitively to prior specification, we also compare Bayes factors and Information Criteria, and walked you through an example whereby they solve two related but different questions—one puts the focus on identifying the right model and the other on the best predictions or lower generalization loss. None of these methods are free of problems, but WAIC and LOO are much more robust in practice.

We briefly discussed how priors are related to the subject of overfitting, bias, and regularization with regard to the important subject of building models with good generalization properties.

Finally, we closed this book with a more in-depth discussion of WAIC, including a commentary of the interrelated concepts of entropy, the maximum entropy principle, and the KL divergence.

Table of Contents for Summary

Create new playlist

Sign In

Sign Up

Table of Contents for
Summary