Akaike information criterion

This is a very well-known and widely used information criterion, especially for non-Bayesians, and is defined as follows:

Here, is just the number of parameters and is the maximum likelihood estimation of . Maximum likelihood estimation is a common practice for non-Bayesians and, in general, is equivalent to the Bayesian maximum a posteriori (MAP) estimation when using flat priors. Notice that is a point estimation and not a distribution.

Once again, the -2 is there for historical reasons. The important observation, from a practical point of view, is that the first term takes into account how well the model fits the data and the second term penalizes complex models. Hence, if two models explain the data equally well, but one has more parameters than the other, tells us that we should choose the one with the fewer parameters.

works well for non-Bayesian approaches, but is problematic otherwise. One reason is that it does not use the posterior, and hence it is discarding information about the uncertainty in the estimation; it is also assuming flat priors and hence this measure is incompatible with informative and weakly informative priors, like those used in this book.

Table of Contents for Akaike information criterion

Create new playlist

Sign In

Sign Up

Table of Contents for
Akaike information criterion