Summary

Although Bayesian statistics is conceptually simple, fully probabilistic models often lead to analytically intractable expressions. For many years, this was a huge barrier, hindering the wide adoption of Bayesian methods. Fortunately, maths, statistics, physics, and computer science came to the rescue in the form of numerical methods that are capable—at least in principle—of solving any inference problem. The possibility of automating the inference process has led to the development of probabilistic programming languages, allowing for a clear separation between model definition and inference.

PyMC3 is a Python library for probabilistic programming with a very simple, intuitive, and easy to read syntax that is also very close to the statistical syntax used to describe probabilistic models. We introduced the PyMC3 library by revisiting the coin-flip model from Chapter 1, Thinking Probabilistically, this time without analytically deriving the posterior. PyMC3 models are defined inside a context manager. To add a probability distribution to a model, we just need to write a single line of code. Distributions can be combined and can be used as priors (unobserved variables) or likelihoods (observed variables). If we pass data to a distribution, it becomes a likelihood. Sampling can be achieved with a single line as well. PyMC3 allows us to get samples from the posterior distribution. If everything goes right, these samples will be representative of the correct posterior distribution and thus they will be a representation of the logical consequences of our model and data.

We can explore the posterior generated by PyMC3 using ArviZ, a Python library that works hand-in-hand with PyMC3 and can be used, among other tasks, to help us interpret and visualize posterior distributions. One way of using a posterior to help us make inference-driven decisions is by comparing the ROPE against the HPD interval. We also briefly mentioned the notion of loss functions, a formal way to quantify the trade-offs and costs associated to making decisions in the presence of uncertainty. We learned that loss functions and point-estimates are intimately associated.

Up to this point, the discussion was restricted to a simple one-parameter model. Generalizing to arbitrary number of parameters is trivial with PyMC3; we exemplify how to do this with the Gaussian and Student's t models. The Gaussian distribution is a special case of the Student's t-distribution and we showed you how to use the latter to perform robust inferences in the presence of outliers. In the next chapter, we will look at how these model can be used as part of linear regression models.

We used a Gaussian model to compare a common data analysis task among groups. While this is sometimes framed in the context of hypothesis testing, we take another route and frame this task as a problem of inferring the effect size, an approach we generally consider to be richer and more productive. We also explored different ways to interpret and report effect sizes.

We saved for last, as we usually do with a fine dessert, one of the most important concepts to learn from this book: hierarchical models. We can build hierarchical models every time we can identify subgroups in our data. In such cases, instead of treating the subgroups as separated entities or ignoring the subgroups and treating them as a single group, we can build a model to partially pool information among groups. The main effect from this partial pooling is that the estimates of each subgroup will be biased by the estimates of the rest of the subgroups. This effect is know as shrinkage, and in general is a very useful trick that helps to improve inferences by making them more conservative (as each subgroup inform the others by pulling estimates toward it) and more informative. We get estimates at the subgroup level and the group level. We will see more examples of hierarchical models in the following chapters. Each example will help us better understand them from a slightly different perspective.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.222.25.112