Who cares about coin flips? Well, virtually no one. However, (a) coin flips are a great simple application to get the hang of Bayesian analysis; (b) the kinds of problems that a beta prior and a binomial likelihood function solve go way beyond assessing the fairness of coin flips. We are now going to apply the same technique to a real life problem that I actually came across in my work.
For my job, I had to create a career recommendation system that asked the user a few questions about their preferences and spat out some careers they may be interested in. After a few hours, I had a working prototype. In order to justify putting more resources into improving the project, I had to prove that I was on to something and that my current recommendations performed better than chance.
In order to test this, we got 40 people together, asked them the questions, and presented them with two sets of recommendations. One was the true set of recommendations that I came up with, and one was a control set—the recommendations of a person who answered the questions randomly. If my set of recommendations performed better than chance would dictate, then I had a good thing going, and could justify spending more time on the project.
Simply performing better than chance is no great feat on its own—I also wanted really good estimates of how much better than chance my initial recommendations were.
For this problem, I broke out my Bayesian toolbox! The parameter of interest is the proportion of the time my recommendations performed better than chance. If .05 and lower were very unlikely values of the parameter, as far as the posterior depicted, then I could conclude that I was on to something.
Even though I had strong suspicions that my recommendations were good, I used a uniform beta prior to preemptively thwart criticisms that my prior biased the conclusions. As for the likelihood function, it is the same function family we used for the coin flips (just with different parameters).
It turns out that 36 out of the 40 people preferred my recommendations to the random ones (three liked them both the same, and one weirdo liked the random ones better). The posterior distribution, therefore, was a beta distribution with parameters 37
and 5
.
> curve(dbeta(x, 37, 5), xlab="θ", + ylab="posterior belief", + type="l", yaxt='n')
Again, the end result of the Bayesian analysis proper is the posterior distribution that illustrates credible values of the parameter. The decision to set an arbitrary threshold for concluding that my recommendations were effective or not is a separate matter.
Let's say that, before the fact, we stated that if .05 or lower were not among the 95% most credible values, we would conclude that my recommendations were effective. How do we know what the credible interval bounds are?
Even though it is relatively straightforward to determine the bounds of the credible interval analytically, doing so ourselves computationally will help us understand how the posterior distribution is summarized in the examples given later in this chapter.
To find the bounds, we will sample from a beta distribution with hyper-parameters 37
and 5
thousands of times and find the quantiles at .025
and .975
.
> samp <- rbeta(10000, 37, 5) > quantile(samp, c(.025, .975)) 2.5% 97.5% 0.7674591 0.9597010
Neat! With the previous plot already up, we can add lines to the plot indicating this 95% credible interval, like so:
# horizontal line > lines(c(.767, .96), c(0.1, 0.1) > # tiny vertical left boundary > lines(c(.767, .769), c(0.15, 0.05)) > # tiny vertical right boundary > lines(c(.96, .96), c(0.15, 0.05))
If you plot this yourself, you'll see that even the lower bound is far from the decision boundary—it looks like my work was worth it after all!
The technique of sampling from a distribution many many times to obtain numerical results is known as Monte Carlo simulation.
18.119.102.46