INTRODUCTION

Image

Virtually everything in life is, to some extent, uncertain. This may seem like a bit of an exaggeration, but to see the truth of it you can try a quick experiment. At the start of the day, write down something you think will happen in the next half-hour, hour, three hours, and six hours. Then see how many of these things happen exactly like you imagined. You’ll quickly realize that your day is full of uncertainties. Even something as predictable as “I will brush my teeth” or “I’ll have a cup of coffee” may not, for some reason or another, happen as you expect.

For most of the uncertainties in life, we’re able to get by quite well by planning our day. For example, even though traffic might make your morning commute longer than usual, you can make a pretty good estimate about what time you need to leave home in order to get to work on time. If you have a super-important morning meeting, you might leave earlier to allow for delays. We all have an innate sense of how to deal with uncertain situations and reason about uncertainty. When you think this way, you’re starting to think probabilistically.

Why Learn Statistics?

The subject of this book, Bayesian statistics, helps us get better at reasoning about uncertainty, just as studying logic in school helps us to see the errors in everyday logical thinking. Given that virtually everyone deals with uncertainty in their daily life, as we just discussed, this makes the audience for this book pretty wide. Data scientists and researchers already using statistics will benefit from a deeper understanding and intuition for how these tools work. Engineers and programmers will learn a lot about how they can better quantify decisions they have to make (I’ve even used Bayesian analysis to identify causes of software bugs!). Marketers and salespeople can apply the ideas in this book when running A/B tests, trying to understand their audience, and better assessing the value of opportunities. Anyone making high-level decisions should have at least a basic sense of probability so they can make quick back-of-the-envelope estimates about the costs and benefits of uncertain decisions. I wanted this book to be something a CEO could study on a flight and develop a solid enough foundation by the time they land to better assess choices that involve probabilities and uncertainty.

I honestly believe that everyone will benefit from thinking about problems in a Bayesian way. With Bayesian statistics, you can use mathematics to model that uncertainty so you can make better choices given limited information. For example, suppose you need to be on time for work for a particularly important meeting and there are two different routes you could take. The first route is usually faster, but has pretty regular traffic back-ups that can cause huge delays. The second route takes longer in general but is less prone to traffic. Which route should you take? What type of information would you need to decide this? And how certain can you be in your choice? Even just a small amount of added complexity requires some extra thought and technique.

Typically when people think of statistics, they think of scientists working on a new drug, economists following trends in the market, analysts predicting the next election, baseball managers trying to build the best team with fancy math, and so on. While all of these are certainly fascinating uses of statistics, understanding the basics of Bayesian reasoning can help you in far more areas in everyday life. If you’ve ever questioned some new finding reported in the news, stayed up late browsing the web wondering if you have a rare disease, or argued with a relative over their irrational beliefs about the world, learning Bayesian statistics will help you reason better.

What Is “Bayesian” Statistics?

You may be wondering what all this “Bayesian” stuff is. If you’ve ever taken a statistics class, it was likely based on frequentist statistics. Frequentist statistics is founded on the idea that probability represents the frequency with which something happens. If the probability of getting heads in a single coin toss is 0.5, that means after a single coin toss we can expect to get one-half of a head of a coin (with two tosses we can expect to get one head, which makes more sense).

Bayesian statistics, on the other hand, is concerned with how probabilities represent how uncertain we are about a piece of information. In Bayesian terms, if the probability of getting heads in a coin toss is 0.5, that means we are equally unsure about whether we’ll get heads or tails. For problems like coin tosses, both frequentist and Bayesian approaches seem reasonable, but when you’re quantifying your belief that your favorite candidate will win the next election, the Bayesian interpretation makes much more sense. After all, there’s only one election, so speaking about how frequently your favorite candidate will win doesn’t make much sense. When doing Bayesian statistics, we’re just trying to accurately describe what we believe about the world given the information we have.

One particularly nice thing about Bayesian statistics is that, because we can view it simply as reasoning about uncertain things, all of the tools and techniques of Bayesian statistics make intuitive sense.

Bayesian statistics is about looking at a problem you face, figuring out how you want to describe it mathematically, and then using reason to solve it. There are no mysterious tests that give results that you aren’t quite sure of, no distributions you have to memorize, and no traditional experiment designs you must perfectly replicate. Whether you want to figure out the probability that a new web page design will bring you more customers, if your favorite sports team will win the next game, or if we really are alone in the universe, Bayesian statistics will allow you to start reasoning about these things mathematically using just a few simple rules and a new way of looking at problems.

What’s in This Book

Here’s a quick breakdown of what you’ll find in this book.

Part I: Introduction to Probability

Chapter 1: Bayesian Thinking and Everyday Reasoning This first chapter introduces you to Bayesian thinking and shows you how similar it is to everyday methods of thinking critically about a situation. We’ll explore the probability that a bright light outside your window at night is a UFO based on what you already know and believe about the world.

Chapter 2: Measuring Uncertainty In this chapter you’ll use coin toss examples to assign actual values to your uncertainty in the form of probabilities: a number from 0 to 1 that represents how certain you are in your belief about something.

Chapter 3: The Logic of Uncertainty In logic we use AND, NOT, and OR operators to combine true or false facts. It turns out that probability has similar notions of these operators. We’ll investigate how to reason about the best mode of transport to get to an appointment, and the chances of you getting a traffic ticket.

Chapter 4: Creating a Binomial Probability Distribution Using the rules of probability as logic, in this chapter, you’ll build your own probability distribution, the binomial distribution, which you can apply to many probability problems that share a similar structure. You’ll try to predict the probability of getting a specific famous statistician collectable card in a Gacha card game.

Chapter 5: The Beta Distribution Here you’ll learn about your first continuous probability distribution and get an introduction to what makes statistics different from probability. The practice of statistics involves trying to figure out what unknown probabilities might be based on data. In this chapter’s example, we’ll investigate a mysterious coin-dispensing box and the chances of making more money than you lose.

Part II: Bayesian Probability and Prior Probabilities

Chapter 6: Conditional Probability In this chapter, you’ll condition probabilities based on your existing information. For example, knowing whether someone is male or female tells us how likely they are to be color blind. You’ll also be introduced to Bayes’ theorem, which allows us to reverse conditional probabilities.

Chapter 7: Bayes’ Theorem with LEGO Here you’ll gain a better intuition for Bayes’ theorem by reasoning about LEGO bricks! This chapter will give you a spatial sense of what Bayes’ theorem is doing mathematically.

Chapter 8: The Prior, Likelihood, and Posterior of Bayes’ Theorem Bayes’ theorem is typically broken into three parts, each of which performs its own function in Bayesian reasoning. In this chapter, you’ll learn what they’re called and how to use them by investigating whether an apparent break-in was really a crime or just a series of coincidences.

Chapter 9: Bayesian Priors and Working with Probability Distributions This chapter explores how we can use Bayes’ theorem to better understand the classic asteroid scene from Star Wars: The Empire Strikes Back, through which you’ll gain a stronger understanding of prior probabilities in Bayesian statistics. You’ll also see how you can use entire distributions as your prior.

Part III: Parameter Estimation

Chapter 10: Introduction to Averaging and Parameter Estimation Parameter estimation is the method we use to formulate a best guess for an uncertain value. The most basic tool in parameter estimation is to simply average your observations. In this chapter we’ll see why this works by analyzing snowfall levels.

Chapter 11: Measuring the Spread of Our Data Finding the mean is a useful first step in estimating parameters, but we also need a way to account for how spread out our observations are. Here you’ll be introduced to mean absolute deviation (MAD), variance, and standard deviation as ways to measure how spread out our observations are.

Chapter 12: The Normal Distribution By combining our mean and standard deviation, we get a very useful distribution for making estimates: the normal distribution. In this chapter, you’ll learn how to use the normal distribution to not only estimate unknown values but also to know how certain you are in those estimates. You’ll use these new skills to time your escape during a bank heist.

Chapter 13: Tools of Parameter Estimation: The PDF, CDF, and Quantile Function Here you’ll learn about the PDF, CDF, and quantile function to better understand the parameter estimations you’re making. You’ll estimate email conversion rates using these tools and see what insights each provides.

Chapter 14: Parameter Estimation with Prior Probabilities The best way to improve our parameter estimates is to include a prior probability. In this chapter, you’ll see how adding prior information about the past success of email click-through rates can help us better estimate the true conversion rate for a new email.

Chapter 15: From Parameter Estimation to Hypothesis Testing: Building a Bayesian A/B Test Now that we can estimate uncertain values, we need a way to compare two uncertain values in order to test a hypothesis. You’ll create an A/B test to determine how confident you are in a new method of email marketing.

Part IV: Hypothesis Testing: The Heart of Statistics

Chapter 16: Introduction to the Bayes Factor and Posterior Odds: The Competition of Ideas Ever stay up late, browsing the web, wondering if you might have a super-rare disease? This chapter will introduce another approach to testing ideas that will help you determine how worried you should actually be!

Chapter 17: Bayesian Reasoning in The Twilight Zone How much do you believe in psychic powers? In this chapter, you’ll develop your own mind-reading skills by analyzing a situation from a classic episode of The Twilight Zone.

Chapter 18: When Data Doesn’t Convince You Sometimes data doesn’t seem to be enough to change someone’s mind about a belief or help you win an argument. Learn how you can change a friend’s mind about something you disagree on and why it’s not worth your time to argue with your belligerent uncle!

Chapter 19: From Hypothesis Testing to Parameter Estimation Here we come full circle back to parameter estimation by looking at how to compare a range of hypotheses. You’ll derive your first example of statistics, the beta distribution, using the tools that we’ve covered for simple hypothesis tests to analyze the fairness of a particular fairground game.

Appendix A: A Quick Introduction to R This quick appendix will teach you the basics of the R programming language.

Appendix B: Enough Calculus to Get By Here we’ll cover just enough calculus to get you comfortable with the math used in the book.

Appendix C: Answers to the Exercises This appendix provides the answers to the exercises at the end of each chapter.

Background for Reading the Book

The only requirement of this book is basic high school algebra. If you flip forward, you’ll see a few instances of math, but nothing particularly onerous. We’ll be using a bit of code written in the R programming language, which I’ll provide and talk through, so there’s no need to have learned R beforehand. We’ll also touch on calculus, but again no prior experience is required, and the appendixes will give you enough information to cover what you’ll need.

In other words, this book aims to help you start thinking about problems in a mathematical way without requiring significant mathematical background. When you finish reading it, you may find yourself inadvertently writing down equations to describe problems you see in everyday life!

If you do happen to have a strong background in statistics (even Bayesian statistics), I believe you’ll still have a fun time reading through this book. I have always found that the best way to understand a field well is to revisit the fundamentals over and over again, each time in a different light. Even as the author of this book, I found plenty of things that surprised me just in the course of the writing process!

Now Off on Your Adventure!

As you’ll soon see, aside from being very useful, Bayesian statistics can be a lot of fun! To help you learn Bayesian reasoning we’ll be taking a look at LEGO bricks, The Twilight Zone, Star Wars, and more. You’ll find that once you begin thinking probabilistically about problems, you’ll start using Bayesian statistics all over the place. This book is designed to be a pretty quick and enjoyable read, so turn the page and let’s begin our adventure in Bayesian statistics!

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.145.110.176