Overviewing Bayes' theorem

To be honest, there are as many interpretations of Bayes' theorem as there are books about it. The one shown previously is the main one that we will be discussing. I would also encourage you to refer to https://brilliant.org/wiki/bayes-theorem/ for further reading.

To make this more concrete and formal, let's start off with a bit of intuition and formality; it will help us set the stage for what is to come.

When we use Bayes' theorem, we are measuring the degree of belief of something, the likelihood that an event will occur. Let's just keep it that simple for now:

The preceding formula means the probability of A given B.

Probability is usually quantified as a number between 0 and 1, inclusive of both; 0 would indicate impossibility and 1 would indicate absolute certainty. The higher the probability, the more the certainty. The odds of a dice rolling a 6 and the odds of a coin flip coming up heads are two examples of probability that you are no doubt very familiar with. There's also another example you are familiar with and encounter daily: spam.

All of us usually have our email open right beside us, all day long (some of us all night long too!). And with the messages that we are expecting also come the ones that we are not and do not care to receive. We all hate dealing with spam, that nasty email that has nothing to do with anything but Viagra; yet we somehow always seem to get it. What is the probability that any one of those emails I get each day is spam? What is the probability that I care about its content? How would we even know?

So let's talk a little bit about how a spam filter works because, you see, it's perhaps the best example of probability we can use! To be precise and more formal, we are dealing with conditional probability, which is the probability of event A given the occurrence of event B.

The way most spam filters work, at least at the very basic level, is by defining a list of words that are used to indicate emails that we do not want or did not ask to receive. If the email contains those words, it's considered spam and we deal with it accordingly. So, using Bayes' theorem, we look for the probability that an email is spam given a list of words, which would look like this in a formulaic view:

The probability that an email is spam, given a set of words: User Qniemiec in Wikipedia has an incredible visual diagram that explains in full force every combination of a probabilistic view, which is represented by the superposition of two event trees. If you are a visual person like I am, here is a complete visualization of Bayes' theorem represented by the superposition of two event tree diagrams:

Now, let's move on to a very famous problem. It is called by many names, but the basic problem is what is known as the taxicab problem. Here's our scenario, which we will attempt to solve using probability and Bayes' theorem.

An Uber driver was involved in a hit-and-run accident. The famous yellow taxi cabs and Uber drivers are the two companies that operate in the city and can be seen everywhere. We are given the following data:

85% of the cabs in the city are yellow and 15% are Uber.
A witness identified the car involved in the hit and run and stated that it had an Uber sticker on it. That being said, we know how reliable witness testimony is, so the court decided to test the user and determine their reliability. Using the same set of circumstances that existed on the night of the accident, the court concluded that the witness correctly identified each one of the two vehicles 80% of the time, but failed 20% of the time. This is going to be important, so stay with me on this!

Our dilemma: What is the probability that the car involved in the accident was an Uber driver versus a yellow cab?

Mathematically, here's how we get to the answer we need:

The total number of Uber drivers identified correctly is:

15 * 0.8 = 12

The witness is incorrect 20% of the time, so the total number of vehicles incorrectly identified is:

85 * 0.2 = 17

Therefore, the total vehicles identified by the witness is 12 + 17 = 29. The probability that they identified the Uber driver correctly is hence:

12/29 = @41.3%

Now, let's see whether we can develop a simple program that can help us arrive at that number to prove our solution works and is viable. To accomplish this, we are going to dive into our first open source toolkit: Encog. Encog is designed to handle problems exactly like this.

The Encog framework is a full-fledged machine learning framework and was developed by Mr. Jeff Heaton. Mr. Heaton has also published several books on the Encog framework, as well as other subjects, and I encourage you to seek them out if you plan to use this framework extensively. I persoally own every one of them and I consider them seminal works.

Let's look at the code it's going to take to solve our problem. As you'll notice, math, statistics, probability... it's all abstracted from you. Encog can allow you to focus on the business problem you are trying to solve.

The complete execution block looks like the following code. We'll begin to dissect it in a moment.

public void Execute(IExampleInterface app)
{
            // Create a Bayesian network
            BayesianNetwork network = new BayesianNetwork();
            // Create the Uber driver event
            BayesianEvent UberDriver = network.CreateEvent("uber_driver");
            // create the witness event
            BayesianEvent WitnessSawUberDriver = network.CreateEvent("saw_uber_driver");
            // Attach the two
            network.CreateDependency(UberDriver, WitnessSawUberDriver);
            network.FinalizeStructure();
            // build the truth tables
            UberDriver?.Table?.AddLine(0.85, true);
            WitnessSawUberDriver?.Table?.AddLine(0.80, true, true);
            WitnessSawUberDriver?.Table?.AddLine(0.20, true, false);
            network.Validate();
            Console.WriteLine(network.ToString());
            Console.WriteLine($"Parameter count: {network.CalculateParameterCount()}");
            EnumerationQuery query = new EnumerationQuery(network);
            // The evidence is that someone saw the Uber driver hit the car
            query.DefineEventType(WitnessSawUberDriver, EventType.Evidence);
            // The result was the Uber driver did it
            query.DefineEventType(UberDriver, EventType.Outcome);
            query.SetEventValue(WitnessSawUberDriver, false);
            query.SetEventValue(UberDriver, false);
            query.Execute();
            Console.WriteLine(query.ToString());
}

OK, let's break this down into more digestible pieces. The first thing we are going to do is create a Bayesian network. This object will be at the center of solving our mystery. The BayesianNetwork object is a wrapper around a probability and classification engine.

The Bayesian network is comprised of one or more BayesianEvents. An event will be one of three distinct types—Evidence, Outcome, or Hidden—and will usually correspond to a number in the training data. An Event is always discrete, but continuous values (if present and desired) can be mapped to a range of discrete values.

After creating the initial network object, we create an event for the Uber driver as well as for the witness who claimed they saw the driver involved in the hit and run. We will create a dependency between the Uber driver and the witness, and then finalize the structure of our network:

public void Execute(IExampleInterface app)
         {
             // Create a Bayesian network
             BayesianNetwork network = new BayesianNetwork();
             // Create the Uber driver event
             BayesianEvent UberDriver = network.CreateEvent("uber_driver");
             // create the witness event
             BayesianEvent WitnessSawUberDriver = network.CreateEvent("saw_uber_driver");
             // Attach the two
             network.CreateDependency(UberDriver, WitnessSawUberDriver);
             network.FinalizeStructure();
             
             // build the truth tables
             UberDriver?.Table?.AddLine(0.85, true);
             WitnessSawUberDriver?.Table?.AddLine(0.80, true, true);
             WitnessSawUberDriver?.Table?.AddLine(0.20, true, false);
             network.Validate();
 
             Console.WriteLine(network.ToString());
             Console.WriteLine($"Parameter count: {network.CalculateParameterCount()}");
             
             EnumerationQuery query = new EnumerationQuery(network);
 
             // The evidence is that someone saw the Uber driver hit the car
             query.DefineEventType(WitnessSawUberDriver, EventType.Evidence);
             // The result was the Uber driver did it
             query.DefineEventType(UberDriver, EventType.Outcome);
             query.SetEventValue(WitnessSawUberDriver, false);
             query.SetEventValue(UberDriver, false);
             query.Execute();
             Console.WriteLine(query.ToString());
         }

Next, we need to build the actual truth tables. A truth table is a listing of all possible values a function can have. There are one or more rows of increasing complexity, and the last row is the final function value. If you remember logic theory, there are basically three operations that you can have: NOT, AND, and OR. 0 usually represents false, and 1 usually represents true.

If we look just a little bit deeper, we will see that we end up with the following rules:

If A = 0, -A = 1

If A = 1, -A = 0

A+B = 1, except when A and B = 0

A+B = 0 if A and B = 0

A*B = 0, except when A and B = 1

A*B = 1 if A and B = 1

Now, back to our code.

To build the truth table, we will need to know the probability and the result value. In the case of our problem, the probability that an Uber driver was involved in the accident is 85%. As for the witness, there is an 80% chance they are telling the truth and a 20% chance that they are mistaken. We will use the AddLine function of the truth table to add this information:

// build the truth tables
UberDriver?.Table?.AddLine(0.85, true);
WitnessSawUberDriver?.Table?.AddLine(0.80, true, true);
WitnessSawUberDriver?.Table?.AddLine(0.20, true, false);
network.Validate();

Let's talk a bit more about truth tables. Here is an extended truth table showing all the possible truth functions of two variables, P and Q.

If we were to program our truth table more extensively, here is an example of how we could do so:

a?.Table?.AddLine(0.5, true); // P(A) = 0.5
x1?.Table?.AddLine(0.2, true, true); // p(x1|a) = 0.2
x1?.Table?.AddLine(0.6, true, false);// p(x1|~a) = 0.6
x2?.Table?.AddLine(0.2, true, true); // p(x2|a) = 0.2
x2?.Table?.AddLine(0.6, true, false);// p(x2|~a) = 0.6
x3?.Table?.AddLine(0.2, true, true); // p(x3|a) = 0.2
x3?.Table?.AddLine(0.6, true, false);// p(x3|~a) = 0.6

Now that our network and truth tables are built, it's time to define some events. As we mentioned earlier, events are any one of Evidence, Hidden, or Outcome. The Hidden event, which is neither Evidence nor Outcome, is still involved in the Bayesian graph itself. We will not be using Hidden but I wanted you to know that it does exist.

To solve our mystery, we must accumulate evidence. In our case, the evidence that we have is that the witness reported seeing an Uber driver involved in the hit and run. We will define an event type of Evidence and assign it to what the witness reported. The result, or outcome, is that it was an Uber driver, so we will assign an event type of outcome to that.

Finally, we must account for the fact that, at least some of the time, the witness's report of seeing an Uber driver involved was incorrect. So we must create event values for both probabilities—that the witness did not see an Uber driver, and that an Uber driver was not involved:

EnumerationQuery query = new EnumerationQuery(network);

// The evidence is that someone saw the Uber driver hit the car
query.DefineEventType(WitnessSawUberDriver, EventType.Evidence);
// The result was the Uber driver did it
query.DefineEventType(UberDriver, EventType.Outcome);
query.SetEventValue(WitnessSawUberDriver, false);
query.SetEventValue(UberDriver, false);
query.Execute();

Notice that the query we are going to execute is an EnumerationQuery. This object allows probabilistic queries against a Bayesian network. This is done by calculating every combination of hidden nodes and using total probability to find the result. The performance can be weak if our Bayesian network is large, but fortunately for us, it is not.

Finally, we execute our query against our Bayesian network definition and print the results:

The result, as we had hoped for, was 41%.

As an exercise for you, see whether you can now use Encog to solve another very famous example. In this example, we wake up in the morning and find out that the grass was wet. Did it rain, was the sprinkler on, or both? Here's what our truth tables look like on pen and paper:

The probability that it rained:

The complete truth table:

Table of Contents for Overviewing Bayes' theorem

Create new playlist

Sign In

Sign Up

Table of Contents for
Overviewing Bayes' theorem