A bit deeper

Without getting too deep into the machine learning terminology, this test is what is known as a binary classifier, which means that it is trying to predict from only two options: having cancer or not having cancer. When we are dealing with binary classifiers, we can draw what is called confusion matrices, which are 2 x 2 matrices that house all the four possible outcomes of our experiment.

Let's try some different numbers. Let's say 165 people walked in for the study. So, our n (sample size) is 165 people. All 165 people are given the test and asked whether they have cancer (provided through various other means). The following confusion matrix shows us the results of this experiment:

A bit deeper

The matrix shows that 50 people were predicted to have no cancer and did not have it, 100 people were predicted to have cancer and actually did have it, and so on. We have the following four classes, again, all with different names:

  • The true positives are the tests correctly predicting positive (cancer) == 100
  • The true negatives are the tests correctly predicting negative (no cancer) == 50
  • The false positives are the tests incorrectly predicting positive (cancer) == 10
  • The false negatives are the tests incorrectly predicting negative (no cancer) == 5

The first two classes indicate where the test was correct, or true. The last two classes indicate where the test was incorrect, or false.

False positives are sometimes called a Type I error, whereas false negatives are called a Type II error:

A bit deeper

Type I and Type II errors

(Source: http://marginalrevolution.com/marginalrevolution/2014/05/type-i-and-type-ii-errors-simplified.html)

We will get into this in the later chapters. For now, we just need to understand why we use the set notation to denote probabilities for compound events. This is because that's what they are. When events A and B exist in the same universe, we can use intersections and unions to represent them happening either at the same time, or to represent one happening versus the other.

We will go into this much more in later chapters, but it is good to introduce it now.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.136.236.231