Computing the performance of classification

There are several measures of performance that we can compute from the preceding table:

  • The true positive rate (or sensitivity) is computed as the number of true positives divided by the number of true positives plus the number of false negatives. In our example, sensitivity is the probability that a survivor is classified as such. Sensitivity in this case is:

    168 / (168 + 186) = 0.4745763

  • The true negative rate (or specificity) is computed as the number of true negatives divided by the number of true negatives plus the number of false positives. In our example, specificity is the probability that a nonsurvivor is classified as such. Specificity in this case is:

    645 / (645 + 60) = 0.9148936

  • The positive predictive value (or precision) is computed as the number of true positives divided by the number of true positives plus the number of false positives. In our example, precision is the probability that individuals classified as survivors are actually survivors. Precision in this case is:

    168 / (168 + 60) = 0.7368421

  • The negative predictive value is computed as the number of true negatives divided by the number of true negative plus the number of false negatives. In our example, the negative predictive value is the probability that individuals classified as nonsurvivors are actually nonsurvivors. In this case, its value is:

    645 / (645 + 186) = 0.7761733

  • The accuracy is computed as the number of correctly classified instances (true positives plus true negatives) divided by the number of cases correctly classified plus the number of incorrectly classified instances. In our example, accuracy is the number of survivors and nonsurvivors correctly classified as such. In this case, accuracy is:

    (645 + 168) / ( (645 + 168) + (60 + 186) ) = 0.7677054

  • Cohen's kappa is a measure that can be used to assess the performance of classification. Its advantage is that it adjusts for correct classification happening by chance. Its drawback is that it is slightly more complex to compute. Considering two possible class values, let's call them No and Yes. The kappa coefficient is computed as the accuracy minus the probability of correct classification by chance, divided by 1 minus the probability of correct classification by chance. The probability of correct classification by chance is the probability of correct classification of "Yes" by chance plus the probability of correct classification of "No" by chance. The probability of the correct classification of "Yes" by chance can be computed as the probability of observed "Yes" multiplied by the probability of classified "Yes". Likewise, the probability of correct classification of "No" by chance can be computed as the probability of observed "No" multiplied by the probability of classified "No". Let's compute the kappa for the preceding example.
  • In the preceding example, the probability of the observed "No" is (645 + 60) / (647+60+186+168) = 0.6644675. The probability of the classified "No" is: (645 + 186) / (645 + 186 + 60 + 168) = 0.7847025. The probability of correct classification of "No" by chance is therefore: 0.6644675 * 0.7847025 = 0.5214093. The probability of observed "Yes" is: (186 + 168) / (186+168+645+60) = 0.3342776. The probability of classified "Yes" is: (60+186) / (60+186+645+186) = 0.2284123. The probability of correct classification of "Yes" by chance is therefore: 0.3342776 * 0.2284123 = 0.07635312. We can now compute the probability of correct classification by chance as: 0.5214093 + 0.07635312 = 0.5977624. We have seen that accuracy is 0.7677054. We can compute the kappa as follows:

    (0.7677054 – 0.5977624) / (1 - 0.5977624) = 0.4224941.

Kappa values can range from -1 and 1. Values below zero are meaningless. A kappa of zero means that the classification is not better than what can be obtained by chance. A kappa of one means a perfect classification. Usually values below .60 are considered bad, and values above .80 are preferred. In the case of our example (kappa below .60), we would not trust the classification, and refrain from using the classifier any further.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
13.58.209.201