Classification metrics

Although we looked at the test set accuracy for our model, we know from Chapter 1, Gearing Up for Predictive Modeling, that the binary confusion matrix can be used to compute a number of other useful performance metrics for our data, such as precision, recall, and the F measure.

We'll compute these for our training set now:

> (confusion_matrix <- table(predicted = train_class_predictions, actual = heart_train$OUTPUT))
         actual
predicted   0   1
        0 118  16
        1  10  86
> (precision <- confusion_matrix[2, 2] / sum(confusion_matrix[2,]))
[1] 0.8958333
> (recall <- confusion_matrix[2, 2] / sum(confusion_matrix[,2]))
[1] 0.8431373
> (f = 2 * precision * recall / (precision + recall))
[1] 0.8686869 

Here, we used the trick of bracketing our assignment statements to simultaneously assign the result of an expression to a variable and print out the value assigned. Now, recall is the ratio of correctly identified instances of class 1, divided by the total number of observations that actually belong to class 1. In a medical context such as ours, this is also known as sensitivity, as it is an effective measure of a model's ability to detect or be sensitive to a particular condition. Recall is also known as the true positive rate. There is an analogous measure known as specificity, which is the false negative rate. This involves the mirror computation of recall for class 0, that is, the correctly identified members of class 0 over all the observations of class 0 in our data set. In our medical context, for example, the interpretation of specificity is that it measures the model's ability to reject observations that do not have the condition represented by class 1 (in our case, heart disease). We can compute the specificity of our model as follows:

> (specificity <- confusion_matrix[1,1]/sum(confusion_matrix[1,]))
[1] 0.880597

In computing these metrics, we begin to see the importance of setting the threshold at 0.5. If we were to choose a different threshold, it should be clear that all of the preceding metrics would change. In particular, there are many circumstances, our current medical context being a prime example, in which we may want to adjust our threshold to be biased toward identifying members of class 1. For example, suppose our model was being used by a clinician to determine whether to have a patient undergo a more detailed and expensive examination for heart disease. We would probably consider that mislabeling a patient with a heart condition as healthy is a more serious mistake to make than asking a healthy patient to undergo further tests because they were deemed unhealthy. To achieve this bias, we could lower our classification threshold to 0.3 or 0.2, for example.

Ideally, what we would like is a visual way to assess the effect of changing the threshold on our performance metrics, and the precision recall curve is one such useful plot. In R, we can use the ROCR package to obtain precision-recall curves:

> library(ROCR)
> train_predictions <- predict(heart_model, newdata = heart_train, type = "response")
> pred <- prediction(train_predictions, heart_training$OUTPUT)
> perf <- performance(pred, measure = "prec", x.measure = "rec")

We can then plot the perf object to obtain our precision recall curve.

Classification metrics

The graph shows us, for example, that to obtain values of recall above 0.8, we'll have to sacrifice precision quite abruptly. To fine-tune our threshold, we'll want to see individual thresholds that were used to compute this graph. A useful exercise is to create a data frame of cutoff values, which are the threshold values for which the precision and recall change in our data, along with their corresponding precision and recall values. We can then subset this data frame to inspect individual thresholds that interest us.

For example, suppose we want to find a suitable threshold so that we have at least 90 percent recall and 80 percent precision. We can do this as follows:

> thresholds <- data.frame(cutoffs = [email protected][[1]], recall = [email protected][[1]], precision = [email protected][[1]])
> subset(thresholds,(recall > 0.9) & (precision > 0.8))
      cutoffs    recall precision
112 0.3491857 0.9019608 0.8288288
113 0.3472740 0.9019608 0.8214286
114 0.3428354 0.9019608 0.8141593
115 0.3421438 0.9019608 0.8070175

As we can see, a threshold of roughly 0.35 will satisfy our requirements.

Tip

You may have noticed that we used the @ symbol to access some of the attributes of the perf object. This is because this object is a special type of object known as an S4 class. S4 classes are used to provide object-oriented features in R. A good reference to learn about S4 classes and object-orientated programming in R more generally is Advanced R, Hadley Wickham, Chapman and Hall.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.218.151.44