Choosing the right classification metric

We talked about several essential scoring functions in Chapter 3, First Steps in Supervised Learning. Among the most fundamental metrics for classification were the following:

Accuracy: This counts the number of data points in the test set that have been predicted correctly and returns that number as a fraction of the test set size (sklearn.metrics.accuracy_score). This is the most basic scoring function for classifiers, and we have made extensive use of it throughout this book.
Precision: This describes the ability of a classifier not to label a positive sample as a negative (sklearn.metrics.precision_score).
Recall (or sensitivity): This describes the ability of a classifier to retrieve all of the positive samples (sklearn.metrics.recall_score).

Although precision and recall are important measures, looking at only one of them will not give us a good idea of the big picture. One way to summarize the two measures is known as the f-score or f-measure (sklearn.metrics.f1_score), which computes the harmonic mean of precision and recall as 2(precision x recall) / (precision + recall).

Sometimes we need to do more than maximize accuracy. For example, if we are using machine learning in a commercial application, then the decision-making should be driven by the business goals. One of these goals might be to guarantee at least 90% recall. The challenge then becomes to develop a model that still has reasonable accuracy while satisfying all secondary requirements. Setting goals like this is often called setting the operating point.

However, it is often not clear what the operating point should be when developing a new system. To understand the problem better, it is important to investigate all possible trade-offs of precision and recall them at once. This is possible using a tool called the precision-recall curve (sklearn.metrics.precision_recall_curve).

Another commonly used tool to analyze the behavior of classifiers is the Receiver Operating Characteristic (ROC) curve.

The ROC curve considers all possible thresholds for a given classifier similar to the precision-recall curve, but it shows the false positive rate against the true positive rate instead of reporting precision and recall.

Table of Contents for Choosing the right classification metric

Create new playlist

Sign In

Sign Up

Table of Contents for
Choosing the right classification metric