ROC curves

Originating from World War II radar engineering, the receiver operating characteristics (ROC) curve is a common way to compare the effectiveness of ML models against each other. It measures the Recall rate against the fall-out rate (calculated as [1-specificity]) along a threshold measure. The fall-out rate is also known as the False Positive Rate (FPR) or the probability of false alarm.

You may have noticed a trend in that a lot of the measures have several different names. It can get confusing but it is important to know the pseudonyms of the measures since different articles, blogs, and research papers will use different names for them.

In most cases, the ROC curve will be used with binary classification problems. Some examples of binary classification questions include failure/no failure, operating/not operating, raining/not raining, and purchase/no purchase. An ROC curve shows the change in trade-off between the benefits of correct classifications (true positives) and the costs of incorrect classifications (false positives), along intervals of a threshold parameter.

It can help to think of the true positive rate as follows:

True positive rate = Correctly classified positives / All positives in the training set

And think of the false positive rate as follows:

False positive rate = Incorrectly classified negatives / All negatives in the training set

The following graph is an example of an ROC curve for a single model. True positive rate (also called Recall or Sensitivity) is shown on the vertical y axis, and the false negative rate (false alarm rate) is shown on the horizontal x axis. The diagonal dotted line represents what a model made of random guesses would look like. The closer a model's ROC curve is to the line, the less distinguishable it is from using random guesses. A curve below the dotted line is worse than random guessing; but it can still have some useful value just by taking the opposite of what it predicts. The curved blue line represents the ML model being graphed:

The ROC example chart. All ROC images generated using the demonstration tool at https://kennis-research.shinyapps.io/ROC-Curves/.

Normally, you would be comparing multiple ML models on the same graph, each with its own curve. However, it is easier to grasp the intuition of the method by using only one ML curve. Read the chart by following the curve from left to right. The left side of the chart represents more conservative threshold settings for the ML model while moving to the right represents more and more liberal settings.

A perfect model would perfectly discriminate between positive and negative instances along all threshold settings until you get to the most liberal setting, which would consider all examples as positives. It would hug the upper left corner of the chart. An example of an extremely good model shown on an ROC chart is shown in the following figure:

A near perfect ML model. Don't believe it, this is about as rare in the real world as leprechauns playing polo while riding unicorns

ROC curves are generated by running the training data through the trained model to generate the predicted score (or probability) of the likelihood of a positive classification for each instance. The results are sorted in order with the most likely first. Each prediction is compared to the actual classification of the instance and a running score of the true positive rate and the false positive rate is kept as you move down the list.

Each true positive rate and false positive rate combination is graphed on the chart with a line connecting the points. In reality, an ROC curve is a step function that approaches a true curve, as the size of your training set grows.

As you move down the list into less and less likely positives (based on the model's prediction), this effectively shifts through a range of thresholds from most conservative (your top scoring instance) to the most liberal (your lowest scoring instance). The resulting curve shows the performance of your ML model along a wide range of threshold scenarios. When you are comparing ML models using ROC curves, you are comparing their performance under many difference threshold conditions. This gives you a fuller picture of performance.

ROC curve charts can be generated easily in R. Using the pROC package, you can generate chart images just using the roc and plot functions. Here is a simple example:

#load pROC library
library(pROC)
#create ROC object based on the model and its class lables
rocCurve <- roc(response = classFactors,
 predictor = testPredictionsProb,
 levels = rev(levels(classFactors))) #roc function assumes the second class is the target, so we will reverse labels

#plot curve
plot(rocCurve, legacy.axes = TRUE, identity = TRUE, col = "blue", add = FALSE)
#you can also add another model curve to the chart by calling this function again with the 'add = TRUE' option

The code will produce an ROC curve plot similar to the following example:

Example of an ROC curve plot in R.

The following are some benefits of using ROC curves:

Not sensitive to changes in class distributions: Shifts in the numbers of positive and negative examples will not affect the ML model's ROC curve, assuming the model remains as effective in discriminating between them.
Useful even if classification error costs are unequal: When the costs of false positives or benefits of the correct classification of positives change, the ROC curve is unaffected. Only the region of interest along the curve changes. In other words, the optimal threshold setting will shift but the shape of the curve will not.
Allows the comparison of the performance of different ML models along various threshold settings: Some models will perform best at conservative settings while others may be the best at more liberal settings. The best model, therefore, depends on the underlying business case.

The following are things to keep in mind with ROC curve analysis:

Difficult to use intuitively for multi-class models: It is hard to visualize beyond 2-dimensional space and the dimensions grow quickly with multi-class prediction models. There are methods to do this, but they are still not easy to interpret.
Hard to explain to non-analytical audiences: There is a learning curve to ROC curves (it is fun to use the word curve). However, non-specialist people can certainly learn how to understand it with repeated exposure. The author recommends educating them if possible. Organizations that understand these charts can use them when making cost/benefit business decisions. Once they get over the hump of learning to interpret the charts, they will find them very useful.

Table of Contents for ROC curves

Create new playlist

Sign In

Sign Up

Table of Contents for
ROC curves