Evaluating classification

Is our classifier doing well? Is this better than the other one? In classification, we count how many times we classify something right and wrong. Suppose there are two possible classification labels of yes and no, then there are four possible outcomes, as shown in the following table:

		Predicted as positive?
		Yes	No
Really positive?	Yes	TP-True Positive	FN- False Negative
Really positive?	No	FP- False Positive	TN-True Negative

The four variables:

True positive (hit): This indicates a yes instance correctly predicted as yes
True negative (correct rejection): This indicates a no instance correctly predicted as no
False positive (false alarm): This indicates a no instance predicted as yes
False negative (miss): This indicates a yes instance predicted as no

The basic two performance measures of a classifier are, firstly, classification error:

And, secondly, classification accuracy is another performance measure, as shown here:

The main problem with these two measures is that they cannot handle unbalanced classes. Classifying whether a credit card transaction is an abuse or not is an example of a problem with unbalanced classes: there are 99.99% normal transactions and just a tiny percentage of abuses. The classifier that says that every transaction is a normal one is 99.99% accurate, but we are mainly interested in those few classifications that occur very rarely.

Table of Contents for Evaluating classification

Create new playlist

Sign In

Sign Up

Table of Contents for
Evaluating classification