Understanding supervised learning

We have previously established that the goal of supervised learning is always to predict labels (or target values) for data. However, depending on the nature of these labels, supervised learning can come in two distinct forms:

Classification: Supervised learning is called classification whenever we use the data to predict categories. A good example of this is when we try to predict whether an image contains a cat or a dog. Here, the labels of the data are categorical, either one or the other, but never a mixture of categories. For example, a picture contains either a cat or a dog, never 50% cat and 50% dog (before you ask, no, here we do not consider pictures of the cartoon character, CatDog), and our job is simply to tell which one it is. When there are only two choices, it is called two-class or binary classification. When there are more than two categories, as when predicting the species of an Iris flower (recall the Iris flower dataset we used in Chapter 1, A Taste of Machine Learning), it is known as multi-class classification.
Regression: Supervised learning is called regression whenever we use the data to predict real values. A good example of this is when we try to predict stock prices. Rather than predicting stock categories, the goal of regression is to predict a target value as accurately as possible, for example, to predict stock prices with the fewest errors.

There are many different ways to do classification and regression but we'll only look at a few. Perhaps the easiest way to figure out whether we are dealing with a classification or regression problem is to ask ourselves the following question: what are we actually trying to predict? The answer is given in the following diagram:

The preceding image shows the difference between classification and regression problems.

Table of Contents for Understanding supervised learning

Create new playlist

Sign In

Sign Up

Table of Contents for
Understanding supervised learning