Learning optimal decision boundaries

Let's look at a simple example. Consider some training samples with only two features (x and y values) and a corresponding target label (positive (+) or negative (-)). Since the labels are categorical, we know that this is a classification task. Moreover, because we only have two distinct classes (+ and -), it's a binary classification task.

In a binary classification task, a decision boundary is a line that partitions the training set into two subsets, one for each class. An optimal decision boundary partitions the data so that all data samples from one class (say, +) are to the left of the decision boundary, and all other data samples (say, -) are to the right of it.

An SVM updates its choice of a decision boundary throughout the training procedure. For example, at the beginning of training, the classifier has seen only a few data points, and it tries to draw in the decision boundary that best separates the two classes. As training progresses, the classifier sees more and more data samples, and so it keeps updating the decision boundary at each step. This process is illustrated in the following diagram:

The preceding diagram shows that the decision boundary is constantly updated during the training phase.

As the training progresses, the classifier gets to see more and more data samples, and thus gets an increasingly better idea of where the optimal decision boundary should lie. In this scenario, a misclassification error would occur if the decision boundary were drawn in such a way that a - sample was to the left of it, or a + was to the right of it.

We know from the previous chapter that after training, the classification model is no longer tampered with. This means that the classifier will have to predict the target label of new data points using the decision boundary it obtained during training.

In other words, during testing, we want to know which class the new data point (?) should have in the following diagram, based on the decision boundary we learned during the training phase. The following diagram shows how to predict the target label of new data points with learned decision boundary:

You can see why this is usually a tricky problem. If the location of the question mark was more to the left, we would have been certain that the corresponding target label is +. However, in this case, there are several ways to draw the decision boundary so that all the + samples are to the left of it and all - samples to the right of it, as illustrated in this diagram:

The preceding diagram shows multiple decision boundaries. Which decision boundary would you choose?

All three decision boundaries get the job done; they split the training data perfectly into + and - subsets without making any misclassification errors. However, depending on the decision boundary we choose, the (?) would come to lie either on the left of it (dotted and solid lines), in which case it would be assigned to the + class, or to the right of it (dashed line), in which case it would be assigned to the - class.

This is what SVMs are really good at. An SVM would most likely pick the solid line because this is the decision boundary that maximizes the margin between the data points from the + and - classes. This is illustrated in the following diagram:

The preceding diagram shows an example of a decision boundary that an SVM would learn.

It turns out that in order to find the maximal margin, it is only necessary to consider the data points that lie on the class margins. These data points are also called the support vectors. That's where SVMs got their name from.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.119.28.108