Implementing nonlinear SVMs

In order to test some of the SVM kernels we just talked about, we will return to our code sample mentioned earlier. We want to repeat the process of building and training the SVM on the dataset generated earlier, but this time, we want to use a whole range of different kernels:

In [13]: kernels = [cv2.ml.SVM_LINEAR, cv2.ml.SVM_INTER,
... cv2.ml.SVM_SIGMOID, cv2.ml.SVM_RBF]

Do you remember what all of these stand for?

Setting a different SVM kernel is relatively simple. We take an entry from the kernels list and pass it to the setKernels method of the SVM class. That's all.

The laziest way to repeat things is to use a for loop as shown here:

In [14]: for idx, kernel in enumerate(kernels):

Then the steps are as follows:

  1. Create the SVM and set the kernel method. Note that the kernel parameters are set to default:
...          svm = cv2.ml.SVM_create()
... svm.setKernel(kernel)
  1. Train the classifier:
...          svm.train(X_train, cv2.ml.ROW_SAMPLE, y_train)

  1. Score the model using scikit-learn's metrics module imported previously:
...          _, y_pred = svm.predict(X_test)
... accuracy = metrics.accuracy_score(y_test, y_pred)
  1. Plot the decision boundary in a 2 x 2 subplot (remember that subplots in matplotlib are 1-indexed to mimic matlab, so we have to call idx + 1):
...          plt.subplot(2, 2, idx + 1)
... plot_decision_boundary(svm, X_test, y_test)
... plt.title('accuracy = %.2f' % accuracy)

The result looks like this:

Let's break the preceding down step by step.

First, we find that the linear kernel (top-left panel) still looks like one in an earlier plot. We now realize that it's also the only version of SVM that produces a straight line as a decision boundary (although cv2.ml.SVM_C produces almost identical results to cv2.ml.SVM_LINEAR).

The histogram intersection kernel (top-right panel) allows for a more complex decision boundary. However, this did not improve our generalization performance (accuracy is still at 80 percent).

Although the sigmoid kernel (bottom-left panel) allows for a nonlinear decision boundary, it made a really poor choice, leading to only 25% accuracy.

On the other hand, the RBF kernel (bottom-right panel) was able to improve our performance to 90% accuracy. It did so by having the decision boundary wrap around the lowest red dot and reaching up to put the two leftmost blue dots into the blue zone. It still makes two mistakes, but it definitely draws the best decision boundary we have seen to date! Also, note how the RBF kernel is the only kernel that cares to narrow down the blue zone in the lower two corners.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.12.166.255