Multiclass classification with support vector machines

Just like with logistic regression, we've seen that the basic premise behind the support vector machine is that it is designed to handle two classes. Of course, we often have situations where we would like to be able to handle a greater number of classes, such as when classifying different plant species based on a variety of physical characteristics. One way to do this is the one versus all approach. Here, if we have K classes, we create K SVM classifiers, and for each classifier, we are attempting to distinguish one particular class from all the rest. To determine the best class to pick, we assign the class for which the observation produces the highest distance from the separating hyperplane, thus lying farthest away from all other classes. More formally, we pick the class for which our linear feature combination has a maximum value across all the different classifiers.

An alternative approach is known as the (balanced) one versus one approach. We create a classifier for all possible pairs of output classes. We then classify our observation with every one of these classifiers and tally up the totals for every winning class. Finally, we pick the class that has the most votes. This latter approach is actually what is implemented by the svm() function in the e1071 package. We can, therefore, use this function when we have a problem with multiple classes.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.15.12.34