Support vector machines

Support vector machines, commonly known as SVMs, are another class of machine learning algorithm that are used to classify data into one or another category using a concept called hyperplane, which is used to demarcate a linear boundary between points.

For instance, given a set of black and white points on an x-y axis, we can find multiple lines that will separate them. The line, in this case, represents the function that delineates the category that each point belongs to. In the following image, lines H1 and H2 both separate the points accurately. In this case, how can we determine which one of H1 and H2 would be the optimal line?:

Intuitively, we can say the line that is closest to the points - for instance, the vertical line H1 - might not be the optimal line to separate the points. Since the line is too close to the points, and so too specific to the points on the given dataset, a new point may be misclassified if it is even slightly off to the right or the left side of the line. In other words, the line is too sensitive to small changes in the data (which could be due to stochastic/deterministic noise, such as imperfections in the data).

On the other hand, the line H2 manages to separate the data whilst maintaining the maximum possible distance from the points closest to the line. Slight imperfections in the data are unlikely to affect the classification of the points to the extent line H1 may have done. This, in essence, describes the principle of the maximum margin of separation as shown in the image below.

The points close to the line, also known as the hyperplane, are known as the 'support vectors' (hence the name). In the image, the points that lie on the dashed line are therefore the support vectors.

In the real world, however, not all points may be 'linearly separable'. SVMs leverage a concept known as the 'kernel trick'. In essence, points that might not be linearly separable can be projected or mapped onto a higher dimensional surface. For example, given a set of points on a 2D x-y space that are not linearly separable, it may be possible to separate them if we were to project the points on a 3-dimensional space as shown in the following image. The points colored in red were not separable by a 2D line, but when mapped to a 3-dimensional surface, they can be separated by a hyperplane as shown in the following image:

There are several packages in R that let users leverage SVM, such as kernlab, e1071, klaR, and others. Here, we illustrate the use of SVM from the e1071 package, as shown as follow:

library(mlbench) 
library(caret) 
library(e1071) 
set.seed(123) 
 
 
data("PimaIndiansDiabetes") 
diab<- PimaIndiansDiabetes 
 
train_ind<- createDataPartition(diab$diabetes,p=0.8,list=FALSE,times=1) 
 
training_diab<- diab[train_ind,] 
test_diab<- diab[-train_ind,] 
 
svm_model<- svm(diabetes ~ ., data=training_diab) 
plot(svm_model,training_diab, glucose ~ mass) 

# The plot below illustrates the areas that are classified 'positive' and 'negative'

# Creating and evaluating the Confusion Matrix for the SVM model

svm_predicted<- predict(svm_model,test_diab[,-ncol(test_diab)]) 
confusionMatrix(svm_predicted,test_diab$diabetes) 

Confusion Matrix and Statistics 
 
          Reference 
Prediction negpos 
neg  93  26 
pos7  27 
 
Accuracy : 0.7843           
                 95% CI : (0.7106, 0.8466) 
    No Information Rate : 0.6536           
    P-Value [Acc> NIR] : 0.0003018        
 
Kappa : 0.4799           
Mcnemar's Test P-Value : 0.0017280        
 
Sensitivity : 0.9300           
Specificity : 0.5094           
PosPredValue : 0.7815           
NegPredValue : 0.7941           
Prevalence : 0.6536           
         Detection Rate : 0.6078           
   Detection Prevalence : 0.7778           
      Balanced Accuracy : 0.7197           
 
       'Positive' Class :neg

Table of Contents for Support vector machines

Create new playlist

Sign In

Sign Up

Table of Contents for
Support vector machines