Support vector machines

Support vector machines, commonly known as SVMs, are another class of machine learning algorithm that are used to classify data into one or another category using a concept called hyperplane, which is used to demarcate a linear boundary between points.

For instance, given a set of black and white points on an x-y axis, we can find multiple lines that will separate them. The line, in this case, represents the function that delineates the category that each point belongs to. In the following image, lines H1 and H2 both separate the points accurately. In this case, how can we determine which one of H1 and H2 would be the optimal line?:

Intuitively, we can say the line that is closest to the points - for instance, the vertical line H1 - might not be the optimal line to separate the points. Since the line is too close to the points, and so too specific to the points on the given dataset, a new point may be misclassified if it is even slightly off to the right or the left side of the line. In other words, the line is too sensitive to small changes in the data (which could be due to stochastic/deterministic noise, such as imperfections in the data).

On the other hand, the line H2 manages to separate the data whilst maintaining the maximum possible distance from the points closest to the line. Slight imperfections in the data are unlikely to affect the classification of the points to the extent line H1 may have done. This, in essence, describes the principle of the maximum margin of separation as shown in the image below.

The points close to the line, also known as the hyperplane, are known as the 'support vectors' (hence the name). In the image, the points that lie on the dashed line are therefore the support vectors.

In the real world, however, not all points may be 'linearly separable'. SVMs leverage a concept known as the 'kernel trick'. In essence, points that might not be linearly separable can be projected or mapped onto a higher dimensional surface. For example, given a set of points on a 2D x-y space that are not linearly separable, it may be possible to separate them if we were to project the points on a 3-dimensional space as shown in the following image. The points colored in red were not separable by a 2D line, but when mapped to a 3-dimensional surface, they can be separated by a hyperplane as shown in the following image:

There are several packages in R that let users leverage SVM, such as kernlab, e1071, klaR, and others. Here, we illustrate the use of SVM from the e1071 package, as shown as follow:

library(mlbench) 
library(caret) 
library(e1071) 
set.seed(123) 
 
 
data("PimaIndiansDiabetes") 
diab<- PimaIndiansDiabetes 
 
train_ind<- createDataPartition(diab$diabetes,p=0.8,list=FALSE,times=1) 
 
training_diab<- diab[train_ind,] 
test_diab<- diab[-train_ind,] 
 
svm_model<- svm(diabetes ~ ., data=training_diab) 
plot(svm_model,training_diab, glucose ~ mass) 

# The plot below illustrates the areas that are classified 'positive' and 'negative'
# Creating and evaluating the Confusion Matrix for the SVM model

svm_predicted<- predict(svm_model,test_diab[,-ncol(test_diab)]) confusionMatrix(svm_predicted,test_diab$diabetes)

Confusion Matrix and Statistics Reference Prediction negpos neg 93 26 pos7 27 Accuracy : 0.7843 95% CI : (0.7106, 0.8466) No Information Rate : 0.6536 P-Value [Acc> NIR] : 0.0003018 Kappa : 0.4799 Mcnemar's Test P-Value : 0.0017280 Sensitivity : 0.9300 Specificity : 0.5094 PosPredValue : 0.7815 NegPredValue : 0.7941 Prevalence : 0.6536 Detection Rate : 0.6078 Detection Prevalence : 0.7778 Balanced Accuracy : 0.7197 'Positive' Class :neg
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.12.166.255