Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Monitoring the performance of the web server and classification-based methods

Classification algorithms can be used to detect outliers. The ordinary strategy is to train a one-class model only for the normal data point in the training dataset. Once you set up the model, any data point that is not accepted by the model is marked as an outlier.

The OCSVM algorithm

The OCSVM (One Class SVM) algorithm projects input data into a high-dimensional feature space. Along with this process, it iteratively finds the maximum-margin hyperplane. The hyperplane defined in a Gaussian reproducing kernel Hilbert space best separates the training data from the origin. When , the major portion of outliers or the solution of OCSVM can be represented by the solution of the following equation (subject to and The OCSVM algorithm ):

The one-class nearest neighbor algorithm

This algorithm is based on the k-Nearest Neighbor algorithm. A couple of formulas are added.

The local density is denoted as follows:

The distance between the test object, x, and its nearest neighbor in the training set, , is defined like this:

The distance between this nearest neighbor () and its nearest neighbor in the training set () is defined as follows:

One data object is marked as an outlier once , or, in another format, is marked as .

The R implementation

Look up the file of R codes, ch_07_ classification _based.R, from the bundle of R codes for previously mentioned algorithms. The codes can be tested with the following command:

> source("ch_07_ classification _based.R")