Monitoring the performance of the web server and classification-based methods

Classification algorithms can be used to detect outliers. The ordinary strategy is to train a one-class model only for the normal data point in the training dataset. Once you set up the model, any data point that is not accepted by the model is marked as an outlier.

Monitoring the performance of the web server and classification-based methods

The OCSVM algorithm

The OCSVM (One Class SVM) algorithm projects input data into a high-dimensional feature space. Along with this process, it iteratively finds the maximum-margin hyperplane. The hyperplane defined in a Gaussian reproducing kernel Hilbert space best separates the training data from the origin. When The OCSVM algorithm, the major portion of outliers or the solution of OCSVM can be represented by the solution of the following equation (subject to The OCSVM algorithm and The OCSVM algorithm):

The OCSVM algorithm
The OCSVM algorithm

The one-class nearest neighbor algorithm

This algorithm is based on the k-Nearest Neighbor algorithm. A couple of formulas are added.

The local density is denoted as follows:

The one-class nearest neighbor algorithm

The distance between the test object, x, and its nearest neighbor in the training set, The one-class nearest neighbor algorithm, is defined like this:

The one-class nearest neighbor algorithm

The distance between this nearest neighbor (The one-class nearest neighbor algorithm) and its nearest neighbor in the training set (The one-class nearest neighbor algorithm) is defined as follows:

The one-class nearest neighbor algorithm

One data object is marked as an outlier once The one-class nearest neighbor algorithm, or, in another format, is marked as The one-class nearest neighbor algorithm.

The R implementation

Look up the file of R codes, ch_07_ classification _based.R, from the bundle of R codes for previously mentioned algorithms. The codes can be tested with the following command:

> source("ch_07_ classification _based.R")

Monitoring the performance of the web server

Web server performance measurements are really important to the business and for operating system management. These measurements can be in the form of CPU usage, network bandwidth, storage, and so on.

The dataset comes from various sources such as benchmark data, logs, and so on. The types of outliers that appear during the monitoring of the web server are point outliers, contextual outliers, and collective outliers.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.144.86.105