Choosing between the linear and RBF kernel

The rule of thumb, of course, is linear separability. However, this is most of the time very difficult to identify, unless you have sufficient prior knowledge or the features are of low dimension (1 to 3).

Prior knowledge, including text data, is often linearly separable, data from the XOR function is not, and we will look at the following three scenarios where the linear kernel is favored over RBF:

Case 1: both the numbers of features and instances are large (more than 104 or 105). As the dimension of the feature space is high enough, additional features as a result of RBF transformation will not provide any performance improvement, but will increase computational expense. Some examples from the UCI Machine Learning Repository are of this type:

Case 2: the number of features is noticeably large compared to the number of training samples. Apart from the reasons stated in Scenario 1, the RBF kernel is significantly more prone to overfitting. Such a scenario occurs in, for example:

Case 3: the number of instances is significantly large compared to the number of features. For a dataset of low dimension, the RBF kernel will, in general, boost the performance by mapping it to a higher dimensional space. However, due to the training complexity, it usually becomes no longer efficient on a training set with more than 106 or 107 samples. Some exemplar datasets include:

Other than these three preceding cases, RBF is practically the first choice.

Rules of choosing between the linear and RBF kernel can be summarized as follows:

Case

Linear

RBF

Expert prior knowledge

If linearly separable

If nonlinearly separable

Visualizable data of 1 to 3 dimension

If linearly separable

If nonlinearly separable

Both numbers of features and instances are large

First choice

Features Instances

First choice

Instances  Features

First choice

Others

First choice

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.188.137.58