How it works...

The varImpplot function is described in caret's documentation. The variable importance plot that we used in this recipe, shows on the left the contribution to the reduction of the MSE (mean squared error) attributable to each feature; and on the right, the average purity increase (a pure node is a node that contains homogenous combinations of features). Pure nodes indicate that the tree (random forests are constructed using lots of trees) is correctly assigning similar sets of features to a similar target outcome. Good variables will obviously rank high on both scales. The rfe function (that we used in Step 5) works by building a large model (using random forests) containing all the features and eliminating them by their variable importance.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.219.64.172