Using random forest with Shark-ML

The random forest algorithm in the Shark-ML library is located in the RFClassifier class, and the corresponding trainer is located in the RFTrainer class. We use the original dataset values without preprocessing for the random forest algorithm implementation. First, we configure the trainer for this type of classifier. These are the next methods for configuration:

  • setNTrees: Set the number of trees.
  • setMinSplit: Set the minimum number of samples that are split.
  • setMaxDepth: Set the maximum depth of the tree.
  • setNodeSize: Set the maximum node size when the node is considered pure.
  • minImpurity: Set the minimum impurity level below which a node is considered pure.

After we configure the trainer object, we can use its train method for the training process. This method takes two parameters: the object of the RFClassifier class, which should be trained, and the ClassificationDataset object, which represents the dataset.

When the training is complete, we can use the classifier object as a functional object to evaluate it on other data. For example, if we have the test dataset of the type ClassificationDataset, we can obtain a classification in the following way: Data<unsigned int> predictions = rf(test.inputs());, where rf is the object of the RFClassifier class, as illustrated in the following code block:

void RFClassification(const ClassificationDataset& train,
const ClassificationDataset& test) {
RFTrainer<unsigned int> trainer;
trainer.setNTrees(100);
trainer.setMinSplit(10);
trainer.setMaxDepth(10);
trainer.setNodeSize(5);
trainer.minImpurity(1.e-10);

RFClassifier<unsigned int> rf;
trainer.train(rf, train);

// compute errors
ZeroOneLoss<unsigned int> loss;
Data<unsigned int> predictions = rf(test.inputs());
double accuracy = 1. - loss.eval(test.labels(), predictions);
std::cout << "Random Forest accuracy = " << accuracy << std::endl;
}

The output of this sample on the dataset is Random Forest accuracy = 0.971014.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.133.124.145