Random forest algorithm

The random forest algorithm works with the bagging technique. The number of trees are planted and grown in the following manner:

  • There are observations in the training set. Samples out of N observations are taken at random and with replacement. These samples will act as a training set for different trees.
  • If there are M input features (variables), m features are drawn as a subset out of M and of course m < M. What this does is select m features at random at each node of the tree.
  • Every tree is grown to the largest extent possible.

  • Prediction takes place based on the aggregation of the results coming out of all the trees. In the case of classification, the method of aggregation is voting, whereas it is an average of all the results in the case of regression:

Let's work on a case study, since that will help us understand this concept more in detail. Let's work on breast cancer data.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.191.176.228