Learning parameters

Gradient boosting models typically use decision trees to capture feature interaction, and the size of individual trees is the most important tuning parameter. XGBoost and CatBoost set the max_depth default to 6. In contrast, LightGBM uses a default num_leaves value of 31, which corresponds to five levels for a balanced tree, but imposes no constraints on the number of levels. To avoid overfitting, num_leaves should be lower than 2max_depth. For example, for a well-performing max_depth value of 7, you would set num_leaves to 70–80 rather than 27=128, or directly constrain max_depth.

The number of trees or boosting iterations defines the overall size of the ensemble. All libraries support early_stopping to abort training once the loss functions register no further improvements during a given number of iterations. As a result, it is usually best to set a large number of iterations and stop training based on the predictive performance on a validation set.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.145.58.169