Fast scalable GBM implementations

Over the last few years, several new gradient boosting implementations have used various innovations that accelerate training, improve resource efficiency, and allow the algorithm to scale to very large datasets. The new implementations and their sources are as follows:

  • XGBoost (extreme gradient boosting), started in 2014 by Tianqi Chen at the University of Washington 
  • LightGBM, first released in January 2017, by Microsoft
  • CatBoost, first released in April 2017 by Yandex

These innovations address specific challenges of training a gradient boosting model (see this chapter's README on GitHub for detailed references). The XGBoost implementation was the first new implementation to gain popularity: among the 29 winning solutions published by Kaggle in 2015, 17 solutions used XGBoost. Eight of these solely relied on XGBoost, while the others combined XGBoost with neural networks.

We will first introduce the key innovations that have emerged over time and subsequently converged (so that most features are available for all implementations) before illustrating their implementation.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.138.114.94