Implementing a bagging regressor

Similarly, we can use the BaggingRegressor class to form an ensemble of regressors.

For example, we could build an ensemble of decision trees to predict housing prices from the Boston dataset of Chapter 3, First Steps in Supervised Learning.

In the following steps, you will learn how to use a bagging regressor for forming an ensemble of regressors:

The syntax is almost identical to setting up a bagging classifier:

In [7]: from sklearn.ensemble import BaggingRegressor
...     from sklearn.tree import DecisionTreeRegressor
...     bag_tree = BaggingRegressor(DecisionTreeRegressor(),
...                                 max_features=0.5, n_estimators=10, 
...                                 random_state=3)

Of course, we need to load and split the dataset as we did for the breast cancer dataset:

In [8]: from sklearn.datasets import load_boston
...     dataset = load_boston()
...     X = dataset.data
...     y = dataset.target
In [9]: from sklearn.model_selection import train_test_split
...     X_train, X_test, y_train, y_test = train_test_split(
...         X, y, random_state=3
...     )

Then, we can fit the bagging regressor on X_train and score it on X_test:

In [10]: bag_tree.fit(X_train, y_train)
...      bag_tree.score(X_test, y_test)
Out[10]: 0.82704756225081688

In the preceding example, we find a performance boost of roughly 5%, from 77.3% accuracy for a single decision tree to 82.7% accuracy.

Of course, we wouldn't just stop here. Nobody said the ensemble needs to consist of 10 individual estimators, so we are free to explore different-sized ensembles. On top of that, the max_samples and max_features parameters allow for a great deal of customization.

A more sophisticated version of bagged decision trees is called random forests, which we will talk about later in this chapter.

Table of Contents for Implementing a bagging regressor

Create new playlist

Sign In

Sign Up

Table of Contents for
Implementing a bagging regressor