Using decision trees for regression

Although we have so far focused on using decision trees in classification tasks, you can also use them for regression. But you will need to use scikit-learn again, as OpenCV does not provide this flexibility. We will therefore only briefly review its functionality here:

  1. Let's say we wanted to use a decision tree to fit a sin wave. To make things interesting, we will also add some noise to the data points using NumPy's random number generator:
In [1]: import numpy as np
... rng = np.random.RandomState(42)
  1. We then create 100 randomly spaced x values between 0 and 5 and calculate the corresponding sin values:
In [2]: X = np.sort(5 * rng.rand(100, 1), axis=0)
... y = np.sin(X).ravel()
  1. We then add noise to every other data point in y (using y[::2]), scaled by 0.5 so we don't introduce too much jitter:
In [3]: y[::2] += 0.5 * (0.5 - rng.rand(50))

  1. You can then create a regression tree like any other tree before.

A small difference is that the gini and entropy split criteria do not apply to regression tasks. Instead, scikit-learn provides two different split criteria:

    • mse (also known as variance reduction): This criterion calculates the Mean Squared Error (MSE) between ground truth and prediction and splits the node that leads to the smallest MSE.
    • mae: This criterion calculates the Mean Absolute Error (MAE) between ground truth and prediction and splits the node that leads to the smallest MAE.
  1. Using the MSE criterion, we will build two trees. Let's first build a tree with depth 2:
In [4]: from sklearn import tree
In [5]: regr1 = tree.DecisionTreeRegressor(max_depth=2,
... random_state=42)
... regr1.fit(X, y)
Out[5]: DecisionTreeRegressor(criterion='mse', max_depth=2,
max_features=None, max_leaf_nodes=None,
min_impurity_split=1e-07,
min_samples_leaf=1, min_samples_split=2,
min_weight_fraction_leaf=0.0,
presort=False, random_state=42,
splitter='best')
  1. Next, we will build a decision tree with a maximum depth of 5:
In [6]: regr2 = tree.DecisionTreeRegressor(max_depth=5,
... random_state=42)
... regr2.fit(X, y)
Out[6]: DecisionTreeRegressor(criterion='mse', max_depth=5,
max_features=None, max_leaf_nodes=None,
min_impurity_split=1e-07,
min_samples_leaf=1, min_samples_split=2,
min_weight_fraction_leaf=0.0,
presort=False, random_state=42,
splitter='best')

We can then use the decision tree like a linear regressor from Chapter 3, First Steps in Supervised Learning.

  1. For this, we create a test set with x values densely sampled in the whole range from 0 through 5:
In [7]: X_test = np.arange(0.0, 5.0, 0.01)[:, np.newaxis]
  1. The predicted y values can then be obtained with the predict method:
In [8]: y_1 = regr1.predict(X_test)
... y_2 = regr2.predict(X_test)
  1. If we plot all of these together, we can see how the decision trees differ:
In [9]: import matplotlib.pyplot as plt
... %matplotlib inline
... plt.style.use('ggplot')

... plt.scatter(X, y, c='k', s=50, label='data')
... plt.plot(X_test, y_1, label="max_depth=2", linewidth=5)
... plt.plot(X_test, y_2, label="max_depth=5", linewidth=3)
... plt.xlabel("data")
... plt.ylabel("target")
... plt.legend()
Out[9]: <matplotlib.legend.Legend at 0x12d2ee345f8>

This will produce the following plot:

Here, the thick red line represents the regression tree with depth 2. You can see how the tree tries to approximate the data using these crude steps. The thinner blue line belongs to the regression tree with depth 5; the added depth has allowed the tree to make many finer-grained approximations. Therefore, this tree can approximate the data even better. However, because of this added power, the tree is also more susceptible to fitting noisy values, as can be seen especially from the spikes on the right-hand side of the plot.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
13.58.116.51