7. Evaluating Regressors

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

7. Evaluating Regressors

In [1]:

# Setup
from mlwpy import *
%matplotlib inline

diabetes = datasets.load_diabetes()

tts = skms.train_test_split(diabetes.data,
                            diabetes.target,
                            test_size=.25,
                            random_state=42)

(diabetes_train_ftrs, diabetes_test_ftrs,
 diabetes_train_tgt,  diabetes_test_tgt) = tts

We’ve discussed evaluation of learning systems and evaluation techniques specific to classifiers. Now, it is time to turn our focus to evaluating regressors. There are fewer ad-hoc techniques in evaluating regressors than in classifiers. For example, we don’t have confusion matrices and ROC curves, but we’ll see an interesting alternative in residual plots. Since we have some extra mental and physical space, we’ll spend a bit of our time in this chapter on some auxiliary evaluation topics: we’ll create our own sklearn-pluggable evaluation metric and take a first look at processing pipelines. Pipelines are used when learning systems require multiple steps. We’ll use a pipeline to standardize some data before we attempt to learn from it.

7.1 Baseline Regressors

As with classifiers, regressors need simple baseline strategies to compete against. We’ve already been exposed to predicting the middle value, for various definitions of middle. In sklearn’s bag of tricks, we can easily create baseline models that predict the mean and the median. These are fixed values for a given training dataset; once we train on the dataset, we get a single value which serves as our prediction for all examples. We can also pick arbitrary constants out of a hat. We might have background or domain knowledge that gives us a reason to think some value—a minimum, maximum, or maybe 0.0—is a reasonable baseline. For example, if a rare disease causes fever and most people are healthy, a temperature near 98.6 degrees Fahrenheit could be a good baseline temperature prediction.

A last option, quantile, generalizes the idea of the median. When math folks say generalize, they mean that a specific thing can be phrased within a more general template. In Section 4.2.1, we saw that the median is the sorted data’s middle value. It had an interesting property: half the values are less than it and half the values are greater than it. In more general terms, the median is one specific percentile—it is called the 50th percentile. We can take the idea of a median from a halfway point to an arbitrary midway point. For example, the 75th percentile has 75% of the data less than it and 25% of the data greater.

Using the quantile strategy, we can pick an arbitrary percent as our break point. Why’s it called quantile and not percentile? It’s because quantile refers to any set of evenly spaced break points from 0 to 100. For example, quartiles—phonetically similar to quarters—are the values 25%, 50%, 75%, and 100%. Percentiles are specifically the 100 values from 1% to 100%. Quantiles can be more finely grained than single-percent steps—for example, the 1000 values 0.1%, 0.2%, . . ., 1.0%, . . ., 99.8%, 99.9%, 100.0%.

In [2]:

	mse
constant	14,657.6847
quantile	10,216.3874
mean	5,607.1979
median	5,542.2252

	MSE	R^2	Norm_MSE	1-R^2
KNeighborsRegressor	3,471.4194	0.3722	0.6278	0.6278
LinearRegression	2,848.2953	0.4849	0.5151	0.5151

	predicted	actual	error
example
0	4	3	1
1	2	5	-3
2	9	7	2

	predicted	actual	error	resid
example
0	4	3	1	-1
1	2	5	-3	3
2	9	7	2	-2

Table of Contents for 7. Evaluating Regressors

Create new playlist

Sign In

Sign Up

7. Evaluating Regressors

7.1 Baseline Regressors

7.2 Additional Measures for Regression

7.2.1 Creating Our Own Evaluation Metric

7.2.2 Other Built-in Regression Metrics

7.2.3 R2

7.2.3.1 An Interpretation of R2 for the Machine Learning World

7.2.3.2 A Cold Dose of Reality: sklearn’s R2

7.2.3.3 Recommendations on R2

7.3 Residual Plots

7.3.1 Error Plots

7.3.2 Residual Plots

7.4 A First Look at Standardization

7.5 Evaluating Regressors in a More Sophisticated Way: Take Two

7.5.1 Cross-Validated Results on Multiple Metrics

7.5.2 Summarizing Cross-Validated Results

7.5.3 Residuals

7.6 EOC

7.6.1 Summary

7.6.2 Notes

7.6.3 Exercises

Table of Contents for
7. Evaluating Regressors

7.2.3 R²

7.2.3.1 An Interpretation of R² for the Machine Learning World

7.2.3.2 A Cold Dose of Reality: sklearn’s R²

7.2.3.3 Recommendations on R²