Regression
is a supervised learning method for predicting a continuous output of an event based on the relationship between variables (or features) obtained from a dataset. A continuous outcome is a real value such as an integer or floating-point value often quantified as amounts and sizes. Regression is a widely popular type of deep learning modeling.
Since regression predicts a quantity, performance is measured error in those predictions. Regression performance can be measured in many ways. But the most common are mean squared error (MSE), mean absolute error (MAE), and root mean squared error (RMSE).
MSE is one of the most commonly used metrics. However, it is least useful when a single bad prediction would ruin the entire model’s predicting abilities. That is, when the dataset contains a lot of noise. It is most useful when the dataset contains outliers or unexpected values. Unexpected values are those that are too high or too low.
MAE is not very sensitive to outliers in comparison to MSE since it doesn’t punish huge errors. It is typically used when performance is measured on continuous variable data. It provides a linear value that averages the weighted individual differences equally.
RMSE errors are squared before they are averaged. As such, RMSE assigns a higher weight to larger errors. So RMSE is much more useful when large errors are present and they drastically affect the model’s performance. A benefit of RMSE is that units of error score are the same as the predicted value.
We thoroughly work through the famous Boston Housing dataset. We demonstrate how to load the data, build the input pipeline, and model the data. We also show you how to use the model to make predictions. We end by modeling a different dataset. The Cars dataset might not be as popular, but we want to give you experience with another set of data.
Notebooks for chapters are located at the following URL: https://github.com/paperd/tensorflow.
Enable the
GPU (if not already enabled):
1.
Click Runtime in the top-left menu.
2.
Click Change runtime type from the drop-down menu.
3.
Choose GPU from the Hardware accelerator drop-down menu.
4.
Test if GPU is active:
# display tf version and test if GPU is active
tf.__version__, tf.test.gpu_device_name()
Import the tensorflow library. If ‘/device:GPU:0’ is displayed, the GPU is active. If ‘..’ is displayed, the regular CPU is active.
Boston Housing Dataset
Boston Housing is a dataset derived from information collected by the US Census Service concerning housing in the area of Boston, Massachusetts. It was obtained from the StatLib archive (http://lib.stat.cmu.edu/datasets/boston) and has been used extensively throughout the machine learning literature to benchmark algorithms. The dataset is small in size with only 506 cases.
The name of this dataset is
boston
. It contains 12 features and 1 outcome (or target). The
features are as follows:
1.
CRIM – Per capita crime rate by town
2.
ZN – Proportion of residential land zoned for lots over 25,000 sq. ft.
3.
INDUS – Proportion of non-retail business acres per town
4.
CHAS – Charles River dummy variable (1 if tract bounds river and 0 otherwise)
5.
NOX – Nitric oxide concentration (parts per 10 million)
6.
RM – Average number of rooms per dwelling
7.
AGE – Proportion of owner-occupied units built prior to 1940
8.
DIS – Weighted distances to five Boston employment centers
9.
RAD – Index of accessibility to radial highways
10.
TAX – Full-value property tax rate per $10,000
11.
PTRATIO – Pupil-teacher ratio by town
12.
LSTAT – % lower status of the population
Data was collected in the 1970s, so don’t be shocked by the low median value of homes.
Boston Data
You can access any dataset for this book directly from GitHub with a few simple
steps:
1.
2.
Locate the dataset and click it.
3.
4.
Copy the URL to Colab and assign it to a variable.
5.
Read the dataset with the Pandas read_csv method.
For convenience, we’ve already located the appropriate URL and assigned it to a variable:
url = 'https://raw.githubusercontent.com/paperd/tensorflow/
master/chapter6/data/boston.csv'
Read the dataset into a pandas dataframe:
Verify that the dataset was read properly:
Explore the Dataset
All features are datatype float64 or int64. The label MEDV is datatype float64.
The dataset contains 506 records.
Create a dataframe that holds basic statistics with the
describe method and transpose it for easier viewing:
data_t = data.describe()
desc = data_t.T
desc
Target specific statistics from the transposed dataframe:
Describe a specific feature from the original dataframe:
Create Feature and Target Sets
Create the
target:
# create a copy of the DataFrame
df = data.copy()
# create the target
target = df.pop('MEDV')
print (target.head())
Since we popped MEDV from the copy, it should only contain the
features:
Get Feature Names from the Features DataFrame
Since the target is no longer part of the
dataframe, it’s easy to get the features:
feature_cols = list(df)
feature_cols
Get the number of features:
Convert Features and Labels
Convert pandas dataframe values to float with the
values method:
features = df.values
labels = target.values
type(features), type(labels)
Split Dataset into Train and Test Sets
Listing
6-1 creates train and test sets.
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(
features, labels, test_size=0.33, random_state=0)
print ('X_train shape:', end=' ')
print (X_train.shape, br)
print ('X_test shape:', end=' ')
print (X_test.shape)
Listing 6-1Train and test sets
Scale Data and Create TensorFlow Tensors
With image data, we scale by dividing each element by 255.0 to ensure that each input parameter (a pixel, in our case) has a similar data distribution. However, features represented by continuous values are scaled differently. We scale continuous data to have a mean (μ) of 0 and standard deviation (σ) of 1. Symbol μ is pronounced mu, and symbol σ is pronounced sigma. A sigma of 1 is called unit variance.
Listing
6-2 scales train and test data.
# scale feature data and create TensorFlow tensors
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
X_train_std = scaler.fit_transform(X_train)
X_test_std = scaler.fit_transform(X_test)
train = tf.data.Dataset.from_tensor_slices(
(X_train_std, y_train))
test = tf.data.Dataset.from_tensor_slices(
(X_test_std, y_test))
Listing 6-2Scale train and test sets
Let’s view the first
tensor as shown in Listing
6-3.
def see_samples(data, num):
for feat, targ in data.take(num):
print ('Features: {}'.format(feat), br)
print ('Target: {}'.format(targ))
n = 1
see_samples(train, n)
Listing 6-3Display the first tensor
The first sample looks as we expect.
Prepare Tensors for Training
Shuffle train data, batch, and prefetch train and test
data:BATCH_SIZE, SHUFFLE_BUFFER_SIZE = 16, 100
train_bs = train.shuffle(
SHUFFLE_BUFFER_SIZE).batch(BATCH_SIZE).prefetch(1)
test_bs = test.batch(BATCH_SIZE).prefetch(1)
Create a Model
If we don’t have a lot of training data, one technique to avoid overfitting is to create a small network with few hidden layers. We do just that!
The 64-neuron input layer accommodates our 12 input features. We have one hidden layer with 64 neurons. The output layer has one neuron because we are using regression with one target.
Create a model as shown in Listing
6-4.
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Flatten
import numpy as np
# clear any previous model
tf.keras.backend.clear_session()
# generate a seed for replication purposes
np.random.seed(0)
tf.random.set_seed(0)
# notice input shape accommodates 12 features!
model = Sequential([
Dense(64, activation="relu", input_shape=[12,]),
Dense(64, activation="relu"),
Dense(1)
])
Listing 6-4Create a model
Model Summary
Output shape of the first layer is (None, 64) because we have 64 neurons at this layer. We get parameters of 832 by multiplying 64 neurons at this layer by 12 features and adding 64 neurons at this layer.
Output shape of the second layer is (None, 64) because we have 64 neurons at this layer. We get parameters of 4,160 by multiplying 64 neurons at this layer by 64 neurons from the previous layer and adding 64 at this layer.
Output shape of the third layer is (None, 1) because we have one target. We get parameters of 65 by adding 64 neurons from the previous layer to 1 neuron at this layer.
Compile the Model
Compile:
rmse = tf.keras.metrics.RootMeanSquaredError()
model.compile(loss='mse', optimizer="RMSProp",
metrics=[rmse, 'mae', 'mse'])
Mean squared error (MSE) is a common loss function used for regression problems. Mean absolute error (MAE) and RMSE are also commonly used metrics. With some experimentation, we found that optimizer RMSProp performed pretty well with this dataset. RMSProp is an algorithm used for full-batch optimization. It tries to solve the problem that gradients may vary widely in magnitude by only using the sign of the gradient, which guarantees that all weight updates are of the same size.
Train the Model
Train the model for 50 epochs:
history = model.fit(train_bs, epochs=50,
validation_data=test_bs)
Visualize Training
Create variable hist to hold the model’s history as a pandas dataframe. Create another variable hist[‘epoch’] to hold epoch history. Display the last five rows to get an idea about performance.
Here’s the code:
hist = pd.DataFrame(history.history)
hist['epoch'] = history.epoch
hist.tail()
Build the plots as shown in Listing
6-5.
import matplotlib.pyplot as plt
def plot_history(history, limit1, limit2):
hist = pd.DataFrame(history.history)
hist['epoch'] = history.epoch
plt.figure()
plt.xlabel('epoch')
plt.ylabel('MAE [MPG]')
plt.plot(hist['epoch'], hist['mae'],
label='Train Error')
plt.plot(hist['epoch'], hist['val_mae'],
label = 'Val Error')
plt.ylim([0, limit1])
plt.legend()
plt.title('MAE by Epoch')
plt.show()
plt.figure()
plt.xlabel('Epoch')
plt.ylabel('MSE [MPG]')
plt.plot(hist['epoch'], hist['mse'],
label='Train Error')
plt.plot(hist['epoch'], hist['val_mse'],
label = 'Val Error')
plt.ylim([0, limit2])
plt.legend()
plt.title('MSE by Epoch')
plt.show()
plt.figure()
plt.xlabel('Epoch')
plt.ylabel('RMSE [MPG]')
plt.plot(hist['epoch'], hist['root_mean_squared_error'],
label='Train Error')
plt.plot(hist['epoch'], hist['val_root_mean_squared_error'],
label = 'Val Error')
plt.ylim([0, limit2])
plt.legend()
plt.title('RMSE by Epoch')
plt.show()
# set limits to make plot readable
mae_limit, mse_limit = 10, 100
plot_history(history, mae_limit, mse_limit)
Listing 6-5Visualize training performance
Since the validation error is worse than the train error, the model is overfitting. What can we do? The first step is to estimate when performance begins to degrade. From the visualizations, can you see when this happens?
Early Stopping
With classification, our goal is to maximize accuracy. Of course, we also want to minimize loss. With regression, our goal is to minimize MSE or one of the other common error metrics. From the visualizations, we see that our model is overfitting because validation error is higher than training error. We also see that once training error and validation error cross, performance begins to degrade.
There is one simple tuning experiment we can run to make this model more useful. We can stop the model when training and validation errors are very close to each other. This technique is called early stopping. Early stopping is a widely used approach that stops training at the point when performance on a validation dataset starts to degrade.
Let’s modify our training experiment to automatically stop training when the validation score doesn’t improve. We use an EarlyStopping callback that tests a training condition for every epoch. When performance starts to degrade, training is automatically stopped.
All we need to do is update the fit method and retrain as shown in Listing
6-6.
# clear the previous model
tf.keras.backend.clear_session()
# generate a seed for replication purposes
np.random.seed(0)
tf.random.set_seed(0)
# monitor 'val_loss' for early stopping
early_stop = tf.keras.callbacks.EarlyStopping(monitor='val_loss')
history = model.fit(train_bs, epochs=50,
validation_data=test_bs,
callbacks=[early_stop])
Listing 6-6Early stopping
Instead of allowing the algorithm to automatically early stop, we can add some control. Just include a parameter that forces the model to continue to a point that gives us the best performance. The patience parameter can be set to a given number of epochs after which training will be stopped if there is no improvement.
Let’s try this and see what happens as shown in Listing
6-7.
# clear the previous model
tf.keras.backend.clear_session()
# generate a seed for replication purposes
np.random.seed(0)
tf.random.set_seed(0)
# set number of patience epochs
n = 4
early_stop = tf.keras.callbacks.EarlyStopping(
monitor='val_loss', patience=n)
history = model.fit(train_bs, epochs=50,
validation_data=test_bs,
callbacks=[early_stop])
Listing 6-7Early stopping with patience
Experiment with the patience parameter to find better results.
Let’s visualize:
hist = pd.DataFrame(history.history)
hist['epoch'] = history.epoch
train_limit, test_limit = 10, 100
plot_history(history, train_limit, test_limit)
Remove Bad Data
The Boston dataset has some inherent bad data. What is wrong with the data? Prices of homes are capped at $50,000 because the Census Service censored the data. They decided to set the maximum value of the price variable to 50,000 USD, so no price can go beyond that value.
What do we do? While maybe not ideal, we can remove data with prices at or above 50,000 USD. This is not ideal because we may be removing perfectly good data, but there is no way to know this. Another reason is because the dataset is so small to begin with. Neural nets are meant to perform at their best with larger datasets.
To explore this topic further, we recommend this URL:
https://towardsdatascience.com/things-you-didnt-know-about-the-boston-housing-dataset-2e87a6f960e8
Get Data
We don’t want to attempt to clean a dataset processed for TensorFlow consumption. So just reload the
raw dataset:
# get the raw data
url = 'https://raw.githubusercontent.com/paperd/tensorflow/
master/chapter6/data/boston.csv'
boston = pd.read_csv(url)
Remove Noise
Remove the bad data, which hopefully reduces inherent
noise:
print ('data set before removing noise:', boston.shape)
# remove noise
noise = boston.loc[boston['MEDV'] >= 50]
data = boston.drop(noise.index)
print ('data set without noise:', data.shape)
Noise is the irrelevant information or randomness in a dataset. We removed several records from the dataset.
Create Feature and Target Data
Create
feature and target sets:
# create a copy of the dataframe
df = data.copy()
# create feature and target sets
target, features = df.pop('MEDV'), df.values
labels = target.values
Build the Input Pipeline
Create the input pipeline by splitting data into train and test sets, scaling feature data, and slicing data into TensorFlow consumable pieces. Finish the pipeline by shuffling train data, batching, and prefetching train and test data.
Listing
6-8 includes the code to build the input pipeline.
X_train, X_test, y_train, y_test = train_test_split(
features, labels, test_size=0.33, random_state=0)
# standardize feature data and create TensorFlow tensors
X_train_std = scaler.fit_transform(X_train)
X_test_std = scaler.fit_transform(X_test)
# slice data for TensorFlow consumption
train = tf.data.Dataset.from_tensor_slices(
(X_train_std, y_train))
test = tf.data.Dataset.from_tensor_slices(
(X_test_std, y_test))
# shuffle, batch, prefetch
BATCH_SIZE = 16
SHUFFLE_BUFFER_SIZE = 100
train_n = train.shuffle(
SHUFFLE_BUFFER_SIZE).batch(BATCH_SIZE).prefetch(1)
test_n = test.batch(BATCH_SIZE).prefetch(1)
Listing 6-8Build the input pipeline
Compile and Train
Listing
6-9 includes the code for compiling and training the model.
rmse = tf.keras.metrics.RootMeanSquaredError()
model.compile(loss='mse', optimizer="RMSProp",
metrics=[rmse, 'mae', 'mse'])
tf.keras.backend.clear_session()
# generate a seed for replication purposes
np.random.seed(0)
tf.random.set_seed(0)
early_stop = tf.keras.callbacks.EarlyStopping(
monitor='val_loss', patience=n)
history = model.fit(train_n, epochs=50,
validation_data=test_n,
callbacks=[early_stop])
Listing 6-9Compile and train the model
Visualize
Plot results:
hist = pd.DataFrame(history.history)
hist['epoch'] = history.epoch
train_limit, test_limit = 10, 100
plot_history(history, train_limit, test_limit)
Our model is not perfect, but we did improve performance.
Generalize on Test Data
Evaluate:
loss, rmse, mae, mse = model.evaluate(test_n, verbose=2)
print('"Testing set Mean Abs Error: {:5.2f} thousand dollars'.
format(mae))
MAE provides a good idea of performance for linear continuous data in an easy-to-understand manner. With this dataset, we can expect model predictions to be off by the MAE value in thousands of dollars on average.
Make Predictions
Use the
predict method to make
predictions from processed test data
test_n:
predictions = model.predict(test_n)
Display the
first prediction:
# predicted housing price
first = predictions[0]
print ('predicted price:', first[0], 'thousand')
# actual housing price
print ('actual price:', y_test[0], 'thousand')
Compare predicted and actual prices to gauge model performance.
Display the first five predictions and compare against actual target values:
five = predictions[:5]
actuals = y_test[:5]
print ('pred', 'actual')
for i, p in enumerate(range(5)):
print (np.round(five[i][0],1), actuals[i])
Visualize Predictions
Listing
6-10 displays predictions against actual housing prices.
fig, ax = plt.subplots()
ax.scatter(y_test, predictions)
ax.plot([y_test.min(), y_test.max()],
[y_test.min(), y_test.max()], 'k--', lw=4)
ax.set_xlabel('Measured')
ax.set_ylabel('Predicted')
plt.show()
Listing 6-10Predictions against actual prices plot
The diagonal line is a plot of the actual housing prices. The further away a prediction is from the diagonal, the more erroneous it is.
Load Boston Data from Scikit-Learn
Since Boston data is included in
sklearn.datasets, we can load it from this environment:
from sklearn import datasets
dataset = datasets.load_boston()
data, target = dataset.data, dataset.target
The list of keys informs us about accessing feature names:
feature_names = dataset.feature_names
feature_names
Notice that the sklearn dataset has an extra feature B. This column might be considered by some to be controversial because it singles out Black (or African American) people in a township.
We want to remove noise from the entire dataset, so build a dataframe with feature data and add target data:
df_sklearn = pd.DataFrame(dataset.data, columns=feature_names)
df_sklearn['MEDV'] = dataset.target
df_sklearn.head()
Remove Noise
Remove noisy data:
print ('data set before removing noise:', df_sklearn.shape)
noise = df_sklearn.loc[df_sklearn['MEDV'] >= 50]
df_clean = df_sklearn.drop(noise.index)
print ('data set without noise:', df_clean.shape)
Build the Input Pipeline
Build the pipeline as shown in Listing
6-11.
# create a copy of the dataframe
df = df_clean.copy()
# create the target
target = df.pop('MEDV')
# convert features and target data
features = df.values
labels = target.values
# create train and test sets
X_train, X_test, y_train, y_test = train_test_split(
features, labels, test_size=0.33, random_state=0)
X_train_std = scaler.fit_transform(X_train)
X_test_std = scaler.fit_transform(X_test)
# slice data into a TensorFlow consumable form
train = tf.data.Dataset.from_tensor_slices(
(X_train_std, y_train))
test = tf.data.Dataset.from_tensor_slices(
(X_test_std, y_test))
BATCH_SIZE = 16
SHUFFLE_BUFFER_SIZE = 100
train_sk = train.shuffle(
SHUFFLE_BUFFER_SIZE).batch(BATCH_SIZE).prefetch(1)
test_sk = test.batch(BATCH_SIZE).prefetch(1)
Listing 6-11Build the input pipeline
Model Data
Model data as shown in Listing
6-12.
# clear any previous model
tf.keras.backend.clear_session()
# generate a seed for replication purposes
np.random.seed(0)
tf.random.set_seed(0)
# new model with 13 input features
model = Sequential([
Dense(64, activation="relu", input_shape=[13,]),
Dense(64, activation="relu"),
Dense(1)
])
rmse = tf.keras.metrics.RootMeanSquaredError()
model.compile(loss='mse', optimizer="RMSProp",
metrics=[rmse, 'mae', 'mse'])
early_stop = tf.keras.callbacks.EarlyStopping(monitor='val_loss',
patience=n)
history = model.fit(train_sk, epochs=50,
validation_data=test_sk,
callbacks=[early_stop])
Model Cars Data
Let’s practice with another dataset.
Get Cars Data from GitHub
We’ve already located the URL and assigned it to a variable:
cars_url = 'https://raw.githubusercontent.com/paperd/tensorflow/
master/chapter6/data/cars.csv'
Read data into a pandas dataframe:
cars = pd.read_csv(cars_url)
Get information about the dataset:
Convert Categorical Column to Numeric
Machine learning algorithms
can only train numeric data. So we must convert any non-numeric feature. The Origin column is categorical, not numeric. To remedy, one solution is to encode the data as one-hot. One-hot encoding
is a process that converts categorical data into a numeric form for machine learning algorithm consumption.
We start by slicing off the Origin feature column from the original dataframe into its own dataframe. We then use this dataframe as a template to build a new feature column in the original dataframe for each category from the original Origin feature.
Create a copy of the dataframe:
# create a copy of dataframe
df = cars.copy()
origin = df.pop('Origin')
Define one-hot encoded feature columns for US, Europe, and Japanese cars:
df['US'] = (origin == 'US') * 1.0
df['Europe'] = (origin == 'Europe') * 1.0
df['Japan'] = (origin == 'Japan') * 1.0
df.tail(8)
For US origin, we assign 1.0 0.0 0.0. For Europe origin, we assign 0.0 1.0 0.0. For Japan origin, we assign 0.0 0.0 1.0. So we replace the Origin feature with three one-hot encoded features.
Alternatively, we can one-hot encode with pandas. Start by creating a copy of the dataframe:
# create a copy of df
alt = cars.copy()
One-hot encode:
# get one hot encoding of columns 'Origin'
one_hot = pd.get_dummies(alt['Origin'])
Drop the original column:
# drop column as it is now encoded
alt = alt.drop('Origin', axis=1)
Join the encoded column to the dataframe:
# join the encoded df
alt = alt.join(one_hot)
alt.tail(8)
Slice Extraneous Data
Since the name of each car has no impact on any predictions we might want to make, we can tuck it away into its own dataframe in case we want to revisit it in the
future:
try:
name = df.pop('Car')
except:
print("An exception occurred")
If an exception occurs, the Car column has already been removed. You can run this piece of code several times with no impact to results.
Create Features and Labels
Our goal is to predict miles per gallon for cars in this dataset. So the target is MPG, and the features are the remaining feature columns.
Create feature and target sets as shown in Listing
6-13.
# create data sets
features = df.copy()
target = features.pop('MPG')
# get feature names
feature_cols = list(features)
print (feature_cols)
# get number of features
num_features = len(feature_cols)
print (num_features)
# convert feature and target data to float
features = features.values
labels = target.values
(type(features), type(labels))
Listing 6-13Create feature and target sets
Build the Input Pipeline
Build the input pipeline as shown in Listing
6-14.
# split
X_train, X_test, y_train, y_test = train_test_split(
features, labels, test_size=0.33, random_state=0)
print ('X_train shape:', end=' ')
print (X_train.shape, br)
print ('X_test shape:', end=' ')
print (X_test.shape)
# scale
X_train_std = scaler.fit_transform(X_train)
X_test_std = scaler.fit_transform(X_test)
train = tf.data.Dataset.from_tensor_slices(
(X_train_std, y_train))
test = tf.data.Dataset.from_tensor_slices(
(X_test_std, y_test))
# shuffle, batch, prefetch
BATCH_SIZE = 16
SHUFFLE_BUFFER_SIZE = 100
train_cars = train.shuffle(
SHUFFLE_BUFFER_SIZE).batch(BATCH_SIZE).prefetch(1)
test_cars = test.batch(BATCH_SIZE).prefetch(1)
Listing 6-14Build the input pipeline
Model Data
Model data as shown in Listing
6-15.
# clear any previous model
tf.keras.backend.clear_session()
model = Sequential([
Dense(64, activation="relu", input_shape=[num_features]),
Dense(64, activation="relu"),
Dense(1)
])
# compile
rmse = tf.keras.metrics.RootMeanSquaredError()
optimizer = tf.keras.optimizers.RMSprop(0.001)
model.compile(loss='mse',
optimizer=optimizer,
metrics=[rmse, 'mae', 'mse'])
# train
tf.keras.backend.clear_session()
early_stop = tf.keras.callbacks.EarlyStopping(
monitor='val_loss', patience=n)
car_history = model.fit(train_cars, epochs=100,
validation_data=test_cars,
callbacks=[early_stop])
Inspect the Model
Output shape of the first layer is (None, 64) because we have 64 neurons at this layer. We get parameters of 640 by multiplying 64 neurons at this layer by 9 features and adding 64 neurons at this layer.
Output shape of the second layer is (None, 64) because we have 64 neurons at this layer. We get parameters of 4,160 by multiplying 64 neurons at this layer by 64 neurons from the previous layer and adding 64 at this layer.
Output shape of the third layer is (None, 1) because we have one target. We get parameters of 65 by adding 64 neurons from the previous layer to 1 neuron at this layer.
Visualize Training
Visualize:
hist = pd.DataFrame(history.history)
hist['epoch'] = history.epoch
train_limit, test_limit = 10, 100
plot_history(history, train_limit, test_limit)
Generalize on Test Data
Generalize:
loss, rmse, mae, mse = model.evaluate(test_cars, verbose=2)
print ()
print('Testing set Mean Abs Error: {:5.2f} MPG'.format(mae))
Make Predictions
Predictions:
predictions = model.predict(test_cars)
Visualize Predictions
Visualize
predictions as shown in Listing
6-16.
fig, ax = plt.subplots()
ax.scatter(y_test, predictions)
ax.plot([y_test.min(), y_test.max()],
[y_test.min(), y_test.max()], 'k--', lw=4)
ax.set_xlabel('Measured')
ax.set_ylabel('Predicted')
plt.show()
Listing 6-16Visualize predictions for cars data
The further the prediction is away from the diagonal true values line, the more erroneous it is.