© David Paper 2021
D. PaperTensorFlow 2.x in the Colaboratory Cloudhttps://doi.org/10.1007/978-1-4842-6649-6_10

10. Time Series Forecasting with RNNs

David Paper1  
(1)
Logan, UT, USA
 

We’ve already leveraged RNNs for NLP. In this chapter, we create experiments to forecast with time series data. We use the famous Weather dataset to demonstrate both a univariate and a multivariate example.

A RNN is well suited for time series forecasting because it remembers the past and its decisions are influenced by what it has learned from the past. So it makes good decisions as data changes. Time series forecasting is deploying a model to predict future values based on previously observed values.

Time series data is different than what we’ve worked with so far because it is a sequence of observations taken sequentially in time. Time series data includes a time dimension, which is an explicit order dependence between observations.

Weather Forecasting

Forecasting the weather is a difficult and complex endeavor. But leading-edge companies like Google, IBM, Monsanto, and Facebook are leveraging AI technology to realize accurate and timely weather forecasts. Given the introductory nature of our lessons, we cannot hope to demonstrate such complex AI experiments. But we show you how to build simple time series forecasting models with weather data.

Notebooks for chapters are located at the following URL: https://github.com/paperd/tensorflow.

The Weather Dataset

We introduce time series forecasting with RNNs for a univariate problem. We then forecast a multivariate time series. We use weather time series data to train our models. The data we use is recorded by the Max Planck Institute for Biogeochemistry.

Find out more about the institute by perusing the following URL:

www.bgc-jena.mpg.de/index.php/Main/HomePage

Enable the GPU (if not already enabled):
  1. 1.

    Click Runtime in the top-left menu.

     
  2. 2.

    Click Change runtime type from the drop-down menu.

     
  3. 3.

    Choose GPU from the Hardware accelerator drop-down menu.

     
  4. 4.

    Click SAVE.

     
Test if GPU is active:
import tensorflow as tf
# display tf version and test if GPU is active
tf.__version__, tf.test.gpu_device_name()

Import the tensorflow library. If ‘/device:GPU:0’ is displayed, the GPU is active. If ‘ .. ’ is displayed, the regular CPU is active.

Get the dataset as shown in Listing 10-1.
import os
p1 = 'https://storage.googleapis.com/tensorflow/'
p2 = 'tf-keras-datasets/jena_climate_2009_2016.csv.zip'
url = p1 + p2
zip_path = tf.keras.utils.get_file(
    origin = url,
    fname='jena_climate_2009_2016.csv.zip',
    extract=True)
csv_path, _ = os.path.splitext(zip_path)
Listing 10-1

Get weather data

Use the splitext method to extract the CSV file from the URL. Create the appropriate path for easy loading into pandas.

Load the CSV file into pandas:
import pandas as pd
df = pd.read_csv(csv_path)

Explore the Data

Display the features of the dataframe:
list(df)
Display the first few records:
df.head(3)
Display the last few records:
df.tail(3)

We see that data collection began on January 1, 2009, and ended on December 31, 2016. The last data recorded was on January 1, 2017, but it’s irrelevant for our purposes since it is the only recorded piece of data for that year. We also see that data is recorded every 10 minutes. So the time step for this experiment is 10 minutes. A time step is a single occurrence of an event.

The first timestamp begins on January 1, 2009 (01.01.2009), with data recorded from 00:00:00 to 00:10:00. The second timestamp begins immediately after 00:10:00 to 00:20:00. This pattern continues throughout the day with the last time step at 23:50:00. The second day (and all subsequent days), January 2, 2009 (02.01.2009), follows the same pattern. Generally, time series forecasting predicts the observation at the next time step.

Display a concise summary of the dataframe:
df.info()
The dataset contains 15 columns:
  • Date Time – Date-time reference

  • p (mbar) – Atmospheric pressure in millibars

  • T (degC) – Temperature in Celsius

  • Tpot (K) – Temperature in Kelvin

  • Tdew (degC) – Temperature in Celsius relative to humidity

  • rh (%) – Relative humidity

  • VPmax (mbar) – Saturation vapor pressure in millibars

  • VPact (mbar) – Vapor pressure in millibars

  • VPdef (mbar) – Vapor pressure deficit in millibars

  • sh (g/kg) – Specific humidity in grams per kilogram

  • H2OC (mmol/mol) – Water vapor concentration in millimoles per mole

  • rho (g/m**3) – Air density in grams per meter cubed

  • wv (m/s) – Wind speed in meters per second

  • max. wv (m/s) – Maximum wind speed in meters per second

  • wd (deg) – Wind direction in degrees

We have 14 features because Date Time is a reference column. There is no missing data. All data is float64 except the Date Time reference object. And the dataset contains 420,551 rows of data with indexes ranging from 0 to 420550.

Display statistical data for the 14 features:
stats = df.describe()
stats.transpose()

The describe method generates descriptive statistics. The transpose method transposes indexes and columns.

We can also display statistics for one or more features:
stats = df.describe()
stats[['p (mbar)', 'T (degC)']].transpose()
Display the shape of the dataframe:
df.shape

The dataframe contains 420,551 rows with 15 columns included in each row.

Plot Relative Humidity Over Time

Since data is in a pandas dataframe, it is easy to plot any of the 14 features against Date Time time steps.

Plot yearly periodicity of relative humidity:
import matplotlib.pyplot as plt
# create new dataframe with just relative humidity column
rh = df['rh (%)']
# plot it!
rh.plot()

Since we have an observation every 10 minutes, each hour has six observations. And each day has 144 (6 observations × 24 hours) observations.

Plot the first 10 days:
rh10 = df['rh (%)'][0:1439]
rh10.plot()

Since we have 144 observations per day, we plot a total of 1,440 (10 days × 144 observations/day) observations. Notice that indexing begins a 0.

With a narrow view of the data (the first 10 days), we can see daily periodicity. We also see that fluctuation is pretty chaotic, which means that prediction is more difficult.

Explore humidity by time step at a granular level:
df[['Date Time','rh (%)']].head()

Forecast a Univariate Time Series

Scale Data

Convert the dataframe into a numpy array:
rh_np = rh.to_numpy()
Scale numpy data for efficient training as shown in Listing 10-2.
br =' '
# original data
print ('first five unscaled observations:', rh_np, br)
# scale relative humidity data
rh_sc = tf.keras.utils.normalize(rh_np)
print ('shape after tf function:', rh_sc.shape)
# squeeze out '1' dimension
rh_sq = tf.squeeze(rh_sc)
print ('shape after squeeze:', rh_sq.shape, br)
# convert to numpy
rh_scaled = rh_sq.numpy()
print ('first five scaled observations:', rh_scaled[:5])
Listing 10-2

Scale data

Scale relative humidity data. Squeeze out the extra 1 dimension added by the TensorFlow function so we can convert the TensorFlow tensor into a numpy array for easier processing. Display the first five scaled observations to verify that scaling works as expected.

Establish Training Split

Calculate train split size:
import numpy as np
# train split with 75% of data
train_split = int(np.round(df.shape[0] * .75))
train_split
Calculate test split size:
# test split with 25% of data
test_split = df.shape[0] - train_split
test_split
Calculate number of days of train and test data:
# calculate number of days of data
print (np.round(train_split / 144, 2))
print (np.round(test_split / 144, 2))

For this experiment, use the first 315,413 rows of data for training and the remaining 105,138 (420,551 – 315,413) rows for the test set. Training data accounts for about 2,190 (315, 413/144) days of data. And test data accounts for about 730 (105, 138/144) days of data.

Create Features and Labels

Create a function that splits a dataset into features and labels as shown in Listing 10-3.
def create_datasets(data, origin, end, window, target_size):
  # list to hold feature set of windows
  features = []
  # list to hold labels
  labels = []
  # establish starting point that reflects window size
  origin = origin + window
  # enable split for test data
  if end is None:
    end = len(data) - target_size
  # create feature set of 'window-sized' elements
  for i in range(origin, end):
    # create index set to identify each window
    indices = range(i-window, i)
    # reshape data from (window,) to (window, 1)
    features.append(np.reshape(data[indices], (window, 1)))
    # create labels
    labels.append(data[i+target_size])
  return np.array(features), np.array(labels)
Listing 10-3

Function that creates features and labels

The function accepts a dataset, an index where we want to start the split, an ending index, the size of each window, and the target size. Parameter window is the size of the past window of information. The target_size is how far in the future we want our model to learn how to predict.

The function creates lists to hold features and labels. It then establishes the starting point that reflects the window size. To create a test set, the function checks the end value. If None, it uses the length of the entire dataset less the target size as the ending value so the test set can start where the training set left off.

Once training and test starting points are established, the function creates the feature windows and labels. The indices for each feature window are established as the next window during iteration. That is, each subsequent set of indices begins where the last set ended. Feature windows are reshaped for TensorFlow consumption and added to the features list. Labels are created as the last observation in the next window and then added to the labels list. Both features and labels are returned as numpy arrays.

The function may seem confusing, but all it really does is create a feature set that holds windows of relative humidity observations (for our experiment) and another that holds targets. Targets are based on the last relative humidity observation from the next window of data. This makes sense because the last relative humidity from the next window is a pretty good indication of future relative humidity. So the feature set becomes a set of windows that contain time step observations. And the label set contains predictions for each window.

Create Train and Test Sets

For the train set, start at index 0 from the dataset and continue up to 315,412. For the test set, take the remainder. Set the window size to 20 and target to 0.

Invoke the function as shown in Listing 10-4.
# create train and test sets
import numpy as np
window = 20
target = 0
x_train, y_train = create_datasets(rh_scaled, 0, train_split,
                                   window, target)
x_test, y_test = create_datasets(rh_scaled, train_split, None,
                                 window, target)
Listing 10-4

Create train and test sets

Inspect train and test data:
print ('train:', end=' ')
print (x_train.shape, y_train.shape)
print ('test:', end=' ')
print (x_test.shape, y_test.shape)

As expected, shapes reflect size of each dataset, window size, and the 1 dimension. The 1 dimension indicates that we are making one prediction into the future. The train set contains 315,393 records composed of windows holding 20 relative humidity readings. The test set contains 105,118 records composed of windows holding 20 relative humidity readings.

So why do we have 315,393 training observations instead of the original 315,413? The reason is that we need the first window to act as history. So just subtract the first window of 20 from 315,413. For test data, subtract the first window of 20 from 105,138 to get 105,118.

We can create bigger windows, but this dramatically increases the amount of data we must process. With only 20 observations per window, we already have 6,307,860 (315,393 × 20) data points for training and 2,102,360 (105,118 × 20) data points for testing!

View Windows of Past History

Inspect the first window from the train set:
print ('length of window:', len(x_train[0]), br)
print ('first window of past history:')
print (x_train[0], br)
print ('target relative humidity to predict:')
print (y_train[0])

As expected, the window contains 20 relative humidity readings. So how did we get the target?

Take the last entry from the next window:
print ('target from the 1st window:', end='   ')
print (np.round(y_train[0], 8))
print ('last obs from the 2nd window:', end=' ')
print (np.round(x_train[1][19][0], 8))
Verify by inspecting the second window:
print ('second window of past history:')
print (x_train[1], br)
print ('target relative humidity to predict:')
print (y_train[1])
Let’s see if the pattern holds for the next few windows as shown in Listing 10-5.
print ('target from the 2nd window:', end='   ')
print (np.round(y_train[1], 8))
print ('last obs from the 3rd window:', end=' ')
print (np.round(x_train[2][19][0], 8), br)
print ('target from the 3rd window:', end='   ')
print (np.round(y_train[2], 8))
print ('last obs from the 4th window:', end=' ')
print (np.round(x_train[3][19][0], 8), br)
print ('target from the 4th window:', end='   ')
print (np.round(y_train[3], 8))
print ('last obs from the 5th window:', end=' ')
print (np.round(x_train[4][19][0], 8), br)
print ('target from the 5th window:', end='   ')
print (np.round(y_train[4], 8))
print ('last obs from the 6th window:', end=' ')
print (np.round(x_train[5][19][0], 8))
Listing 10-5

Inspect the patterns for the next few windows

Plot a Single Example

Create a function that returns a list of time steps from -length to 0:
def create_time_steps(length):
  return list(range(-length, 0))

Start at -length to 0 to use the previous window as history.

Create another function that accepts a single data window and its target, delta, and a title as shown in Listing 10-6.
def plot(plot_data, delta=0, title='Data Window'):
  labels = ['history', 'actual future', 'model prediction']
  marker = ['r.-', 'b*', 'g>']
  time_steps = create_time_steps(plot_data[0].shape[0])
  if delta: future = delta
  else: future = 0
  plt.title(title)
  for i, obs in enumerate(plot_data):
    if i:
      plt.plot(future, obs, marker[i], markersize=10,
               label=labels[i])
    else:
      plt.plot(time_steps, obs.flatten(), marker[i],
               label=labels[i])
  plt.legend()
  plt.xlim([time_steps[0], (future+5)*2])
  plt.xlabel('time step')
  return plt
Listing 10-6

Function that plots an example

Parameter delta indicates a change in a variable. The default is no change. The function plots each element in the data window with its associated time step.

Invoke the function based on the first data window and target from the train set:
plot([x_train[0], y_train[0]])

The actual future for each window is its label, which is the last entry of the next window.

Create a Visual Performance Baseline

Before training, it is a good idea to create a simple visual performance baseline to compare against model performance. Of course, there are many ways to do this, but a very simple way is to use the average of the last 20 observations.

Create a function that returns the average of a window of observations:
def baseline(history):
  return np.mean(history)
Plot the first data window with a baseline prediction:
plot([x_train[0], y_train[0], baseline(x_train[0])], 0,
     'baseline prediction')

Create a Baseline Metric

It’s also a good idea to create a baseline metric to compare against our model. The simplest approach is to predict the last value for each window. We can then find the average mean squared error of the predictions and use this value as our metric.

Create a baseline metric as shown in Listing 10-7.
# display shape of test set
print ('TensorFlow shape:', x_test[0].shape, br)
# remove '1' dimension for easier processing
x_test_np = tf.squeeze(x_test)
print ('numpy shape:', x_test_np[0].shape, br)
# predict last value for each window
y_pred = x_test_np[:, -1]
# compute average MSE
MSE = np.mean(tf.keras.losses.mean_squared_error(
    y_test, y_pred))
print ('MSE:', MSE)
Listing 10-7

Create a baseline metric

The MSE is very small, which means that our baseline metric might be hard to beat. Why? In general, machine learning has a pretty significant limitation. Unless the learning algorithm is hardcoded to look for a specific kind of simple model, parameter learning can sometimes fail to find a simple solution to a simple problem. Our time series problem is a very simple problem.

Finish the Input Pipeline

Shuffle train data, batch, and cache as shown in Listing 10-8.
BATCH_SIZE = 256
BUFFER_SIZE = 10000
# prepare the train set
train = tf.data.Dataset.from_tensor_slices((x_train, y_train))
train_one = (train.cache()
.shuffle(BUFFER_SIZE)
.batch(BATCH_SIZE)
.repeat())
# prepare the test set
test = tf.data.Dataset.from_tensor_slices((x_test, y_test))
test_one = test.batch(BATCH_SIZE).repeat()
Listing 10-8

Finish the input pipeline

Inspect tensors:
train_one, test_one

As expected, windows have 20 observations and 1 prediction.

Explore a Data Window

Verify that features and labels are batched in 256-element windows:
for feature, label in train_one.take(1):
  print (len(feature), len(label))
Display the first window from the batch:
for feature, label in train_one.take(1):
  print ('feature:')
  print (feature[0].numpy(), br)
  print ('label:', label[0].numpy())

As expected, the window contains a feature set with 20 observations and a label with one prediction.

Create the Model

Establish the input shape:
input_shape = x_train.shape[-2:]
input_shape

Input shape indicates window size of 20 with 1 prediction.

Import requisite libraries, clear previous model sessions, generate a seed for reproducibility, and create the model as shown in Listing 10-9.
# clear any previous models
tf.keras.backend.clear_session()
#import libraries
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import GRU, Dense
# generate seed to ensure reproducibility
tf.random.set_seed(0)
neurons = 32  # number of neurons in GRU layer
model = Sequential([
  GRU(neurons, input_shape=input_shape),
  Dense(1)
])
Listing 10-9

Create the model

A RNN is well suited to time series data because its layers can provide feedback to earlier layers. Specifically, it processes time series data time step by time step while remembering information it sees during training. Our model uses a GRU layer, which is a specialized RNN layer capable of remembering information over long periods of time. So it is especially well suited for time series modeling.

Model Summary

Inspect the model:
model.summary()

We use the formula 3 × (n2 × mn + 2n) to calculate the number of learnable parameters for a GRU, where m is the input dimension and n is the output dimension. For any neural net, multiply output from the previous layer by neurons at the current layer (m × n). In addition, account for neurons at the current layer. But count them twice because of feedback (2n). A GRU layer has feedback, so square the output dimension (n2). Multiply by 3 because there are three sets of operations for a GRU (hidden state, reset gate, and update gate) that require weight matrices.

Here is the breakdown to calculate parameters for the GRU layer:
  • 3 × (322 + 32 + 2 × 32)

  • 3 × (1024 + 32 + 64)

  • 3 × 1120

  • 3,360

The output dimension is 32. The input dimension doesn’t exist since there is no previous layer.

The dense layer has 33 learnable parameters calculated by multiplying neurons from the previous layer (32) by neurons at the current layer (1) and adding the number of neurons at the current layer (1).

Verify Model Output

Make an untrained prediction from the model:
for x, y in test_one.take(1):
  print(model.predict(x).shape)

The prediction shows batch size of 256 with 1 prediction. So the model is working as expected.

Compile the Model

Compile:
model.compile(optimizer='adam', loss="mse")

Train the Model

Train :
num_train_steps = 400
epochs = 10
history = model.fit(train_one, epochs=epochs,
                    steps_per_epoch=num_train_steps,
                    validation_data=test_one,
                    validation_steps=50)

Generalize on Test Data

Generalize:
test_loss = model.evaluate(test_one, steps=num_train_steps)

Make Predictions

Make three predictions as shown in Listing 10-10.
n = 3
title = 'GRU prediction'
for i, (x, y) in enumerate(test_one.take(n)):
  p = model.predict(x)[0]
  plot([x[0].numpy(), y[0].numpy(), p], 0,
       title + ' window ' + str(i))
  plt.show()
Listing 10-10

Make some predictions

Although visual inspection is pretty, it doesn’t efficiently gauge overall performance.

Plot Model Performance

Listing 10-11 plots model performance:
loss = history.history['loss']
val_loss = history.history['val_loss']
epochs = range(1, len(loss) + 1)
plt.figure()
plt.plot(epochs, loss, 'bo', label='training loss')
plt.plot(epochs, val_loss, 'b', label='validation loss')
plt.title('training and validation loss')
plt.legend()
plt.show()
Listing 10-11

Plot model performance

Voilà! Our model performs pretty well!

Forecast a Multivariate Time Series

We just demonstrated how to make a single prediction based on a single feature. Now, let’s make multiple predictions on multiple variables. We can choose any of the 14 features (we don’t want to predict from the date-time reference).

The following are the available 14 features:
  • p (mbar) – Atmospheric pressure in millibars

  • T (degC) – Temperature in Celsius

  • Tpot (K) – Temperature in Kelvin

  • Tdew (degC) – Temperature in Celsius relative to humidity

  • rh (%) – Relative humidity

  • VPmax (mbar) – Saturation vapor pressure in millibars

  • VPact (mbar) – Vapor pressure in millibars

  • VPdef (mbar) – Vapor pressure deficit in millibars

  • sh (g/kg) – Specific humidity in grams per kilogram

  • H2OC (mmol/mol) – Water vapor concentration in millimoles per mole

  • rho (g/m**3) – Air density in grams per meter cubed

  • wv (m/s) – Wind speed in meters per second

  • max. wv (m/s) – Maximum wind speed in meters per second

  • wd (deg) – Wind direction in degrees

Create a variable to hold the features we wish to consider for our experiment:
mv_features = ['Tdew (degC)', 'sh (g/kg)', 'H2OC (mmol/mol)', 'T (degC)']

We chose four features – Tdew (degC), sh (g/kg), H2OC (mmol/mol), and T (degC). Tdew (degC) is the temperature in Celsius relative to humidity. sh (g/kg) is the specific humidity in grams per kilogram. H2OC (mmol/mol) is the water vapor concentration in millimoles per mole. And T (degC) is the temperature in Celsius. We chose these features for demonstration purposes only. Choice of features should be based on a problem domain.

Create a dataframe to hold the features:
mv_features = df[mv_features]
mv_features.index = df['Date Time']
mv_features.head()
Visualize features:
mv_features.plot(subplots=True)
Visualize a single feature:
mv_features['T (degC)'].plot(subplots=True)

Scale Data

Convert the dataframe to a numpy array:
f_np = mv_features.to_numpy()
f_np[:5]
Check the number of observations:
len(f_np)

As expected, we have 420,551 observations.

Scale data as shown in Listing 10-12.
# scale features
f_sc = tf.keras.utils.normalize(f_np)
print ('shape after tf function:', f_sc.shape, br)
# squeeze
f_sq = tf.squeeze(f_sc)
# convert to numpy
f_scaled = f_sq.numpy()
print ('first five scaled observation:')
print (f_scaled[:5])
Listing 10-12

Scale data

Multistep Model

With relative humidity, we only predicted a single future point. But we can create a model to learn to predict a range of future values, which is what we are going to do with the multivariate data we just established.

Let’s say that we want to train our multistep model to learn to predict for the next 6 hours. Since our data time steps are 10 minutes (one observation every 10 minutes), there are six observations every hour. Given that we want to predict for the next 6 hours, our model makes 36 (6 observations × 6) predictions.

Let’s also say that we want to show our model data from the last 3 days for each sample. Since there are 24 hours in a day, we have 144 (6 × 24) observations each day. So we have a total of 432 (3 × 144) observations. But we want to sample every hour because we don’t expect a drastic change in any of our features within 60 minutes. Thus, 72 (432 observations/6 observations per hour) observations represents each window of data.

Generators

Since we are training multiple features to predict a range of future values, we create a generator function to create train and test splits. A generator is a function that returns an object iterator that we iterate over one value at a time. A generator is defined like a normal function, but it generates a value with the yield keyword rather than return. So adding the yield keyword automatically makes a normal function a generator function.

Generators are easy to implement, but a bit difficult to understand. We invoke a generator function in the same way as a normal function. But, when we invoke it, a generator object is created. We must iterate over the generator object to see its contents. As we iterate over the generator object, all processes within the generator function are processed until it reaches a yield statement. Once this happens, the generator yields a new value from its contents and returns execution back to the for loop. So a generator yields one element from its contents for each yield statement encountered.

Advantages of Using a Generator

Generator functions allow us to declare a function that behaves like an iterator. So we can make iterators in a fast, easy, and clean way. An iterator is an object that can be iterated upon. It is used to abstract a container of data to make it behave like an iterable object. As programmers, we use iterable objects like strings, lists, and dictionaries frequently.

Generators save memory space because they don’t compute the value of each item when instantiated. Generators only compute a value when explicitly asked to do so. Such behavior is known as lazy evaluation. Lazy evaluation is useful when we process large datasets because it allows us to start using data immediately rather than having to wait until the entire dataset is processed. It also saves memory because data is generated only when needed.

Generator Caveats

  • A generator creates a single object. So a generator object can only be assigned to a single variable no matter how many values it yields.

  • Once iterated over, a generator is exhausted. So it must be rerun to be repopulated.

Create a Generator Function

Create a generator as shown in Listing 10-13:
def generator(d, t, o, e, w, ts, s):
  # hold features and labels
  features, labels = [], []
  # initialize variables
  data, target = d, t
  origin, end = o, e
  window, target_size = w, ts
  step = s
  # establish starting point that reflects window size
  origin = origin + window
  # enable split for test data
  if end < 0:
    end = len(data) - target_size
  # create feature set of 'window-sized' elements
  for i in range(origin, end):
    # create index set to identify each window
    indices = range(i-window, i, step)
    # create features
    features.append(data[indices])
    # create labels
    labels.append(target[i:i+target_size])
  yield np.array(features), np.array(labels)
Listing 10-13

Generator function

Generate Train and Test Data

Invoke the generator to create train and test sets as shown in Listing 10-14.
window = 432  # observations for 3 days
future_target = 36  # predictions for the next 6 hours
step = 6  # number of timesteps per hour
train_gen = generator(f_scaled, f_scaled[:, 1], 0,
                      train_split, window,
                      future_target, step)
test_gen = generator(f_scaled, f_scaled[:, 1],
                     train_split, -1, window,
                     future_target, step)
Listing 10-14

Invoke the generator

Notice that we assign the generator object to a single variable!

Reconstitute Generated Tensors

Remake generated data into numpy arrays. Since train and test data are generators, we must iterate to create features and labels.

Begin with train data by iterating the generator to create features and labels lists:
train_f, train_l = [], []
for i, row in enumerate(train_gen):
  train_f.append(row[0])
  train_l.append(row[1])
Convert list data to numpy arrays:
train_features = np.asarray(train_f, dtype=np.float64)
train_labels = np.asarray(train_l, dtype=np.float64)
Inspect shapes:
train_features.shape, train_labels.shape

As expected, features have 72 observations per window and 4 features. And labels have 36 predictions.

Remove the 1 dimension with the tf.squeeze function:
train_features, train_labels = tf.squeeze(train_features),
tf.squeeze(train_labels)
train_features.shape, train_labels.shape
Reconstitute test data as shown in Listing 10-15.
# create test data
test_f, test_l = [], []
for i, row in enumerate(test_gen):
  test_f.append(row[0])
  test_l.append(row[1])
# convert lists to numpy arrays
test_features = np.asarray(test_f, dtype=np.float64)
test_labels = np.asarray(test_l, dtype=np.float64)
# squeeze out the '1' dimension created by the generator
test_features, test_labels = tf.squeeze(test_features),
tf.squeeze(test_labels)
test_features.shape, test_labels.shape
Listing 10-15

Reconstitute test data

Finish the Input Pipeline

Finish the input pipeline as shown in Listing 10-16.
train_mv = tf.data.Dataset.from_tensor_slices(
    (train_features, train_labels))
train_mv = train_mv.cache().shuffle(
    BUFFER_SIZE).batch(BATCH_SIZE).repeat()
test_mv = tf.data.Dataset.from_tensor_slices(
    (test_features, test_labels))
test_mv = test_mv.batch(BATCH_SIZE).repeat()
Listing 10-16

Finish the input pipeline

Check tensors:
train_mv, test_mv
Check a batch:
for feature, label in train_mv.take(1):
  print (feature.shape)
  print (label.shape)

Each batch contains 256 windows of feature data and 256 labels. Each window has 72 observations with each observation containing 4 features. Each label has 36 predictions.

Check the first training example:
for feature, label in train_mv.take(1):
  print ('observations:', len(feature[0]))
  print (feature[0], br)
  print ('predictions:', len(label[0]))
  print (label[0])

As expected, the first window has 72 observations with 4 features, and the first label has 36 predictions.

Establish input shape:
input_shape_multi = feature.shape[-2:]
input_shape_multi

As expected, window size is 72 with 4 features for each window.

Create the Model

Create the model as shown in Listing 10-17.
# clear any previous models
tf.keras.backend.clear_session()
# generate seed to ensure reproducibility
tf.random.set_seed(0)
neurons = 32  # neurons in GRU layer
outputs = 36  # predictions
gen_model = Sequential([
  GRU(neurons, input_shape=input_shape_multi),
  Dense(outputs)
])
Listing 10-17

Create the model

Model Summary

Inspect model:
gen_model.summary()
Here is the breakdown to calculate learnable parameters for the GRU layer:
  • 3 × (322 + 32 × 4 + 2 × 32)

  • 3 × (1024 + 128 + 64)

  • 3 × 1216

  • 3,648

The only difference is that we have four features. So multiply the output (n) dimension by 4.

The dense layer has 1,188 learnable parameters calculated by multiplying neurons from the previous layer (32) by neurons at the current layer (36) and adding the number of neurons at the current layer (36).

Compile the Model

Compile :
gen_model.compile(optimizer='adam', loss="mse")

Train the Model

Train:
num_train_steps = 400
epochs = 10
gen_history = gen_model.fit(train_mv, epochs=epochs,
                            steps_per_epoch=num_train_steps,
                            validation_data=test_mv,
                            validation_steps=50)

Generalize on Test Data

Generalize:
test_loss = gen_model.evaluate(test_mv, steps=num_train_steps)

Plot Performance

Plot performance as shown in Listing 10-18.
loss = gen_history.history['loss']
val_loss = gen_history.history['val_loss']
epochs = range(1, len(loss) + 1)
plt.figure()
plt.plot(epochs, loss, 'bo', label='training loss')
plt.plot(epochs, val_loss, 'b', label='validation loss')
plt.title('training and validation loss')
plt.legend()
plt.show()
Listing 10-18

Plot performance

The model is overfitting a bit, but performance is still pretty good.

Plot a Data Window

Create a plotting function as shown in Listing 10-19.
def multi_step_plot(window, true_future, pred):
  plt.figure(figsize=(12, 6))
  num_in = create_time_steps(len(window))
  num_out = len(true_future)
  plt.plot(num_in, np.array(window[:, 1]), 'm',
           label='history')
  plt.plot(np.arange(num_out)/step, np.array(true_future),
           'bo', label='true future')
  if pred.any():
    plt.plot(np.arange(num_out)/step, np.array(pred), 'go',
             label='predicted future')
  plt.legend(loc='upper left')
  plt.show()
Listing 10-19

Plotting function

Plot the first training window from the first batch:
for x, y in train_mv.take(1):
  multi_step_plot(x[0], y[0], np.array([0]))

Make a Prediction

Make a prediction based on the first training window:
for x, y in test_mv.take(1):
  y_pred = gen_model.predict(x)[0]
  multi_step_plot(x[0], y[0], y_pred)

Not bad.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.218.169.50