Convolutional neural networks for time series forecasting

Convolutional neural networks (CNN) were developed and remained very popular in the image classification domain. However, they can also be applied to 1-dimensional problems, such as predicting the next value in the sequence, be it a time series or the next word in a sentence.

In the following diagram, we present a simplified schema of a 1D CNN:

Based on the preceding diagram, we briefly describe the elements of a typical CNN architecture:

  • Convolutional layer: The goal of this layer is to apply convolutional filtering to extract potential features.
  • Pooling layer: This layer reduces the size of the image or series while preserving the important characteristics identified by the convolutional layer.
  • Fully connected layer: Usually, there are a few fully connected layers at the end of the network to map the features extracted by the network to classes or values.

The convolutional layers read the input (such as a 1D time series) and drag a kernel (of a specified, tunable length) over the series. The kernel represents the features we want to locate in the series and has the same width as the time series, so it can only move in one direction (from the beginning of the time series toward its end). For each step, the input is multiplied by the values of the kernel and then a non-linear activation function is applied to the result. This way, the original input series is transformed into an interpretation of the input called the filter map. The next step of the CNN architecture is to apply a pooling layer (maximum or average), which reduces the size of the series, but-at the same time-preserves the identified characteristics.

Convolutional and pooling layers can be stacked on top of each other to provide multiple layers of abstraction (see popular CNN architectures such as AlexNet, VGG-16, Inception, and ResNet50). Alternatively, the results of the pooling layer can be passed to a fully connected (dense) layer.

Some of the benefits of using 1D CNN for time series forecasting include:

  • 1D CNNs can be very effective for discovering features in fixed-length segments of the entire dataset (for example, predicting the next value based on past n observations), mostly for cases in which it is not important where the feature is located within the segment.
  • CNNs are able to extract informative features that are independent of the time component (translation invariant). The network can identify a pattern at one position in the series (for example, the beginning) and then locate it at another position (for example, at the end of the sequence) and use it to make a prediction of the target.
  • The process of feature learning, in which the model learns the internal representation of the one (or higher) dimensional input, removes the need for domain knowledge for manual feature creation.
  • 1D networks allow for larger filter sizes – in a 1D setting, a filter of size 4 contains 4 feature vectors, while in a 2D setting, the same filter contains 16 feature vectors, which is a much broader selection.
  • CNNs are considered to be noise-resistant.
  • 1D CNNs are computationally cheaper than RNNs and sometimes perform better.

In this recipe, we will learn how to train a CNN for one-step-ahead forecasting of stock prices.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
52.14.174.132