Univariate time series forecasting

With this task, the objective is to produce a univariate forecast for the surface temperature, focusing on choosing either an exponential smoothing model, an ARIMA model, or an ensemble of methods, including a neural net. We'll train the models and determine their predictive accuracy on an out-of-time test set, just like we've done in other learning endeavors. The following code creates the train and test sets:

> temp_ts <- ts(climate$Temp, start = 1919, frequency = 1)

> train <- window(temp_ts, end = 2007)

> test <- window(temp_ts, start = 2008)

To build our exponential smoothing model, we'll use the ets() function found in the forecast package. The function will find the best model with the lowest AIC:

> fit.ets <- forecast::ets(train)

> fit.ets
ETS(A,A,N) 

Call:
 forecast::ets(y = train) 

  Smoothing parameters:
    alpha = 0.3429 
    beta = 1e-04 

  Initial states:
    l = -0.2817 
    b = 0.0095 

  sigma: 0.1025

       AIC AICc BIC 
-0.1516906 0.5712010 12.2914912

The model object returns a number of parameters of interest. The first thing to check is what does (A,A,N) mean. It represents that the model selected is a simple exponential smoothing with additive errors. The first letter denotes the error type, the second letter the trend, and the third letter seasonality. The possible letters are as follows:

A = additive
M = multiplicative
N = none

We also see the parameter estimates with alpha, the smoothing parameter, for error correction (the level), and beta for slope. Initial state values were used to initiate model selection; sigma is the variation of the residuals and model criteria values are provided. You can plot how the estimates change over time:

> forecast::autoplot(fit.ets)

The output of the preceding code is as follows:

We'll now plot forecast and see how well it performed visually on the test data:

> plot(forecast::forecast(fit.ets, h = 6))

> lines(test, type = "o")

The output of the preceding code is as follows:

Looking at the plot, it seems that this forecast is showing a slight linear uptrend and is overestimating versus the actual values. We'll now look at the accuracy measures for the model:

> fit.ets %>% forecast::forecast(h = 6) %>%
    forecast::accuracy(temp_ts)
                      ME       RMSE        MAE       MPE     MAPE      MASE
Training set -0.00160570 0.10012357 0.08052241      -Inf      Inf 0.8752436
Test set     -0.06410776 0.08303044 0.07086704 -14.90784 16.12354 0.7702939
                  ACF1  Theil's U
Training set 0.1058923         NA
Test set    -0.1743445  0.7940449

There are eight measures for error. The one I believe we should focus on is Theil's U (actually U2 as the original Theil's U had some flaws), which is available only on the test data. Theil's U is an interesting statistic as it isn't dependent on scale, so you can compare multiple models. For instance, if in one model you transform the time series using a logarithmic scale, you can compare the statistic with a model that doesn't transform the data. You can think of it as the ratio that the forecast improves predictability over a naive forecast, or we can describe it at the root mean square error (RMSE) of the model divided by the RMSE of a naive model. Therefore, Theil's U statistics greater than 1 perform worse than a naive forecast, a value of 1 equals naive, and less than 1 indicates the model outperforms naive. Further discussion and how the statistic is derived is available at this link: http://www.forecastingprinciples.com/data/definitions/theil's%20u.html.

The smoothing model provided a statistic of 0.7940449. That isn't very impressive even though it's below one. We should strive for values at or below 0.5, in my opinion.

We'll now develop an ARIMA model, using auto.arima(), which is also from the forecast package. There are many options that you can specify in the function, or you can just include your time series data and it will find the best ARIMA fit. I recommend using the function with caution, as it can often return a model that violates assumptions for the residuals, as we shall see:

> fit.arima <- forecast::auto.arima(train)

> summary(fit.arima)
Series: train 
ARIMA(1,1,1) with drift 

Coefficients:
         ar1     ma1  drift
      0.2089 -0.7627 0.0087
s.e.  0.1372  0.0798 0.0033

sigma^2 estimated as 0.01021: log likelihood=78.09
AIC=-148.18 AICc=-147.7 BIC=-138.28

Training set error measures:
                        ME       RMSE        MAE MPE MAPE      MASE
Training set -8.396214e-05 0.09874311 0.07917484 Inf  Inf 0.8605961
                   ACF1
Training set 0.02010508

The abbreviated output shows that the model selected is an AR = 1, I = 1, and MA = 1, I = 1, or ARIMA(1,1,1) with drift (equivalent to an intercept term on differenced data and a slope term in undifferenced data). We can examine the plot of its performance on the test data in the same fashion as before:

> plot(forecast::forecast(fit.arima, h = 6))

> lines(test, type = "o")

The output of the preceding code is as follows:

This is very similar to the prior method. Let's check those accuracy statistics, of course with a focus on Theil's U:

> fit.arima %>% forecast::forecast(h = 6) %>%
    forecast::accuracy(temperature)
                        ME       RMSE        MAE       MPE     MAPE      MASE
Training set -8.396214e-05 0.09874311 0.07917484       Inf      Inf 0.8605961
Test set     -4.971043e-02 0.07242892 0.06110011 -11.84965 13.89815 0.6641316
                    ACF1 Theil's U

Training set  0.02010508        NA
Test set     -0.18336583 0.6729521

The forecast error is slightly better with the ARIMA model. You should always review the residuals with your models and especially ARIMA, which relies on the assumption of no serial correlation in said residuals:

> forecast::checkresiduals(fit.arima)

  Ljung-Box test

data: Residuals from ARIMA(1,1,1) with drift
Q* = 18.071, df = 7, p-value = 0.01165

Model df: 3. Total lags used: 10

The output of the following code is as follows:

First of all, take a look at the Ljung-Box Q test. The null hypothesis is that the correlations in the residuals are zero, and the alternative is that the residuals exhibit serial correlation. We see a significant p-value so we can reject the null. This is confirmed visually in the ACF plot of the residuals where significant correlation exists at lag 4 and lag 10. With serial correlation present, the model coefficients are unbiased, but the standard errors and any statistics that rely on them are wrong. This fact may require you to manually select an appropriate ARIMA model manually through trial and error. To explain how to do that would require a separate chapter, so it's not in scope for this book.

With a couple of relatively weak models, we can try other methods, but let's look at creating an ensemble similar to what we produced in Chapter 8, Creating Ensembles and Multiclass Methods. We'll put together the two models just created and add a forward-feed neural network from the nnetar() function available in the forecast package. We won't stack the models, but simply take the average of the three models for comparison on the test data.

The first step in this process is to develop the forecasts for each of the models. This is straightforward:

> ETS <- forecast::forecast(forecast::ets(train), h = 6)

> ARIMA <- forecast::forecast(forecast::auto.arima(train), h = 6)

> NN <- forecast::forecast(forecast::nnetar(train), h = 6)

The next step is to create the ensemble values, which again is just a simple average:

> ensemble.fit <-
    (ETS[["mean"]] + ARIMA[["mean"]] + NN[["mean"]]) / 3

The comparison step is kind of an open canvas for you to produce the statistics you desire. Notice that I'm pulling the accuracy for only the test data and Theil's U. You can pull the necessary stats, such as RMSE or MAPE, should you so desire:

> c(ets = forecast::accuracy(ETS, temperature)["Test set", c("Theil's U")],
    arima = forecast::accuracy(ARIMA, temperature)["Test set", c("Theil's U")],
    nn = forecast::accuracy(NN, temperature)["Test set", c("Theil's U")],
    ef = forecast::accuracy(ensemble.fit, temperature)["Test set", c("Theil's U")])
      ets     arima        nn        ef 
0.7940449 0.6729521 0.6794704 0.7104893

This is interesting, I think, as the exponential smoothing is dragging the ensemble performance down, and ARIMA and neural net are almost equal. Just for visual comparison, let's plot the neural network:

> plot(NN)

> lines(test, type = "o")

The output of the preceding code is as follows:

What are we to do with all of this? Here are a couple of thoughts. If you look at the time series pattern, you notice that it goes through what we could call different structural changes. There are a number of R packages to examine this structure and determine a point where it makes more sense to start the time series for forecasting. For example, there seems to be a discernible change in the slope of the time series in the mid-1960s. When you do this with your data, you're throwing away what may be valuable data points, so judgment comes into play. The implication is that if you want to totally automate your time series models, you'll need to take this into consideration.

You might try and transform the entire time series with log values (this doesn't work too well with negative values) or Box-Cox. In the forecast package, you can set lambda = "auto", in your model function. I did this and the performance didn't improve. For the sake of example, let's try and detect structural changes and build an ARIMA model on a selected starting point. I'll demonstrate structural change with the strucchange package, which computationally determines changes in linear regression relationships. You can find a full discussion and vignette on the package at this link: https://cran.r-project.org/web/packages/strucchange/vignettes/strucchange-intro.pdf.

I find this method useful in discussions with stakeholders as it helps them to understand when and even why the underlying data generating process changed. Here goes:

> temp_struc <- strucchange::breakpoints(temp_ts ~ 1)

> summary(temp_struc)

   Optimal (m+1)-segment partition: 

Call:
breakpoints.formula(formula = temp_ts ~ 1)

Breakpoints at observation number:
                      
m = 1 68 
m = 2 60 78
m = 3 18 60 78
m = 4 18 45 60 78
m = 5 17 31 45 60 78

Corresponding to breakdates:
                                
m = 1 1986 
m = 2 1978 1996
m = 3 1936 1978 1996
m = 4 1936 1963 1978 1996
m = 5 1935 1949 1963 1978 1996

The algorithm gave us five potential breakpoints in the time series, returning the information as an observation number and a year. Sure enough, 1963 indicates a structural change, but it tells us that 1978 and 1996 qualify also. Let's pursue the 1963 break as the start of our time series for an ARIMA model:

> train_bp <- window(temp_ts, start = 1963, end = 2007)

> fit.arima2 <- forecast::auto.arima(train_bp)

> fit.arima2 %>% forecast::forecast(h = 6) %>%
    forecast::accuracy(temperature)
                       ME      RMSE        MAE       MPE     MAPE
Training set -0.007696066 0.1034046 0.08505900  53.68130 99.93869
Test set     -0.086625082 0.1017767 0.08676477 -19.61829 19.64341
                  MASE        ACF1 Theil's U
Training set 0.7951128  0.09310454        NA
Test set     0.8110579 -0.08291170  1.057287

There you have it: much to my surprise performance, it's even worse than a naive forecast, but at least we've covered how to implement that methodology.

With this, we've completed the building of a univariate forecast model for the surface temperature anomalies, and now we'll move on to the next task of seeing whether CO2 levels cause these anomalies.

Table of Contents for Univariate time series forecasting

Create new playlist

Sign In

Sign Up

Table of Contents for
Univariate time series forecasting