Analytical tools for diagnostics and feature extraction

Time series data is a sequence of values separated by discrete time intervals that are typically even-spaced (except for missing values). A time series is often modeled as a stochastic process consisting of a collection of random variables, y(t₁), ..., y(t_T), with one variable for each point in time, t_i, i=1, ..., T. A univariate time series consists of a single value, y, at each point in time, whereas a multivariate time series consists of several observations that can be represented by a vector.

The number of periods, Δt= t_i - t_j, between distinct points in time, t_i, t_j, is called lag, with T-1 lags for each time series. Just as relationships between different variables at a given point in time is key for cross-sectional models, relationships between data points separated by a given lag are fundamental to analyzing and exploiting patterns in time series. For cross-sectional models, we distinguished between input and output variables, or target and predictors, with the labels y and x, respectively. In a time series context, the lagged values of the outcome play the role of the input or x values in the cross-section context.

A time series is called white noise if it is a sequence of independent and identically-distributed random variables, ε_t, with finite mean and variance. In particular, the series is called a Gaussian white noise if the random variables are normally distributed with a mean of zero and a constant variance of σ.

A time series is linear if it can be written as a weighted sum of past disturbances, ε_t, that are also called innovations, and are here assumed to represent white noise, and the mean of the series, μ:

A key goal of time series analysis is to understand the dynamic behavior driven by the coefficients, a_i. The analysis of time series offers methods tailored to this type of data with the goal of extracting useful patterns that, in turn, help us to build predictive models. We will introduce the most important tools for this purpose, including the decomposition into key systematic elements, the analysis of autocorrelation, and rolling window statistics such as moving averages. Linear time series models often make certain assumptions about the data, such as stationarity, and we will also introduce both the concept, diagnostic tools, and typical transformations to achieve stationarity.

For most of the examples in this chapter, we work with data provided by the Federal Reserve that you can access using the pandas datareader that we introduced in Chapter 2, Market and Fundamental Data. The code examples for this section are available in the notebook tsa_and_arima notebook.

Table of Contents for Analytical tools for diagnostics and feature extraction

Create new playlist

Sign In

Sign Up

Table of Contents for
Analytical tools for diagnostics and feature extraction