Linear regression for inference and prediction

As the name suggests, linear regression models assume that the output is the result of a linear combination of the inputs. The model also assumes a random error that allows for each observation to deviate from the expected linear relationship. The reasons that the model does not perfectly describe the relationship between inputs and output in a deterministic way include, for example, missing variables, measurement, or data collection issues.

If we want to draw statistical conclusions about the true (but not observed) linear relationship in the population based on the regression parameters estimated from the sample, we need to add assumptions about the statistical nature of these errors. The baseline regression model makes the strong assumption that the distribution of the errors is identical across errors and that errors are independent of each other, that is, knowing one error does not help to forecast the next error. The assumption of independent and identically distributed (iid) errors implies that their covariance matrix is the identity matrix multiplied by a constant representing the error variance.

These assumptions guarantee that the OLS method delivers estimates that are not only unbiased but also efficient, that is, they have the lowest sampling error learning algorithms. However, these assumptions are rarely met in practice. In finance, we often encounter panel data with repeated observations on a given cross-section. The attempt to estimate the systematic exposure of a universe of assets to a set of risk factors over time typically surfaces correlation in the time or cross-sectional dimension, or both. Hence, alternative learning algorithms have emerged that assume more error covariance matrices that differ from multiples of the identity matrix.

On the other hand, methods that learn biased parameters for a linear model may yield estimates with a lower variance and, hence, improve the predictive performance. Shrinkage methods reduce the model complexity by applying regularization that adds a penalty term to the linear objective function. The penalty is positively related to the absolute size of the coefficients so that these are shrunk relative to the baseline case. Larger coefficients imply a more complex model that reacts more strongly to variations in the inputs. Properly calibrated, the penalty can limit the growth of the model's coefficients beyond what an optimal bias-variance trade-off would suggest.

We will introduce the baseline cross-section and panel techniques for linear models and important enhancements that produce accurate estimates when key assumptions are violated. We will then illustrate these methods by estimating factor models that are ubiquitous in the development of algorithmic trading strategies. Lastly, we will focus on regularization methods.

Table of Contents for Linear regression for inference and prediction

Create new playlist

Sign In

Sign Up

Table of Contents for
Linear regression for inference and prediction