Linear models and non-linear data

In Chapter 3, Modeling with Linear Regression, and Chapter 4, Generalizing Linear Models, we learned to build models of the general form:

Here, is a parameter for some probability distribution (for example, the mean of a Gaussian), the parameter of a binomial, the rate of a Poisson distribution, and so on. We call the inverse link function and is a function that is the square root or a polynomial function. For the simple linear regression case, is the identity function.

Fitting (or learning) a Bayesian model can be seen as finding the posterior distribution of the weights , and thus, this is known as the weight-view of approximating functions. As we have already seen, with the polynomial regression example, by letting be a non-linear function, we can map the inputs onto a feature space. We then fit a linear relation in the feature space that is not linear in the actual space. We saw that by using a polynomial of the proper degree, we can perfectly fit any function. But unless we apply some form of regularization (for example, using prior distributions), this will lead to models that memorize the data or, in other words, models with very poor generalization properties.

Gaussian processes provide a principled solution to modeling arbitrary functions by effectively letting the data decide on the complexity of the function, while avoiding, or at least minimizing, the chance of overfitting.

The following sections explain Gaussian processes from a very practical point of view; we avoid covering almost all the mathematics surrounding them. For a more formal explanation, please check the resources listed in Chapter 9, Where To Go Next?.

Table of Contents for Linear models and non-linear data

Create new playlist

Sign In

Sign Up

Table of Contents for
Linear models and non-linear data