Understanding the bias and variance characteristics

The bias and variance characteristics are used to predict model behavior. They are universal keywords. Before we go any further and describe what they mean, we should consider validation. Validation is a technique that's used to test model performance. It estimates how well the model makes predictions on new data. New data is data that we did not use for the training process. To perform validation, we usually divide our initial dataset in two or three parts. One part should contain most of the data and will be used for training, while other ones will be used to validate and test the model. Usually, validation is performed for iterative algorithms after one training cycle (often called an epoch). Alternatively, we perform testing after the overall training process.

The validation and testing operations evaluate the model on the data we have excluded from the training process, which results in the values of the performance metrics that we chose for this particular model. The values of these validation metrics can be used to estimate models, prediction error trends. The most crucial issue for validation and testing is that the data for them should always be from the same distribution as the training data.

Throughout the rest of this chapter, we will use the polynomial regression model to show different prediction behaviors. The polynomial degree will be used as a hyperparameter.

Table of Contents for Understanding the bias and variance characteristics

Create new playlist

Sign In

Sign Up

Table of Contents for
Understanding the bias and variance characteristics