Bias and Variance

Variance and Bias is another way of saying overfitting and underfitting respectively, as discussed in Chapter 2, Deep Learning and Convolutional Neural Networks. We can diagnose the problem of "underfitting" and "overfitting" using the train set, dev set and test set errors.

Consider the following scenario where we have data coming from two different distributions named as Distribution 1 and Distribution 2. Distribution 2 represents the target application which we care about. The question is, how do we define train, dev and test sets on such distributions.

The best way to do so is to split it according to the preceding figure. Distribution 1 is split in to train set and part of it is used as the dev set. Here we are calling it the "Train-Dev set" ( because the dev set has same distribution as train set). Distribution 1 is used mainly for training as it is a large dataset. Distribution 2 is split into test set and dev set which are independent of either sets from Distribution 1. One thing to emphasize here is that the test and dev set should be coming from the same distribution and belong to the application which we actually care about i.e. the target application. The dev and test sets are usually small datasets, as the purpose of these is to to give an unbiased performance estimate of the model/algorithm.

The difference in errors of the model on the different dataset partitions, and looking at the human level error can give us insight in diagnosing our problems of bias and variance

The following table shows what the diagnosis should be when there is a difference in error between the sets in the left column. N.B. Human level error is the benchmark in this analysis which gives a baseline to compare our model with.

This can be explained better by the following tables. In these examples, we assume optimal/human error in all cases to be minimal, that is, 1%. Normally, deep learning models have accuracy similar to humans, so having this as a comparison helps pave a path in finding a good architecture.

High bias/underfitting

Human-level/optimal error	1%
Training error	15%

Having a high training error compared to human-level performance means that the model is not able to even fit the data; it is trained on and thus underfitting/high bias. However, when we look at the dev error in this case, it is generalizing well on it, so all is not lost.

High variance/overfitting

Training error	1.5%
Train-dev error	30%

In this case, the model doesn't perform well on the unseen data, which belongs to the same distribution as the training set but is not part of training. This means that the model is not able to generalize and hence overfits the training data.

High variance and high bias

Training error	20%
Train-dev error	40%

This situation is the worst case, as we observe that the model is not able to fit properly on the training data and also not generalizing well. This can be solved by changing the model architecture.

Data mismatch

Train-dev error	2%
Dev error	15%

When the model is fitting well on the dev set coming from the same distribution as the training set and performs badly on the dev set coming from a different distribution, which leads to a data mismatch problem, as discussed earlier in the chapter.

Overfit dev set

Dev error	2%
Test error	15%

The solution/guideline to address the mentioned problems is presented in the form of a flowchart in the following diagram:

ML basic recipe

A useful graph that illustrates how the test and train error vary with model complexity is given as follows. On the one hand, when the model is too complex it tends to overfit to the training data, hence the Train Error decreases while Test Error increases. On the other hand a simpler model tends to underfit and fails to generalize. The ideal range for model complexity lies somewhere before the Test Error starts increasing and when the Train Error is approaching zero.

Table of Contents for Bias and Variance

Create new playlist

Sign In

Sign Up

Table of Contents for
Bias and Variance