The balance between simplicity and accuracy

Things should be as simple as possible, but not simpler is a quote often attributed to Einstein and is like the Occam's razor with a twist. Ideally, we would like to have a model that neither overfits nor underfits the data. So, in general, we will face a trade-off and somehow we have to optimize or tune our models.

This trade-off is is usually discussed in terms of variance and bias:

  • High bias is the result of a model with low ability to accommodate the data. High bias can cause a model to miss the relevant pattern and thus can lead to underfitting.
  • High variance is the result of a model that has high sensitivity to details in the data. High variance can cause a model to capture the noise in the data and thus can lead to overfitting.

In Figure 5.5, the order 0 model is the one with the higher bias (and lower variance), because it is biased to return a flat line at the value of the mean of the variable , irrespective of the value of . The order 5 model is the one with the higher variance (and lower bias). The easier way to see this is to image different datasets of six points. You can arrange these six points in very different ways, and this model will adapt to each one of this arrangement. It will fit perfectly with most of them (except some arrangements, like circles). See exercise 6 for more details.

A model with high bias is a model with more prejudices (if you will excuse the anthropomorphization) or more inertia (if you will excuse the physicalization), while a model with high variance is a more open-minded model. The problem of being too biased is that you are ill-equipped to accommodate new evidences; the problem of being too open-minded is that you end up believing nonsensical stuff like terraplaners or anti-Vaxxers. In general, when we increase one of these terms, we decrease the other, leading us to a bias-variance trade-off. Once again, the main idea is that we want a balanced model.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.216.151.164