Trade-off and complexity

In a perfect world, you would have ML models that, after the learning process, have both low bias and low variance. However, as you may be aware, we live in a far from perfect world. So, as in most things, there is a trade-off needed to get the optimal result.

The goal of ML research efforts is to not only find the optimal trade-off combination but to also develop new ML techniques that sacrifice a little on one in exchange for a significant reduction in the other. For any given ML model, there is a trade-off to be made between bias and variance in order to find the optimal configuration that will generalize the best. Unfortunately, this optimal position is unknown and has to be inferred based on error calculations from the training and validation process.

More complex models can fit the data better and reduce error in the training process. However, they are prone to overfitting, which will increase the error on test data, and later on new real world data. The following image represents a range of complexity settings for a model. The goal is to find the optimal place where both training and test error is minimized. This represents a best compromise, which you would expect to generalize well to future datasets when the trained model is put into production:

Bias - Variance tradeoff as model complexity increases. Source: http://horicky.blogspot.com/2012/06/

Fortunately, there are methods, such as regularization, designed to help prevent overfitting in order to increase the likelihood that the resulting trained model will be in the optimal trade-off area. These methods are incorporated into the R packages that you will be using to grow your ML models.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.134.118.95