We've touched on ways to avoid overfitting when discussing the pros and cons of algorithms in the last practice. We will now formally summarize them:
- Cross-validation, a good habit we have built on throughout the chapters in this book.
- Regularization.
- Simplification if possible. The more complex the mode is, the higher the chance of overfitting is. Complex models include a tree or forest with excessive depth, a linear regression with high degree polynomial transformation, and SVM with a complicated kernel.
- Ensemble learning, combining a collection of weak models to form a stronger one.