Summary

In this chapter, we discussed how to estimate the ML model's performance and what metrics can be used for such estimation. We considered different metrics for regression and classification tasks and what characteristics they have. We have also seen how performance metrics can be used to determine the model's behavior, and also looked at the bias and variance characteristics. We looked at some high bias (underfitting) and high variance (overfitting) problems and considered how to solve them. We also learned about the regularization approaches, which are often used to deal with overfitting. We then studied what validation is and how it is used in the cross-validation technique. We saw that the cross-validation technique allows us to estimate model performance while training limited data. In the last section, we combined an evaluation metric and cross-validation in the grid search algorithm, which we can use to select the best set of hyperparameters for our model.

In the next chapter, we'll learn about the ML algorithms we can use to solve concrete problems. The next topic we will discuss in depth is clustering – the procedure of splitting the original set of objects into groups classified by properties. We will look at different clustering approaches and their characteristics.

Table of Contents for Summary

Create new playlist

Sign In

Sign Up

Table of Contents for
Summary