Supervised learning

A supervised learning application makes predictions based on a set of examples, and the goal is to learn general rules that map inputs to outputs aligning with the real world. For example, a dataset for spam filtering usually contains spam messages as well as non-spam messages. Therefore, we are able to know whether messages in the training set are spam or ham. Nevertheless, we might have the opportunity to use this information to train our model in order to classify new unseen messages. The following figure shows the schematic diagram of supervised learning. After the algorithm has found the required patterns, those patterns can be used to make predictions for unlabeled test data. This is the most popular and useful type of machine learning task, that is not an exception for Spark as well, where most of the algorithms are supervised learning techniques:

Figure 3: Supervised learning in action

Examples include classification and regression for solving supervised learning problems. We will provide several examples of supervised learning, such as logistic regression, random forest, decision trees, Naive Bayes, One-vs-the-Rest, and so on in this book. However, to make the discussion concrete, only logistic regression and the random forest will be discussed, and other algorithms will be discussed in Chapter 12, Advanced Machine Learning Best Practices, with some practical examples. On the other hand, linear regression will be discussed for the regression analysis.

Table of Contents for Supervised learning

Create new playlist

Sign In

Sign Up

Table of Contents for
Supervised learning