Advanced analysis – directed methods

Some of the most important directed techniques include classification, estimation, and forecasting. Classification means to examine a new case and assign it to a predefined discrete class, for example, assigning keywords to articles and assigning customers to known segments. Next is estimation, where you are trying to estimate the value of a continuous variable of a new case. You can, for example, estimate the number of children or the family income. Forecasting is somewhat similar to classification and estimation. The main difference is that you can't check the forecast value at the time of the forecast. Of course, you can evaluate it if you just wait long enough. Examples include forecasting which customers will leave in the future, which customers will order additional services, and the sales amount in a specific region at a specific time in the future.

After you train models, you use them to perform predictions. In most classification and other directed approach projects, you build multiple models, using different algorithms, different parameters of the algorithms, different independent variables, additional calculated independent variables, and more. The question of which model is the best arises. You need to test the accuracy of the predictions of different models. To do so, you simply split the original dataset into training and test sets. You use the training set to train the model. A common practice is to use 70% of the data for the training set. The remaining 30% of the data goes into the test set, which is used for predictions. When you know the value of the predicted variable, you can measure the quality of the predictions.

In R, there is a plethora of directed algorithms. This section is limited to three only: logistic regression from the base installation, decision trees from the base installation, and conditional inference trees from the party package.

In this section, you will learn how to:

  • Prepare training and test datasets
  • Use the logistic regression algorithm
  • Create decision trees models
  • Evaluate predictive models
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.189.186.167