Model estimation

In this section, we will describe the methods and procedures for utilizing R notebooks within the DataScientistWorkbench to complete our model estimation.

The DataScientistWorkbench for R notebooks

As soon as we get our data ready by using OpenRefine, we should develop an R notebook within the which applies the codes prepared in section, Methods for risk scoring and the features prepared in section, Data and feature preparation to the data.

As seen in the following screenshot, the DataScientistWorkbench allows us to create an interactive R notebook, run it, and share it as well.

R studio, a favorite with R users, is also integrated with the DataScientistWorkbench:

The DataScientistWorkbench for R notebooks

To start a notebook, you can click on Build Analytics, and then on Notebook, or you can directly click on the Notebook blue button as seen in the following screenshot:

The DataScientistWorkbench for R notebooks

Once an R notebook is developed, it can be seen under Recent Notebooks, and you can run it to obtain results as in other environments.

R notebooks implementation

For estimating models, the main task is to schedule the implementation of R notebooks within the Data Scientist Workbench environment. To do so, we need to use R notebooks started in the previous section, for which we need to insert all the following R code:

  • Logistic regression: The following code is needed for the R notebook:
    Model1 <-glm(good_bad ~.,data=train,family=binomial())
  • Random Forest: The following code is needed for the R notebook:
    library(randomForest)
    randomForest(default~ ., data=train, na.action=na.fail, importance=TRUE, ntree=2000)
  • Decision tree: The following codes is needed for the R notebook to estimate decision trees:
    f.est1 <- rpart(default~ r1 + … + r21, data=train, method="class")
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.119.136.84