Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Model estimation

Once the feature sets get finalized, in our last section, what follows is the estimating of parameters of the selected models, for which we can use MLlib on the Zeppelin notebook.

Similar to what we did before, for the best modeling, we need to arrange distributed computing, especially for this case, with various student segments for various study subjects. For this distributed computing part, readers may refer to previous chapters as we will not repeat them here.

Spark implementation with the Zeppelin notebook

With MLlib for SCALA code for random forest, we will use the following code:

// Train a RandomForest model.
val treeStrategy = Strategy.defaultStrategy("Classification")
val numTrees = 300
val featureSubsetStrategy = "auto" // Let the algorithm choose.
val model = RandomForest.trainClassifier(trainingData,
  treeStrategy, numTrees, featureSubsetStrategy, seed = 12345)

For decision tree, we will execute the following code:

val model = DecisionTree.trainClassifier(trainingData, numClasses,
  categoricalFeaturesInfo, impurity, maxDepth, maxBins)

In MLlib, for linear regression, we will run the following code:

val numIterations = 90
val model = LinearRegressionWithSGD.train(TrainingData, numIterations)

For logistic regression, we will use the following code:

val model = new LogisticRegressionWithSGD()
.setNumClasses(2)

To get all them implemented, we need to first input all the preceding codes into our Zeppelin notebook and then complete their computing over there.

In other words, we need to input the codes described before into the Zeppelin notebook, as follows:

Spark implementation with the Zeppelin notebook

Then we can press Shift + Enter to run these commands and then obtain results similar to the following screenshot:

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Model estimation

Create new playlist

Sign In

Sign Up

Model estimation

Spark implementation with the Zeppelin notebook

Table of Contents for
Model estimation