Deployment

Some users may have some deployment systems in place already for which exporting the developed models to users' desired forms could be good enough.

For linear regression and logistic regression, MLlib supports model exporting to Predictive Model Markup Language (PMML).

For more information about exporting to PMML from MLlib, visit https://spark.apache.org/docs/latest/mllib-pmml-model-export.html.

For the R notebook, it can be run on another environment directly. Also, with the R package PMML, R models can be exported.

For more information on the R package PMML, go to http://journal.r-project.org/archive/2009-1/RJournal_2009-1_Guazzelli+et+al.pdf.

It is also possible to deploy the models for decision making directly on Apache Spark and make the results easily available to users.

Two commonly used methods of deploying results are (1) dashboard and (2) rule-based decision making. Which one to select depends on who we will supply our result to.

Here, we will discuss them only briefly as a full deployment for decision making will need optimization that is not covered in this chapter. In later chapters, we will spend a little more time on deployment for readers to learn more.

Dashboard

For real-time analytical dashboard, most users use Spark Streaming together with other tools.

For our work here, we will take an easy dashboard approach, which is to use graphs and tables to quickly present our analytical results to consumers. All the dashboards are interactive as every plot depends on one or more features. When these features get updated, the algorithms behind each plot can be automatically reexecuted, and the plot will be regenerated.

Starting from our R notebooks, we can use the shiny and shinydashboard R packages to quickly build a dashboard.

For more information about using the shinydashboard package, go to https://rstudio.github.io/shinydashboard/.

Databricks' new version also has a dashboard builder. To use it, just go to Workspace -> Create -> Dashboard.

This Databricks dashboard builder is very powerful and intuitive. Once built, users can then publish a dashboard just with the click of a button to other employees in the organization or to their customers.

Rules

To turn all the modeling results into rules is easy as many tools are available. Especially for R results, there are several tools to help extract rules from developed predictive models.

For decision tree models, we should use the rpart.utils R package, which can extract rules and export them in various formats, including RODBC.

For more information about the rpart.utils R package, go to https://cran.r-project.org/web/packages/rpart.utils/rpart.utils.pdf.

For a discussion on extracting rules from MLlib, go to:

http://stackoverflow.com/questions/31782288/how-to-extract-rules-from-decision-tree-spark-mllib.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.135.247.181