SparkR

SparkR is an R package that provides a light-weight frontend to use Apache Spark from R. SparkR provides a distributed data frame implementation that supports operations such as selection, filtering, aggregation, and so on. SparkR also supports distributed machine learning using MLlib. SparkR uses R-based SparkContext and R scripts as tasks and then uses JNI and pipes to executed processes to communicate between Java-based Spark clusters and R scripts.


R must be installed on all worker nodes running the Spark executors.

The following is how SparkR works by communicating between Java processed and R scripts:

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.119.104.160