How Spark 2.0 MLlib works

Going forward in Spark 2.0, MLlib is pushing dataframes as its primary API. This is the way of the future, so let's take a look at how it works. I've gone ahead and opened up the SparkLinearRegression.py file in Canopy, as shown in the following figure, so let's walk through it a little bit:

As you see, for one thing, we're using ml instead of MLlib, and that's because the new dataframe-based API is in there.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.138.117.53