Powerful Exploratory Data Analysis with MLlib

In this chapter, we will explore Spark's capability to perform regression tasks with models such as linear regression and support-vector machines (SVMs). We will learn how to compute summary statistics with MLlib, and discover correlations in datasets using Pearson and Spearman correlations. We will also test our hypothesis on large datasets.

We will cover the following topics:

  • Computing summary statistics with MLlib
  • Using the Pearson and Spearman methods to discover correlations
  • Testing our hypotheses on large datasets
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.145.39.60