Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Summary

In this chapter, we learned summary statistics and computing the summary statistics with MLlib. We also learned about Pearson and Spearman correlations, and how we can discover these correlations in our datasets using PySpark. Finally, we learned one particular way of performing hypothesis testing, which is called the Pearson chi-square test. We then used PySpark's hypothesis-testing functions to test our hypotheses on large datasets.

In the next chapter, we're going to look at putting the structure on our big data with Spark SQL.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

3.14.142.194

Table of Contents for Summary

Create new playlist

Sign In

Sign Up

Table of Contents for
Summary