Summary

In this chapter, we learned how to load data on Spark RDDs and also covered parallelization with Spark RDDs. We had a brief overview of the UCI machine learning repository before loading the data. We had an overview of the basic RDD operations, and also checked the functions from the official documentation.

In the next chapter, we will cover big data cleaning and data wrangling.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.118.10.32