Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Summary

In this chapter, we discussed the internals of Apache Spark, what RDDs are, DAGs and lineages of RDDs, Transformations, and Actions. We also looked at various deployment modes of Apache Spark using standalone, YARN, and Mesos deployments. We also did a local install on our local machine and then looked at Spark shell and how it can be used to interact with Spark.

In addition, we also looked at loading data into RDDs and saving RDDs to external systems as well as the secret sauce of Spark's phenomenal performance, the caching functionality, and how we can use memory and/or disk to optimize the performance.

In the next chapter, we will dig deeper into RDD API and how it all works in Chapter 7, Special RDD Operations.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

3.15.141.206

Table of Contents for Summary

Create new playlist

Sign In

Sign Up

Table of Contents for
Summary