Start Working with Spark – REPL and RDDs

"All this modern technology just makes people try to do everything at once."

- Bill Watterson

In this chapter, you will learn how Spark works; then, you will be introduced to RDDs, the basic abstractions behind Apache Spark, and you'll learn that they are simply distributed collections exposing Scala-like APIs. You will then see how to download Spark and how to make it run locally via the Spark shell.

In a nutshell, the following topics will be covered in this chapter:

  • Dig deeper into Apache Spark
  • Apache Spark installation
  • Introduction to RDDs
  • Using the Spark shell
  • Actions and Transformations
  • Caching
  • Loading and Saving data
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.145.96.102