Part 1. Get acquainted: First steps in PySpark

When working with a new technology, the best way to get familiar with it is to jump right in, building our intuition along the way. This first part succinctly introduces PySpark before going over two distinct use cases.

Chapter 1 introduces the technology and the computing model that power Spark.

Then, in chapters 2 and 3, we build a simple end-to-end program and learn how to structure PySpark code in a readable and intuitive fashion. We go from the data ingestion of text data to processing, to the presentation of the results, and, finally, to submitting the program in a noninteractive fashion.

Chapters 4 and 5 look at working with tabular data, the most frequently used type of data. We build on the foundation from the previous chapters (already!) to manipulate structured data to our will. At the end of part 1, you should feel comfortable about writing your own simple programs from start to finish!

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.144.105.227