Transformations and Actions

Transformations and actions are the main building blocks of an Apache Spark program. In this chapter, we will look at Spark transformations to defer computations and then look at which transformations should be avoided. We will then use the reduce and reduceByKey methods to carry out calculations from a dataset. We will then perform actions that trigger actual computations on graphs. By the end of this chapter, we will also have learned how to reuse the same rdd for different actions.

In this chapter, we will cover the following topics:

  • Using Spark transformations to defer computations to a later time
  • Avoiding transformations
  • Using the reduce and reduceByKey methods to calculate the result
  • Performing actions that trigger actual computations of our Directed Acyclic Graph (DAG)
  • Reusing the same rdd for different actions
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.144.28.70