SparkContext

A SparkContext is the entry point for all Spark operations and means by which the application connects to the resources of the Spark cluster. It initializes an instance of Spark and can thereafter be used to create RDDs, perform actions and transformations on the RDDs, and extract data and other Spark functionalities. A SparkContext also initializes various properties of the process, such as the application name, number of cores, memory usage parameters, and other characteristics. Collectively, these properties are contained in the object SparkConf, which is passed to SparkContext as a parameter.

SparkSession is the new abstraction through which users initiate their connection to Spark. It is a superset of the functionality provided in SparkContext prior to Spark 2.0.0. However, practitioners still use SparkSession and SparkContext interchangeably to mean one and the same entity; namely, the primary mode of interacting with Spark.SparkSession has essentially combined the functionalities of both SparkContext and HiveContext.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.135.198.174