Initializing SparkSession

Let's create a SparkSession object. SparkSession follows the builder design pattern, therefore we can initialize SparkSession in the following way:

SparkSession sparkSession =SparkSession.builder() 
.master("local") 
.appName("Spark Session Example") 
.getOrCreate(); 

You must have noticed that we have not created any SparkContext or SparkConf objects for initializing SparkSession. The SparkConf and SparkContext are encapsulated in SparkSession. No explicit initialization of these objects is required with SparkSession.

The SparkConf and SparkContext can be accessed using the SparkSession object as follows:

SparkContext sparkContext = sparkSession.sparkContext(); 
SparkConf conf = sparkSession.sparkContext().getConf(); 

You can run Spark SQL using SparkSession. However, Hive SQL support is not enabled by default. To enable it SparkSession can be initialized as follows:

SparkSession sparkSession = SparkSession.builder() 
.master("local") 
.appName("Spark Session Example") 
.enableHiveSupport() 
.getOrCreate(); 

As described in Chapter 3, Lets Spark, runtime configurations can be set in the SparkConf object before initializing SparkContext. Since we do not need to create SparkConf explicitly for creating SparkSession, there are configurational parameters for the Spark application that can be provided in SparkSession as follows:

SparkSession sparkSession = SparkSession.builder() 
.master("local") 
.appName("Spark Session Example") 
.config("spark.driver.memory", "2G") 
.getOrCreate(); 

After initializing a Spark object, runtime configurations can be altered as well as follows:

sparkSession.conf().set("spark.driver.memory", "3G");
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.141.42.116