SQL

The SQL tab in Spark UI provides the really useful Spark SQL queries. We will learn about the working of Spark SQL framework in Chapter 8, Working with Spark SQL. For now, let us run the following commands on Spark shell. It is accessible at http://localhost:4040/SQL.

In the following example, we will read a JSON file in Spark using the SparkSession library and creating temporary views on JSON data and then we will run a Spark SQL query on the table:

scala> import org.apache.spark.sql.SparkSession
importorg.apache.spark.sql.SparkSession
scala> import spark.implicits._
importspark.implicits._

scala>val spark = SparkSession.builder().appName("Spark SQL basic example").getOrCreate()
16/11/13 22:27:51 WARN SparkSession$Builder: Use an existing SparkSession, some configuration may not take effect.
spark: org.apache.spark.sql.SparkSession = org.apache.spark.sql.SparkSession@53fd59d4

scala>valdf = spark.read.json("/usr/local/spark/examples/src/main/resources/people.json")
df: org.apache.spark.sql.DataFrame = [age: bigint, name: string]

scala>df.createOrReplaceTempView("people")

scala>valsqlDF = spark.sql("SELECT * FROM people")
sqlDF: org.apache.spark.sql.DataFrame = [age: bigint, name: string]

scala>sqlDF.show()
+----+-------+
| age| name|
+----+-------+
|null|Michael|
| 30| Andy|
| 19| Justin|
+----+-------+

After running the preceding example, details of the SQL query can be found in the SQL tab. It provides the DAG of the SQL query along with the query plan that shows how Spark optimized the executed SQL query.

The SQL tab retrieves all this information using org.apache.spark.sql.execution.ui.SQLListener:

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.144.97.187