Chapter 14 – Explains

“A life filled with love may have some thorns, but a life empty of love will have no roses.”

Anonymous

 

EXPLAIN

image

The EXPLAIN command is a nice tool provided with the NPS database. It is designed to provide an English explanation of what steps the SPU must complete to satisfy the SQL request. The EXPLAIN is based on the SMP Host’s optimized execution plan.

The EXPLAIN is an SQL modifier; it modifies the way the SQL operates. When an SQL statement is submitted using the EXPLAIN, the SMP HOST still does the same optimization step as normal. However, instead of building the SPU execution steps, it builds the English explanation and sends it back to the client software, not to the SPU. This gives users the ability to see resource utilization, use of indices, and row and time estimates.

Therefore, it can predict a Cartesian product join in seconds instead of hours later when the user gets suspicious that the request should have been finished. The EXPLAIN should be run every time changes to an object’s structure occur, when a request is first put into production, and other key times during the life of an application. Some companies require that the EXPLAIN always be run before execution of any new queries.

 

EXPLAIN Terms

image

Above are some key terms you will see in the explain results. The first area to check in the output of the EXPLAIN is the estimated cost, expressed in time, to complete the SQL request. Although it is expressed in time, do not confuse it with either wall-clock or CPU time. It is strictly a cost factor calculated by the optimizer for comparison purposes only. It does not take the number of users, the current workload, or other system related factors into account. After looking at the potential execution plans, the plan with the lowest cost value is selected for execution. Once these two values are checked, the question that should be asked is: Are these values reasonable?

 

EXPLAIN Terms Continued

image

Above are some key terms you will see in the explain results. The first are of concern should be in checking the cost. The second area is the estimated number of rows that will be returned. This number is an educated guess that the SMP Host has made based on information available at the time of the EXPLAIN. This number may or may not be accurate. If there are current STATISTICS on the table, the numbers are more accurate. If you expect 100 rows and it says 1,000,000,000, then this is when you recheck your query. You run an EXPLAIN to ensure your query does what you expect it to do.

 

EXPLAIN Syntax

image

You run the EXPLAIN on queries to see the optimizers plan. It will show a wide variety of detail so you can better understand what Netezza will do to retrieve the data. Some companies require a user to always look at the explain plan before executing any query. This chapter will show you how to better understand explains.

 

EXPLAIN Example

image

The EXPLAIN will show you the cost to retrieve the first row and the cost to retrieve all rows. The 0.0 means really fast. This is a small table. It only has 21 rows, and there is not a lot of width to the table. That means the columns are few and small. A Sequential Scan means all SPUs are going to read all rows. This is a full table scan.

 

EXPLAIN Verbose Example

image

Use the word Verbose in the EXPLAIN, and more information is delivered.

 

EXPLAIN Example For A Join

image

We just wrote the SQL to perform a two-table join between the Employee_Table and the Department_Table. The system is going to do a Hash Join. This means it will place all of the joining rows inside the memory of each SPU and quickly join them together. Hash Joins are fast because the table or tables are small enough to fit into memory entirely.

 

EXPLAIN Verbose Example For A Join

image

This query will do a full table scan of the Department_Table and eliminate any rows where the Dept_No is Null. What is important is that joining rows must be joined on the same SPU. To accomplish this, the optimizer is going to broadcast the Department_Table (in its entirety) to every SPU (just in memory for the duration of this query). Now that the Department_Table has been copied to every SPU, the planner will do a full table scan of the Employee_Table (eliminating rows where Dept_No is Null) and then perform the join. Consider using EXPLAIN VERBOSE on queries that join tables.

 

Good Advice - Join Tables by the Same Distribution Key

image

Notice that both tables have a Distribution Key of Dept_No. We will join these two tables together by Dept_No. Turn the page and see the EXPLAIN VERBOSE plan. You will notice that since the tables are both Hash Distributed on Dept_No, and it is the column Dept_No that will be used to join them together, that no data moves. This is because the joining rows are already on the exact same SPU. For example, let’s assume all the employees in Dept_No 400 were hashed to SPU 2. That means the Department_Table row for 400 was also hashed to SPU 2. The joining rows are naturally on the same SPU.

 

EXPLAIN Verbose - Join With Matching Distribution Keys

image

There is no data movement because both tables have a distribution key of Dept_No and they are joined by Dept_No.

 

EXPLAIN DISTRIBUTION

image

The EXPLAIN DISTRIBUTION will show how the result set will be distributed among the SPUs. In the case above, the system will DISTRIBUTE by Dept_No among the SPUs to satisfy this query.

 

EXPLAIN PLANTEXT Example

image

The EXPLAIN PLANTEXT shows more than just using the EXPLAIN by itself, but shows less than EXPLAIN VERBOSE.

 

EXPLAIN PLANGRAPH Example

image

The EXPLAIN PLANGRAPH shows the EXPLAIN in HTML

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.226.88.110