Writing efficient queries

Before finishing this section, I also need to mention that no join, compression algorithm, or any other feature that SQL Server offers can help you if you write inefficient queries. A good example of a typical DW query is one that involves running totals. You can use non-equi self joins for such queries, which is a very good example of an inefficient query. The following code calculates the running total for the profit ordered over the sale key with a self join. The code also measures the IO and time needed to execute the query. Note that the query uses a CTE first to select 12,000 rows from the fact table. A non-equi self join is a quadratic algorithm; with double the number of the rows, the time needed increases by a factor of four. You can play with different number of rows to prove that:

SET STATISTICS IO ON; 
SET STATISTICS TIME ON; 
WITH SalesCTE AS 
( 
SELECT [Sale Key] AS SaleKey, Profit 
FROM Fact.Sale 
WHERE [Sale Key] <= 12000 
) 
SELECT S1.SaleKey, 
 MIN(S1.Profit) AS CurrentProfit,  
 SUM(S2.Profit) AS RunningTotal 
FROM SalesCTE AS S1 
 INNER JOIN SalesCTE AS S2 
  ON S1.SaleKey >= S2.SaleKey 
GROUP BY S1.SaleKey 
ORDER BY S1.SaleKey; 
SET STATISTICS IO OFF; 
SET STATISTICS TIME OFF; 

With 12,000 rows, the query needed 817,584 logical reads in a worktable, which is a temporary representation of the test fact table on the right side of the self join, and on the top of this, more than 3,000 logical reads for the left representation of the fact table. On my computer, it took more than 12 seconds (elapsed time) to execute this query, with more than 72 seconds of CPU time, as the query was executed with a parallel execution plan. With 6,000 rows, the query would need approximately four times less IO and time.

You can calculate running totals very efficiently with window aggregate functions. The following example shows the query rewritten. The new query uses the window aggregate functions:

SET STATISTICS IO ON; 
SET STATISTICS TIME ON; 
WITH SalesCTE AS 
( 
SELECT [Sale Key] AS SaleKey, Profit 
FROM Fact.Sale 
WHERE [Sale Key] <= 12000 
) 
SELECT SaleKey, 
 Profit AS CurrentProfit,  
 SUM(Profit)  
   OVER(ORDER BY SaleKey 
        ROWS BETWEEN UNBOUNDED PRECEDING 
                 AND CURRENT ROW) AS RunningTotal 
FROM SalesCTE 
ORDER BY SaleKey; 
SET STATISTICS IO OFF; 
SET STATISTICS TIME OFF; 

This time, the query used 331 reads in the fact table, 0 (zero) reads in the worktable, 0.15 second elapsed time, and 0.02 second CPU time. SQL Server didn't even bother to find a parallel plan.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.117.100.89