10.7. Optimizing WHERE Clause Processing with Indexes

To get the best possible performance from programs containing SQL procedure code, an index and WHERE clause can be used together (see the list below). Using a WHERE clause restricts processing in a table to a subset of selected rows (see Chapter 2, “Working with Data in PROC SQL” for specific details). When an index exists, the SQL processor determines whether to take advantage of it during WHERE clause processing. Although the SQL processor determines whether using an index will ultimately benefit performance, when it does the result can be an improvement in processing speeds.

  • Comparison operators such as EQ (=), LT (<), GT (>), LE (<=), GE (>=), and NOT

  • Comparison operators with the IN operator

  • Comparison operators with the colon modifier (for example, NOT = :“Ab”)

  • CONTAINS operator

  • IS NULL or IS MISSING operator

  • Pattern-matching operators such as LIKE and NOT LIKE

10.7.1. Constructing Efficient Logic Conditions

When constructing a chain of AND conditions in a WHERE clause, specify the most restrictive conditional values first. This way the SQL processor expends fewer resources by bypassing rows that do not satisfy the first conditional value in the WHERE clause. For example, the first PROC SQL step below may expend more resources because the first condition “SOFTWARE” occurs even for products costing more than $99.00.

SQL Code (Less Efficient)

PROC SQL;
  SELECT *
    FROM PRODUCTS
      WHERE UPCASE(PRODTYPE) = 'SOFTWARE' AND
            PRODCOST < 100.00;
QUIT;


For this data, a more efficient way of producing the same results as the previous example, while reducing CPU resources, is to code the second and more restrictive condition first so it appears as follows.

SQL Code (More Efficient)

PROC SQL;
  SELECT *
    FROM PRODUCTS
      WHERE PRODCOST < 100.00 AND
            UPCASE(PRODTYPE) = 'SOFTWARE';
QUIT;


Another popular construct uses a series of OR conditions equality tests or the IN predicate to select rows that match the multiple conditions. Programmers often order these kinds of lists by order of magnitude or alphabetically to make the lists easier to maintain. A better and more efficient way would be to order the list from the most frequently occurring values to the least frequent.

Note:One way to determine the frequency of values is to submit the following code:
 SELECT COUNT(PRODTYPE) FROM SQL.PRODUCTS GROUP BY
 PRODTYPE;


Once the frequencies are known, they can be specified in that order. This way the SQL processor expends fewer resources locating frequently occurring values because it has to perform fewer steps to return a value of TRUE. For example, the first SQL step below expends more resources because the first condition “LAPTOP” occurs less frequently than the value “SOFTWARE”. Consequently, the SQL processor needs to process the second condition in order to find a match resulting in a value of TRUE being returned.

SQL Code (Less Efficient):

PROC SQL;
  SELECT *
    FROM PRODUCTS
      WHERE UPCASE(PRODTYPE) IN ('LAPTOP',
 'SOFTWARE'),
QUIT;


A more efficient way of processing the same data but generating the same results as the previous code is to place “SOFTWARE” first as follows:

SQL Code (More Efficient):

PROC SQL;
  SELECT *
    FROM PRODUCTS
      WHERE UPCASE(PRODTYPE) IN ('SOFTWARE',
 'LAPTOP'),
QUIT;


10.7.2. Avoiding UNIONs

UNIONs are executed by creating two internal sets, then merge-sorting the results together. Duplicate rows are automatically eliminated from the final results. For example, the SQL procedure code illustrated below first constructs the two result sets from each query, then merges and sorts the two sets together, and eliminates duplicate rows from the final results.

SQL Code (Less Efficient):

PROC SQL;
  SELECT *
    FROM PRODUCTS
      WHERE UPCASE(PRODTYPE) = 'LAPTOP'

  UNION

  SELECT *
    FROM PRODUCTS
      WHERE UPCASE(PRODTYPE) = 'SOFTWARE';
QUIT;


To improve UNION performance, SQL procedure code can be converted to a single query using OR conditions in a WHERE clause. The next example illustrates the previous SQL procedure code being made more efficient by converting the UNION to a single query using an OR operator (or IN predicate) in a WHERE clause.

SQL Code (More Efficient):

 PROC SQL;
   SELECT DISTINCT *
     FROM PRODUCTS
       WHERE UPCASE(PRODTYPE) = 'SOFTWARE' OR
												UPCASE(PRODTYPE) = 'LAPTOP';
 QUIT;

<or>

 PROC SQL;
   SELECT DISTINCT *
     FROM PRODUCTS
       WHERE UPCASE(PRODTYPE) IN ('SOFTWARE', 
'LAPTOP'),
 QUIT;


Another approach that can aid in improving the way a UNION performs is to specify the ALL keyword with the UNION operator, as long as duplicates are not an issue. Because UNION ALL does not remove duplicate rows, CPU resources may be improved. The next example shows the UNION ALL coding construct being used to perform what amounts to an append operation, thereby bypassing the sort altogether because the duplicate rows are not removed.

SQL Code (Less Efficient):

PROC SQL;
  SELECT *
    FROM PRODUCTS
      WHERE UPCASE(PRODTYPE) = 'LAPTOP'

  UNION ALL

  SELECT *
    FROM PRODUCTS
      WHERE UPCASE(PRODTYPE) = 'SOFTWARE';
QUIT;


..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.224.54.136