10.5. Indexes and Performance

Indexes can be used to allow rapid access to table rows. Rather than physically sorting a table (as performed by the ORDER BY clause or PROC SORT), an index is designed to set up a logical arrangement for the data without the need to physically sort it. This has the advantage of reducing CPU and memory requirements. It also reduces data access time when using WHERE clause processing (discussed in Section 10.7).

Indexes are useful, but they do have drawbacks. As data in a table is inserted, modified, or deleted, an index must be updated to address the changes. This automatic feature requires additional CPU resources to process any changes to a table. Also, as a separate structure in its own right, an index can consume considerable storage space. As a consequence, care should be exercised not to create too many indexes but to assign indexes to only those discriminating variables in a table. Here are a few suggestions for creating indexes.

  • Sort data in ascending order on the key column prior to creating the index.

  • Sort the data by the key variable first to achieve the greatest performance improvement.

  • Sort data in ascending order by the key variable before it is appended to the table.

  • Create simple indexes, when possible, to be used by most queries.

  • Avoid creating one single index for all queries.

  • Assign indexes to the most discriminating of variables.

  • Select columns that are frequently the subject of summary functions (COUNT, SUM, AVG, MIN, MAX, etc.).

  • Only create indexes that are actually needed.

  • Avoid taxing CPU resources associated with index maintenance (maintaining an index during inserts, modifications, and deletions) by selecting columns that do not change frequently.

  • On some operating systems indexes are stored as a separate file on disk, using additional memory and disk space to store the structure.

  • To avoid excessive and unnecessary I/O operations, prior to creating an index sort data in ascending order by the most discriminating key column.

  • Attempt to define composite indexes using the most discriminating of the variables as your first variable in the index.

  • Select columns that do not have numerous null values because this results in a large percentage of rows with the same value.

Note:Indexes should only be created on tables where query search time needs to be optimized. Any unnecessary indexes may force the SAS System to expend resources needlessly—updating and reorganizing after insert, update, and delete operations are performed. And even worse, the SQL optimizer may accidentally use an index when it should not.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.118.12.157