Chapter 5. Intro to Query Tuning

The topic of query optimization and tuning is easily worth another book. Indeed, there are many books available already and I encourage you to read them and master your skills. I will not try to duplicate them here; instead, this chapter will cover some of the most important concepts you need to understand to tune the queries.

You cannot master the process of query optimization without understanding the internal index structure and patterns that SQL Server uses to access data. This chapter thus begins with a high-level overview of B-Tree indexes and seek-and-scan operations.

Next, I discuss statistics and cardinality estimations, along with ways to read and analyze execution plans.

Finally, I cover several common issues you might encounter during the query tuning process, offering advice on how to address them and index the data.

Data Storage and Access Patterns

Modern SQL Server versions support three data storage and processing technologies. The oldest and most commonly used one is row-based storage. With row-based storage, all table columns are combined together into the data rows that reside on 8KB data pages. Logically, those data rows belong to B-Tree indexes or heaps (which we’ll discuss in a moment).

Starting with SQL Server 2012, you can store some indexes or entire tables in columnar format using column-based storage. The data in such indexes is heavily compressed and stored on a per-column basis. This technology is optimized and provides great performance for read-only analytical queries that scan large amounts of data. Unfortunately, it does not scale well in an OLTP workload.

Finally, starting with SQL Server 2014, you can use In-Memory OLTP and store data in memory-optimized tables. The data in such tables resides completely in memory and is great for heavy OLTP workloads.

Note

You can use all three technologies—row-based, column-based, and memory-optimized tables—together, partitioning data between them. This approach is extremely useful when you need to support heavy OLTP and analytical workloads in the same system. I cover that architecture pattern in detail in my book Pro SQL Server Internals.

Row-based storage is the default and by far most common storage technology in SQL Server. The CREATE TABLE and CREATE INDEX statements will store data in a row-based format unless you specify otherwise. It can handle moderate OLTP and analytical workloads and introduces less database administration overhead than columnstore indexes and In-Memory OLTP.

In this chapter, I will focus on row-based storage and queries that work with B-Tree indexes. I will discuss troubleshooting aspects of columnstore indexes and In-Memory OLTP in Chapters 8 and 14.

Let’s look at how SQL Server stores data in row-based storage.

Row-Based Storage Tables

Internally, the structure of a row-based table consists of multiple elements and internal objects, as shown in Figure 5-1.

Figure 5-1. Internal table structure

The data in the tables is stored either completely unsorted (those tables are called heap tables or heaps) or sorted based on the value of a clustered index key, when such an index is defined.

I am not going to dive deep into detail, but as a general rule, it is better to avoid heaps and define clustered indexes on your tables. There are some edge cases when heap tables may outperform tables with clustered indexes; nevertheless, heaps have several shortcomings. In most cases, you’ll get better performance when tables have clustered indexes.

In addition to a single clustered index, every table may have a set of nonclustered indexes: separate data structures that store copies of the data from a table sorted according to index key columns. For example, if a column is included in two nonclustered indexes, SQL Server would store that data three times - once in a clustered index or heap, and once in each nonclustered index.

While SQL Server allows you to create large numbers of nonclustered indexes, doing so is not a good idea. In addition to storage overhead, SQL Server needs to insert, update, or delete data in each nonclustered index during data modifications maintaining multiple copies of the data.

Internally, each index (and heap) consists of one or more partitions. You can think of each partition as an internal data structure (index or heap) that is independent from other partitions in the table. You can use a different partition strategy for every index in the table; however, it is usually beneficial to partition all indexes in the same way, aligning them with each other.

As I mentioned above, the actual data is stored in data rows on 8KB data pages with 8,060 bytes available to users. The data from all columns is stored together with exception when column data does not fit on the data page.

The data pages combine into three different categories called allocation units.

IN_ROW_DATA allocation unit pages store the main data row objects, which consist of internal attributes and the data from fixed-length columns (such as int, datetime, float, etc.). The in-row part of a data row must fit on a single data page, so it cannot exceed 8,060 bytes. The data from variable-length columns, such as (n)varchar(max), (n)varbinary(max), xml and others, may also be stored in-row in the main row object when it fits into this limit.

When variable-length data does not fit in-row, SQL Server stores it off-row on different data pages, referencing them through in-row pointers. Variable-length data that exceeds 8,000 bytes is stored on LOB_DATA allocation unit data pages (LOB stands for large objects). Otherwise, the data is stored in ROW_OVERFLOW_DATA allocation unit pages.

I’d like to repeat a well-known piece of advice here: Do not use retrieve unnecessary columns in SELECT statements, especially with the SELECT * pattern. This may lead to additional I/O operations to get data from off-row pages, and may also defer usage of covered indexes, as you’ll see later in the chapter.

Finally, SQL Server logically groups sets of eight pages into 64KB units called extents. There are two types of extents available. Mixed extents store data that belongs to different objects. Uniform extents store the data for the same object. By default, when a new object is created, SQL Server stores the first eight object pages in mixed extents. After that, all subsequent space allocation for that object is done with uniform extents.

You can disable mixed extents allocation with server-level trace flag T1118. In SQL Server 2016 and above, you can control it on the database level with MIXED_PAGE_ALLOCATION database option. Turning mixed extents off will reduce the number of modifications in the system tables when a new table is created. In users’ databases, doing so rarely gives you noticeable benefits; however, it may significantly improve tempdb throughput in busy OLTP systems. You can disable mixed extents allocation with trace flag T1118 in old versions of SQL Server (prior to 2016). From SQL Server 2016 on, tempdb stopped using mixed extents, so you don’t need to enable that trace flag in the system.

Next, let’s look at the structure of the indexes.

B-Tree Indexes

Clustered and nonclustered indexes have a very similar internal format called B-Tree. Let’s create an example table called Customers, defined in Listing 5-1. The table has the clustered index defined on CustomerId and nonclustered index on Name columns.

Example 5-1. Customers table
CREATE TABLE dbo.Customers
(
    CustomerId INT NOT NULL,
    Name NVARCHAR(64) NOT NULL,
    Phone VARCHAR(32) NULL, 
    /* Other Columns */ 
);
CREATE UNIQUE CLUSTERED INDEX IDX_Customers_CustomerId
ON dbo.Customers(CustomerId);
CREATE NONCLUSTERED INDEX IDX_Customers_Name
ON dbo.Customers(Name);

The logical structure of the clustered index is shown in Figure 5-2.

Figure 5-2. B-Tree index

The bottom level of the index is called the leaf level. It stores the data sorted according to the index key value. If it is a clustered index, the leaf level stores all data from the table sorted based on the clustered key. To be exact, the leaf level includes IN_ROW data only, which may reference off-row column data on the other pages.

If all data in the index fits into a single data page, the index will consist of that single leaf page. Otherwise, SQL Server will start to build intermediate levels of the index. Each row on an intermediate level page references the page from the level below and contains the minimal value of the key in the referenced page, as well as its physical address (FileId:PageId) in the database. The only exception is the very first row, which stores NULL instead of the minimal value of the key.

SQL Server continues to build intermediate levels until it ends with a level with a single page. This level is called the root level; it is the entry point to the index.

The pages on each level of the index are linked into the double-linked list. Each page knows the previous and the next page in the index. This allows SQL Server to scan the indexes forward and backward. (Keep in mind, however, that the backward scan may be less efficient, since SQL Server does not use parallelism during that operation.)

SQL Server can access data in the index either through index scan or index seek. With scans, there are two ways SQL Server can do that.

The first is an allocation order scan. SQL Server tracks the extents that belong to each index in the database through system pages called Index Allocation Maps (IAM). It reads the data pages from the index in random order according to index allocation data. This method, however, could introduce data consistency problems and is rarely used.

The second, more common method is called an ordered scan. Let’s assume that you want to run the SELECT Name FROM dbo.Customers query. All data rows reside on the leaf level of the index and SQL Server can scan it and return the rows to the client.

SQL Server starts with the root page of the index and reads the first row from there. That row references the intermediate page with the minimum key value from the table. SQL Server reads that page and repeats the process until it finds the first page on the leaf level. Then SQL Server starts to read rows one by one, moving through the linked list of the pages until all rows have been read (Figure 5-3).

Figure 5-3. Index scan

Obviously, in real life, it may become more complicated. In some cases, a query may simultaneously scan multiple parts of the index with parallel execution plans. In others, SQL Server may combine multiple index scans of simultaneously running queries together into the single physical index scan. Nevertheless, when you see the Index Scan operator in the execution plan, you can assume that this operator will access all data from the index.

There is one exception, however: when the plan has an index scan immediately following the Top operator. In that case, the scan operator will stop after it returns the number of rows requested by TOP and will not access the entire table. Usually, this happens if your query does not have an ORDER BY clause, or if the ORDER BY clause matches the index key.

Figure 5-4 shows part of the execution plan of SELECT TOP 3 Name FROM dbo.Customers ORDER BY CustomerId query. The Number of Rows Read and Actual Number of Rows properties in the Index Scan operator indicate that the scan stopped after it read three rows.

Figure 5-4. Top and Index Scan operators
Top and Index Scan operators

As you can imagine, reading all data from the large index is an expensive operation. Fortunately, SQL Server can access subset of the data by using the index seek operation. Say you want to run the following query: SELECT Name FROM dbo.Customers WHERE CustomerId BETWEEN 4 AND 7. Figure 5-5 illustrates how SQL Server might process it.

Figure 5-5. Index seek

In order to read the range of rows from the table, SQL Server needs to find the row with the minimum value of the key from the range, which is 4. SQL Server starts with the root page, where the second row references a page with a minimum key value of 350. That is greater than the key value you’re looking for, so SQL Server reads the intermediate-level data page (1:170) referenced by the first row on the root page.

Similarly, the intermediate page leads SQL Server to the first leaf-level page (1:176). SQL Server reads that page, then it reads the rows with CustomerId equal 4 and 5, and finally, it reads the two remaining rows from the second page.

Technically speaking, there are two kinds of index seek operations:

Point-lookup
The first is called a point-lookup (or, sometimes, singleton lookup) where SQL Server seeks and returns a single row: for example, the WHERE CustomerId = 2 predicate is point-lookup operation.
Range scan
The other type is called a range scan. It requires SQL Server to find the lowest or highest value of the key and scan the set of rows (either forward or backward) until it reaches the end of the scan range. The predicate WHERE CustomerId BETWEEN 4 AND 7 leads to the range scan. Both cases are shown as Index Seek operators in the execution plans.

As you can guess, index seek is more efficient than index scan because SQL Server usually processes just a subset of rows and data pages, rather than scanning the entire index. However, the Index Seek operator in the execution plan may be misleading and represent an inefficient range scan that reads a large number of rows, or even the entire index. I will talk about this condition later in the chapter.

There is a concept in relational databases called SARGable predicates, which stands for Search Argument-able. SARGable predicates allow SQL Server to isolate a subset of the index key to process. In a nutshell, with SARGable predicate, SQL Server can determine a single key value or a range of index key values to read during a predicate evaluation and utilize Index Seek operation when the index exists.

Obviously, it is beneficial to write queries using SARGable predicates and utilize index seek whenever possible. This is done using operators, which include =, >, >=, <, <=, IN, BETWEEN, and LIKE (for prefix matching). Non-SARGable operators include NOT, <>, LIKE (when not prefix matching), and NOT IN.

Predicates are also non-SARGable when using functions (system or non-inlined user-defined) against the table columns. SQL Server must call the function for every row it processes to evaluate the predicate. This prevents you from using an index seek.

The same applies to data type conversions where SQL Server uses the CONVERT_IMPLICIT internal function. One common example is using the Unicode nvarchar parameter in the predicate with a varchar column. Another case is when you have different data types in the columns that participate in the join predicate. Both cases could lead to an index scan, even when the predicate operator appears to be SARGable.

Composite Indexes

Indexes with multiple key columns are called composite indexes. The data in the composite indexes is sorted per column, from left to right. Figure 5-6 shows the structure of a composite index defined on LastName and FirstName columns in the table. The data is first sorted based on LastName (the leftmost column), then on FirstName within each LastName value.

Figure 5-6. Composite indexes

The SARGability of a composite index depends on the SARGability of the predicates on the leftmost index columns, which allow SQL Server to determine and isolate the range of the index keys to process.

Table 5-1 shows examples of SARGable and non-SARGable predicates, using the index from Figure 5-6.

Table 5-1. Table 5-1. Composite index and SARGability
SARGable predicates Non-SARGable predicates
LastName = ‘Clark’ AND FirstName = ‘Steve’ LastName <> ‘Clark’ AND FirstName = ‘Steve’
LastName = ‘Clark’ AND FirstName <> ‘Steve’ LastName LIKE '% ar %' AND FirstName = ‘Steve’
LastName = ‘Clark’ FirstName = ‘Steve’
LastName LIKE ‘Cl%'

Nonclustered Indexes

While a clustered index specifies how data rows are sorted in a table, nonclustered indexes define a separate sorting order for a column or set of columns, storing them as separate data structures.

Think about a book, for example. Its page numbers represent the book’s clustered index. The term index—that’s the one labeled “Index” at the end of the book—lists terms from the book in alphabetical order. Each term references the numbers of each page where the term is mentioned. It is thus a nonclustered index of the terms.

When you need to find a term in the book, you can look it up in the term index. It is a fast and efficient operation, because terms are sorted in alphabetical order. Next, you can quickly find the pages on which the terms are mentioned using the page numbers specified there. Without the term index, your only choice would to read the entire book, page by page, until you find all references to the term.

As I have noted, clustered and nonclustered indexes use a similar B-Tree structure. Figure 5-7 shows the structure of a nonclustered index (right) on the Name column we created in Listing 5-1. It also shows the clustered index (left), for reference.

Figure 5-7. Clustered and nonclustered indexes in Customers table

The leaf level of the nonclustered index is sorted based on the value of the index key (Name). Every row on the leaf level includes the key value and row-id value. For tables with a clustered index, row-id represents the value of the clustered index key of the row.

This is a very important thing to remember: nonclustered indexes do not store information about physical row location when a table has a clustered index defined. They store the value of the clustered index key instead. This also means that nonclustered indexes include the data from clustered index key columns even if you don’t explicitly add those columns to the index definition.

Like clustered indexes, the intermediate and root levels of nonclustered indexes store one row per page from the level they reference. That row consists of the physical address and the minimum value of the key from the page. In non-unique indexes, it also stores the row-id of such a row.

Let’s look at how SQL Server uses nonclustered indexes. I’ll run the following query: SELECT Name, Phone FROM dbo.Customers WHERE Name = ‘Boris’. Figure 5-8 shows that process.

Figure 5-8. Nonclustered Index Usage: Part 1

Similar to the clustered index, SQL Server starts with the root page of the nonclustered index. The key value Boris is less than Dan, so SQL Server goes to the intermediate page referenced from the first row in the root-level page.

The second row of the intermediate page indicates that the minimum key value on the page is Boris, although the index had not been defined as unique and SQL Server does not know if there are other Boris rows stored on the first page. As a result, it goes to the first leaf page of the index and finds the row with the key value Boris and row-id equaling 7 there.

In our case, the nonclustered index does not store any data besides CustomerId and Name, and SQL Server needs to traverse the clustered index tree and obtain the data for Phone column from there. This operation is called key lookup (RID lookup in heap tables).

In the next step shown in Figure 5-9, SQL Server comes back to the nonclustered index and reads the second page from the leaf level. It finds another row with the key value Boris and row-id of 93712, and it performs key lookup again.

Figure 5-9. Nonclustered Index Usage: Part 2

As you can see from Figure 5-9, SQL Server had to perform 10 reads even though query returned just two rows. The number of I/O operations can be calculated based on the following formula:

(# of levels in nonclustered index) + (number of pages read from the leaf level of nonclustered index) + (number of rows found) * (# of levels in clustered index)

A large number of rows found (key lookup operations) leads to a large number of I/O operations, which makes using a nonclustered index inefficient.

As a result, SQL Server is very conservative in choosing nonclustered indexes when it expects that a large number of key lookup operations will be required. It may choose to scan a clustered or another nonclustered index instead. The threshold when SQL Server decides not to use a nonclustered index with key lookup varies, but it is very low – often a fraction of a percent of the total number of rows in the table.

The same applies to RID lookup operations. Nonclustered indexes in heap tables store physical address of the row in row-id. Technically, SQL Server can access the row in a heap though the single read operation; however, it is still expensive. Moreover, if the new version of the row does not fit into the old data page during an update, SQL Server will move it to another place, referencing it through another structure called forwarding pointer, which contains the address of the new (updated) version of the row. Nonclustered indexes will continue to reference forwarding pointers in row-id, and the RID lookup may lead to multiple read operations to access the row.

Index Fragmentation

SQL Server always maintains the order of the data in the index, inserting new rows on the data pages to which they belong. If the data page does not have enough free space, SQL Server allocates a new page and places the row there, adjusting the pointers in the double-linked page list to maintain logical sorting order in the index. This operation is called page split, and it leads to index fragmentation, as you’ll see in this section.

Figure 5-10 illustrates this condition. When the original page does not have enough space to accommodate the new row, SQL Server performs a page split, moving about half of the data from original page to the new page and adjusting page pointers afterward.

Figure 5-10. Page split

Page splits can also occur during data modifications. When an update cannot be done in place—for example, during a data row size increase—SQL Server performs a page split and moves updated and subsequent rows from that page to another page. It maintains the index sorting order through the page pointers.

There are two kinds of index fragmentation – internal and external.

External fragmentation
External fragmentation means that the logical order of the pages does not match their physical order in the data files, and/or that logically subsequent pages are not located in the same or adjacent extents. External fragmentation forces SQL Server to jump around while reading the data from the disk, which makes read-ahead less efficient and increases the number of physical reads required. The impact is higher with magnetic drives where random I/O is less efficient than sequential I/O.
Internal fragmentation
Internal fragmentation, on the other hand, means that data pages in the index have free space. As a result, the index uses more data pages to store data, which in turn increases the number of logical reads during query execution. In addition, SQL Server uses more memory in the buffer pool to cache index pages.

A small degree of internal fragmentation is not always bad. It reduces page splits during insert and update operations, when data is inserted into or updated in different pages in the index. A large degree of internal fragmentation, however, wastes index space and reduces the performance of the system.

You can analyze index fragmentation in the system with sys.dm_db_index_physical_stats view. The three most important columns from the result set are:

avg_page_space_used_in_percent
avg_page_space_used_in_percent shows the average percentage of the data storage space used on the page. This value shows you the internal index fragmentation.
avg_fragmentation_in_percent
avg_fragmentation_in_percent provides you with information about external index fragmentation. For tables with clustered indexes, it indicates the percent of out-of-order pages when the next physical page allocated in the index is different from the page referenced by the next-page pointer of the current page. For heap tables, it indicates the percent of out-of-order extents, when extents are not residing continuously in data files.
fragment_count
fragment_count indicates how many continuous data fragments the index has. Every fragment constitutes the group of extents adjacent to each other. Adjacent data increases the chances that SQL Server will use sequential I/O and Read-Ahead while accessing the data.

The impact of index fragmentation can be offset by modern hardware, when servers have enough memory to cache the data in the buffer pool and fast flash-based I/O subsystems to read the data. While it is always beneficial to reduce fragmentation in the system, you need to analyze its impact when designing your index maintenance strategy.

To put things in perspective: If your system has low-activity hours during nights or weekends, use them for index maintenance. However, if your system handles thousands of transactions per second around the clock, do the analysis and estimate the benefits and downsides of different index maintenance strategies. Remember that index maintenance is an expensive operation and will add overhead to the system while it’s running.

There are two index maintenance methods that reduce fragmentation: index reorganize and index rebuild. Let’s look at each in turn.

Index reorganize
Index reorganize, often called index defragmentation, reorders leaf-level data pages into their logical order. It also tries to compress pages, reducing their internal fragmentation. This is an online operation that can be interrupted at any time without losing the operation’s progress up to the point of interruption. You can also reorganize indexes with the ALTER INDEX REORGANIZE command.
Index rebuild
Index rebuild (ALTER INDEX REBUILD), on the other hand, creates another copy of the index in the table. It is an offline operation, which will lock the table in non-Enterprise editions of SQL Server. In the Enterprise edition it can be done online, though it will still require a short table-level lock at the beginning and end of execution.
Microsoft documentation recommends rebuilding indexes if their external fragmentation (avg_fragmentation_in_percent) exceeds 30% and reorganize indexes for fragmentation between 5% and 30%. You can use those values as a rule of thumb; however, as I mentioned, it may be better to analyze and tune for your own use-cases.

Pay attention to the FILLFACTOR index property, which allows you to reserve some free space during index creation or rebuild, reducing page splits afterwards. Unless you have an ever-increasing append-only index, you should set FILLFACTOR below 100%. I usually start with 85 or 90% and fine-tune the values to get the least internal and external fragmentation in the index.

Finally, in heap tables, sys.dm_db_index_physical_stats view provides the information about forwarding pointers with the forwarded_record_count column. Tables with a large number of forwarding pointers are inefficient and need to be rebuilt with the ALTER TABLE REBUILD operation. However, the better option in most cases is converting them into clustered index tables.

Note

Ola Hallengren has provided a set of scripts that have become the de facto standard for database maintenance tasks. Consider using them in your systems.

Statistics and Cardinality Estimation

SQL Server stores information about data distribution in the index in internal objects called statistics. By default, SQL Server creates statistics for each index in the database and uses it during query optimization. Let’s look what information is stored in the statistics first.

Listing 5-2 creates a table with clustered and nonclustered indexes and populates it with some data. Finally, it provides information about the statistics using the DBCC SHOW_STATISTICS command.

Example 5-2. Examining statistics
CREATE TABLE dbo.DBObjects
(
    ID INT NOT NULL IDENTITY(1,1),
    Name SYSNAME NOT NULL,
    CreateDate DATETIME NOT NULL
); 
CREATE UNIQUE CLUSTERED INDEX IDX_DBObjects_ID 
ON dbo.DBObjects(ID);
 
INSERT INTO dbo.DBObjects(Name,CreateDate)
    SELECT name, create_date FROM sys.objects ORDER BY name;
 
-- Creating some duplicate values
INSERT INTO dbo.DBObjects(Name, CreateDate)
    SELECT t1.Name, t1.CreateDate
    FROM dbo.DBObjects t1 CROSS JOIN dbo.DBObjects t2
    WHERE t1.ID = 5 AND t2.ID between 1 AND 20;
 
CREATE NONCLUSTERED INDEX IDX_DBObjects_Name_CreateDate
ON dbo.DBObjectsName, CreateDate);
 
DBCC SHOW_STATISTICS('dbo.DBObjects','IDX_DBObjects_Name_CreateDate');

Figure 5-11 shows the output of the listing (you may get different results in your system).

Figure 5-11. Statistics

As you can see, the DBCC SHOW_STATISTICS command returns three result sets. The first contains general metadata information about statistics, such as name, update date, number of rows in the index at the time when the statistics were updated, and so on.

The second result set, called density vector, contains information about density for the combination of key values from the statistics (index). It is calculated based on the formula (1 / number of distinct values), and it indicates how many rows on average every combination of key values has. It is worth noting that although IDX_DBObjects_Name_CreateDate index has two index keys, row-id (clustered index column)- ID- also presents in the index and is returned in the density vector.

The last and most important result set is called the histogram. It provides information about data distribution in the index. Each record in the histogram, called a histogram step, includes the sample key value from the left-most column from the statistics (index) and information about data distribution in the interval of values from the preceding to the current RANGE_HI_KEY value. It also includes the estimated number of rows in the interval (RANGE_ROWS), number of rows with key value equal to RANGE_HI_KEY (EQ_ROWS), number of distinct key values in the interval (DISTINCT_RANGE_ROWS), and average of rows per distinct key values (AVG_RANGE_ROWS).

SQL Server uses statistics information during query optimization estimating the number of rows that each operator in the execution plan would process and return to the next operator there. That process is called cardinality estimation.

Cardinality estimation greatly affects the execution plan. SQL Server uses it to choose the sequence of operators in the plan, indexes to access the data, type of join operators, and many other things. The efficiency of its execution plans greatly depends on the correctness of its cardinality estimation, and therefore on accurate statistics in the system.

There are three things you need to remember about statistics. First, and most important, SQL Server maintains the histogram and has information about data distribution only for left-most column of the index. There is no information about data distribution for other index columns or for combinations of index column values.

The common advice you’ll hear suggests using the most selective column as the left-most column in the composite indexes. While following this advice may sometimes improve the quality of cardinality estimations, don’t follow it blindly. You need to analyze the queries, making sure that the predicates in left-most columns are SARGable and support efficient index seek operations.

The second important thing to remember about statistics is that the histogram stores, at most, 200 steps, regardless of the table size and whether the table is partitioned or not. This can affect cardinality estimations in large tables with uneven data distribution, since each step stores information about larger key intervals.

Finally, you need to know how SQL Server updates statistics. In databases with a compatibility level below 130 (as of SQL Server 2016), statistics are only updated automatically after 20 percent of the data in the index has changed. For example, in a table with 100 million rows, you would need to insert, delete or update index key columns in 20 million rows before an automatic update is triggered. This means that in large tables, statistics are rarely updated automatically and tend to become inaccurate over time.

Starting with a database compatibility level of 130, the statistics update threshold becomes dynamic. The percentage of changes that triggers the statistics update becomes smaller as the amount of the data in the table grows. You can force the same behavior for databases with older compatibility levels and in old versions of SQL Server with trace flag T2371. This is one of the trace flags I enable in every system.

Statistics Maintenance

Accurate, up-to-date statistics improve system performance. Analyze your statistics maintenance strategy when you perform system troubleshooting, and validate whether it provides you with accurate information.

You can rely on automatic statistics updates, maintain statistics manually, or combine both approaches. Index maintenance also affects statistics maintenance strategy, since index rebuild automatically updates statistics in the index. Index reorg, on the other hand, does not update it.

You can control whether SQL Server creates and updates statistics automatically at the database level with the Auto Create Statistics and Auto Update Statistics database options. When these are enabled, SQL Server automatically maintains statistics on all indexes except those that have STATISTICS_NORECOMPUTE enabled (it is disabled by default).

SQL Server may use different methods to update statistics. By default, it just samples the data from the index. This approach is lightweight, but it does not always provide accurate results. Alternatively, you can update statistics using the UPDATE STATISTICS WITH FULLSCAN statement, which will read the entire index.

You can also update the statistics specifying percent or number of rows to sample with UPDATE STATISTICS WITH SAMPLE statement. Obviously, the more data you read, the more I/O overhead you’ll have on large indexes.

During query compilation, SQL Server detects whether statistics are outdated and may update them synchronously or asynchronously, based on the selected Auto Update Statistics Asynchronously database option. With synchronous updates, Query Optimizer defers query compilation until the update is done. With asynchronous updates, the query is optimized using old statistics while statistics are updated in background. You can keep your default synchronous statistics update unless your system requires extremely low response time from the queries.

The default automatic statistics maintenance is acceptable in many cases, as long as the database has a compatibility level of 130 or above, or T2371 is set. However, in some cases, you can also update statistics of the key indexes manually and/or run statistics update with FULLSCAN after hours.

It is usually beneficial to update statistics on the filtered indexes manually. Modifications of filtered columns do not count towards the statistics update threshold, which may make automatic statistics maintenance inefficient.

Note

Filtered indexes allow you to filter subsets of data in the table, reducing index size and index maintenance cost. Read about them in the Microsoft documentation.

Listing 5-3 shows you how to view statistics properties, such as when statistics were last updated and how many changes in the data have occurred since the last update. You can use it as part of your custom statistics maintenance in the system, if needed.

Example 5-3. Analyzing statistics properties
SELECT
    s.stats_id AS [Stat ID]
    ,sc.name + '.' + t.name AS [Table]
    ,s.name AS [Statistics]
    ,p.last_updated
    ,p.rows
    ,p.rows_sampled
    ,p.modification_counter AS [Mod Count]
FROM
    sys.stats s JOIN sys.tables t ON 
        s.object_id = t.object_id
    JOIN sys.schemas sc ON
        t.schema_id = sc.schema_id
    OUTER APPLY
        sys.dm_db_stats_properties(t.object_id,s.stats_id) p
ORDER BY
    p.last_updated
Note

This section barely scratches the surface of statistics and their maintenance—I strongly recommend reading the Microsoft documentation to learn more about it.

Cardinality Estimation Models

As you already know, the quality of query optimization depends on accurate cardinality estimations. SQL Server must correctly estimate the number of rows in each step of query execution to generate an efficient execution plan. Accurate statistics go a long way in improving the estimations; however, they are just part of the picture.

During the cardinality estimation process, Query Optimizer relies on a set of assumptions that cover, among other things:

  • data distribution in the tables
  • the impact of different operators and predicates on the size of the output
  • relations between multiple predicates in a single table
  • correlation of the data in multiple tables during joins

These assumptions, along with cardinality estimation algorithms, define the cardinality estimation model used during optimization.

The original (legacy) cardinality estimation model was initially developed for SQL Server 7.0 and used exclusively until the release of SQL Server 2014. Aside from some minor improvements across versions, the model remained conceptually the same.

In SQL Server 2014, Microsoft released a new cardinality estimation model enabled in the databases with compatibility level of 120. That model uses different assumptions, which lead to different cardinality estimations and execution plans.

It is impossible to tell which model is better. Some queries behave better with the new model; others may regress when you upgrade. You can continue to use the legacy cardinality estimation model with new versions of SQL Server; however, it may be beneficial to upgrade at some point. Microsoft says it is not going to remove legacy model from SQL Server in the future, but it won’t be enhanced, either.

Unfortunately, that’s easier said than done, especially in large and complex systems. Changing the model may lead to massive changes in the execution plans, so you need to be prepared to detect and address regressions quickly. Fortunately, Query Store can simplify the process. You can collect the data before the change and force SQL Server to use old execution plans for those queries that regressed under the new model. Obviously, you’ll still need to analyze and optimize them later.

You can control the cardinality estimation model with the database compatibility level. Keep in mind that new model may behave slightly differently in each compatibility level, starting with 120 (SQL Server 2014). Legacy models, on the other hand, will behave the same in each SQL Server version. Enabling the QUERY_OPTIMIZER_HOTFIXES database setting or setting T4199 may also affect the estimations.

In SQL Server 2014, you can control the model with database compatibility levels or with trace flags. T2312 and T9481 force SQL Server to use new and legacy models respectively ignoring database compatibility level. In SQL Server 2016 and above, you can choose the model by setting the LEGACY_CARDINALITY_ESTIMATION database option.

When you perform the SQL Server version upgrade, I recommend doing it in phases to reduce the risk of regression. First, upgrade the server version, keeping the old cardinality estimation model in place. Validate that everything works as expected after the upgrade. Then you can consider changing the model. As mentioned, use Query Store as part of that process.

As a general rule, I do not recommend switching to the new cardinality estimation model in SQL Server 2014. I encountered several bugs in early builds of this version, which led to more regressed queries. If you do switch, install the latest service pack, enable T4199 and carefully test the system.

Analyzing Your Execution Plan

The query optimization process in SQL Server, done by the Query Optimizer, generates a query execution plan. This plan consists of multiple operators that access and manipulate the data, achieving results for the query. The query tuning process, in a nutshell, requires us to analyze and improve execution plans for the queries.

Even though every database engineer is familiar with execution plans, I’d like to discuss several things related to query tuning. First, we need to look how SQL Server executes operators in the plan.

Row Mode and Batch Mode Execution

SQL Server has two processing methods for queries. The default, row mode, is traditionally used with row-based storage and B-Tree indexes. In this mode, each operator in the execution plan processes data rows one at a time, requesting them from child operators when needed.

Let’s look at the simple example query shown in Listing 5-4.

Example 5-4. Row mode execution: Sample query
SELECT TOP 10 c.CustomerId, c.Name, a.Street, a.City, a.State, a.ZipCode
FROM
    dbo.Customers c JOIN dbo.Addresses a ON
        c.PrimaryAddressId = a.AddressId
ORDER BY 
    c.Name

This query would produce the execution plan shown in Figure 5-12. SQL Server selects all of the data from the Customers table, sorts it based on the Name column, gets the first 10 rows, joins it with the Addresses data, and returns it to the client.

Figure 5-12. Row mode execution: Getting the first row

Let’s analyze how SQL Server executes a query. The Select operator, which is the parent operator in the execution plan, calls the GetRow() method of the Top operator. The Top operator, in turn, calls the GetRow() method of the Nested Loop Join.

A Join operator gets the data from two different inputs. First, it calls the GetRow() method of the Sort operator. In order to sort, SQL Server needs to read all of the rows first. So the Sort operation calls the GetRow() method of the Clustered Index Scan operator multiple times, accumulating the results. The Scan operator, which is the lowest operator in the execution plan tree, returns one row from the Customers table per call. Figure 5-12 shows just two GetRow() calls, for simplicity’s sake.

When all of the data from the Customers table has been read, the Sort operator performs sorting and returns the first row back to the Join operator, which calls the GetRow() method of the Clustered Index Seek operator on the Addresses table after that. If there is a match, the Join operator concatenates data from both inputs and passes the resulting row back to the Top operator, which, in turn, passes it to Select.

The Select operator returns a row to the client and requests the next row by calling the GetRow() method of the Top operator again. The process repeats until the first 10 rows are selected. All operators keep their state and the Sort operator preserves the sorted data. It does not need to access the Clustered Index Scan operator again, as shown in Figure 5-13.

Figure 5-13. Row mode execution: Getting the next row

Each operator in the execution plan has multiple properties, the names of which may vary slightly in different versions of SSMS and in other applications. Let’s look at the most important ones.

Actual Number of Rows and Number of Rows Read
Those two properties illustrate how many rows were returned by the operator and how many rows were processed during execution. For example, Index Scan operator with a predicate may process 1,000 rows, filtering out 950 of them. In that case, the property would show 50 and 1,000 rows, respectively.
Estimated Number of Rows and Estimated Number of Rows Read
Those two properties provide cardinality estimation data and indicate how many rows Query Optimizer expected the operator to return and process. The large discrepancy between estimated and actual metrics indicates a cardinality estimation error, which could lead to a suboptimal execution plan.
Number of Executions and Estimated Number of Executions
The Number of Executions metric indicates how many times the operator was executed. It does not correspond to the number of GetRow() calls but rather indicates how many times this part of the execution plan was processed. For example, in the plan shown in Figures 5-12 and 5-13, Clustered Index Scan in the Customers table would be executed once, while Clustered Index Seek in the Addresses table would be executed 10 times.
The Estimated Number of Executions metric shows the estimate used by Query Optimizer.
Startup Predicate
In some cases, operators may have a Startup Predicate property, which indicates the condition that needs to be met for the operator to execute. For example, a WHERE @ProvideDetails = 1 clause may generate Filter operator with Startup Predicate @ProvideDetails = 1. The execution plan subtree after the Filter operator may or may not be executed, depending on the @ProvideDetails parameter in runtime.

Unfortunately, row mode execution and per-row processing do not scale well with large analytical queries that process millions or even billions of rows. To address this, SQL Server 2012 introduced another execution model, called batch mode execution. This allows operators in the execution plans to process rows in batches. The process is optimized for large amounts of data and parallel execution plans.

Until SQL Server 2019, Query Optimizer did not consider batch mode execution unless at least one of the tables in the query had a columnstore index. This restriction was removed in SQL Server 2019, where batch mode can be used with row-based B-Tree indexes in databases with a compatibility level of 150. This does not mean that all execution plans will use batch mode; however, Query Optimizer will consider batch mode during optimization.

As with any feature that affects execution plans, batch mode can introduce regressions in some cases. You can enable and disable it on the database level with the BATCH_MODE_ON_ROWSTORE database option or on the query level with the ALLOW_BATCH_MODE and DISALLOW_BATCH_MODE query hints.

Finally, there is a trick that may enable batch mode execution on B-Tree tables in SQL Server 2016 and 2017: You can create empty and filtered nonclustered columnstore indexes on B-Tree tables that run large analytical queries. For example, if the table has ID column that stores only positive values, the following index will allow Query Optimizer to consider batch mode during optimization: CREATE NONCLUSTERED COLUMNSTORE INDEX NCCI ON T WHERE ID < 0.

You can see the operator’s execution mode with Actual Execution Mode property in the execution plan. Actual Number of Batches will tell you how many batches were processed. However, the query tuning strategy would be the same regardless of the execution mode.

Live Query Statistics and Execution Statistics Profiling

There are several tools that allow you to analyze execution plans. In addition to the well-known SSMS, you can use another freeware tool from Microsoft—Azure Data Studio. Despite the name, it works perfectly well with on-prem instances of SQL Server and can be installed on other operating systems besides Windows.

I consider Azure Data Studio to be targeted to developers rather than database administrators. Nevertheless, it provides basic database administration and tuning features and can be expanded with multiple third-party extensions. Some extensions will even bring support of other database platforms in addition to SQL Server.

I consider SentryOne Plan Explorer a must-have freeware tool for query tuning. This tool focuses on execution plan analysis. I find it more advanced and easier to use than SSMS. I suggest you download and test it if you have not done so already.

SSMS has another very useful feature called Live Query Statistics. This feature allows you to monitor query execution in runtime, detecting possible inefficiencies in the execution plan.

Figure 5-14 shows an example of the Live Query Statistics window in SSMS (screenshot is copied from Microsoft documentation). The operators with solid lines have been completed. The dotted lines represent the tree of the operators that are currently executing. You can also see the estimated progress of each active operator, along with the actual and estimated numbers of rows. All metrics are updating during query execution.

Figure 5-14. Live Query Statistics (Source: Microsoft documentation)

Live Query Statistics is very useful when you need to debug long-running queries. It allows you to pinpoint inefficiencies in the execution plans and speed up further query tuning. You can enable Live Query Statistics for queries you run in SSMS. You can also access it from the Active Expensive Query section of the Activity Monitor window.

Live Query Statistics collects data based on query execution statistics profiling. It uses two different methods. Standard profiling exists in all versions of SQL Server and has historically been used to obtain the actual execution plan for the queries. Unfortunately, this method introduces significant overhead.

Starting with SQL Server 2014 SP2, there is another option called lightweight profiling. With this method, the overhead is significantly smaller; however, it does not collect runtime CPU information.

Table 5-2 illustrates how you can enable profiling in different versions of SQL Server. It also shows xEvents that enable profiling globally in the system. Live Query Statistics integrates with the latest version of profiling supported by the SQL Server instance where it runs.

Table 5-2. Table 5-2. Query execution statistics profiling
Type How to enable xEvent
Prior SQL Server 2014 SP2 Standard SET STATISTICS XML
SET STATISTICS PROFILE
post_query_execution_showplan
SQL Server 2014 SP2 – SQL Server 2016 RTM Lightweight v1 Live Query Statistics post_query_execution_showplan (less overhead)
query_thread_profile
SQL Server 2016 SP1 – SQL Server 2017 Lightweight v2 T7412
QUERY_PLAN_PROFILE query hint
query_plan_profile
SQL Server 2019 Lightweight v3 Enabled by default
LIGHTWEIGHT_QUERY_PROFILING database option
query_post_executon_plan_profile

The overhead of standard profiling is significant. With lightweight profiling, on the other hand, it is very low. According to Microsoft, starting with SQL Server 2016 SP1, the overhead of continuously running lightweight profiling is about 2 to 4%. Technically speaking, you can run an xEvents session and collect profiling information for all queries in the system if the server is not CPU bound. Nevertheless, be careful, and measure the impact of this monitoring in your system.

There is another useful new function, sys.dm_exec_query_statistics_xml, that utilizes lightweight profiling. It provides an in-flight execution plan for the currently running request. The result looks like the snapshot of Live Query Statistics. You can use this function together with the sys.dm_exec_requests view, as shown in Listing 5-5.

Example 5-5. Using sys.dm_exec_query_statistics_xml
SELECT	
	er.session_id
	,er.request_id
	,DB_NAME(er.database_id) as [database]
	,er.start_time
	,CONVERT(DECIMAL(21,3),er.total_elapsed_time / 1000.) AS [duration]
	,er.cpu_time
	,SUBSTRING(
		qt.text, 
	 	(er.statement_start_offset / 2) + 1,
		((CASE er.statement_end_offset
			WHEN -1 THEN DATALENGTH(qt.text)
			ELSE er.statement_end_offset
		END - er.statement_start_offset) / 2) + 1
	) AS [statement]
	,er.status
	,er.wait_type
	,er.wait_time
	,er.wait_resource
	,er.blocking_session_id
	,er.last_wait_type
	,er.reads
	,er.logical_reads
	,er.writes
	,er.granted_query_memory
	,er.dop
	,er.row_count
	,er.percent_complete
	,es.login_time
	,es.original_login_name
	,es.host_name
	,es.program_name
	,c.client_net_address
	,ib.event_info AS [buffer]
	,qt.text AS [sql]
	,p.query_plan
FROM	
	sys.dm_exec_requests er WITH (NOLOCK)
		OUTER APPLY sys.dm_exec_input_buffer(er.session_id, er.request_id) ib
		OUTER APPLY sys.dm_exec_sql_text(er.sql_handle) qt
		OUTER APPLY sys.dm_exec_query_statistics_xml(er.session_id) p
		LEFT JOIN sys.dm_exec_connections c WITH (NOLOCK) ON 
			er.session_id = c.session_id 
		LEFT JOIN sys.dm_exec_sessions es WITH (NOLOCK) ON 
			er.session_id = es.session_id
WHERE
	er.status <> 'background'
	AND er.session_id > 50
ORDER BY 
	er.cpu_time desc
OPTION (RECOMPILE, MAXDOP 1);

Common Issues and Inefficiencies

The query optimization and tuning process may touch multiple layers in the system. For example, with third-party applications, you might not have access to the source code and must deal with predefined set of queries. This, perhaps, is the most challenging case—where your optimization is limited to creating and modifying indexes.

The situation is much better when you are able to change queries and database code. Those changes can be time-consuming and require testing, but they yield better results and significant performance improvements.

In some cases, you need to go beyond the database code. You might need to change the database schema, the application architecture, and sometimes even the technology to scale the system. While this is an extremely complex process, it may provide you the best results long-term.

I will not go that deep in this section, however. Instead, I will cover several common inefficiencies that can be addressed with indexing and code changes. Just remember that your options are not as limited in real life.

Inefficient Code

With exception of black-box optimizations, where you don’t have access to the code, you should start the tuning by reviewing and, potentially, refactoring the queries. There are several anti-patterns and issues to detect.

Non-SARGable predicates

SARGable predicates allow SQL Server to utilize the Index Seek operator by limiting the range of the key values to process. You need to analyze the query detecting and removing non-SARGable predicates when possible.

One very common case that leads to non-SARGable predicates is functions. SQL Server calls the function for every row it is processing in order to evaluate the predicate. This applies to both system and scalar user-defined functions (UDFs).

Note

SQL Server 2019 may inline some scalar UDFs into the query statement. However, there are many limitations that prevent inlining, so it is better to avoid them when possible.

Table 5-3 shows several examples of how you can refactor some predicates to make them SARGable.

Table 5-3. Examples of refactoring Non-SARGable predicates into SARGable ones
Operation Non-SARGable implementation SARGable implementation
Mathematical calculations Column - 1 = @Value Column = @Value + 1
ABS(Column) = 1 Column IN (-1, 1)
Date manipulation CONVERT(DATETIME, CONVERT(VARCHAR(10),Column,121)) = @Date Column >= @Date AND
Column < DATEADD(DAY,1,@Date)
DATEPART(YEAR,Column) = @Year Column >= @Year AND
Column < DATEADD(YEAR,1,@Year)
DATEADD(DAY,7,Column) > GETDATE() Column >
DATEADD(DAY,-7,GETDATE())
Prefix search LEFT(Column,3) = ‘ABC’ Column LIKE ‘ABC%'
Substring search Column LIKE '%ABC%' Use Full-Text Search or other technologies

Pay attention to the data types used in the predicates. Implicit conversion operation is, in a nutshell, a call of the system CONVERT_IMPLICIT function, which in many cases would prevent index seek. Remember to analyze JOIN predicates in addition to WHERE clauses. Data type mismatches in join columns are a very common source of problems.

User-Defined Functions

I noted just now that system and non-inlined scalar UDFs may prevent SQL Server from using the index seek operation. Moreover, they introduce significant performance overhead, especially in case of user-defined functions. SQL Server calls them for every row it processes, and those calls are similar to stored procedure calls (you can confirm this by capturing an rpc_starting xEvent or SP:Starting trace event).

SQL Server optimizes the code in multi-statement UDFs (scalar and table-valued) separately from the caller query. This usually leads to less efficient execution plans. More importantly, depending on the version of SQL Server, database compatibility level and configuration settings, SQL Server estimates that multi-statement table-valued functions return either 1 or 100 rows. This could completely invalidate cardinality estimations and lead to highly inefficient plans.

The latter situation is improved in SQL Server 2017 with Adaptive Query Processing. One of its features, Interleaved Execution, defers final compilation of the query until run-time, when SQL Server can measure the actual number of rows returned by the function and finish optimization using that data.

Nevertheless, it is better to avoid multi-statement functions and use inline table-valued functions when possible. SQL Server embeds and optimizes them together with the caller queries. Fortunately, in many cases, scalar and multi-statement table-valued functions can be converted to inline table-valued functions with very little effort.

Temporary Tables and Table Variables

Temporary tables and table variables are very valuable during query optimization. You can use them to persist the intermediate results of queries. This allows you to simplify queries, which may improve cardinality estimations and generate more efficient execution plans.

There are two common mistakes associated with temporary tables and table variables.

First, some people choose table variables over temporary tables because of the common misconception that table variables are in-memory objects that don’t use tempdb and are therefore more efficient than temporary tables.

This is not the case. Both objects rely on tempdb. Even though table variables are slightly more efficient than temporary tables, this efficiency comes from an important limitation: they do not maintain statistics on primary keys and unique constraints.

Similar to multi-statement table-valued functions, Query Optimizer estimates that a table variable stores either 1 or 100 rows. In SQL Server 2019, you can use Adaptive Query Processing to defer compilation of the query and allow Query Optimizer to use the actual number of rows to generate the execution plan. You can achieve the same results in older versions of SQL Server with a statement-level recompile and OPTION (RECOMPILE) clause. However, SQL Server does not maintain statistics histograms and/or information about actual data distribution in table variables.

Temporary tables, on the other hand, behave like regular tables. They maintain index statistics and allow SQL Server to use them during optimization. While there are some edge cases when table variables could be the better choice, in most cases it is safer to use temporary tables. In many cases throughout my career, I’ve achieved great results by replacing table variables with temporary tables, without any further code or indexing changes.

The second common mistake is not indexing temporary tables, which negatively affects cardinality estimations and can lead to inefficient table scans. Treat temporary tables as regular tables and index them to support efficient querying, especially when they store significant amounts of data.

You can use a properly indexed temporary table to persist results from multi-statement table-valued functions. This will improve cardinality estimations, especially in the old versions of SQL Server without Adaptive Query Processing.

Obviously, temporary tables and table variables come with a price. There is overhead involved in creating and populating them. When they are used wisely, the benefits may outweigh the downsides, but they are generally not a good choice to store millions of rows.

Stored Procedures and ORM Frameworks

While this topic is not directly related to query tuning, it is impossible to avoid mentioning Object Relational Mapping (ORM) frameworks. They are extremely common nowadays and saying that all database engineers hate them would not be exaggerating. The queries generated by ORM frameworks are extremely complex and hard to optimize.

Unfortunately, we need to accept that those frameworks simplify development and reduce its time and cost. In most cases, it is unrealistic and unreasonable to insist that application developers not use them. More importantly, in many cases, the less efficient queries generated by frameworks are totally acceptable.

Performance-critical queries are different, though. You may not have many of them, but there are always some that will require extensive tuning and optimization. In those cases, autogenerated and/or ad-hoc queries are not the best choice. It is preferable to switch to stored procedures, which provide full flexibility and a larger set of techniques for optimization.

While this switch may require changes in the application code, in many cases it will reduce tuning time and cost. Remember that when you are choosing your tuning approach.

Inefficient Index Seek

As you already know, an index seek operation is usually more efficient than an index scan. This does not mean, however, that every index seek is efficient. SQL Server uses an index seek when query predicates allow it to isolate the range of data rows from the index during query execution. If this range is very large, this can reduce the efficiency of the operation.

Let’s look at a simple example: we’ll create a table and populate it with some data. Then we’ll run two SELECT statements – with and without a WHERE clause — as shown in Listing 5-6.

Example 5-6. Index seek inefficiency
CREATE TABLE dbo.T1
(
    IndexedCol INT NOT NULL,
    NonIndexedCol INT NOT NULL
);
CREATE UNIQUE CLUSTERED INDEX IDX_T1
ON dbo.T1(IndexedCol);
 
;WITH N1(C) AS (SELECT 0 UNION ALL SELECT 0) -- 2 ROWS
,N2(C) AS (SELECT 0 FROM N1 AS T1 CROSS JOIN N1 AS T2) -- 4 ROWS
,N3(C) AS (SELECT 0 FROM N2 AS T1 CROSS JOIN N2 AS T2) -- 16 ROWS
,N4(C) AS (SELECT 0 FROM N3 AS T1 CROSS JOIN N3 AS T2) -- 256 ROWS
,N5(C) AS (SELECT 0 FROM N4 AS T1 CROSS JOIN N4 AS T2) -- 65,536 ROWS
,N6(C) AS (SELECT 0 FROM N3 AS T1 CROSS JOIN N5 AS T2) -- 1,048,576 ROWS
,IDs(ID) AS (SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM N6)
INSERT INTO dbo.T1(IndexedCol, NonIndexedCol)
    SELECT ID, ID FROM IDs;
 
SET STATISTICS IO ON
SELECT COUNT(*) FROM dbo.T1;
SELECT COUNT(*) FROM dbo.T1 WHERE IndexedCol > 0;

Figure 5-15 shows the execution plans of both queries along with their I/O statistics. All rows in the table have positive IndexedCol values, so both queries must scan an entire index. In short, an index seek operation is identical to an index scan.

Figure 5-15. Inefficient index seek

It is very common to see inefficient index seeks in multi-tenant systems. Take, for example, order fulfillment software, where data is generally spread across a relatively small number of warehouses. It is common to see the tenant-id (or warehouse_id, in this example) as the left-most column in the index keys.

Queries in those systems usually process data from a single tenant, using it as the predicate in the WHERE clause, which will rightfully lead to index seek operations in the execution plan. But if each tenant (or warehouse) stores very large amounts of data, you may get perfectly looking and inefficient execution plan even without any scans present. You’ll need to use other predicates to make index seeks selective and improve performance.

You can analyze the efficiency of index seeks in the execution plan by looking at the operator’s properties. Let’s run the query shown in Listing 5-7 against the table we created in Listing 5-6.

Example 5-7. Sample query
SELECT IndexedCol, NonIndexedCol
FROM dbo.T1
WHERE
    IndexedCol BETWEEN 100 AND 150 AND NonIndexedCol % 2 = 0;

Figure 5-16 illustrates several key properties for the analysis. The screenshot here was captured in SentryOne Plan Explorer; however, you’d see the same data in SSMS.

Figure 5-16. Index Seek operator properties

Let’s look at the most important properties of the operator.

Seek Predicate
This property shows the predicate(s) that SQL Server uses to limit the range of rows during the index seek. The more selective this predicate is, the more efficient index seek will be.
Predicate
The predicate illustrates additional filter criteria that SQL Server applies to every row read by an Index Seek operator. It does not reduce the size of the data the operator must process; however, it may reduce the number of rows the operator returns in the execution plan. In our case, the operator read 51 rows from the index and returned 26 rows to the next operator in the execution plan.
It is always more efficient to reduce the size of the data with an efficient seek predicate. When seek predicates are not selective enough, consider restructuring your index in a way that allows SQL Server to use regular predicates as seek predicates.
Actual Rows and Actual Rows Read
These two properties are called Actual Number of Rows and Number of Rows Read in SSMS. They illustrate how many rows were returned by the operator and how many rows were processed during execution. The large value in Number of Rows Read indicates that index seek processed a large amount of data and may require further investigation. The large discrepancy between those two values shows potential index inefficiency, with a significant portion of the data being filtered out by the Predicate rather than the Seek Predicate of the operator.
Estimated Rows and Estimated Rows to be Read
Those two properties are called Estimated Number of Rows and Estimated Number of Rows to be Read in SSMS. As noted, you can compare estimated and actual metrics in the execution plan to estimate the quality of the cardinality estimation data. A noticeable cardinality estimation error may indicate a wrong choice of index and/or join type (more on that later), which you will likely want to address.

Obviously, it is much easier to analyze an execution plan when you have the actual execution metrics available. While estimated metrics can be useful for initial analysis, cardinality estimation errors may provide a very wrong or incomplete picture. Do additional analysis and look at the data distribution in the database when dealing with estimated execution plans.

Incorrect Join Type

SQL Server uses many physical join operators during query execution. These belong to one of the three logical join types: loop, hash and merge. Each is optimized for specific conditions, and the incorrect choice may have a serious negative impact on the query performance. Unfortunately, people often don’t pay attention to the join type chosen by SQL Server, overlooking opportunities for optimization.

Let’s look at all three types in more detail.

Loop Join

A loop join (or nested loop join) is the simplest join algorithm. As with any join type, it accepts two inputs, which are called outer and inner tables. The algorithm for the join is very simple (Listing 5-8). Briefly, SQL Server goes through the outer table looking up rows to join in the inner table for each outer row.

Example 5-8. Loop join algorithm (pseudo code)
/* Inner join */
for each row R1 in outer table
    find row(s) R2 in inner table
        if R1 joins with R2
            return join (R1, R2)
/* Outer join */
for each row R1 in outer table
    find row(s) R2 in inner table
        if R1 joins with R2
            return join (R1, R2)
        else
            return join (R1, NULL)

The cost of the join depends on two factors. The first is the size of the outer table. SQL Server goes through each row there, locating corresponding rows in the inner table to join. The more data it needs to process, the less efficient it will be.

The second factor is the efficiency of the inner table search. When the join column(s) in the inner table are properly indexed, SQL Server can utilize the efficient index seek operation. In that case, the cost of the inner table search on each iteration will be relatively low. Without the index, SQL Server might have to scan the inner table multiple times – once for each row from the outer table. As you can guess, that is extremely inefficient.

The loop join is optimized for conditions in which one of the tables is small and the other has an index to support an index seek operation for the join. It is impossible to define the hard threshold after which the join becomes inefficient. It may perform well with thousands and sometimes tens of thousands of rows in the outer input; however, it would not scale well with millions of rows. Nevertheless, in proper conditions, this type of join is extremely efficient. It has very little startup cost, does not use tempdb, and does not consume large amount of memory.

Finally, loop join is the only join type that does not require an equality predicate. SQL Server may evaluate a join predicate between every row from both inputs. It does not require a join predicate at all. For example, the CROSS JOIN operator would lead to a nested loop physical join when every row from both inputs has been joined together. Obviously, SQL Server cannot use index seek if the join predicate is not SARGable, which would lead to extremely inefficient operation with large inputs.

Merge Join

The merge join works with two sorted inputs. It compares two rows, one at time, and returns their join to the client if they are equal. If they are not, it discards the lesser value and moves on to the next row in the input. The algorithm for the join is shown in Listing 5-9.

Example 5-9. Inner merge join algorithm (pseudocode)
/* Prerequirements: Inputs I1 and I2 are sorted */
get first row R1 from input I1
get first row R2 from input I2
while not end of either input
begin
    if R1 joins with R2
    begin
        return join (R1, R2)
        get next row R2 from I2
    end
    else if R1 < R2
        get next row R1 from I1
    else /* R1 > R2 */
        get next row R2 from I2
end

The merge join is optimized for medium and large inputs, when both of those inputs are sorted. This means that inputs need to be indexed on the join predicate columns. However, in practice, SQL Server may decide to sort inputs during query execution; the cost of the sort may thus far exceed the cost of the merge join itself. Check if that is the case and factor the cost of the Sort operator into your analysis.

Hash Join

A hash join is designed to handle large unsorted inputs. Its algorithm consists of two different phases.

During the first, or build, phase, a hash join scans one of the inputs (usually the smaller one), calculates the hash values of the join key, and places them into the hash table. In the second, or probe, phase, it scans the second input, and checks (probes) to see if the hash value of the join key from the second input exists in the hash table. If so, SQL Server evaluates the join predicate for the row from the second input and all rows from the first input that belong to the same hash bucket. The algorithm is shown in Listing 5-10.

Example 5-10. Inner hash join algorithm (pseudocode)
/* Build Phase */
for each row R1 in input I1
begin
    calculate hash value on R1 join key
    insert hash value to appropriate bucket in hash table
end
/* Probe Phase */
for each row R2 in input I2
begin
    calculate hash value on R2 join key
    for each row R1 in hash table bucket
    if R1 joins with R2
        return join (R1, R2) 
end

A hash join requires memory to store the hash table. When there is not enough memory, the join stores some hash table buckets in tempdb. This condition is called a spill and can greatly affect join performance, since tempdb access is significantly slower.

Spills often happen due to incorrect memory grant estimation, which in turn may be triggered by incorrect cardinality estimation. When this is the case, make sure that the statistics are up to date; consider simplifying or refactoring the query if this does not help.

Adaptive Query Processing in SQL Server 2017 introduced a new feature called memory grant feedback, which increases or decreases memory grants for a query based on memory usage in previous executions. In SQL Server 2017, that feature is limited to batch mode execution. Starting with SQL Server 2019, it is also enabled in row mode execution. Read the Microsoft documentation for more information, and consider enabling it. This may reduce tempdb spills in the system.

Comparing Join Types

Table 5-4 summarizes the behavior of different join types and the use cases for which they are optimized.

Table 5-4. Join comparison
Nested Loop Join Merge Join Hash Join
Best use case At least input is small; index on the join column(s) in another input Medium to large inputs, sorted on index key Medium to large inputs
Requires sorted input No Yes No
Requires equality predicate No Yes Yes
Blocking operator No No Yes (Build phase only)
Uses memory No No Yes
Uses tempdb No No (Sort may spill to tempdb) Yes, in case of spills
Preserves order Yes (outer input) Yes No

Adaptive Query Processing in SQL Server 2017 also introduced the concept of the adaptive join. With this join, SQL Server chooses to use either a loop or a hash join based on the size of the inputs in runtime. Unfortunately, in SQL Server 2017 and 2019, this works only in batch mode execution, which, in most cases, is triggered by columnstore indexes. You need to enable Live Query Statistics in SSMS to see adaptive join in the execution plan.

I mentioned just now that each type of join is optimized for specific use cases and may not perform well in other cases. Let’s look at a simple example and compare the performance of different join types. Listing 5-11 creates another table (similar to the one in Listing 5-6) and populates it with the same data. Both tables have two columns each and a clustered index defined on one of the columns.

Example 5-11. Join performance: Table creation
CREATE TABLE dbo.T2
(
    IndexedCol INT NOT NULL,
    NonIndexedCol INT NOT NULL
);
CREATE UNIQUE CLUSTERED INDEX IDX_T2
ON dbo.T2(IndexedCol);
INSERT INTO dbo.T2(IndexedCol, NonIndexedCol)
    SELECT IndexedCol, NonIndexedCol FROM dbo.T1;

Next, let’s compare the performance of different join types using the code in Listing 5-12. Here, I am forcing different join types with join hints (more on those later). I put the execution time of the statements in my test environment into code comments.

Example 5-12. Join performance: Test cases
-- Loop join with index seek in inner table
-- Elapsed time: 137ms.
SELECT COUNT(*)
FROM dbo.T1 INNER LOOP JOIN dbo.T2 ON   T1.IndexedCol = T2.IndexedCol
WHERE 
    T1.NonIndexedCol <= 100;
 
-- Loop join with inefficient index scan in inner table
-- Elapsed time: 16,732ms
SELECT COUNT(*)
FROM dbo.T1 INNER LOOP JOIN dbo.T2 ON 
    T1.IndexedCol = T2.NonIndexedCol
WHERE 
    T1.NonIndexedCol <= 100;

-- Hash join. Slower than loop join on small inputs
-- Elapsed time: 411ms.
SELECT COUNT(*)
FROM dbo.T1 INNER HASH JOIN dbo.T2 ON 
    T1.IndexedCol = T2.IndexedCol
WHERE 
    T1.NonIndexedCol <= 100;

-- Loop join with index seek in inner table with large input
-- Elapsed time: 1,514ms
SELECT COUNT(*)
FROM dbo.T1 INNER LOOP JOIN dbo.T2 ON 
    T1.IndexedCol = T2.IndexedCol;
 
-- Hash join using indexed columns to join
-- Faster than loop join on large input
-- Elapsed time: 1,215ms
SELECT COUNT(*)
FROM dbo.T1 INNER HASH JOIN dbo.T2 ON 
    T1.IndexedCol = T2.IndexedCol;
 
-- Hash join using non-indexed columns to join
-- Performance does not depend on if join columns were indexed
-- Elapsed time: 1,235ms
SELECT COUNT(*)
FROM dbo.T1 INNER HASH JOIN dbo.T2 ON 
    T1.IndexedCol = T2.NonIndexedCol;
 
-- Merge join with pre-sorted inputs
-- Elapsed time: 440ms
SELECT COUNT(*)
FROM dbo.T1 INNER MERGE JOIN dbo.T2 ON 
    T1.IndexedCol = T2.IndexedCol;

-- Merge join without pre-sorted inputs
-- Elapsed time: 774ms
SELECT COUNT(*)
FROM dbo.T1 INNER MERGE JOIN dbo.T2 ON 
    T1.IndexedCol = T2.NonIndexedCol;

Loop join is faster than hash join on the small inputs; however, hash join becomes more efficient as size of the input grows. Merge join, on the other hand, is great when inputs are sorted. Otherwise, it adds a Sort operator to the execution plan. While this might work fine with small inputs, sorting very large inputs would not work well.

As these examples illustrate, an incorrect choice of join type can reduce query performance dramatically. In most cases, this happens due to incorrect cardinality estimations, especially when SQL Server seriously underestimates the size of join inputs.

With hash and merge joins, this may lead to tempdb spills and slower join performance. However, that situation is most dangerous with the loop join, especially when the cost of inner input processing is high. (Recall that SQL Server processes the inner input for each row from the outer input, so costs add up quickly with each iteration.

You can detect that condition by comparing actual to estimated rows in the outer input or by looking at actual versus estimated number of executions in the first operator from inner input (see Figure 5-17). A large discrepancy would indicate a cardinality estimation error. When a cardinality estimation error leads to a high number of executions in the inner input, a loop join may be the wrong choice.

Figure 5-17. Cardinality estimation error in loop join

You have several options for addressing this problem. As a first step, review the query, looking for opportunities to refactor. Remove code patterns that could affect cardinality estimations, like multi-statement functions and table variables. See if there is a possibility of better indexing, which could improve the execution plan.

It’s worth checking if you are being affected by parameter sniffing in parameter-sensitive plans. SQL Server sometimes caches and reuses execution plans compiled for atypical parameter values, especially when data is distributed very unevenly in the table. Think about multi-tenant systems where some tenants may have very little data and others a great deal. Execution plans generated for the former group of tenants would be inefficient for the latter.

When this is the case, consider using statement-level recompilation with OPTION (RECOMPILE) or disabling parameter sniffing in the database. I will discuss those and other options in more detail in the next chapter.

You can also try updating the statistics in the tables from the query. Unfortunately, this may not always help. Query Optimizer in SQL Server does not always make the right assumptions when estimating cardinality in complex queries with multiple joins.

I have mentioned query simplification, which may help to address that problem. Consider splitting the query and persist intermediate results in the properly indexed temporary tables. SQL Server will be able to see the data distribution there and accurately estimate cardinality in the queries. Obviously, remember the overhead that temporary tables may introduce and do not use them to cache very large datasets.

As a last resort, you can force join types with query hints. This is a dangerous method, and you need to be very careful. The hints will force Query Optimizer to perform optimization in a specific way, which may be inefficient in the future if the data distribution changes. If you do this, remember it and re-evaluate periodically if hints are still required.

There are two ways to use join hints. The first is specifying a list of allowed join types in query options. The first statement in Listing 5-13 shows how to use this method to prevent Query Optimizer from using loop joins anywhere in the query. The second option is specifying the type of specific join between the tables. The second statement in the listing forces SQL Server to use a hash join between tables A and B.

Example 5-13. Forcing join types
SELECT A.Col1, B.Col2
FROM 
    A JOIN B ON A.ID = B.ID 
    JOIN C ON B.CID = C.ID
OPTION (MERGE JOIN, HASH JOIN);
SELECT A.Col1, B.Col2
FROM 
    A INNER HASH JOIN B ON A.ID = B.ID 
    JOIN C ON B.CID = C.ID;

Unfortunately, the second approach also forces join orders for all joins in the query. SQL Server will always join the tables in the order they were specified in the query without trying to reorder the joins. In Listing 5-13, for example, table A will be always be joined with table B first using hash join, and the result of their join will be joined with table C. This makes it dangerous for queries with multiple joins by preventing possible join-order optimizations. Be careful when you use them!

Excessive Key Lookups

As you already know, key and RID lookups1 become extremely inefficient on a large scale. SQL Server does not use indexes for seek operations when it estimates that large number of key or RID lookups will be required. However, you may still encounter excessive lookups in the execution plans.

This situation often occurs due to incorrect cardinality estimations. If SQL Server estimates that just a handful of key lookups will be required, it might decide to use nonclustered index seek, and the error could lead to a large number of key lookups. That error usually happens due to cardinality estimation model limitations or parameter sniffing (in parameter-sensitive plans, about which you’ll learn more in the next chapter).

You can detect this issue by analyzing estimated and actual numbers of rows in the execution plan, as shown in Figure 5-18.

Figure 5-18. Incorrect cardinality estimation and key lookups

SQL Server may also decide to use key lookups, choosing between bad data access strategies and worse ones. Running millions of key lookups is extremely inefficient; however, it could be better than scanning a table with billions of rows.

In many cases, you can remove key lookups with covering indexes. If all columns required for the query are present in a nonclustered index, SQL Server doesn’t need to access the main data row in a clustered index or heap. By definition, it includes all columns in the nonclustered index key along with the clustered index columns in row-id.

SQL Server allows you to include other columns in the index using the INCLUDE index clause. Data from these columns is stored on the leaf level only. It is not added to the index key and does not affect the sorting order of the index rows. Figure 5-19 illustrates the structure of an index with included columns, defined as CREATE INDEX IDX_Customers_Name ON dbo.Customers(Name) INCLUDE(DateOfBirth) on the table, which has CustomerId as the clustered index column.

Figure 5-19. Index with included columns

Now, if the only columns the query references are present in the index, SQL Server can obtain all data from the leaf level of the nonclustered index B-Tree without performing key lookups. It can use the index regardless of how many rows would be selected from there.

Making nonclustered indexes covering is one of the most commonly used query optimization techniques. You can open the properties of Key Lookup operator in the execution plan (see Figure 5-18 above) and get a list of columns from the operator’s output and filter predicate. Including those columns in the nonclustered index would eliminate the need to do lookups.

Although covering indexes are a great tool for optimizing queries, they come at a cost. Every column in the index increases its row size, as well as the number of data pages it uses on disk and in memory. That introduces additional overhead during index maintenance and increases the database size. Moreover, queries need to read more pages when scanning all or part of the index. This doesn’t necessarily introduce a noticeable performance impact during small range scans, when reading a few extra pages is far more efficient than key lookups, but it can degrade the performance of queries that scan a large number of data pages or the entire index.

Covering indexes also add update overhead. By adding a column to nonclustered indexes, you store the data in multiple places. This improves the performance of queries that select the data. However, during updates, SQL Server needs to change the rows in every index where updated columns are present.

It is not a good idea to create very wide nonclustered indexes that include majority of the columns in the table. Nor do you want to have very large number of indexes. Both conditions will increase the size of the database and lead to excessive update overhead, especially in OLTP environments. (I will talk more about index analysis and consolidation in chapter 14.)

Finally, I would like to state a very important and obvious thing. While excessive key lookups are bad for performance, having key lookups is completely normal. Doing lookups of hundreds or even thousands of rows may be a better option than creating large covering indexes.

Think about it from a different angle: the key lookup is basically a loop join with efficient index seek in the inner table (clustered index). It is very fast and efficient on a small scale.

Remember, you don’t have eradicate all key lookups from the execution plan. Just analyze their efficiency and take care of the inefficient ones.

Indexing the Data

Designing proper indexes is both an art and a science. You will master this skill over time. Let me give you a few tips on how how to start.

Most queries in the system have some parameters. The most selective SARGable predicates filter out most of the data; the columns in those predicates are the best candidates for the index.

Let’s look at a hypothetical query that returns a list of one customer’s orders (Listing 5-14).

Example 5-14. Selecting a list of a customer’s orders
SELECT c.CustomerName, c.CustomerNumber, o.OrderId, o.OrderDate, o.Amount
FROM 
    dbo.Customers c JOIN dbo.Orders o ON
        c.CustomerId = o.CustomerId
WHERE 
    c.CustomerNumber = @CustNum AND
    c.Active = 1 AND
    o.OrderDate between @StartDate AND @EndDate AND
    o.Fulfilled = 1;

Let’s assume that you run this query against tables that have no indexes except clustered indexes on CustomerId and OrderId colums. The execution plan for the query consists of two clustered index scans and a hash join between the tables (Figure 5-20). As you can see, it is inefficient.

Figure 5-20. Indexing example: Initial execution plan

The natural place to start is the CustomerNumber column. The data here is likely to be unique, and the index on this column will be highly selective. The predicate Active=1 checks the status of the customer. You can expect it to be commonly used in the queries. It is a good idea to add the Active column to the index as an included column.

I’d also expect CustomerName to often be selected alongside CustomerNumber data and added to the index as another included column, eliminating the need for a key lookup operation. Still, key lookups may be completely acceptable on a small scale with highly selective indexes.

Figure 5-21 shows the execution plan after you create the following index: CREATE UNIQUE INDEX IDX_Customers_CustomerNumber ON dbo.Customers(CustomerNumber) INCLUDE (Active, CustomerName).

Figure 5-21. Indexing example: Plan with the index on Customers table

There are three predicates with Orders table columns. Two of them—on the CustomerId and OrderDate columns—are selective, and thus good candidates for the index. You can define the index with either (CustomerId, OrderDate) or (OrderDate, CustomerId) column order.

To choose, consider how data will be sorted in the index. With the first option, (CustomerId, OrderDate), SQL Server sorts the data by CustomerId first. Then the orders for each customer are sorted by OrderDate. With the second option, the data will be sorted by OrderDate across all customers.

Both indexes will allow index seek in the Orders table. However, the first index is more efficient for our query. SQL Server will be able to do a range scan for orders that belong to this single customer for the time interval defined by @StartDate and @EndDate. With the second index, SQL Server would have to read all orders in that time interval for all customers, which would force it to scan more data.

It’s good to add the Fulfilled column to the index as an included column, to evaluate the predicate as part of the seek operation. In this case, I’d also include the Amount column – it is small enough and would not increase the size of the index much.

Figure 5-22 shows the final execution plan after the following index has been created: CREATE INDEX IDX_Orders_CustomerId_OrderDate ON dbo.Orders(CustomerId, OrderDate) INCLUDE (Fulfilled, Amount).

Figure 5-22. Indexing example: Final execution plan

Query optimization is never boring. It constantly challenges you and helps you to learn. I hope, this chapter gives you some tips on where to start and encourage you to practice and learn more. After all, query tuning is the most efficient way to improve performance of the system.

A few more words of advice: Don’t create separate indexes for each query. Instead, analyze the workload in the system and create indexes that can be useful in multiple queries. When you start optimization, review the least efficient queries and identify common access patterns.

Be careful with covering indexes. Making them very wide is not a good idea. Again, there is nothing wrong with key lookups if they don’t introduce a significant impact on the system.

In the next chapter, we will discuss high CPU load and options how to reduce it.

Summary

SQL Server supports three data storage and processing technologies. Row-based storage, which is the most common, stores the data from all columns together in unsorted heaps or sorted B-Tree indexes.

B-Tree indexes sort data based on index keys. Clustered indexes store the data from all table columns. Nonclustered indexes store another copy of the data in separate physical indexes. They reference clustered indexes though row-id, which is the clustered index key values. When data is not present in the nonclustered index, SQL Server goes though the clustered index B-Tree using key lookup operation. This operation is expensive at scale.

SQL Server accesses data in two ways. An index scan usually reads all rows from the index. Index seek isolates and processes a subset of the index rows, which is usually more efficient than an index scan. Write your queries to allow SQL Server to utilize index seeks. You should also analyze the efficiency of your index seeks, making sure that they don’t process large amounts of data.

SQL Server stores information about indexes and data distribution in statistics, which it uses to estimate how many rows each operator will need to process in the execution plan. Accurate cardinality estimation helps SQL Server generate efficient execution plans. Up-to-date statistics are a key element in correct cardinality estimation.

When you analyze execution plans, pay attention to the efficiency of the Index Seek operators, the choices of join type, and the number of key and RID lookups. Check cardinality estimations, too: improper cardinality estimation is one of the most common problems that lead to poor execution plans. Make sure that statistics are up to date, add the required indexes, and simplify and refactor queries to address issues.

Troubleshooting Checklist

Troubleshoot the following:

  • Analyze statistics maintenance strategy in the system. Add a T2371 trace flag if the system has databases with compatibility level below 130 (SQL Server 2016).
  • Analyze and improve index maintenance strategies.
  • Make sure that statistics on filtered indexes are frequently updated. Optionally, consider rebuilding them frequently.
  • Identify and optimize inefficient queries.

1 For the sake of brevity, from this point on, I will stop referring to “key and RID lookups”; everything I say about “key lookups” applies to RID lookups as well.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
44.198.180.108