Chapter 9
Performance tuning SQL Server

In this chapter, we review the database concepts and objects most commonly associated with performance tuning the performance of objects within the Microsoft SQL Server database. We begin with a fundamental exploration of database isolation and its practical effects on queries. We then review the concepts of delayed durability and delayed durability transactions. Then, we explore execution plans, including ways to use them with the Query Store feature. We discuss execution plans in detail, what to look for when performance tuning, and how to control when they go parallel.

Entire books have been written on some of the sections in this chapter—we obviously can’t go into that degree of detail here in a single chapter, but we do provide a deep enough discussion to jumpstart and accelerate your learning toward SQL Server performance tuning, including features added in SQL Server 2016 and 2017.

Understanding isolation levels and concurrency

It is important to have a fundamental understanding of isolation levels. These aren’t just arcane keywords you study only when it is certification test time; they can have a profound effect on application performance, stability, and data integrity.

Understanding the differing impact of isolation levels on locking and blocking, and therefore on  concurrency, is the key to understanding when you should use an isolation level different from the default of READ COMMITTED. Table 9-1 presents all of the isolation levels available in SQL Server.

Table 9-1 Isolation levels

Transaction isolation level

Allows dirty reads

Allows nonrepeatable reads

Allows phantom rows

Update conflicts possible

READ UNCOMMITTED

X

X

X

 

READ COMMITTED

 

X

X

 

REPEATABLE READ

 

 

X

 

SERIALIZABLE

 

 

 

 

READ COMMITTED SNAPSHOT (RCSI)

 

X

X

 

SNAPSHOT

 

 

 

X

Inside OUT

What about READPAST?

READPAST is a table hint, not an isolation level, and you cannot set it at the session level. We discuss more about how and where you can set isolation levels later in this chapter.

But, READPAST can be useful in very specific circumstances, limited to when there are SQL Server tables used as “stack” or “queue,” with “first in, first out” architecture. READPAST does not place row-level locks on a table, and instead of being blocked by rows that are locked, it skips them. User transactions can fetch the “first” row in the stack that isn’t already being accessed.

In this way, a multithreaded process that is regularly looping through a table can read rows, afford to skip the rows currently being written to, and read them on the “next pass.” Outside of these limited scenarios, READPAST is not appropriate because it will likely return incomplete data.

When you are choosing an isolation level for a transaction in an application, you should consider primarily the transactional safety and business requirements of the transaction in a multiuser environment. The performance of the transaction should be a distant second priority when choosing an isolation level. Locking is not bad, it is the way that every transaction in SQL Server cooperates with others when dealing with disk-based tables. READ COMMITTED is generally a safe isolation level because it allows updates to block reads. In the default READ COMMITTED isolation level, reads cannot read uncommitted data and must wait for a transaction to commit or rollback. In this way, READ COMMITTED prevents a SELECT statement from accessing uncommitted data, a problem known as a dirty read. This is especially important during multistep transactions, in which parent and child records in a foreign key relationship must be created in the same transaction. In that scenario, reads should not access either table until both tables are updated.

READ COMMITTED does not ensure that row data and row count won’t change between two SELECT queries in a multistep transaction. For some application scenarios, this might be acceptable or desired, but not for others. To avoid these two problematic scenarios (which we talk more about soon), you need to increase the transaction’s isolation.

For scenarios in which transactions must have a higher degree of isolation from other transactions, escalating the isolation level of a transaction is appropriate. For example, if a transaction must process multistep writes and cannot allow other transactions to change data during the transaction, escalating the isolation level of a transaction is appropriate. Here are two examples:

In this example, REPEATABLE READ would block other transactions from changing or deleting rows needed during a multistep transaction. This phenomenon is called nonrepeatable reads. A nonrepeatable read returns different or fewer rows of data when attempting to read the same data twice, which is problematic to multistep transactions. Nonrepeatable reads can affect transactions with less isolation than REPEATABLE READ.

However, if the transaction in this example would need to ensure that the same number of rows in a result set is returned throughout a multistep transaction, the SERIALIZABLE isolation is necessary. It is the only isolation level that prevents other transactions from inserting new rows inside of a range of rows, a problem known as phantom rows.

The behavior just described is consistent with transactions affecting only a few rows. In these cases, the Database Engine is performing row and page-level locks to provide protection for transaction isolation. It is possible that REPEATABLE READ transactions could access a large number of rows and then escalate to use table locks, and then protect the transaction against phantom rows.

Image For more on monitoring database locking and blocking, see Chapter 13.

Inside OUT

SQL Server doesn’t have a time-out? Really?

That’s correct, by default there is no time-out for a local request that is being blocked in SQL Server, although applications can report a “SQL time-out” if query run time surpasses their own time-out limitations.

By default, SQL Server will not cancel a request that is being blocked, but you can change this behavior for individual sessions. The value of the global variable @@LOCK_TIMEOUT is -1 by default, indicating that there is no time-out. You can change this for the current session by using the following statement:

SET LOCK_TIMEOUT n;

Where n is the number of milliseconds before a request is cancelled by SQL Server, returning error 1222, “Lock request time out period exceeded. The statement has been terminated.” Take caution in implementing this change to SQL’s default lock time-out, and try to fully understand the cause of the blocking first. If you change the lock time-out in code, ensure that any applications creating the sessions are prepared to handle the errors gracefully and retry.

SQL Server does have a configuration setting for a lock time-out for outgoing remote connections called Remote Query Timeout (s), which defaults to 600 seconds. This time-out applies only to connections to remote data providers, not to requests run on the SQL Server instance.

Understanding how concurrent sessions become blocked

In this section, we review a series of realistic examples of how concurrency works in a multiuser application interacting with SQL Server tables. First, let’s discuss how to diagnose whether a request is being blocked or blocking another request.

How to observe blocking

It’s easy to find out live whether a request is being blocked. The dynamic management view sys.dm_db_requests, when combined with sys_dm_db_sessions on the session_id column, provides similar data plus much more information than the legacy sp_who or sp_who2 commands, including the blocked_by column, as demonstrated here:

SELECT * FROM
sys.dm_exec_sessions s
LEFT OUTER JOIN sys.dm_exec_requests r ON r.session_id = s.session_id;

Now, let’s review some example scenarios to detail exactly why and how requests can block one another in the real world. This is the foundation of concurrency in SQL Server and helps you understand the reason why NOLOCK appears to make queries perform faster. The examples that follow behave identically in SQL Server instances and databases in Microsoft Azure SQL Database

Understanding concurrency: two requests updating the same rows

Consider the following steps involving two writes, with each transaction coming from a different session. The transactions are explicitly declared by using the BEGIN/COMMIT TRAN syntax. In this example, the transactions are not overriding the default isolation level of READ COMMITTED:

  1. A table contains only rows of Type = 0 and Type = 1. Transaction 1 begins and updates all rows from Type = 1 to Type = 2.

  2. Before Transaction 1 commits, Transaction 2 begins and issues a statement to update Type = 2 to Type = 3. Transaction 2 is blocked and will wait for Transaction 1 to commit.

  3. Transaction 1 commits.

  4. Transaction 2 is no longer blocked and processes its update statement. Transaction 2 then commits.

The result: The resulting table will contain records of Type = 3, and the second transaction will have updated records. This is because when Transaction 2 started, it waited, too, for committed data until after Transaction 1 committed.

Understanding concurrency: a write blocks a read

Next, consider the following steps involving a write and a read, with each transaction coming from a different session. In this scenario, an uncommitted write in Transaction 1 blocks a read in Transaction 2. The transactions are explicitly declared using the BEGIN/COMMIT TRAN syntax. In this example, the transactions are not overriding the default isolation level of READ COMMITTED:

  1. A table contains only records of Type = 0 and Type = 1. Transaction 1 begins and updates all rows from Type = 1 to Type = 2.

  2. Before Transaction 1 commits, Transaction 2 begins and issues a SELECT statement for records of Type = 2. Transaction 2 is blocked and waits for Transaction 1 to commit.

  3. Transaction 1 commits.

  4. Transaction 2 is no longer blocked, and processes its SELECT statement. Rows are returned. Transaction 2 then commits.

The result: Transaction 2 returns records of Type = 2. This is because when Transaction 2 started, it waited for committed data until after Transaction 1 committed.

Understanding concurrency: a nonrepeatable read

Consider the following steps involving a read and a write, with each Transaction coming from a different session. In this scenario, Transaction 1 suffers a nonrepeatable read, as READ COMMITTED does not offer any protection against phantom rows or nonrepeatable reads. The transactions are explicitly declared using the BEGIN/COMMIT TRAN syntax. In this example, the transactions are not overriding the default isolation level of READ COMMITTED:

  1. A table contains only records of Type = 0 and Type = 1. Transaction 1 starts and selects rows where Type = 1. Rows are returned.

  2. Before Transaction 1 commits, Transaction 2 starts and issues an Update statement, setting records of Type = 1 to Type = 2. Transaction 2 is not blocked, and process immediately.

  3. Transaction 1 again selects rows where Type = 1, and is blocked.

  4. Transaction 2 commits.

  5. Transaction 1 is immediately unblocked. No rows are returned. (No committed rows exist where Type=1.) Transaction 1 commits.

The result: The resulting table contains records of Type = 2, and the second transaction has updated records. This is because when Transaction 2 started, Transaction 1 had not placed any exclusive locks on the data, allowing for writes to happen. Because it is doing only reads, Transaction 1 would never have placed any exclusive locks on the data. Transaction 1 suffered from a nonrepeatable read: the same SELECT statement returned different data during the same multistep transaction.

Understanding concurrency: preventing a nonrepeatable read

Consider the following steps involving a read and a write, with each transaction coming from a different session. This time, we protect Transaction 1 from dirty reads and nonrepeatable reads by using the REPEATABLE READ isolation level. A read in the REPEATABLE READ isolation level will block a write. The transactions are explicitly declared by using the BEGIN/COMMIT TRAN syntax:

  1. A table contains only records of Type = 0 and Type = 1. Transaction 1 starts and selects rows where Type = 1 in the REPEATABLE READ isolation level. Rows are returned.

  2. Before Transaction 1 commits, Transaction 2 starts and issues an UPDATE statement, setting records of Type = 1 to Type = 2. Transaction 2 is blocked by Transaction 1.

  3. Transaction 1 again selects rows where Type = 1. Rows are returned.

  4. Transaction 1 commits.

  5. Transaction 2 is immediately unblocked and processes its update. Transaction 2 commits.

The result: The resulting table will contain records of Type = 2. This is because when Transaction 2 started, Transaction 1 had placed read locks on the data it was selecting, blocking writes to happening until it had committed. Transaction 1 returned the same records each time and did not suffer a nonrepeatable read. Transaction 2 processed its updates only when it could place exclusive locks on the rows it needed.

Understanding concurrency: experiencing phantom reads

Consider the following steps involving a read and a write, with each transaction coming from a different session. In this scenario, we describe a phantom read:

  1. A table contains only records of Type = 0 and Type = 1. Transaction 1 starts and selects rows where Type = 1 in the REPEATABLE READ isolation level. Rows are returned.

  2. Before Transaction 1 commits, Transaction 2 starts and issues an INSERT statement, adding rows of Type = 1. Transaction 2 is not blocked by Transaction 1.

  3. Transaction 1 again selects rows where Type = 1. More rows are returned compared to the first time the select was run in Transaction 1.

  4. Transaction 1 commits.

  5. Transaction 2 commits.

The result: Transaction 1 experienced a phantom read when it returned a different number of records the second time it selected from the table inside the same transaction. Transaction 1 had not placed any locks on the range of data it needed, allowing for writes in another transaction to happen within the same dataset. The phantom read would have occurred to Transaction 1 in any isolation level, except for SERIALIZABLE. Let’s look at that next.

Understanding concurrency: preventing phantom reads

Consider the following steps involving a read and a write, with each transaction coming from a different session. In this scenario, we protect Transaction 1 from a phantom read.

  1. A table contains only records of Type = 0 and Type = 1. Transaction 1 starts and selects rows where Type = 1 in the SERIALIZABLE isolation level. Rows are returned.

  2. Before Transaction 1 commits, Transaction 2 starts and issues an INSERT statement, adding rows of Type = 1. Transaction 2 is blocked by Transaction 1.

  3. Transaction 1 again Selects rows where Type = 1. The same number of rows are returned.

  4. Transaction 1 commits.

  5. Transaction 2 is immediately unblocked and processes its insert. Transaction 2 commits.

The result: Transaction 1 did not suffer from a phantom read the second time it selected form the table, because it had placed a lock on the range of rows it needed. The table now contains additional records for Type = 1, but they were not inserted until after Transaction 1 had committed.

Stating the case against READ UNCOMMITTED (NOLOCK)

Many developers and database administrators consider the NOLOCK table hint and the equivalent READ UNCOMMITTED isolation level nothing more than the turbo button on their 486DX. “We had performance problems, but we’ve been putting NOLOCK in all our stored procedures to fix it.”

The effect of the table hint NOLOCK or the READ UNCOMMITTED isolation level is that no locks are taken inside the database, save for schema locks. (A query using NOLOCK could still be blocked by Data Definition Language [DDL] commands.) The resulting removal of basic integrity of the mechanisms that retrieve data can result in uncommitted data, obviously, but that is not usually enough to scare away developers. There are more good reasons to avoid the READ UNCOMMITTED isolation level, however.

The case against using the READ UNCOMMITTED isolation level is deeper than the performance and deeper than “data that has yet to be committed.” Developers might counter that data is rarely ever rolled back or that the data is for reporting only. In production environments, these are not sufficient grounds to justify the potential problems. The only situations in which READ UNCOMMITTED are an acceptable performance shortcut involve nonproduction systems, estimate-only counts, or estimate-only aggregations.

A query in READ UNCOMMITTED isolation level could return invalid data in the following real-world, provable ways:

  • Read uncommitted data (dirty reads)

  • Read committed data twice

  • Skip committed data

  • Return corrupted data

  • Or, the query could fail altogether: “Could not continue scan with NOLOCK due to data movement.”

One final caveat: in SQL Server you cannot apply NOLOCK to tables when used in modification statements, and it ignores the declaration of READ UNCOMMITTED isolation level in a batch that includes modification statements; for example:

INSERT INTO dbo.testnolock1 WITH (NOLOCK)
SELECT * FROM dbo.testnolock2;

The preceding code will return the error:

Msg 1065, Level 15, State 1, Line 17
The NOLOCK and READUNCOMMITTED lock hints are not allowed for target tables of INSERT, UPDATE, DELETE or MERGE statements.

However, this protection doesn’t apply to the source of any writes, hence the danger.

This following code is allowed and is dangerous because it could write invalid data:

INSERT INTO testnolock1
SELECT * FROM testnolock2 WITH (NOLOCK);

Changing the isolation level within transactions

In addition to using the SET TRANSACTION ISOLATION LEVEL command, you can use table hints to override previously set behavior. Let’s review the two ways by which you can change the isolation level of queries.

Using the transaction isolation level option

The SET TRANSACTION ISOLATION LEVEL command changes the isolation level for the current session, affecting all future transactions until the connection is closed.

But, you can change the isolation level of an explicit transaction after it is created, as long as you are not changing from or to the SNAPSHOT isolation level.

For example, the following code snippet is technically valid:

SET TRANSACTION ISOLATION LEVEL READ COMMITTED
BEGIN TRAN  
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
SELECT…

However, this snippet is invalid:

SET TRANSACTION ISOLATION LEVEL READ COMMITTED
BEGIN TRAN  
SET TRANSACTION ISOLATION LEVEL SNAPSHOT
SELECT…

Doing so results in the following error:

Msg 3951, Level 16, State 1, Line 4

Transaction failed in database 'databasename' because the statement was run under snapshot isolation but the transaction did not start in snapshot isolation. You cannot change the isolation level of the transaction after the transaction has started.

In .NET applications, you should change the isolation level of each transaction when it is created. In Transact-SQL (T-SQL) code and stored procedures, you should change the execution plan of the session before creating an explicit transaction.

Using table hints to change isolation

You also can use isolation level hints to change the isolation level at the individual object level. This is an advanced type of troubleshooting that you shouldn’t use commonly, because it increases the complexity of maintenance and muddies architectural decisions with respect to enterprise concurrency.

For example, you might have seen developers use NOLOCK at the end of a table, effectively (and dangerously) dropping access to that table into the READ COMMITTED isolation level:

SELECT col1 FROM dbo.Table (NOLOCK)

Aside from the unadvisable use of NOLOCK in the preceding example, using a table hint without WITH is deprecated syntax (since SQL Server 2008).

Aside from the cautionary NOLOCK, there are 20-plus other table hints that can have utility, including the ability for a query to use a certain index, to force a seek or scan on an index, or to override the query optimizer’s locking strategy. We look at how to use UPDLOCK later in this chapter; for example, to force the use of the SERIALIZABLE isolation level.

All table hints should be considered for temporary and/or highly situational troubleshooting. They could make maintenance of these queries problematic in the future. For example, using the INDEX or FORCESEEK table hints could result in poor query performance or even cause the query to fail if the table’s indexes are changed.

Image For detailed information on all possible table hints, see the SQL Server documentation at https://docs.microsoft.com/sql/t-sql/queries/hints-transact-sql-table.

Understanding the enterprise solution to concurrency: SNAPSHOT

In the interest of performance, however, application developers too often seek to solve concurrency issues (reduce blocking) by using READ UNCOMMITTED. At first and at scale, the performance gains are too vast to consider other alternatives. But there is a far safer option, without the significant drawbacks and potential for invalid data and errors. Using row versioning with READ_COMMITTED_SNAPSHOT (RCSI) and/or the SNAPSHOT isolation level is the enterprise solution to performance issues related to concurrency.

SNAPSHOT isolation allows queries to read from the same rows that might be locked by other queries by using row versioning. The SQL Server instance’s TempDB keeps a copy of committed data, and this data can be served to concurrent requests. In this way, SNAPSHOT allows access only to committed data but without blocking access to data locked by writes. By increasing the utilization and workload of TempDB for disk-based tables, performance is dramatically increased by increasing concurrency without the dangers of accessing uncommitted data.

Although row versioning works silently in the background, you access it at the statement level, not at the transaction or session levels. Each statement will have access to the latest committed row version of the data. In this way, RCSI is still susceptible to nonrepeatable reads and phantom rows. SNAPSHOT isolation uses row versions of affected rows throughout a transaction; thus, it is not susceptible to nonrepeatable reads and phantom rows.

As an example of SNAPSHOT in use internally, all queries run against a secondary readable database in an availability group are run in the SNAPSHOT isolation level, by design. The transaction isolation level and any locking table hints are ignored. This removes any concurrency conflicts between a read-heavy workload on the secondary database and the transactions arriving there from the primary database.

Understanding concurrency: accessing SNAPSHOT data

Consider the following steps involving a read and a write, with each transaction coming from a different session. In this scenario, we see that Transaction 2 has access to previously committed row data, even though those rows are being updated concurrently.

  1. A table contains only records of Type = 1. Transaction 1 starts and updates rows where Type = 1 to Type = 2.

  2. Before Transaction 1 commits, Transaction 2 sets its session isolation level to SNAPSHOT.

  3. Transaction 2 issues a SELECT statement WHERE Type = 1. Transaction 2 is not blocked by Transaction 1. Rows where Type = 1 are returned. Transaction 2 commits.

  4. Transaction 1 commits.

  5. Transaction 2 again issues a SELECT statement WHERE Type = 1. No rows are returned.

The result: Transaction 2 was not blocked when it attempted to query rows that Transaction 1 was updating. It had access to previously committed data, thanks to row versioning.

Implementing SNAPSHOT isolation

You can implement SNAPSHOT isolation level in a database in two different ways. Turning on SNAPSHOT isolation simply allows for the use of SNAPSHOT isolation and begins the process of row versioning. Alternatively, turning on RCSI changes the default isolation level to READ COMMITTED SNAPSHOT. You can implement both or either. It’s important to understand the differences between these two settings, because they are not the same:

  • READ COMMITTED SNAPSHOT configures optimistic concurrency for reads by overriding the default isolation level of the database. When turned on, all queries will use RCSI unless overridden.

  • SNAPSHOT isolation mode configures optimistic concurrency for reads and writes. You must then specify the SNAPSHOT isolation level for any transaction to use SNAPSHOT isolation level. It is possible to have update conflicts with SNAPSHOT isolation mode that will not occur with READ COMMITTED SNAPSHOT.

The statement to implement SNAPSHOT isolation in the database is simple enough, but is not without consequence. Even if no transactions or statements use the SNAPSHOT isolation level, behind the scenes, TempDB begins storing row version data for disk-based tables. (Memory-optimized tables have row-versioning built in and don’t need TempDB.) The Database Engine maintains previous versions for changing data in TempDB regardless of whether that data is currently being accessed by user queries. Here’s how to implement SNAPSHOT isolation:

ALTER DATABASE databasename SET ALLOW_SNAPSHOT_ISOLATION ON;

All transactions will continue to use the default READ COMMITTED isolation level, but you now can specify the use SNAPSHOT isolation at the session level or in table hints, as shown in the following example:

SET TRANSACTION ISOLATION LEVEL SNAPSHOT;

Alternatively, or in conjunction with ALLOW_SNAPSHOT_ISOLATION, you can turn on RCSI as the new default isolation level in a database. Here’s how to turn on RCSI:

ALTER DATABASE databasename SET READ_COMMITTED_SNAPSHOT ON;

You can set both of the preceding database settings independently of each other. Setting ALLOW_SNAPSHOT_ISOLATION is not required to turn on READ_COMMITTED_SNAPSHOT, and vice versa. Similarly, these settings are not tied to the MEMORY_OPTIMIZED_ELEVATE_TO_SNAPSHOT database setting to promote memory-optimized table access to SNAPSHOT isolation.

Image We discuss memory-optimized tables in greater detail in Chapter 10.

For either of the previous ALTER DATABASE statements to succeed, no other transactions can be open in the database. It might be necessary to close other connections manually or to put the database in SINGLE_USER mode. Either way, we do not recommend that you perform this change during production activity.

Be aware and prepared for the increased utilization in the TempDB, both in the demand and space requirements. To avoid autogrowth events, increase the size of the TempDB data and log files and monitor their size. Although you should try to avoid autogrowth events by growing the TempDB data file(s) yourself, you should also verify that your TempDB file autogrowth settings are appropriate.

Image For more information on file autogrowth settings, see Chapter 4.

Should the TempDB exhaust all available space on its drive volume, SQL will be unable to row-version records for transactions, and will terminate them with SQL Server error 3958. SQL Server will also issue errors 3967 and 3966 as the oldest row versions are removed from the TempDB to make room for new row versions needed by newer transactions.

Understanding updates in SNAPSHOT isolation level

Transactions that read data in SNAPSHOT isolation or RCSI will have access to previously committed data instead of being blocked, when data needed is being changed. This is important to understand and could result in an update statement experiencing a concurrency error. The potential for update conflicts is real and you need to understand it. In the next section, we review ways to mitigate the risk.

For example, consider the following steps, with each transaction coming from a different session. In this example, Transaction 2 fails due to a concurrency conflict or “write-write error”:

  1. A table contains many records, each with a unique ID. Transaction 1 begins a transaction in the READ COMMITTED isolation level and performs an update on the row where ID = 1.

  2. Transaction 2 sets its session isolation level to SNAPSHOT and issues a statement to update the row where ID = 1.

  3. Transaction 1 commits first.

  4. Transaction 2 immediately fails with SQL error 3960.

The result: Transaction 1’s update to the row where ID = 1 succeeded. Transaction 2 immediately failed with the following error message:

Msg 3960, Level 16, State 2, Line 8

Snapshot isolation transaction aborted due to update conflict. You cannot use snapshot isolation to access table 'dbo.AnyTable' directly or indirectly in database 'WideWorldImporters' to update, delete, or insert the row that has been modified or deleted by another transaction. Retry the transaction or change the isolation level for the update/delete statement.

The transaction for Transaction 2 was rolled back, marked uncommittable. Let’s try to understand why this error occurred, what to do, and how to prevent it.

In SQL Server, SNAPSHOT isolation uses locks to create blocking but doesn’t block updates from colliding for disk-based tables. It is possible to error when committing an update statement, if another transaction has changed the data needed for an update during a transaction in SNAPSHOT isolation level.

For disk-based tables, the update conflict error will look like the Msg 3960 that we saw a moment ago. For queries on memory-optimized tables, the update conflict error will look like this:

Msg 41302, Level 16, State 110, Line 8

The current transaction attempted to update a record that has been updated since this transaction started. The transaction was aborted.

The preceding error can occur with ALLOW_SNAPSHOT_ISOLATION turned on if transactions are run in SNAPSHOT isolation level.

Even though optimistic concurrency of snapshot isolation level (and also memory-optimized tables) increases the potential for update conflicts, you can mitigate these by doing the following:

  • When running a transaction in SNAPSHOT isolation level, it is crucial to avoid using any statements that place update locks to disk-based tables inside multistep explicit transactions.

    Similarly, always avoid multistep transactions with writes when working with memory-optimized tables, regardless of isolation level.

  • Specifying the UPDLOCK table hint can have utility at preventing update conflict errors for long-running SELECT statements. The UPDLOCK table hints places pessimistic locks on rows needed for the multistep transaction to complete. The use of UPDLOCK on SELECT statements with SNAPSHOT isolation level is not a panacea for update conflicts, and it could in fact create them. Frequent select statements with UPDLOCK could increase the number of update conflicts with updates. Regardless, your application should handle errors and initiate retries when appropriate.

    If two concurrent statements use UPDLOCK, with one updating and one reading the same data, even in implicit transactions, an update conflict failure is possible if not likely.

  • Avoid writes altogether while in SNAPSHOT isolation mode. Change the transaction isolation level back to READ COMMITTED before running an UPDATE statement, and then back to SNAPSHOT if desired.

Specifying table granularity hints such as ROWLOCK or TABLOCK can prevent update conflicts, although at the cost of concurrency. The second update transaction must be blocked while the first update transaction is running—essentially bypassing SNAPSHOT isolation for the write. If two concurrent statements are both updating the same data in SNAPSHOT isolation level, an update conflict failure is likely for the statement that started second.

Using memory-optimized tables in SNAPSHOT isolation level

SNAPSHOT isolation is supported for memory-optimized tables, but not with all of the different ways to place a query in SNAPSHOT isolation. There are only ways to ensure that memory-optimized tables use SNAPSHOT isolation:

  • Turn on the MEMORY_OPTIMIZED_ELEVATE_TO_SNAPSHOT database option. This promotes access to all memory-optimized tables in the database up to SNAPHOT isolation level if the current isolation level is not REPEATABLE READ or SERIALIZABLE. It will promote the isolation level to SNAPSHOT from isolation levels such as READ UNCOMMITTED and READ COMMITTED. This option is off by default, but you should consider it because you otherwise cannot use the READ UNCOMMITTED or SNAPSHOT isolation levels for a session including memory-optimized tables.

  • You can specify SNAPSHOT isolation with table hints (see the section “Using table hints to change isolation” earlier in this chapter). Note that only for memory-optimized tables can use this SNAPSHOT table hint, not disk-based tables.

You cannot, for example, include memory-optimized tables in a session that begins with SET TRANSACTION ISOLATION LEVEL SNAPSHOT, even if MEMORY_OPTIMIZED_ELEVATE_TO_SNAPSHOT = ON or you specify the SNAPSHOT table hint.

Inside OUT

Which isolation level does my.NET application use?

Be aware that by default the .NET System.Transaction infrastructure uses the SERIALIZABLE isolation level, the safest but least practical choice. SERIALIZABLE provides the most isolation for transactions, so by default .NET transactions do not suffer from dirty reads, nonrepeatable reads, or phantom rows.

You might find, however, that SERIALIZABLE transactions are being frequently blocked and at the source of blocking, and that reducing the isolation of certain transactions would result in better performance. Evaluate the potential risk of nonrepeatable reads and phantom rows for each new .NET transaction, and reduce the isolation level to REPEATABLE READ or READ COMMITTED only where appropriate, and following guidance throughout this chapter, do not use the READ UNCOMMITTED isolation level in any production code.

For applications with high transactional volume, consider also using SNAPSHOT isolation level to increase concurrency.

You can set the isolation level of any transaction when it is begun by setting the IsolationLevel property of the TransactionScope class. You can also default a new database connection’s isolation level upon creation. Remember, however, that you cannot change the isolation level of a transaction after it has begun.

Understanding on-disk versus memory-optimized concurrency

Queries using memory-optimized tables (initially called Project Hekaton prior to the release of SQL 2014) can perform significantly faster than queries based on the same data in disk-based tables. Memory-optimized tables can improve the performance of frequently written-to tables by up to 40 times over disk-based tables.

When in the aforementioned scenarios we use the words “prevents” or “protection,” we mean locking, and this applies only to on-disk tables, not memory-optimized tables. When a transaction has rows or a range of rows locked, any other transaction’s writes in that range are blocked and wait patiently, queueing up to proceed as soon as the locks are released. Although SQL Server allows requests to wait and be blocked forever, the applications generating the request might easily time out under a minute of waiting.

In the case of memory-optimized tables, locking isn’t the mechanism that ensures isolation. Instead, the in-memory engine uses row versioning to provide row content to each transaction. In the in-memory engine, update operations create new rows in the in-memory data structure (actually a heap), that supplant older row versions. Similarly, delete operations create rows in a delta file, marking the row as deleted. Periodically, cleanup is done to merge the in-memory data structure and delta files to reduce the space used in memory, and in the case of tables with durable data, on a drive. If you are familiar with the data warehousing concept of a Slowly Changing Dimension (SCD), this is similar to an SCD Type II.

If two transactions attempt to update the same data at the same time, one transaction will immediately fail due to a concurrency error. Only one transaction can be in the process of updating or deleting the same row at a time. The other will fail with a concurrency conflict (SQL error 41302).

This is the key difference between the behavior of pessimistic and optimistic concurrency. Pessimistic concurrency uses locks to prevent write conflict errors, whereas optimistic concurrency uses row versions with acceptable risk of write conflict errors. On-disk tables offer isolation levels that use pessimistic concurrency to block conflicting transactions, forcing them to wait. Memory-optimized tables offer optimistic concurrency that will cause a conflicting transaction to fail.

In the case of a nonrepeatable read, SQL error 41305 will be raised. In the case of a phantom read, a SQL error 41325 will be raised. Because of these errors, applications that write to memory-optimized tables must include logic that gracefully handles and automatically retries transactions. They should already handle and retry in the case of deadlocks or other fatal database errors.

Image For more information on configuring memory-optimized tables, see Chapter 8.

Image We discuss more about indexes for memory-optimized tables in Chapter 10.

Understanding delayed durability

Delayed durability is a set of transaction features first introduced in SQL Server 2014. It allows for transactions to avoid synchronously committing to a disk; instead, committing only to memory and asynchronously committing to a disk. If this sounds dangerous to you, and opens the possibility to losing records in the event of a server shutdown, you are correct!

However, unless your SQL Server instance’s databases are running in a synchronous availability group (and even then, chance exists for the databases to drop into asynchronous under pressure), you already face the likelihood in your database of losing recently written records in the event of a sudden server or drive failure.

So perhaps delayed durability’s danger isn’t so unfamiliar after all. Databases in Azure SQL Database also support delayed durability transactions, with the same caveat and expectations for data recovery. Some data loss is possible.

A delayed durable transaction will be flushed to the disk whenever a threshold of delayed durability transactions builds up, or, whenever any other durable transaction commits in the same database. You also can force a flush of the transaction log with the system stored procedure sp_flush_log. Otherwise, the transactions are written to a buffer in-memory and kept away from using I/O resources until a log flush event. SQL Server manages the buffer, but makes no guarantees as to the amount of time a transaction can remain in buffer.

The delayed durability options, implemented either at the database level or at the transaction level, have application in very-high-performance workloads for which the bottleneck to write performance is the transaction log itself. By trading the possibility for new records to be written only to memory and lost in the event of a shutdown, you can gain a significant performance increase, especially with write-heavy workloads.

It’s important to note that delayed durability is simply about reducing the I/O bottleneck of committing a massive quantity of writes to the transaction log. This has no effect on isolation (locking, blocking) or access to any data in the database that must be read to perform the write. Otherwise, delayed durability transactions follow the same rules as other transactions.

Delayed durability database options

At the database level, you can set the DELAYED_DURABILITY option to DISABLED (default), ALLOWED, or FORCED.

The FORCED option obviously has implications on the entirety of the database, and you should consider it carefully with existing applications and databases. The ALLOWED option permits delayed durability transactions but has no effect on other transactions.

Delayed durability transactions

In the end, delayed durability is a transaction option with simple syntax. This syntax is necessary only when DELAYED_DURABILITY = ALLOWED in the current database.

It is supported for explicit transactions at the time they are committed by using the following sample syntax:

BEGIN TRAN
COMMIT TRAN WITH (DELAYED_DURABILITY=ON);

In the case of a natively compiled procedure, you can specify DELAYED_DURABILITY in the BEGIN ATOMIC block. Take, for example, this procedure in the WideWorldImporters database:

CREATE PROCEDURE [Website].[RecordColdRoomTemperatures_DD]
@SensorReadings Website.SensorDataList READONLY
WITH NATIVE_COMPILATION, SCHEMABINDING, EXECUTE AS OWNER
AS
BEGIN ATOMIC WITH
(
    TRANSACTION ISOLATION LEVEL = SNAPSHOT,
    LANGUAGE = N'English',
    DELAYED_DURABILITY = ON
)
    BEGIN TRY

Understanding execution plans

Execution plans are a detailed explanation of the query optimizer’s plan for processing any statement. Each time you run a statement, including batches with multiple statements, an execution plan is generated.

Execution plans inform the developer of the steps the Database Engine will take to retrieve data, from the tables, through the various transformation steps to sort, join, and filter data, and finally return or affect data. All statements create execution plans, including Data Manipulation Language (DML) and DDL.

Execution plans contain the cost and other metadata of each piece that it takes to process a query—from the data retrieval steps, joins, sorts, and more, and finally the DML or DDL operation itself. This data can be invaluable to developers and database administrators for tuning query performance.

The Procedure Cache, stored in the memory that SQL Server uses, contains query plans for statements that have been run. The Query Store is a powerful built-in repository in each database to track and trend runtime statistics over time.

Execution plans are generated for a query and reused when that exact same query text is called again. (The query text is first and always subjected to simplification, which removes redundancies, including using a code reduction technique called Constant Folding.) Queries will reuse the same plan only if every character of the query statement matches, including capitalization, whitespace, line breaks, and text in comments. There is one exception to this rule of query reuse, and that is when SQL Server parameterizes a query or stored procedure statement.

SQL Server does a smart job at sniffing for parts of a statement that could be parameterized to make a query’s cached plan reusable. For example, a query that has a WHERE clause on a LastName field should be able to use the same execution plan whether it is searching for “Smith” or “Green.”

Understanding parameterization and “parameter sniffing”

SQL Server parameterization occurs when the query optimizer detects values (such as the search criteria of a WHERE clause statement) that can be parameterized.

With parameterization, it’s possible that two potentially helpful or potentially problematic conditions can occur:

  • You can reuse a query plan for multiple queries for which the query text is exactly the same, except for parameterized values.

  • The same query could use the same execution plan for two different values of a WHERE clause, resulting in vastly different performance.

For example, the following two query statements in the WideWorldImporters database will be parameterized and use the same query plan. (This also means that both queries could be affected by the same Query Store forced plan; more on that later.) The first query returns 13 rows, the second returns 1,055 rows:

SELECT ppo.OrderDate, ppo.PurchaseOrderID, pol.PurchaseOrderLineID, ppo.[SupplierID]
  FROM [Purchasing].[PurchaseOrders] AS ppo
  INNER JOIN [Purchasing].[PurchaseOrderLines] AS pol
     ON ppo.PurchaseOrderID = pol.PurchaseOrderID
  INNER JOIN [Purchasing].[Suppliers] AS s  ON s.SupplierID = ppo.SupplierID
  WHERE ppo.SupplierID = 5

SELECT ppo.OrderDate, ppo.PurchaseOrderID, pol.PurchaseOrderLineID, ppo.[SupplierID]
  FROM [Purchasing].[PurchaseOrders] AS ppo
  INNER JOIN [Purchasing].[PurchaseOrderLines] AS pol
     ON ppo.PurchaseOrderID = pol.PurchaseOrderID
  INNER JOIN [Purchasing].[Suppliers] AS s  ON s.SupplierID = ppo.SupplierID
  WHERE ppo.SupplierID = 4

In the WideWorldImporters database, we might see the same query plan for both statements results in quick performance for the smaller rowcount SupplierID and horrible performance for the larger rowcount.

If the larger rowcount query (SupplierID = 4) is run first and has its query plan cached, there isn’t likely to be a problem. Both versions of the query will run well enough. If the smaller rowcount query (SupplierID = 5) is run first, its version of the plan will be cached. In this case, the plan is different, less efficient for very large row counts, and will be used for all versions of the parameterized statement.

Here are a few advanced troubleshooting avenues to alleviate this scenario:

  • You can use the OPTIMIZE FOR query hint to demand that the query analyzer use a cached execution plan that substitutes a provided value for the parameters. You also can use OPTIMIZE FOR UNKNOWN, which instructs the query analyzer to optimize for the most common value, based on statistics of the underlying data object.

  • The RECOMPILE query hint or procedure option does not allow the reuse of a cached plan, forcing a fresh query plan to be generated each time the query is run.

  • You can use the Plan Guide feature (implemented via stored procedures) to guide the query analyzer to a plan currently in cache. You identify the plan via its plan_handle. For information on identifying and analyzing plans in sys.dm_exec_cached_plans, see the upcoming section, which contains a plan_handle.

  • You can use the Query Store feature (implemented with a GUI in SQL Server Management Studio, and via stored procedures behind the scenes) to visually look at plan performance and force a query to use a specific plan currently in cache.

Image For more information, see the section “Using the Query Store feature” later in this chapter.

  • You could use the USE PLAN query hint to provide the entire XML query plan for any statement execution. This obviously is the least convenient option, and like other approaches that override the query analyzer, you should consider it an advanced and temporary performance tuning technique.

Understanding the Procedure Cache

New execution plans enter the Procedure Cache only when a new statement is run. If a procedure cache already contains a plan matching a previous run of the current statement, the execution plan is reused, saving valuable time and resources.

This is why complex statements can appear to run faster the second time they are run.

The Procedure Cache is empty when the SQL Server service starts and grows from there. SQL Server manages plans in the cache, removing them as necessary under memory pressure. The size of the Procedure Cache is managed by SQL Server and is inside the memory space configured for the server in the Max Server Memory configuration setting. Plans are removed based on their cost and how recently it has been used. Smaller, older plans and single-user plans are the first to be cleared.

Inside OUT

If I run a statement only once, does SQL Server remember its plan?

By default, SQL Server adds an execution plan to the Procedure Cache the first time it is generated. You can view the number and size of cached execution plans with the dynamic management view sys.dm_exec_cached_plans. You might find that a large amount of space in the Procedure Cache is dedicated to storing execution plans that have been used only once. These single-use plans can be referred to as ad hoc execution plans, from the Latin, meaning “for this situation.”

If you find that a SQL Server instance is storing many single-use plans, as many do, selecting the server configuration option Optimize For Ad Hoc Queries will benefit performance. This option does not optimize ad hoc queries; rather, it optimizes SQL Server memory by storing an execution plan in memory only after the same query has been detected twice. Queries might then benefit from the cached plan only upon the third time they are run.

The following query provides the number of single-use versus multiuse query plans, and the space used to store both:

SELECT
        PlanUse = CASE WHEN p.usecounts > 1 THEN '>1' ELSE '1' END
,       PlanCount = COUNT(1)
,       SizeInMB = SUM(p.size_in_bytes/1024./1024.)
FROM sys.dm_exec_cached_plans p
GROUP BY CASE WHEN p.usecounts > 1 THEN '>1' ELSE '1' END;

Analyzing cached execution plans in aggregate

You can analyze execution plans in aggregate starting with the dynamic management view sys.dm_exec_cached_plans, which contains a plan_handle.

The plan_handle column contains a system-generated varbinary(64) string that can be joined to a number of other dynamic management views. As seen in the code example that follows, you can use the plan_handle to gather information about aggregate plan usage, plan statement text, and to retrieve the graphical execution plan itself. You might be used to viewing the graphical execution plan only after a statement is run in SQL Server Management Studio, but you can also analyze and retrieve plans by using the following query against a handful of dynamic management views (DMVs). These DMVs return data for all databases in SQL Server instances, and for the current database in Azure SQL Database.

Query cached plan stats

SELECT
   UseCount     = p.usecounts
,  PlanSize_KB  = p.size_in_bytes / 1024
,  CPU_ms   = qs.total_worker_time/1000
,  Duration_ms  = qs.total_elapsed_time/1000
,  ObjectType   = p.cacheobjtype + ' (' + p.objtype + ')'
,  DatabaseName = db_name(convert(int, pa.value))
,  txt.ObjectID
,  qs.total_physical_reads
,  qs.total_logical_writes
,  qs.total_logical_reads
,  qs.last_execution_time
,  StatementText =  SUBSTRING (txt.[text], qs.statement_start_offset/2 + 1,
                    CASE WHEN qs.statement_end_offset = -1 THEN LEN
(CONVERT(nvarchar(max), txt.[text]))
                    ELSE qs.statement_end_offset/2 - qs.statement_start_offset/2 + 1 END)
,  QueryPlan  = qp.query_plan
FROM sys.dm_exec_query_stats AS qs
INNER JOIN sys.dm_exec_cached_plans p ON p.plan_handle = qs.plan_handle
OUTER APPLY sys.dm_exec_plan_attributes (p.plan_handle) AS pa
OUTER APPLY sys.dm_exec_sql_text (p.plan_handle) AS txt
OUTER APPLY sys.dm_exec_query_plan (p.plan_handle) AS qp
WHERE pa.attribute = 'dbid'  --retrieve only the database id from sys.dm_exec_plan_
attributes
ORDER BY qs.total_worker_time + qs.total_elapsed_time DESC;

Note that the preceding query orders by a sum of the CPU time and duration, descending, returning the longest running queries first. You can adjust the ORDER BY and WHERE clauses in this query to hunt, for example, for the most CPU-intensive or most busy execution plans. Keep in mind that the Query Store feature, as detailed later in this chapter, will help you visualize the process of identifying the most expensive and longest running queries in cache.

As you can see in the previous query, you can retrieve a wealth of information from these five DMVs, including the statement within a batch that generated the query plan. The query plan appears as blue hyperlink in SQL Server Management Studio’s Results To Grid mode, opening the plan as a new .sqlplan file. You can save and store the .sqlplan file for later analysis.

Permissions required to access cached plan metadata

The only permission needed to run the previous query in SQL Server is the server-level VIEW SERVER STATE permissions, which might be appropriate for developers to have access to in a production environment because it does not give them access to any data in user databases.

In Azure SQL Database, because of the differences between the Basic/Standard and Premium tiers, different permissions are needed. In the Basic/Standard tier, you must be the server admin or Azure Active Directory Admin to access objects that would usually require VIEW SERVER STATE. In Premium tier, you can grant VIEW DATABASE STATE in the intended database in Azure SQL Database to a user who needs permission to view the above DMVs.

Clearing the Procedure Cache

You might find that manually clearing the Procedure Cache is useful when performance testing or troubleshooting. Typically, you want to reserve this activity for nonproduction systems. There are a few strategies to clearing out cached plans in SQL Server.

To compare two versions of a query or the performance of a query with different indexes, you could clear the cached plan for the statement to allow for proper comparison. You can manually flush the entire Procedure Cache, or individual plans in cache, with the following database-scoped configuration command. The following command affects only the current database context, as opposed to the entire instance’s procedure cache:

ALTER DATABASE SCOPED CONFIGURATION CLEAR PROCEDURE_CACHE;

This command was introduced in SQL Server 2016 and is effectively the same as the command DBCC FREEPROCCACHE within the current database context. It works in both SQL Server and Azure SQL Database. DBCC FREEPROCCACHE is not supported in Azure SQL Database.

You can use DBCC FREEPROCCACHE to clear the procedure cache of the SQL Server instance.

You can also remove a single plan from cache by identifying its plan_handle and then providing it as the parameter to the DBCC FREEPROCCACHE function. Perhaps this is a plan you would like to remove for testing or troubleshooting purposes that you have identified with the script in the previous section:

DBCC FREEPROCCACHE (0x06000700CA920912307B86
7DB701000001000000000000000000000000000000000000000000000000000000);

You could alternatively flush the cache by object type. This command clears cached execution plans that are the result of ad hoc statements and prepared statements (from applications):

DBCC FREESYSTEMCACHE ('SQL Plans');

The advantage of this statement is that it does not wipe the cached plans from “Programmability” database objects such as stored procedures, multistatement table-valued functions, scalar user-defined functions, and triggers. The following command clears the cached plans from those type of objects:

DBCC FREESYSTEMCACHE ('Object Plans');

Note that DBCC FREESYSTEMCACHE is not supported in Azure SQL Database.

You can also use DBCC FREESYSTEMCACHE to clear cached plans association to a specific Resource Governor Pool, as follows:

DBCC FREESYSTEMCACHE ('SQL Plans', 'poolname');

Retrieving execution plans in SQL Server Management Studio

There are three basic types of graphical execution plans to retrieve for a statement: Estimated, Actual, and Live. Let’s review the differences, and how you can view them.

Estimate the execution plan

You can generate the estimated execution plan quickly and view it graphically from within SQL Server Management Studio by choosing the Display Estimated Execution Plan option in the Query menu, or pressing Ctrl+L. An estimated execution plan will return for the highlighted region, or for the entire file if no text is selected.

You can also retrieve an estimated graphical execution plan in T-SQL code by running the following statement:

SET SHOWPLAN_XML ON

The actual execution plan is returned as an XML string. In SQL Server Management Studio, in Grid mode, the results are displayed as a link. Click the link to open the plan graphically in SQL Server Management Studio. You can save the execution plan as a .sqlplan file by right-clicking in the neutral space of the plan window.

You can also configure the estimated text execution plan in code by running one of the following statements, which return the execution plan in one result set or two, respectively:

SET SHOWPLAN_ALL ON
SET SHOWPLAN_TEXT ON

As expected, the estimated execution plan is not guaranteed to match the actual plan used when you run the statement, but it is a very reliable approximation. The query optimizer uses the same information for the estimate as it does for the actual plan when you run it.

One cause for any differences between the estimate and actual execution plans would be any reason for the plan to be recompiled between the estimate and actual plan generation, including if the plan was removed from the Procedure Cache.

To display information for individual steps, hover over a step in the execution plan. You can also click an object, and then open the Properties window by pressing F4 or, in the View menu, clicking Properties Window. You’ll notice the estimated execution plan is missing some information that the actual plan returns. The missing fields are self-explanatory; for example, Actual Number Of Rows, Actual Number Of Batches, and Number of Executions.

Displaying the actual execution plan

You can generate the actual execution plan along with the statement’s result set from within SQL Server Management Studio by choosing the Include Actual Execution Plan option in the Query menu, or pressing Control+M to turn on the setting. After turning on this setting, when you run a statement, you will see an additional tab along with the execution results.

You’ll notice that returning the actual graphical execution plan adds some additional time to the execution. The actual execution plan will return as an additional tab in Management Studio.

You can also configure the actual graphical execution plan in T-SQL code, returning XML that can be viewed graphically in SQL Server Management Studio, by running the following statement:

SET STATISTICS XML ON

The actual execution plan is returned as an XML string. In SQL Server Management Studio, in Grid mode, the results display as a link.

Remember to turn off the SET STATISTICS option before you reuse the same session, if you don’t want to get back the actual plan for every query you run on this connection.

You can save both the estimated and actual execution plans as a .sqlplan file by right-clicking the neutral space of the plan window.

Displaying live query statistics

You can generate and display a “live” version of the execution plan by using SQL Server Management Studio 2016. You can access live statistics on versions of SQL Server starting with SQL Server 2014. You turn on the Live Execution Statistics option in the Query menu of SQL Server Management Studio, as demonstrated in Figure 9-1.

Image

Figure 9-1 The Query menu in SQL Server Management Studio, with the Include Live Query Statistics option highlighted.

The Live Query Statistics window displays a hybrid version of the Estimated and Actual execution plans while the query is processing. If your query runs too quickly, you’ll miss the dotted, moving lines and the various progress metrics including duration for each step and overall percentage completion. The percentage is based on the Actual rows processed currently incurred versus a total number of rows processed for that step.

The Live Query Statistics contains more information than the Estimated query plan, such as Actual Number Of Rows and Number Of Executions, but less than the Actual query plan. The Live Query Statistics does not display some data from the Actual Execution Plan, Actual Execution Mode, Number Of Rows Read, Actual Rebinds, and Actual Rebinds.

Notice in Figure 9-2 that returning the execution plan slows down the query, so be aware that the individual and overall execution durations measured will often be longer than when the query is run without the option to display Live Query Statistics.

Image

Figure 9-2 Three different screenshots of the Live Query Statistics, moments apart.

You might see that the total rows to be processed does not match total Estimated Number Of Rows for that step; rather, the multiple of that step’s Estimated Number Of Rows and a preceding step’s Estimated Number Of Rows. In Figure 9-2, the number of rows Estimated is less than the number of rows actually read.

Inside OUT

What’s the difference between “Number Of Rows Read” and “Actual Number Of Rows”?

This is an important distinction, and it can tip you off to a significant performance issue.

Both are “Actual” values, but Actual Number Of Rows contains the number of values in the range of rows we expect to retrieve, and Number Of Rows Read contains the number of rows that were actually read. The difference could be significant to performance, and the solution is likely to change the query so that the predicate is narrower and/or better aligned with indexes on the table. Alternatively, you could add indexes to better fit the query predicates and make for more efficient searches.

One of the easiest ways to reproduce this behavior is with a wildcard search, for example in the WideWorldImporters sample database:

SELECT i.InvoiceID
FROM [Sales].[Invoices] as i
WHERE i.InvoiceID like '1%'

In the XML, in the node for the Index Scan, you will see:

<RunTimeInformation>
<RunTimeCountersPerThread Thread="0" ActualRows="11111" ActualRowsRead="70510"

Defined as “ActualRowsRead” in the XML of the plan, this value is displayed as “Number of Rows Read” in SQL Server Management Studio. Similarly, “ActualRows” is displayed as “Actual Number of Rows.”

Permissions necessary to view execution plans

The user must have permissions to actually run the query, even if they are generating only an Estimated execution plan.

Retrieving the Estimated or Actual execution plan requires the SHOWPLAN permission in each database referenced by the query. The Live Query Statistics feature requires SHOWPLAN in each database, plus the VIEW SERVER STATE permission to see live statistics.

It might be appropriate in your environment to grant SHOWPLAN and VIEW SERVER STATE permissions to developers. However, the permission to execute queries against the production database may not be appropriate in your regularly environment. If that is the case, there are alternatives to providing valuable execution plan data to developers without production access:

  • Consider providing database developers with saved execution plan (.sqlplan) files for offline analysis.

  • Consider also configuring the dynamic data masking feature, which may already be appropriate in your environment for hiding sensitive or personally identifying information for users who are not sysadmins on the server. Do not provide UNMASK permission to developers; assign that only to application users.

Image For more information on dynamic data masking, see Chapter 7.

Using the Query Store feature

First introduced in SQL Server 2016, the Query Store provides a practical history of execution plan performance. It can be invaluable for the purposes of investigating and troubleshooting sudden negative changes in performance, by allowing the administrator or developer to identify high-cost queries and the quality of their execution plans.

The Query Store is most useful for looking back in time toward the history of statement execution. The Query Store can also assist in identifying and overriding execution plans by using a feature similar to but different from the legacy plan guides feature.

Inside OUT

How should I force a statement to use a certain execution plan?

Your options for forcing a statement to follow a certain execution plan are either the older plan guides stored procedures or the newer Query Store interface (and its underlying stored procedures) to force an execution plan.

Both options are advanced options for temporary or diagnostic use only. Overriding the query optimizer’s execution plan choice is an advanced performance tuning technique. It is most often necessitated by query parameter sniffing.

It is possible to create competing plan guides or Query Store forced plans. This is certainly not recommended because it could be extremely confusing. If you create compete plan guides or Query Store forced plans, it’s likely you’ll see the Query Store forced plan “win.”

In case you are troubleshooting competing plan guides and Query Store forced plans, you can view any existing plan guides and forced query plans with the following DMV queries:

SELECT * FROM sys.plan_guides

SELECT *
FROM    sys.query_store_query AS qsq
JOIN    sys.query_store_plan AS qsp
        ON qsp.query_id = qsq.query_id
WHERE   qsp.is_forced_plan = 1;

Finally, you could use the USE PLAN query hint to provide the entire XML query plan for any statement execution. This obviously is the least convenient option, and like other approaches that override the query analyzer, should be considered an advanced and temporary performance tuning technique.

Plan guides are used to override an otherwise complicated manual scripting exercise.

You see live Query Store data as it happens from a combination of both memory-optimized and on-disk sources. Query Store minimizes overhead and performance impact by capturing cached plan information to in-memory data structure. The data is “flushed” to disk at an interval defined by Query Store, by default 15 minutes. The Disk Flush Interval setting defines how much Query Store data could be lost in the event of an unexpected system shutdown.

The Query Store is a feature that Microsoft delivered to the Azure SQL Database platform first, and then to the SQL Server product. In fact, Query Store is at the heart of the Azure SQL Database Advisor feature which provides automatic query tuning. The Query Store feature’s overhead is quite manageable, tuned to avoid performance hits, and is already in place on millions of databases in Azure SQL Database.

The VIEW DATABASE STATE permission is all that is needed to view the Query Store data.

Initially configuring the query store

The Query Store feature is identical between the two platforms, except for its default activation. Query Store is turned on automatically on Azure SQL Database, but it is not automatically on for new databases in SQL Server 2017, and it is not a setting that can be inherited by the model database.

You should turn on the Query Store on new production databases in SQL Server 2017 when you anticipate doing any performance tuning. You can turn on Query Store via the database Properties dialog box, in which Query Store is a page on the menu on the left. Or, you can turn it on via T-SQL by using the following command:

ALTER DATABASE [DatabaseOne] SET QUERY_STORE = ON;

Keep in mind that Query Store begins collecting when you activate it. You will not have any historical data when you first turn on the feature on an existing database, but you will begin to immediately see data for live database activity.

The Query Store Capture Mode default setting of All includes all queries. You might soon realize that this setting does not filter out ad hoc queries, even if you selected the Optimize For Ad Hoc Queries option in the system configuration. Change this setting to Auto because the additional data of one-use plans might not be useful, and can reduce the amount of historical data can be retained.

The Query Store retains data up to two limits: a Max Size (500 MB by default), and a “Stale Query Threshold” time limit of Days (30 by default). If Query Store reaches its Max Size, it will clean up the oldest data. Because Query Store data is saved on a drive, its historical data is not affected by the commands we looked at earlier in this chapter to clear the Procedure Cache, such as DBCC FREEPROCACHE.

You should keep the Size Based Cleanup Mode set to the default Auto. If not, when the Max Size is reached, Query Store will stop collecting data and enter “Read Only” mode, which does not collect new data. If you find that the Query Store is not storing more historical days of data than your Stale Query Threshold setting in days, increase the Max Size setting.

Using query store data in your troubleshooting

Query Store has several built-in dashboards, shown in Figure 9-3, to help you examine query performance and overall performance over recent history.

Image

Figure 9-3 The SQL Server Object Explorer list of built-in dashboards available for Query Store in SQL Server Management Studio 2017.

With SQL Server Management Studio 2017, you can view more dashboards in SQL Server 2016 databases than you could in SQL Server Management Studio 2016, including Queries With Forced Plans and Queries With High Variation.

You can also write your own reports against the collection of system DMVs that present Query Store data to administrators and developers by using the VIEW DATABASE STATE permission. You can view the six-view schema of well-documented views and their relationships at https://docs.microsoft.com/sql/relational-databases/performance/how-query-store-collects-data#views.

On many of the dashboards, there is a button with a crosshairs symbol, as depicted in Figure 9-4. If a query seems interesting, expensive, or is of high value to the business, you can click this button to view a new screen that tracks the query when it’s running as well as various plans identified for that query.

Image

Figure 9-4 The Query Store tool bar at the top of the screen on many of the dashboards, in this example, the tool bar for the Regressed Queries report.

You can also review the various plans for the same statement, compare the plans, and if necessary, force your chosen plan into place. Compare the execution of each plan by CPU Time, Duration, Logical Reads, Logical Writes, Memory Consumption, and Physical Reads.

Most of all, the Query Store can be valuable by informing you when a query started using a new plan. You can see when a plan was generated and the nature of the plan; however, the cause of the plan’s creation and replacement is not easily answered, especially when you cannot correlate to a DDL operation. Query plans can become invalidated automatically due to large changes in statistics due to data inserts or deletes, changes made to other statements in the stored procedure, changes to any of the indexes used by the plan, or manual recompilation due to the RECOMPILE option.

Forcing a statement (see Figure 9-5) to use a specific execution plan via the Query Store is not a recommended common activity. You should use this only for specific performance cases, problematic queries demanding unusual plans, workarounds for other unresolvable index or performance scenarios. Note that if the forced plan is invalid, such as an index changing or being dropped, SQL Server will move on without the forced plan without warning or error, though Query Store will still show that the plan is being forced for that statement.

Image

Figure 9-5 The Query Store has recorded the execution results of the query. Note that one plan has been Forced (using the Force Plan button) for this statement and is displayed with a check mark.

Understanding automatic plan correction

SQL Server 2017 introduces a new feature called Automatic Plan Tuning, originally developed for the Azure SQL Database platform. It is capable of detecting and reverting plan regression.

You could use Query Store in 2016 to identify a query that has regressed in performance, and manually force a past execution plan into use. Now in SQL Server 2017, the database can be configured to detect plan regression and take this action automatically. The sample syntax for enabling automatic plan correction is below:

ALTER DATABASE [WideWorldImporters] SET AUTOMATIC_TUNING (FORCE_LAST_GOOD_PLAN = ON );

Currently, FORCE_LAST_GOOD_PLAN is the only option for automatic plan tuning.

The DMV sys.dm_db_tuning_recommendations captures plan recommendations based on query performance regression. This doesn’t happen immediately – the feature has an algorithm that requires several executions before regression is identified. When a recommendation appears in sys.dm_db_tuning_recommendations, it includes a large amount of diagnostic data, including a plain-language “reason” explanation for the recommendation to be generated, and a block of JSON data containing diagnostic information. A sample query to parse this data is available at https://docs.microsoft.com/sql/relational-databases/automatic-tuning/automatic-tuning.

Understanding execution plan operators

After you have a graphical execution plan in front of you, you can begin to understand how the statement is processed.

To display information for individual steps, position your pointer over a step in the execution plan. You can also click an object, and then open the Properties window by pressing F4 or, in the View menu, clicking Properties Window. You’ll notice that the information returned estimate and actual values for some metrics, including Number of Rows and Executions. Look for differences here; they can indicate an inefficient execution plan and the source of a poor performing query. Your query might be suffering from a poorly chosen plan because of the impact of parameter sniffing or stale, inaccurate index statistics. (We discussed parameter sniffing earlier in this chapter, and discuss index statistics in Chapter 10.)

However, notice that some values, like Cost information, contain only Estimated values, even when you are viewing the Actual execution plan. This is because the operator costs aren’t sourced separately, they are generated the same way for both Estimated and Actual plans, and do not change based on statement execution. Furthermore, cost is not just comprised entirely of duration. You might find that some statements far exceed others in terms of duration, but not in cost.

There are even known plan presentation issues (as recent as SQL Server Management Studio 17.1) that might sometimes result in a sum of Operator Costs that do not add up to 100 percent, specifically in the presence of the concatenation operator.

Interpreting graphical execution plans

In the next list, we review some of the most common things to look for as you review execution plans in SQL Server Management Studio. You can also choose to review execution plans with a well-known third-party tool called Plan Explorer, which is a free download from https://www.sentryone.com/.

In this section, it is assumed that you will have access to the Actual execution plan, as not all the information within will exist in the Estimated plan.

Start in the upper left

The upper-left operator will reflect the basic operation that the statement performed. For example, Select, Delete, Update, or Insert for DML statements. This operator might contain warnings or other items that require your immediate attention. These might show up with a small yellow triangle warning icon, with additional detail when you position your pointer on the operator.

Click the upper-left operator, and then press F4 to open the Properties window, or open the Properties window from the View menu in SQL Server Management Studio. In this list are a couple other things to look for. You’ll see warnings repeated in here, along with additional aggregate information.

Look also for the Optimization Level, which ideally says FULL. If the Optimization Level was TRIVIAL, the plan bypassed the query optimizer altogether because it was too straightforward. The plan contained only a simple Scan or Seek operation the only other operator, perhaps. If not FULL or TRIVIAL, this is something to investigate.

Look next for the presence of a value for Reason For Early Termination, which indicates the query optimizer spent too long on attempting to build the perfect execution plan, and gave up, sometimes literally returning the self-explanatory value, Good Enough Plan Found. If the reason is Time Out, the optimizer tried as many times as it could to find the best plan before deciding, taking the best plan available, which might not be “good enough.” If you see this case, consider simplifying the query, especially reducing the use of functions, and by potentially modifying the underlying indexes. Finally, if you see the reason is Memory Limit Exceeded, this is a rare and critical error indicating severe memory pressure on the SQL Server instance.

In the Query Cached Plan Stats script sample shown in the section "Analyzing cached execution plans in aggregate" earlier in this chapter, in which we queried the procedure cache for plan statistics, you can add some code to search only for queries that have a Reason For Early Termination. In the execution plan XML, the Reason For Early Termination will show in a node StatementOptmEarlyAbortReason. Before the WHERE clause, add this line:

CROSS APPLY sys.dm_exec_text_query_plan(p.plan_handle, qs.statement_start_offset,
qs.statement_end_offset) AS tqp

And before the ORDER BY in the script, add this line:

and tqp.query_plan LIKE '%StatementOptmEarlyAbortReason%'

Next, scroll right, then read from right to left

Graphical execution plans build from sources (rightmost objects), and apply operators to join, sort, and filter data from right to left, eventually arriving at the leftmost operator. In the rightmost objects, you’ll see Scans, Seeks, and Lookups of different types. You might find some quick, straightforward insight into how the query is using indexes.

Seek operations are best for when you’re looking for a needle or needles in a much larger haystack. They are generally the most efficient operators to see, and can rarely be improved by additional indexes. Keep an eye out for Seeks that are accompanied by Lookups, however. They’ll likely appear one on top of the other in the graphical execution plan. Row Lookups indicate that although the optimizer used a seek, it needed a second pass at the table in the form of a Lookup on another object, perhaps the clustered index. Key Lookups (on clustered indexes) and RID Lookups (on heaps) are expensive and inefficient, and likely can be eliminated from the execution plan with the modification to an existing nonclustered index. Lookups are very efficient when looking up a small number of rows, but very inefficient for larger number of rows. In high-cost or high-importance queries, Key Lookups can represent a significant cost, one that is easily resolvable with a nonclustered index.

Image For an example, see the section “Designing nonclustered indexes” in Chapter 10.

Scan operations aren’t great unless your query is intentionally performing a query that returns most of the rows out of a table. Scans are in fact that most efficient option for when an index does not provide an ordered dataset, but keep in mind, they do read all rows from the index. Without a nonclustered index with a well-designed key to enable a seek for the query, a scan might be the query optimizer’s only option. Scans on nonclustered indexes are often better than scans of clustered indexes, in part due to what is likely a smaller key size. Test and compare the performance of a new or updated nonclustered index, created based on the predicates and outputs of Index Scans and Clustered Index Scans.

Other types of scans include the following:

  • Table Scans. These indicate that the table has no clustered index. We discuss why this is probably not a good idea in Chapter 10.

  • Remote Scans. This includes any object that is preceded by “remote,” which is the same operation but over a linked server connection. Troubleshoot them the same way, but potentially by making changes to the remote server instead.

  • Constant Scans. These appear when the query optimizer deals with scalar values, repeated numbers, and other “constants.” These are necessary operators for certain tasks and generally not actionable from a performance standpoint.

  • Columnstore Index Scans. These are incredibly efficient operators, and likely will outperform a Clustered Index Scan or Index Seek where millions of rows, for example, must be aggregated. No need to create a nonclustered index to replace this operator.

The weight of the lines connecting operators isn’t the full story

SQL Server dynamically changes the thickness of the gray lines to reflect the Actual Number Of Rows. You can get a visual idea of where the bulk of data is coming from by observing the pipes, drawing your attention to the places where performance tuning could have the biggest impact.

The visual weight and the sheer number of rows does not directly translate to cost, however. Look for where the pipe weight changes from light to heavy, or vice versa. Be aware of when thick pipes are joined or sorted.

Operator cost share isn’t the full story either

When you run multiple queries, the cost of the query relative to the batch is displayed in the Query Execution Plan header, and within each plan, the batch cost relative to the rest of the operators in the statement is displayed. SQL Server uses a cost-based process to decide which query plan to use. Deciding to address only the highest-cost single operator in the execution plan might be a dead end, but generally you will find the highest cost operators on the rightmost side of the execution plan.

In Figure 9-6, we can see that operator cost might not align with the amount of data. You should investigate performance tuning this execution plan using all of the information provided.

Image

Figure 9-6 In this snippet from an execution plan, much of the cost is associated with the top operator, but more rows are moving on from the bottom operator.

Look for Join operators and understand the different algorithms

As you read from right to left, in a query of any complexity, you’ll likely see the paths meet at a join operator. Two tables can be joined, obviously, but different indexes on the table can also meet in a join operator. If you find that a large portion of the cost of an execution plan spent in a Hash Match, Hash Join, Merge Join, or Nested Loop, take a look at what is being joined.

The Hash operators have the most overhead, with a temporary hash table created to bucketize and match rowdata. Merge Joins are the best for ordered data that streams processed data as it receives it. Nested Loops aren’t as bad as they sound, but they are essentially the row-by-row comparison of one rowset against another. This can be very efficient for small, indexed datasets.

Each of the following could reduce the cost of a Join operator.

  • There may be an opportunity to improve the indexing on the columns being joined, or perhaps, you have a join on a compound key that is incompletely defined. Perhaps you are unintentionally omitting part of the join key in the ON or WHERE clause of the query.

  • In the case of a Merge Join, you may see a preceding Sort operator. This could be an opportunity to present the data already sorted according to how the Merge Join requires the data to be sorted. Perhaps changing the ASC/DESC property or the order of index key columns could remove the Sort operator.

  • Make sure you that are filtering at the lowest level possible. Perhaps a WHERE clause could exist in a subquery instead of at the top level of the query, or in the definition of a common table expression (CTE) instead of in the lower query.

  • Hash Match and Hash Join operators are the most expensive, but are the typically the most efficient for joining two large row sets, especially large unsorted datasets. Reducing the row counts going into the Hash Match or Hash Join could allow the query optimizer to use a less memory-intensive and less costly join operator. You could accomplish this perhaps by adding or modifying nonclustered indexes to eliminate Scan operators in favor of Seek operators.

  • Nested Loops are often necessitated by Key Lookups and sometimes quite costly. They too are no longer necessary if a new nonclustered index is added to address the Key Lookup and make an accompanying Index Seek more capable.

Look for Parallel icons

The left-pointing pair of arrows in a yellow circle shown in Figure 9-7 indicate that your query has been run with a parallel-processing execution plan. We talk more about Parallelism later in this chapter, but the important thing here is to be aware that your query has gone parallel.

Image

Figure 9-7 the parallel indicator on a Clustered Index Scan operator.

This doesn’t mean that multiple sources or pipes are being read in parallel; rather, the work for individual tasks has been broken up behind the scenes. The query optimizer decided it was faster if your workload was split up and run into multiple parallel streams of records.

You might see one of the three different Parallelism operators—the distribute streams, gather streams, and repartition streams operators—each of which appear only for parallel execution plans.

Forcing a parallel execution plan

New to SQL Server 2017 (and also implemented in SQL Server 2016 CU2) is a query hint that can force a statement to compile with a parallel execution plan. This can be valuable in troubleshooting, or to force a behavior in the query optimizer, but is not usually a necessary or recommended option.

Appending the following hint to a query will force a parallel execution plan, which you can see using the Estimate or Actual execution plan output options:

OPTION(USE HINT('ENABLE_PARALLEL_PLAN_PREFERENCE'));

Understanding parallelism

We mentioned parallelism in execution plans earlier in this chapter. When SQL Server decides to split and stream data needed for requests into multiple threads, it uses more than one processor to get the job done. The number of different parallel threads used for the query is called the degree of parallelism. Because parallelism can never exceed the number of logical processors, naturally the maximum degree of parallelism (MAXDOP) is capped.

The default MAXDOP setting of 0 (allowing all processors to be used in a single statement) allows SQL Server to “go parallel” at will, and, sometimes, to a fault. Although queries may perform fastest in a vacuum going massively parallel, at scale the overuse of parallelism creates a multithreaded bottleneck. Split into too many different parts, queries slow down en masse as CPU utilization rises and SQL Server records increasing values in the CXPACKET wait type.

Image We talk about CXPACKET here, but for more about wait type statistics, see Chapter 13.

Until SQL Server 2016, MAXDOP was a server-level setting, or a setting enforced at the query level, or a setting enforced to sessions selectively via the Resource Governor, an Enterprise edition feature. Since SQL server 2016, the MAXDOP setting is now available as a database-scoped configuration. You can also use the MAXDOP query hint in any statement to override the database or server level MAXDOP setting.

Another limit to parallelism, called the Cost Threshold for Parallelism (CTFP), enforces a minimum bar for query cost before a query can use a parallel execution plan. The higher the threshold, the fewer queries go parallel. This setting is fairly low by default, but its proper setting in your environment is quite dependent on the workload and processor count. More expensive queries usually benefit from parallelism more than simpler queries, so limiting the use of parallelism to the worst queries in your workload can help. Similarly, setting the CTFP too high could have an opportunity impact, as performance is limited, queries are executed serially, and CPU cores go underutilized. The CTFP is a server-level setting only.

If large queries are already a problem for performance and multiple large queries regularly run simultaneously, raising the CTFP might not solve the problem. In addition to the obvious solutions of query tuning and index changes, including the introduction of Columnstore indexes, use MAXDOP instead to limit very large queries.

When the CXPACKET wait is the predominant wait type experienced over time by your SQL Server, both MAXDOP and CTFP are dials to turn when performance tuning. You can also view the live and last wait types for a request using sys.dm_exec_requests. Make these changes in small, measured gestures, and don’t overreact to performance problems with a small number of queries. Use the Query Store to benchmark and trend the performance of high-value and high-cost queries as you change configuration settings.

Another flavor of CPU pressure, and in some ways the opposite of the CXPACKET wait type, is the SOS_SCHEDULER_YIELD wait type. The SOS_SCHEDULER_YIELD is an indicator of CPU pressure, indicating that SQL Server had to share time or “yield” to other CPU tasks, which may be normal and expected on busy servers. Whereas CXPACKET is the SQL Server complaining about too many threads in parallel, the SOS_SCHEDULER_YIELD is the acknowledgement that there were more runnable tasks for the available threads. In either case, first take a strategy of reducing CPU-intensive queries and rescheduling or optimizing CPU-intense maintenance operations. This is more economical than simply adding CPU capacity.

Inside OUT

How can I reduce the processor utilization during maintenance operations?

If processor utilization spikes and during maintenance operations such as index maintenance or integrity checks, you can force these to run serially. Although this can increase the duration of maintenance, other queries should be less negatively affected.

You can use the MAXDOP query hint at the end of index maintenance to force index rebuild steps to run serially. Combined with the ONLINE hint, an Enterprise edition feature, your scripted index maintenance might run longer but have a minimal impact of concurrent queries. You can also specify MAXDOP when creating indexes. You cannot specify a MAXDOP for the reorganize step.

ALTER INDEX ALL ON WideWorldImporters.Sales.Invoices REBUILD
WITH (MAXDOP = 1, ONLINE = ON);

You can also turn on trace flag 2528 to disable parallelism server-wide for DBCC CHECKDB, DBCC CHECKFILEGROUP, and DBCC CHECKTABLE operations. Keep in mind these operations can take hours to complete on large databases, and might run longer if single-threaded.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
52.15.135.175