Chapter 13. Deadlock Analysis

When a deadlock occurs between two or more transactions, SQL Server allows one transaction to complete and terminates the other transaction. SQL Server then returns an error to the corresponding application, notifying the user that they have been chosen as a deadlock victim. This leaves the application with only two options: resubmit the transaction or apologize to the end user. To successfully complete a transaction and avoid the apologies, it is important to understand what might cause a deadlock and the ways to handle a deadlock.

In this chapter, I cover the following topics:

  • Deadlock fundamentals

  • Error handling to catch a deadlock

  • Ways to analyze the cause of a deadlock

  • Techniques to resolve a deadlock

Deadlock Fundamentals

A deadlock is a special blocking scenario in which two sessions get blocked by each other. Each session, while holding its own resources, attempts to access a resource that is locked by the other session. This will lead to a circular blocking scenario, also known as a deadly embrace, as illustrated in Figure 13-1.

Deadlocks also frequently occur when two processes attempt to escalate their locking mechanisms on the same resource. In this case, each of the two processes has a shared lock on a resource, such as an RID, and they both attempt to promote the lock from shared to exclusive, but neither can do so until the other releases its shared lock. This too leads to one of the processes being chosen as a deadlock victim.

Deadlocks are an especially nasty type of blocking, because a deadlock cannot resolve on its own, even if given an unlimited period of time. A deadlock requires an external process to break the circular blocking.

Circular blocking scenario

Figure 13.1. Circular blocking scenario

SQL Server has a deadlock detection routine, called a lock monitor, that regularly checks for the presence of deadlocks in SQL Server. Once a deadlock condition is detected, SQL Server selects one of the sessions participating in the deadlock as a victim to break the circular blocking. This process involves withdrawing all the resources held by the victim session. SQL Server does so by rolling back the uncommitted transaction of the session picked as a victim.

Choosing the Deadlock Victim

SQL Server determines the session to be a deadlock victim by evaluating the cost of undoing the transaction of the participating sessions, and it selects the one with the least cost. You can exercise some control over the session to be chosen as a victim by setting the deadlock priority of its connection to LOW:

SET DEADLOCK_PRIORITY LOW

This steers SQL Server toward choosing this particular session as a victim in the event of a deadlock. You can reset the deadlock priority of the connection to its normal value by executing the following SET statement:

SET DEADLOCK_PRIORITY NORMAL

The SET statement allows you to mark a session as a HIGH deadlock priority too. This won't prevent deadlocks on a given session, but it will reduce the likelihood of a given session being picked as the victim. You can even set the priority level to a number value from −10 for the lowest priority to 10 for the highest.

In the event of a tie, one of the processes is chosen as a victim and rolled back as if it had the least cost. Some processes are invulnerable to being picked as a deadlock victim. These processes are marked as such and will never be chosen as a deadlock victim. The most common example I've seen are processes already involved in a rollback.

Using Error Handling to Catch a Deadlock

When SQL Server chooses a session as a victim, it raises an error with the error number. You can use the TRY/CATCH construct within T-SQL to handle the error. SQL Server ensures the consistency of the database by automatically rolling back the transaction of the victim session. The rollback ensures that the session is back to the same state it was in before the start of its transaction. On determining a deadlock situation in the error handler, it is possible to attempt to restart the transaction within T-SQL a number of times before returning the error to the application.

Take the following T-SQL statement as an example of one method for handling a deadlock error (trap_sample.sql in the download):

DECLARE @retry AS TINYINT = 1
   ,@retrymax AS TINYINT = 2
   ,@retrycount AS TINYINT = 0 ;

WHILE @retry = 1
    AND @retrycount <= @retrymax
    BEGIN
        SET @retry = 0 ;
        BEGIN TRY
            UPDATE  HumanResources.Employee
            SET     LoginID = '54321'
            WHERE   BusinessEntityID = 100 ;
        END TRY
        BEGIN CATCH
            IF (ERROR_NUMBER() = 1205)
                BEGIN
                    SET @retrycount = @retrycount + 1 ;
                    SET @retry = 1 ;
                END
        END CATCH
    END

The TRY/CATCH methodology allows you to capture errors. You can then check the error number using the ERROR_NUMBER() function to determine whether you have a deadlock. Once a deadlock is established, it's possible to try restarting the transaction a set number of times, in this case, two. Using error trapping will help your application deal with intermittent or occasional deadlocks, but the best approach is to analyze the cause of the deadlock and resolve it, if possible.

Deadlock Analysis

You can sometimes prevent a deadlock from happening by analyzing the causes. You need the following information to do this:

  • The sessions participating in the deadlock

  • The resources involved in the deadlock

  • The queries executed by the sessions

Collecting Deadlock Information

You can collect the deadlock information three ways: using a specific trace event through the Profiler tool, setting trace flag 1222, and setting trace flag 1204. Trace flags are used to customize certain SQL Server behavior such as, in this case, generating the deadlock information.

Profiler provides information on deadlocks through the Locks:Deadlock graph event. The deadlock graph generates XML output in the TextData field on a trace. After the trace has captured the deadlock events, you can view them within the Profiler tool by bringing up that particular event. You also have the option of exporting an individual event or all deadlock events to a file.

You can open the deadlock graph in Management Studio or Profiler. You can search the XML, but the deadlock graph generated from the XML works almost like an execution plan for deadlocks, as shown in Figure 13-2.

A deadlock graph as displayed in the Profiler

Figure 13.2. A deadlock graph as displayed in the Profiler

I'll show you how to use this in the "Analyzing the Deadlock" section.

The two trace flags that generate deadlock information can be used together to generate two sets of information. Usually people will prefer to run one or the other. Trace flag 1222 provides the most detailed information on the deadlock. The trace flags write the information gathered into the log file on the server where the deadlock event occurred.

Trace flag 1204 provides detailed deadlock information that helps you analyze the cause of a deadlock. It sorts the information by each of the nodes involved in the deadlock. Trace flag 1222 also provides detailed deadlock information, but it breaks the information down differently. Trace flag 1222 sorts the information by resource and by processes and provides even more information. Details on both sets of data will be shown in the "Analyzing the Deadlock" section.

The DBCC TRACEON statement is used to turn on (or enable) the trace flags. A trace flag remains enabled until it is disabled using the DBCC TRACEOFF statement. If the server is restarted, this trace flag will be cleared. You can determine the status of a trace flag using the DBCC TRACESTATUS statement. Setting both of the deadlock trace flags looks like this:

DBCC TRACEON (1222, −1);
DBCC TRACEON (1204, −1);

To ensure that the trace flags are always set, it is possible to make them part of the SQL Server startup in the SQL Server Configuration Manager by following these steps:

  1. Open the Properties dialog box of the instance of SQL Server.

  2. Switch to the Advanced tab of the Properties dialog box, and find Startup Parameters, as shown in Figure 13-3.

    SQL Server instance's Properties dialog box showing the server's advanced properties

    Figure 13.3. SQL Server instance's Properties dialog box showing the server's advanced properties

  3. Type ;T-1204; T-1222 in the Startup Parameter text box, and click Add to add trace flag 1204, as shown in Figure 13-4. This will set both. I recommend just using trace flag 1222.

    SQL Server startup parameters settings

    Figure 13.4. SQL Server startup parameters settings

  4. Click the OK button to close all the dialog boxes.

    These trace flag settings will be in effect after you restart your SQL Server instance.

Analyzing the Deadlock

To analyze the cause of a deadlock, let's consider a straightforward little example. First, make sure you've turned on the deadlock trace flags, 1204 and 1222, and created a trace that uses the deadlock graph event. In one connection, execute this script (deadlock_1.sql in the download):

BEGIN TRAN
UPDATE  Purchasing.PurchaseOrderHeader
SET     Freight = Freight * 0.9 -- 10% discount on shipping
WHERE   PurchaseOrderID = 1255 ;

In a second connection, execute this script (deadlock_2.sql in the download):

BEGIN TRANSACTION
UPDATE  Purchasing.PurchaseOrderDetail
SET     OrderQty = 2
WHERE   ProductID = 448
        AND PurchaseOrderID = 1255 ;

Each of these opens a transaction and manipulates data, but neither one commits or rolls back the transaction. Switch back to the first transaction, and run this query:

UPDATE  Purchasing.PurchaseOrderDetail
SET     OrderQty = 4
WHERE   ProductID = 448
        AND PurchaseOrderID = 1255 ;

Unfortunately, after possibly a few seconds, the first connection faces a deadlock:

Msg 1205, Level 13, State 51, Line 1
Transaction (Process ID 54) was deadlocked on lock resources with another process
and has been chosen as the deadlock victim. Rerun the transaction.

Any idea what's wrong here?

Let's analyze the deadlock by first examining the deadlock graph collected through the trace event (see Figure 13-5).

A deadlock graph displayed in the Profiler tool

Figure 13.5. A deadlock graph displayed in the Profiler tool

From the deadlock graph displayed in Figure 13-5, it's fairly clear that two processes were involved, session 54 and session 55. Session 54, the one with the big blue X crossing it out, was chosen as the deadlock victim. Two different keys were in question. The top key was owned by session 55, as demonstrated by the arrow pointing to the session object, named Owner Mode, and marked with an X for exclusive. Session 54 was attempting to request the same key for an update. The other key was owned by session 54 with session 55 requesting an update. You can see the exact HoBt ID, object ID, and index name for the objects in question for the deadlock. For a classic, simple deadlock like this, you have most of the information you need. The last piece would be the queries running from each process. These would need to be captured using a different Profiler event.

This visual representation of the deadlock can do the job, but to really understand exactly where deadlocks occurred, what processes caused them, and which objects were involved, the best tool is still the trace flag. As I mentioned earlier, I prefer using trace flag 1222.

Trace flag 1222 also generates deadlock information and stores it in the error log. Table 13-1 shows each statement from trace flag 1222 and the information it represents.

Table 13.1. Trace Flag 1222 Data

Entry in Log

Description

deadlock-list

The beginning of the deadlock information.

deadlock victim=process502d720

Physical memory address of the process picked to be the deadlock victim.

process-list

Processes that define the deadlock victim.

process id=process502d720 taskpriority=0
logused=392 waitresource=KEY:
14:72057594046578688 (f800e6f77703)
waittime=4907 ownerId=57602
transactionname=user_transaction
lasttranstarted=2008-12-05T23:20:55.960
XDES=0x7eb0c10 lockMode=U   schedulerid=2
kpid=3444 status=suspended spid=54
sbid=0 ecid=0 priority=0 trancount=2
lastbatchstarted=2008-12-05T23:21:15.600
lastbatchcompleted=2008-12-05T23:21:15.600
lastattention=2008-12-05T23:07:00.080
clientapp=Microsoft SQL Server   Management
Studio - Query hostname=FRITCHEYGXP
hostpid=2724 loginname=CORPfritcheyg
isolationlevel=read committed (2) xactid=57602
currentdb=14 lockTimeout=4294967295
clientoption1=671098976 clientoption2=390200

All the information about the session picked as the deadlock victim. Note the highlighted isolation level, which is a key for helping identify the root cause of a deadlock.

executionStack

T-SQL that was being executed.

frame procname=adhoc line=1 stmtstart=64
sqlhandle=0x02000000d0c7f31a30fb1ad425c34357fe
8ef6326793e7aa

The type of query being executed, in this case ad hoc.

UPDATE   [Purchasing].[PurchaseOrderDetail]
set [OrderQty] = @1  WHERE [ProductID]=@2 AND
[PurchaseOrderID]=@3

The T-SQL statement generated for the execution plan.

frame procname=adhoc line=1 sqlhandle=0x020000
001472d506ef788e5357d6da47e2652f0e385d481d

The next statement in the batch.

UPDATE    Purchasing.PurchaseOrderDetail
SET     OrderQty = 4
WHERE   ProductID = 448
AND PurchaseOrderID = 1255 ;

The query with values.

Inputbuf

The statements for the current batch.

UPDATE    Purchasing.PurchaseOrderDetail
SET     OrderQty = 4
WHERE   ProductID = 448
AND PurchaseOrderID = 1255 ;

The text from the execution plan showing the precise values used by the statement defined earlier.

process  id=processb0fab0 taskpriority=0
logused=10124 waitresource=KEY:
14:72057594046644224 (e700bef11a92)
waittime=21088 ownerId=57607
transactionname=user_transaction
lasttranstarted=2008-12-05T23:20:59.410
XDES=0x7eb1640 lockMode=U   schedulerid=2
kpid=4248 status=suspended spid=53
sbid=0 ecid=0 priority=0 trancount=2
lastbatchstarted=2008-12-05T23:20:59.410
lastbatchcompleted=2008-12-05T23:20:59.410
clientapp=Microsoft SQL Server   Management
Studio - Query hostname=FRITCHEYGXP
hostpid=2724 loginname=CORPfritcheyg
isolationlevel=read committed (2) xactid=57607
currentdb=14   lockTimeout=4294967295
clientoption1=673327200 clientoption2=390200

The next process in the deadlock. This is the process that succeeded.

executionStack

 
frame   procname=AdventureWorks2008.Purchasing.
uPurchaseOrderDetail line=39   stmtstart=2732
stmtend=3864   sqlhandle=0x03000e00203bb566d83ad
7004f9b00000000000000000000

Note the procname value, AdventureWorks2008.Purchasing. uPurchaseOrderDetail. This is a trigger fired by the following code.

UPDATE   [Purchasing].[PurchaseOrderHeader]
SET [Purchasing].[PurchaseOrderHeader].
[SubTotal] =
(SELECT SUM([Purchasing].[PurchaseOrderDetail].
[LineTotal])
FROM [Purchasing].[PurchaseOrderDetail]
WHERE [Purchasing].[PurchaseOrderHeader].
[PurchaseOrderID]
= [Purchasing].[PurchaseOrderDetail].
[PurchaseOrderID])
WHERE [Purchasing].[PurchaseOrderHeader].
[PurchaseOrderID]
IN (SELECT inserted.[PurchaseOrderID] FROM
inserted);

The T-SQL executed by the trigger.

frame   procname=adhoc line=2 stmtstart=64
sqlhandle=0x02000000d0c7f31a30fb1ad425c34357fe
8ef6326793e7aa

The ad hoc code defined in the listings being run.

UPDATE   [Purchasing].[PurchaseOrderDetail]
set [OrderQty] = @1  WHERE [ProductID]=@2 AND
[PurchaseOrderID]=@3

Again, this is the code as presented to the optimizer including parameterization.

frame   procname=adhoc line=2 stmtstart=38
sqlhandle=0x020000008a87202a1b402d77071c771ac5
d1b58e896fbf3f

Definition of the T-SQL.

UPDATE    Purchasing.PurchaseOrderDetail
SET     OrderQty = 2
WHERE   ProductID = 448
AND PurchaseOrderID = 1255 ;

T-SQL, including the actual values. This would be the cause of firing of the trigger.

Inputbuf

 
BEGIN   TRANSACTION
UPDATE  Purchasing.PurchaseOrderDetail
SET     OrderQty = 2
WHERE   ProductID = 448
AND PurchaseOrderID = 1255 ;

The query with the actual values passed.

resource-list

The objects that caused the conflict.

keylock   hobtid=72057594046578688 dbid=14
objectname=AdventureWorks2008.Purchasing.
PurchaseOrderDetail indexname=PK_
PurchaseOrderDetail_PurchaseOrderID_
PurchaseOrderDetailID id=lock48b7cc0 mode=X
associatedObjectId=72057594046578688

Definition of the primary key from the Purchasing.PurchaseOrderDetail table.

owner-list

The process that owned the object.

owner id=processb0fab0 mode=X

Process definition.

waiter-list

The process that was waiting for access to the object, which is the deadlock victim.

waiter id=process502d720 mode=U
requestType=wait

Additional detail about the waiting process.

keylock hobtid=72057594046644224 dbid=14
objectname=AdventureWorks2008.
Purchasing.PurchaseOrderHeader indexname=PK_
PurchaseOrderHeader_PurchaseOrderID
id=lock48b9100 mode=X   associatedObjectId=
72057594046644224

The Purchasing.PurchaseOrderHeader clustered index where the conflict occurred.

owner-list

Owner of the object.

owner id=process502d720 mode=X

Owner definition. Note this was the process waiting for access to the Purchasing.PurchaseOrderDetail table.

waiter-list

Process waiting for access.

waiter id=processb0fab0 mode=U
requestType=wait

Process waiting for access. As you can see, this process owned the lock on the Purchasing.PurchaseOrderDetail table.

This information is a bit more difficult to read through than the clean set of data provided by the deadlock graph. However, it is a very similar set of information. You can see, highlighted in bold near the bottom, the definition of one of the keys associated with the deadlock. You can also see, just before it, that the text of the execution plans is available through trace flag 1222, unlike the deadlock graph. In this case, you are much more likely to have everything you need to isolate the cause of the deadlock.

The information collected by trace flag 1204 is completely different from either of the other two sets of data and doesn't provide nearly as much detail as trace flag 1222. Trace flag 1204 is also much more difficult to interpret. For all these reasons, I suggest you stick to using trace flag 1222 to capture deadlock data.

This example demonstrated a classic circular reference. Although not immediately obvious, the deadlock was caused by a trigger on the Purchasing.PurchaseOrderDetail table. When Quantity is updated on the Purchasing.PurchaseOrderDetail table, it attempts to update the Purchasing.PurchaseOrderHeader table. When the first two queries are run, each within an open transaction, it's just a blocking situation. The second query is waiting on the first to clear so that it can also update the Purchasing.PurchaseOrderHeader table. But when the third query, the second within the first transaction, is introduced, now a circular reference is in place, and the only way to resolve it is to kill one of the processes.

Before proceeding, be sure to roll back the open transaction.

The obvious question at this stage is, can you avoid this deadlock? If the answer is yes, then how?

Avoiding Deadlocks

The methods for avoiding a deadlock scenario depend upon the nature of the deadlock. The following are some of the techniques you can use to avoid a deadlock:

  • Accessing resources in the same chronological order

  • Decreasing the locking

  • Minimizing lock contention

Accessing Resources in the Same Chronological Order

One of the most commonly adopted techniques in avoiding a deadlock is to ensure that every transaction accesses the resources in the same physical order. For instance, suppose that two transactions need to access two resources. If each transaction accesses the resources in the same physical order, then the first transaction will successfully acquire locks on the resources without being blocked by the second transaction. The second transaction will be blocked by the first while trying to acquire a lock on the first resource. This will cause a typical blocking scenario without leading to a circular blocking.

If the resources are not accessed in the same physical order, as follows (and as demonstrated in the earlier deadlock analysis example), this can cause a circular blocking between the two transactions:

  • Transaction 1:

    • Access Resource 1

    • Access Resource 2

  • Transaction 2:

    • Access Resource 2

    • Access Resource 1

In the current deadlock scenario, the following resources are involved in the deadlock:

  • Resource 1, hobtid=72057594046578688: This is the index row within index PK_ PurchaseOrderDetail_PurchaseOrderId_PurchaseOrderDetailId on table Purchasing.PurchaseOrderDetail.

  • Resource 2, hobtid=72057594046644224: This is the row within clustered index PK_ PurchaseOrderHeader_PurchaseOrderId on table Purchasing.PurchaseOrderHeader.

Both sessions attempt to access the resource; unfortunately, the order in which they access the key are different.

Decreasing the Number of Resources Accessed

A deadlock involves at least two resources. A session holds the first resource and then requests the second resource. The other session holds the second resource and requests the first resource. If you can prevent the sessions (or at least one of them) from accessing one of the resources involved in the deadlock, then you can prevent the deadlock. You can achieve this by redesigning the application, which is a solution highly resisted by developers late in the project. However, you can consider using the following features of SQL Server without changing the application design:

  • Convert a nonclustered index to a clustered index.

  • Use a covering index for a SELECT statement.

Convert a Nonclustered Index to a Clustered Index

As you know, the leaf pages of a nonclustered index are separate from the data pages of the heap or the clustered index. Therefore, a nonclustered index takes two locks: one for the base (either the cluster or the heap) and one for the nonclustered index. However, in the case of a clustered index, the leaf pages of the index and the data pages of the table are the same; it requires one lock, and that one lock protects both the clustered index and the table, since the leaf pages and the data pages are the same. This decreases the number of resources to be accessed by the same query, compared to a nonclustered index.

Use a Covering Index for a SELECT Statement

You can also use a covering index to decrease the number of resources accessed by a SELECT statement. Since a SELECT statement can get everything from the covering index itself, it doesn't need to access the base table. Otherwise, the SELECT statement needs to access both the index and the base table to retrieve all the required column values. Using a covering index stops the SELECT statement from accessing the base table, leaving the base table free to be locked by another session.

Minimizing Lock Contention

You can also resolve a deadlock by avoiding the lock request on one of the contended resources. You can do this when the resource is accessed only for reading data. Modifying a resource will always acquire an (X) lock on the resource to maintain the consistency of the resource; therefore, in a deadlock situation, identify the resource accesses that are read-only, and try to avoid their corresponding lock requests by using the dirty read feature, if possible. You can use the following techniques to avoid the lock request on a contended resource:

  • Implement row versioning.

  • Decrease the isolation level.

  • Use locking hints.

Implement Row Versioning

Instead of attempting to prevent access to resources using a more stringent locking scheme, you could implement row versioning through the READ_COMMITTED_SNAPSHOT isolation level or through the SNAPSHOT isolation level. The row versioning isolation levels are used to reduce blocking as outlined in Chapter 12. Because they reduce blocking, which is the root cause of deadlocks, they can also help with deadlocking. By introducing READ_COMMITTED_SNAPSHOT with the following T-SQL, you can have a version of the rows available in tempdb, thus possibly eliminating the contention caused by the lock escalation in the preceding deadlock scenario:

ALTER DATABASE AdventureWorks2008
    SET READ_COMMITTED_SNAPSHOT ON;

This will allow any necessary reads without causing lock contention since the reads are on a different version of the data. There is overhead associated with row versioning, especially in tempdb and when marshaling data from multiple resources instead of just the table or indexes used in the query. But that trade-off of increased tempdb overhead vs. the benefit of reduced deadlocking and increased concurrency may be worth the cost.

Decrease the Isolation Level

Sometimes the (S) lock requested by a SELECT statement contributes to the formation of circular blocking. You can avoid this type of circular blocking by reducing the isolation level of the transaction containing the SELECT statement to READ UNCOMMITTED. This will allow the SELECT statement to read the data without requesting an (S) lock and thereby avoid the circular blocking. However, reading uncommitted data carries with it a serious issue by returning bad data to client. You need to be in very dire straights to consider this as a method of eliminating your deadlocks.

Use Locking Hints

You can also resolve the deadlock presented in the preceding technique using the following locking hints:

  • NOLOCK

  • READUNCOMMITTED

Like the READ UNCOMMITTED isolation level, the NOLOCK or READUNCOMMITTED locking hint will avoid the (S) locks requested by a given session, thereby preventing the formation of circular blocking.

The effect of the locking hint is at a query level and is limited to the table (and its indexes) on which it is applied. The NOLOCK and READUNCOMMITTED locking hints are allowed only in SELECT statements and the data selection part of the INSERT, DELETE, and UPDATE statements.

The resolution techniques of minimizing lock contention introduce the side effect of a dirty read, which may not be acceptable in every transaction. Therefore, use these resolution techniques only in situations in which a dirty read is acceptable.

Summary

As you learned in this chapter, a deadlock is the result of circular blocking and is reported to an application with the error number 1205. You can analyze the cause of a deadlock by collecting the deadlock information using the trace flags 1204 and 1222. You can also capture deadlock graphs using a trace.

You have a number of techniques to avoid a deadlock; which technique is applicable depends upon the type of queries executed by the participating sessions, the locks held and requested on the involved resources, and the business rules governing the degree of isolation required. Generally, you can resolve a deadlock by reconfiguring the indexes and the transaction isolation levels. However, at times you may need to redesign the application or automatically reexecute the transaction on a deadlock.

In the next chapter, I cover the performance aspects of cursors and how to optimize the cost overhead of using cursors.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.226.52.203