CHAPTER 24

image

Designing Transaction Strategies

Now it is time to put everything together. In this chapter, we will summarize, and perhaps repeat a few rules that can help us reduce blocking and improve concurrency in a system. The most important rule we need to remember is that most cases of concurrency and blocking can be fixed by query optimization. Even if we do not talk about system performance in general, non-optimized queries scan more data than is needed. From a locking standpoint, this means that those queries issue more lock requests, which increase the chances of collision and blocking with other sessions. This is especially true for queries that change data. While we have some options to deal with shared (S) locks from non-optimized readers, there is very little that we can do with blocking introduced by writers and update (U) locks. SNAPSHOT transaction isolation level is the only option where writers do not block other writers, although switching to that isolation level requires code redesign and introduces major overhead to the server.

We need to make transactions as short as possible. Exclusive (X) locks are not released until the end of the transaction. The longer the transaction, the longer the locks are held. This again increases the chances of collision and blocking. Avoid any long-term operations while a transaction is active. Never have any interactions with users through the UI while a transaction is active. When dealing with external resources, be especially careful with external access CLR and linked servers. For example, when a linked server is down, it can take a long time before a connection timeout occurs. It is not a good idea to have exclusive (X) locks in place all that time.

Update the data as close to the end of the transaction as possible. This also reduces the time that exclusive (X) locks are held. In some cases, it might make sense to use temporary tables as the preliminary staging place, insert data there, and flush it from the temporary to the actual tables at the very end of the transaction. You need to remember that this approach helps reduce blocking, but creating and populating temporary tables is expensive in terms of I/O, so it introduces overhead and slows down execution, especially when a lot of data needs to be stored there.

One particular instance when this technique is very useful is an update statement that is impossible or unpractical to optimize. If the statement scans a lot of rows, but updates just a handful of them, it would make sense to refactor the code and collect the clustered index key values of the rows that need to be updated into a temporary table. After that, you can run an update statement based on those collected key values. While you still need to scan the data during select, you would not use update (U) locks there, which could dramatically help with the concurrency. For example, if you have a statement like the one shown in Listing 24-1, there is a great chance that SQL Server would use a clustered index scan on the Orders table. This scan leads to an update (U) lock acquired on every row of the table.

Listing 24-1.  Reducing blocking with temporary table: Original statement

update dbo.Orders
set
        Cancelled = 1
where
        (PendingCancellation = 1) or
        (Paid = 0 and OrderDate < @MinUnpaidDate) or
        (Status = 'BackOrdered' and EstimatedStockDate > @StockDate)

You can change the code similar to that shown in Listing 24-2.

Listing 24-2.  Reducing blocking with a temporary table: Using a temporary table to refactor

create table #OrdersToBeCancelled
(
        OrderId int not null primary key
)
 
insert into #OrdersToBeCancelled(OrderId)
        select OrderId
        from dbo.Orders
        where
               (PendingCancellation = 1) or
               (Paid = 0 and OrderDate < @MinUnpaidDate) or
               (Status = 'BackOrdered' and EstimatedStockDate > @StockDate)
                
update dbo.Orders
set
        Cancelled = 1
where
        OrderId in (select OrderId from #OrdersToBeCancelled)

In this case, a select statement that populates the temporary table acquires either shared (S) locks, or it does not acquire row-level locks at all depending on the transaction isolation level. It helps with concurrency at the expense of the additional overhead of creating and populating the temporary table. This method is more or less the last resort, however. Creating the right indexes is the better option in most cases. It also only makes sense when you need to update just a handful of rows. Otherwise, you will end up with execution plans that scan the entire dbo.Orders table during update anyway, which is even less efficient than the original statement, due to the temporary table overhead.

When you need to read and write from/to the same table in the same transaction, it is better first to read the data whenever possible. Avoid updating the row multiple times, especially using different indexes. Both of these patterns would increase the chances of deadlocks.

Do not mix DDL and DML statements in one transaction. That also increases the chance of a deadlock. One very common case where it is impossible to avoid doing so is a system with partitioned tables when some partition-related information is stored in the user tables. For example, you can have an Orders table partitioned on a monthly basis and an OrderPartitionsInfo table that stores the information about the min and max OrderId from every partition. When we alter the Orders table partition function, we also need to update the OrderPartitionsInfo table in the same transaction. In such cases, when we choose which statement should go first—DDL or DML—the goal is to eliminate the possibility that other sessions are blocked while waiting for the lock on the OrderPartitionsInfo table and holding intent locks on the Orders table at the same time.

Remember locking behavior when lock partitioning is involved. Especially when there is the code that alters the schema online in busy OLTP systems. Make sure that your test servers have 16 or more logical CPUs, which enables lock partitioning. We are talking about logical processors here, so even dual quad-core systems with hyperthreading would do the trick.

Be careful with lock escalation, especially in OLTP systems. When you process a large batch of rows within the same transaction, try to work with batches that are less than 5,000 rows. If that is impossible, monitor how often lock escalation occurs, and consider disabling it on the table level, or switch to partition escalation if lock escalation becomes a problem. Do not disable lock escalation, just in case that introduces concurrency issues, without first analyzing the root-cause of the problem. Do not forget about additional memory requirements and error 1204 when escalation is disabled, especially if lock partitioning is enabled. Keep in mind that smaller batches can be less efficient than larger ones, and lock escalation can improve performance of the system due to the smaller number of locks that a session needs to acquire and maintain.

Consider implementing logic that retries the operation in case of deadlocks. This is especially important with statements that acquire full object locks on systems with lock partitioning enabled.

Of course, use the right transaction isolation levels. While a lot of people are trying to reduce blocking issues by switching to READ UNCOMMITTED mode, either explicitly or with NOLOCK hints, this is rarely the right choice. First, READ UNCOMMITTED does not help with blocking issues introduced by writers. They still use update (U) locks during scans. Most importantly, however, by using a READ UNCOMMITTED transaction isolation level, we are telling SQL Server that we do not care about data consistency at all, and it is not only about reading uncommitted data. SQL Server can choose execution plans that use allocation order scans on large tables. This can lead to missing rows and duplicated reads due to page splits, especially in the busy systems. There are some cases when READ UNCOMMITTED can be used legitimately, but those cases are quite rare. Even then, we need to understand clearly all of the side effects it could and would introduce.

A READ COMMITTED transaction isolation level is usually good enough for OLTP systems when queries are optimized. Optimization is the key here—optimized queries acquire a small number of locks and do not introduce a lot of blocking while keeping server overhead at a minimum.

REPEATABLE READ and SERIALIZABLE isolation levels are not good for OLTP due to excessive blocking, and they should be avoided unless absolutely necessary.

Optimistic isolation levels are a mixed bag when discussing OLTP. A READ COMMITTED SNAPSHOT is an option as long as you can live with extra tempdb load and additional index fragmentation. That isolation level can be a great replacement for READ UNCOMMITTED; that is, writers do not block readers and at the same time it provides statement-level consistency. On the other hand, using a SNAPSHOT isolation level is questionable. While it provides transaction-level consistency and it eliminates blocking between writers, in OLTP systems where data is highly volatile, it adds excessive tempdb load and introduces update conflicts that lead to the error 3960. While it is still acceptable in some cases, you need to consider carefully all of the factors and test the system under load using hardware that is as similar to the production environment as possible when making this decision.

For Data Warehouse-type systems, much depends on how you update the data and how critical data consistency is during the update window. For static read-only data, any isolation level will work because readers do not block other readers. You can even switch the database to read-only mode to reduce the locking overhead. Otherwise, optimistic isolation levels may be your best choice. They provide either transaction- or statement-level consistency for report queries, and they eliminate possible blocking with the writers involved. Obviously, if statement-level consistency is enough, it is better to use a READ COMMITTED SNAPSHOT isolation level and reduce the load on tempdb.

Last but not least, it is completely normal to use different transaction isolation levels in a system and even within the same transaction. Different use cases require different approaches. Different problems force different solutions. You just need to understand the locking behavior in each case and choose the one that provides the greatest benefit.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.142.133.54