Concurrency and Transaction Management

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Concurrency and Transaction Management

Although concurrency and transaction management are in fact two separate topics, they go hand in hand when considered in terms of database programming strategies. DB2 must ensure the integrity of your data and query results by restricting access to table data by applying row and table locks. This chapter explains how this is done using isolation levels, as well as how transactions actually define when locks will be held.

When applications must wait for a lock before a query can be executed, this is known as lock-wait. Lock-wait occurs to guarantee accuracy of results and to maintain the integrity of the underlying data. Thus, in many cases, applications may experience valid periods of lock-wait.

Almost all problems applications encounter with lock timeouts or deadlock situations are either due to inherent problems with how a transaction is defined, or misunderstandings about when locks will be acquired and released. Concurrency is managed by DB2 on a connection level, so if you have multiple connections to the same database within an application, they will be considered concurrent applications and compete for database locks.

Unit of Work (UOW)

UOW is synonymous with the concept of a transaction. It is defined as zero or more SQL queries that execute as a single atomic operation. For example, when a customer makes a purchase in the online mall, three steps must be carried out:

The inventory of the mall must be updated.
The customer must be charged for the items purchased.
Each item purchased must be shipped.

What would happen if the inventory was updated and the customer was charged, but a shipping order entry was never created? Not only would you have an angry customer who never received his purchase, you would also wreak havoc with your inventory. Thus, all of the SQL queries for the purchase must be defined as an atomic operation. This is done by issuing a COMMIT only after all of the queries have successfully executed, or issuing a ROLLBACK if any of them failed to do so.

Many programming interfaces also have two types of connections: transactional and nontransactional. Although DB2 supports these concepts, you should be aware that there is really only one type of connection to a DB2 database, and that is a transactional one. Thus, every SQL query will be a part of a transaction. When you are running in a nontransactional mode, however, the programming interface you are using has enabled a feature called autocommit, which issues a COMMIT implicitly after every SQL query. You must ensure that if your UOW has multiple queries, then you do not have autocommit enabled.

Database Locking Strategies and Isolation Levels

DB2 employs locking strategies to ensure that the data in your result set is accurate and that the integrity of data is not damaged. Locks can be applied to individual rows in a table or to an entire table. How DB2 manages locks depends on the locking-specific settings for each database (set in the database configuration file), and the isolation level used by the application. In general, you should be aware of two basic types of locks that DB2 will apply:

Share lock
Exclusive lock

In DB2 there are different kinds of share locks and exclusive locks, but as an application developer, being familiar with these two general types should suffice. As implied, a share lock allows other concurrent applications to read the data that has been locked. Multiple concurrent applications can obtain share locks on a table or a row. A share lock is obtained when reading data, except when the cursor being used is updatable. In contrast, an exclusive lock is obtained on rows that have been inserted, updated, or deleted. Row locks also can get escalated to exclusive table locks, depending on lock resource usage.

Be aware that only a single concurrent application can obtain an exclusive lock on a row or a table, and this can only occur if there are no existing share locks or an exclusive lock already present. Of course, if an application has an exclusive lock on an entire table, then no one will be able to obtain any locks on that table as the name “exclusive” implies. The same reasoning applies to an exclusive lock obtained on a row. Both share and exclusive locks are released when a COMMIT or ROLLBACK is issued. The only exception is a WITH HOLD cursor, which retains share locks, but releases exclusive locks.

The way locks are obtained when data is read depends on the isolation level of the current transaction. There are four isolation levels employed by DB2, which are summarized in Table 8.1. All of the isolation levels, except for UR, only allow you to read data that has been committed. Since uncommitted data will have exclusive locks placed on it, this prevents you from obtaining a share lock to read it. The way UR gets around this is by not placing any lock on the row being read. Conceptually speaking, you also don't want to read uncommitted data because it is unreliable. For example, it might be rolled back by the application holding the lock, in which case the changes will be reversed.

Table 8.1. DB2 Isolation Levels
Isolation Level	Description
Uncommitted Read (UR)	No locks are acquired when reading data. Data that has not been committed can be read (dirty read), and the result set can change with each execution (nonrepeatable read).
Cursor Stability (CS)	Only committed data can be read and a share lock is obtained for the current cursor position. The result set is not guaranteed to be the same if the query is executed again (unrepeatable read).
Read Stability (RS)	A share lock is obtained for all of the rows in a result set and only committed data is read. If the query is executed again, new rows can appear (phantom read), but all of the rows in the previously result set will also be present because the share locks on these rows are not relinquished.
Repeatable Read (RR)	A share lock is obtained for all of the rows processed to obtain the result set. Every time the query is executed, you will get the same results just as with the RS isolation level.

Although the CS isolation level only places a share lock on the current cursor position, you should be aware that to compile the result set, every row in the result set must have share-lock capability. If not, then the query will wait until the exclusive lock is released or until a lock-timeout occurs. A lock-timeout is among several common types of problems that can occur if the correct isolation level for a transaction is not used, or if concurrency conflicts result in your unit of work. We have described each of these in Table 8.2, and discuss the isolation levels where this problem commonly occurs.

Table 8.2. Common Concurrency Problems
Problem	Description
Lost Update	Commonly occurs with an isolation level of CS or lower. Typically, two queries are executed at the same time in two concurrent applications. Each result set has at least one row that was in the result set from the query in the other application. If the first application updates data and commits changes, the second application will not be aware of these changes, and may unknowingly update the same row when it should not. Thus, the update from the first application is lost.
Nonrepeatable Read	This type of problem also occurs with an isolation level of CS or lower. If the same query is issued twice in succession, the results may vary because there are no locks on the whole result set to prevent the rows in it from being deleted or updated.
Phantom Read	A phantom read occurs with all isolation levels except RR. When a phantom read occurs, additional rows may be present in the result set if the same query is executed multiple times. This occurs because either new rows were inserted that match the query's condition, or rows that previously did not meet its conditions now do so.
Lock Timeout	Lock timeouts can occur on any isolation level above UR. Each database can specify whether the application will wait forever for a lock it needs or time out and return an error after a certain number of seconds of being blocked from obtaining that lock. A common example is when one application has inserted a row into a table but has not committed it, and a second application tries to select that row as part of a result set. The second application must wait until the insert is committed or rolled back by the first application.
Deadlock	A deadlock only occurs if two concurrent applications prevent each other from obtaining required locks. For example, if application 1 has an exclusive lock on row 1, and is waiting for a share lock on row 2, while application 2 has an exclusive lock on row 2 and is waiting for a share lock on row 1, a deadlock has occurred. If DB2 detects a deadlock, one of the transactions is rolled back and an error is returned.

One of the most common problems application developers have is having to wait for locks. If you have set the LOCKTIMEOUT value in the database configuration (DB CFG) file, then an application will time out its query after having to wait for the number of seconds that was set. The consequence of this is that the transaction is rolled back because the statement failed to execute. Here are the best ways to avoid lock timeouts from occurring:

Define your UOW to be as short as possible.
Avoid nondatabase processing within a UOW.
Architect your applications to avoid competing for resources whenever possible.

The first rule follows the principle of committing as often as possible. This is the best way to avoid lock contention. An easy way to accomplish this is to use autocommit whenever you don't need to manage a UOW over multiple statements. Sometimes programmers may also include additional code that will considerably lengthen transaction time. For example, we once tackled a problem in which the developer insisted that the transaction only had three statements, and thus should not cause the kind of long-term lock contention that was being observed. We finally determined that the source of the problem was that in between statements they were needlessly parsing text documents, which considerably lengthened their transaction time!

Our third point is something you should always strive for when you develop your applications. Although it may not always be possible to avoid lock contention, you should always seek to minimize it, using the first two methods described.

TIP

Lock contention can often result when table scans are required to identify the qualifying rows in a query. If there are exclusive locks on any rows, then lock contention can occur. This can be avoided by ensuring that your queries favor use of indexes in their access plan. This is done by specifying the members of a WHERE clause that favor index-only lookup.

Transactional Processing Using Embedded SQL

A transaction within embedded SQL begins implicitly with the execution of an SQL statement and ends with either a COMMIT or ROLLBACK statement. An embedded SQL program also can make use of a SAVEPOINT to further control how transactions are processed. After the transaction has ended, a new one is automatically started as soon as a new executable SQL statement has begun. Note that not all SQL are executable. Generally, an executable statement is one that is processed by the database server. BEGIN DECLARE SECTION is an example of SQL that is not processed by the server.

Using the COMMIT statement ends the current transaction and makes the changes that were made during the transaction processing available to other applications. Deciding when to commit a transaction is a decision that the programmer must make. Generally, it is best to keep all transaction work as small as possible. Once transactions become large, more resources are required on the server to manage the process. Log files are used to track all activity against the database. These log files contain enough information so that a rollback can be performed if needed. Large transactions require large log files.

Something else to consider when ending the transaction is the timing. If the application can avoid dependencies such as user input to decide when to end a transaction then it should. If a user walks away for lunch, an application could end up locking data that other processes require. Some servers, such as the mainframe OS1390 have an idle timeout condition that will abruptly end a transaction and disconnect users if they have left their connections idle for too long.

Using the ROLLBACK statement ends the current transaction and restores the data to the state that it was in prior to beginning the transaction. If you use a ROLLBACK statement in a routine that was entered because of an error or warning and you use the SQL WHENEVER statement, then you should specify WHENEVER SQLERROR CONTINUE and WHENEVER SQLWARNING CONTINUE before the ROLLBACK. This avoids a program loop if the ROLLBACK fails with an error or warning.

The transaction also can be ended implicitly within embedded SQL if the application terminates without ending the current transaction. This should be avoided and is considered bad programming. DB2 will ROLLBACK the current transactions when this occurs.

A SAVEPOINT can be used to help manage transactions. After a transaction has started, one or more save points can be named so that a ROLLBACK can be performed to this point in the transaction. Therefore, a transaction that consists of three SQL statements with a save point named between each SQL can be rolled back to the first or second save point and then the SQL previous to the named save point can be committed (or it too can be rolled back).

Transactional Processing Using CLI/ODBC

Transactional processing within CLI/ODBC can behave in one of two different ways. The default behavior is for all transactions to be ended implicitly. This feature is called Autocommit and the CLI and ODBC standards say that the default behavior for all transactions are to be committed or rolled back implicitly by CLI/ODBC. If an application wishes to manage it's own transaction behavior then it is free to do so by using the SQLSetConnectOption() and setting SQL_ATTR_AUTOCOMMIT to SQL_ AUTOCOMMIT_OFF.

DB2 CLI offers a special keyword that can be used in its db2cli.ini configuration file, called AUTOCOMMIT. Setting this value to '0' will override the default behavior of auto commit. This keyword can be used safely within a CLI-based application, but as soon as an ODBC driver manager is used between the application and the CLI driver, then undesired results will occur. It is rare to find a CLI-based application that does not use the ODBC driver manager on Windows and so its use must be cautioned against. This is because the Windows-based driver manager will still assume that the default behavior is for AUTOCOMMIT to be on. Thus, if it sees an explicit request to end a transaction using the SQLEndTran() API, it will indicate that the function has executed successfully without actually passing the request to the CLI driver. Refer to Figure 8.2 to observe the process flow for an ODBC transaction. The ODBC manager is trying to optimize the request based on its assumption that there are no outstanding transactions.

Figure 8.2. Process flow for ODBC transaction.

A transaction is automatically started once an application issues one of the following APIs, SQLPrepare(), SQLExecute(), SQLExecDirect(), SQLGetTypeInfo(), or one of the catalog functions. Even if the API call fails, a transaction is started. If AUTOCOMMIT is on then the transaction will be implicitly completed with a commit if the processing was successful or a roll back if an error occurred. Exactly when the implicit action occurs is dependent on the type of SQL.

SQL statements that open a cursor to manipulate data to/from the database will implicitly end their transaction when the cursor is closed. If a cursor is opened and another SQL statement is executed before it is closed then a COMMIT or ROLLBACK may cause the cursor to be closed on the other statement. This is because the second statement may implicitly end the transaction before the first has finished. If an implicit commit occurs and SQL_ATTR_CURSOR_HOLD is off, then the open cursor will be closed and subsequent fetches will result in a function sequence error. See Figure 8.3 for an example of two threads operating on the same connection. If a roll back occurs then the cursor will be closed regardless of SQL_ATTR_CURSOR_HOLD. Note, that the default setting for SQL_ATTR_CURSOR_HOLD is ON.

Figure 8.3. Implicit transaction behavior for two threads on a single connection.

When in manual commit mode, transactions are explicitly ended using the API SQLEndTran() with either the COMMIT or ROLLBACK option. A new transaction is implicitly started again with the execution of the appropriate API. The SQLEndTran() API operates at the connection level rather than on the statement or environment level. This means that if two connections are established, then each will manage its own transaction. If SQLEndTran() is performed on one connection handle, then all outstanding SQL are committed or rolled back for only that connection. It is possible to encompass more than one connection into a single transaction with the use of a Distributed UOW (covered in greater detail later in this section).

Another effect that SQLEndTran() has other than ending a transaction is to close all open cursors when a rollback is performed. If a commit is issued, then all open cursors are closed if SQL_ATTR_CURSOR_HOLD is set to SQL_CURSOR_HOLD_OFF using the API SQLSetStmtOption(). The default setting for this value is SQL_CURSOR_HOLD_ON for all DB2 platforms except VM/VSE, which does not support cursor-with-hold. Pay special attention to the fact that all open cursors within a transaction are affected by the transaction termination.

Some interesting behavior can develop if a CLI-based stored procedure is called within a transaction from a CLI-based application. The desired behavior is to have the stored procedure execute atomically and either be successfully committed or rolled back in its entirety. This can fail to happen if the stored procedure either changes the autocommit behavior or otherwise attempts to manage its own transactions. Table 8.3 lists various scenarios and their outcome.

Table 8.3. Various Scenarios
CLI Client AUTOCOMMIT Behavior	CLI Store Procedure AUTOCOMMIT Behavior	COMMIT Behavior
On	On	All SQL statements in a stored procedure that succeed are committed, even if other SQL statements in the stored procedure fail and error or warning `SQLSTATES` are returned to the `CALL` statement.
On	Off	If the stored procedure handles errors and returns no error `SQLSTATES` to the `CALL` statement, then all SQL statements in the stored procedure that succeed are committed. Otherwise, all SQL statements in the stored procedure are rolled back.
On	Off with explicit commits	All SQL statements that succeed and are manually committed will not be rolled back, even if error `SQLSTATES` are returned to the `CALL` statement. However, any SQL statements that are executed after the last `COMMIT` or `ROLLBACK` will be committed or rolled back depending on the `SQLSTATES` that is returned to the `CALL` statement.
Off	On	All SQL statements in the stored procedure that succeed are committed, even if other SQL statements in the stored procedure fail and error or warning `SQLSTATES` are returned to the `CALL` statement or the transaction containing the `CALL` statement is rolled back. Note, that turning `AUTOCOMMIT` on will automatically commit any outstanding transaction including that of the client.
Off	Off	If the stored procedure handles errors and returns no error or warning `SQLSTATES` to the `CALL` statement, all SQL statements in the stored procedure that succeed will be committed when the transaction that includes the `CALL` statement is committed. Otherwise, all SQL statements in the stored procedure will be rolled back when the transaction that includes the `CALL` statement is rolled back.
Off	Off with explicit commits	All SQL statements that succeed and are manually committed will not be rolled back, even if error or warning `SQLSTATES` are returned to the `CALL` statement or the transaction containing the `CALL` statement is rolled back. Note, that when the stored procedure issues its first commit, previous SQL statements from the client application are also committed.

Transactional Processing Using JDBC

Many of the transactional methods for JDBC are the same as those for CLI/ODBC. JDBC also operates in an AUTOCOMMIT mode by default. In AUTOCOMMIT mode, an SQL statement is committed when it is complete, provided that no errors occurred, otherwise the effects of the statement are rolled back. For statements that make use of a cursor, the transaction is completed when the result set is closed. For a JDBC application to control its own transactions, it must first turn AUTOCOMMIT off. This can be done using the connection method setAutoCommit() with the value false.

With AUTOCOMMIT off, JDBC implicitly starts a transaction when the first SQL statement is executed. The transaction is then ended explicitly when the connection methods, commit() or rollback() are called. Just as with CLI/ODBC-based applications, an implicit COMMIT will occur if AUTOCOMMIT is turned back on while a transaction is outstanding. The following code snippet demonstrates some transaction processing within JDBC:

// Declare variables
String sql = "UPDATE product " +
             "SET units_in_stock = units_in_stock - ? " +
             "WHERE product_id = ?";
int [] newUnitsInStock = {10, 20, 30};
int [] productIds = {123, 456, 789};

Connection con = null;
int rowCount = 0;

// Obtain connection and turn off autocommit
con = DriverManager.getConnection("jdbc:db2:db2mall");
con.setAutoCommit(false);

// Prepare the statement and process each update
PreparedStatement ps = con.prepareStatement(sql);
for(int i=0; i < newUnitsInStock.length; i++)
{
   ps.setInt(1, newUnitsInStock[i]);
   ps.setInt(2, productIds[i]);
   rowCount = ps.executeUpdate();
}

// Commit the transaction
ps.close();
con.commit();
con.setAutoCommit(true);

It is advisable to turn AUTOCOMMIT on when you do not need to be in a transaction mode. This way you avoid unnecessarily holding on to database locks that could otherwise cause contention for other processes.

When operating in AUTOCOMMIT mode, it is important to understand when the implicit behavior for committing a transaction occurs, especially for those statements that use cursors. Because the commit occurs when the cursor is closed, it is important to close the result set after the application is finished with it. Often times, an application will discard the statement handle and allow the garbage collector to close the result set and free the resources associated with a statement handle. Not only is this bad programming, it may result in unpredictable results.

Java's garbage collector monitors memory usage and autonomously decides when to initiate the finalizer method for discarded objects. The finalizer method closes any open cursors and frees any handles that are still active. Because most of the memory associated with a statement handle is allocated through Java Native Interface (JNI) by DB2, the Java Virtual Machine (JVM) has no way of knowing how much memory is actually associated with a statement. As far as the JVM is concerned, the memory allocated to a statement is small, so it assigns a low priority to cleaning a statement resource. If the finalizer thread closes the cursor, then an implicit commit will be driven and all open transactions for that connection will be committed. This behavior can cause function sequence errors to occur within the application.

Transaction Processing Using Other Interfaces

The functional standards for OLE DB have been driven by ODBC. OLE DB is the next generation interface standard for database access, so it stands to reason that much of the functional behavior has not changed. Because most applications will use a class set of methods to access a database through OLE DB (such as ADO), the documentation for that interface must be consulted to find the syntax for controlling transactions. The behavior, however, will remain the same as CLI/ODBC.

Other interfaces, such as Perl, Net.Data, ADO, and other class sets are all used on top of CLI/ODBC. Each of these interfaces can provide their own methods for controlling transactions. Details for which are found in their own documentation. The general concepts, however, will remain the same as with any CLI/ODBC application.

NOTE

ADO can be used on top of CLI/ODBC or OLE DB.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Concurrency and Transaction Management

Create new playlist

Sign In

Sign Up

Concurrency and Transaction Management

Unit of Work (UOW)

Database Locking Strategies and Isolation Levels

Transactional Processing Using Embedded SQL

Transactional Processing Using CLI/ODBC

Figure 8.2. Process flow for ODBC transaction.

Figure 8.3. Implicit transaction behavior for two threads on a single connection.

Transactional Processing Using JDBC

Transaction Processing Using Other Interfaces

Table of Contents for
Concurrency and Transaction Management