13. Users, Sessions, and Transaction Control

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

An end user can interact with a database either by issuing SQL statements directly by typing them or by running an application program in which SQL has been embedded. In either case, the database must recognize the user as an authorized database user, the user must connect to the database to establish a database session, and there must be control of the user's transactions. As an introduction, this chapter begins with a discussion of the environment in which multiple users operate and what a DBMS has to do to preserve data integrity when multiple users attempt to modify the same data. The chapter then turns to SQL specifics as a prelude to the discussion of embedded SQL in Chapter 15.

The Concurrent Use Data Environment

A transaction is a unit of work submitted as a whole to a database for processing. (A database session consists of one or more transactions.) When more than one user of an application program is interacting with the database at one time, we say that their transactions are running concurrently. Concurrent transactions can run in one of two ways:

◊ They may run serially, in which case one transaction completes its work before the second begins.

◊ They may run interleaved, in which case the actions of both transactions alternate.

Ideally, the results of interleaved transaction execution should be the same as that of serial execution (regardless of which transaction went first). If interleaved transaction execution produces such a result, the transactions are said to be serializable.

Unfortunately, some very nasty thing can happen if no controls are placed on interleaved execution. As an example, consider what might happen at the rare book store when two customers call at the same time and attempt to order the same volume (see Figure 13-1).

B9780123756978500133/f13-01-9780123756978.jpg is missing

Figure 13-1

A lost update

The staff member handling the first customer retrieves data about the volume and notes that it has not been sold. A short time later, a second customer calls and is handled by a second staff member, who also queries the database and sees that the volume is available. After the second staff member's query, the first customer decides to purchase the volume and the first staff member updates the database to indicate that the volume has been sold.

Moments later, the second customer also decides to purchase the volume. As far as the second staff member and the second customer are concerned, the volume is available. The second staff member updates the database with the second customer's purchase, erasing the first customer's purchase. It is likely that the book will be sent to the second customer because no record of the first customer's purchase remains in the database.

This problem, known as a lost update, occurred because the second staff member's update was based on old data; the second staff member did not see the first customer's purchase and therefore could not know that the book had already been sold.

The most common solution is to use locking, where transaction receive control over database elements they are using to prevent other transactions from updating and/or viewing the same data. Transactions that modify data usually obtain exclusive, or write, locks that prevent both viewing and modification of data by other transactions while the locks are in place.

To see how locking solves the book purchasing problem, take a look at Figure 13-2. This time, when the first staff member retrieves data about the volume, the transaction receives a lock on the book's data that prevents the second staff member from viewing the data. The second staff member's transaction is placed in a wait state by the DBMS until the transaction holding the lock finishes and releases the lock. At this point, the second staff member's transaction can proceed, but when it retrieves the data about the volume, the second staff member sees that the volume has been sold and does not attempt to sell it again.

B9780123756978500133/f13-02-9780123756978.jpg is missing

Figure 13-2

Solving a lost update problem with locking

The second customer is unhappy, but this is a far better situation than what might occur when the first customer discovers that the book that he or she thought was purchased was actually sold to someone else (especially if the first customer's credit card was charged for the book!).

For locking to be effective, a transaction must hold all its locks for the entire length of the transaction. Part of the process that ends a transaction is therefore to release all of the locks, making the data held by the transaction available for other transactions.

In the preceding example, you saw an exclusive lock used to prevent both viewing and updating a part of the database. DBMSs also place shared, or read, locks that allow many trans-actions to view a part of the database but allow none to modify it while a shared lock is in place. A DBMS will use a shared lock instead of an exclusive lock whenever it can because a shared lock allows more concurrent use of database resources.

In many cases, the DBMS will place a shared lock on data when a transaction retrieves data and then upgrade that lock to an exclusive lock only when the transaction issue a data modification command. This scheme, known as two-phase locking, helps ensure that exclusive locks are held for as short a time as possible and thus promotes the highest level of concurrent use.

The size of the database element on which a lock is placed (the granularity of the lock) varies from one DBMS to another and with the type of actions you are performing. It may be as large as an entire disk page (as in early versions of DB2) or an entire table, or it may be as small as a single row in a single table. The smaller the granularity, the more “pieces” there are to lock and the more concurrent use a database can support. However, the DBMS must spent time maintaining locks and keeping track of which transactions are waiting for locks. Therefore, the smaller the granularity and the more locks in place, the more processing time the DBMS must devote to locks rather than data manipulation.

Muddying the Waters: Isolation Levels

At first glance, it may seem that concurrency control is straight-forward: Either you have serializable transactions or you don't. However, the SQL standard muddies the water a bit by allowing you to specify that a transaction can read data modified by another, uncommitted transaction. The degree to which a transaction has access to such data is known as its isolation level.

There are four isolation levels:

◊ SERIALIZABLE: A serializable transaction—the default isolation level—is fully isolated from other transactions. It acts exactly as described in the preceding section of this chapter.

◊ REPEATABLE READ: A repeatable read transaction can read the same data more than once, retrieving rows that satisfy a WHERE predicate. If another transaction has inserted or updated rows and been committed between the first transaction's reads, then the repeated read of the data may return different rows than the first. Depending on the nature of the transaction, such behavior may be desirable. This effect is known as a phantom read.

◊ READ COMMITTED: A read committed transaction can also read the same data more than once, but in this case the read returns the same rows. However, the second read may produce different values if the data have been updated by another transaction that committed between the first and second reads by the transaction in question. Again, depending on the nature of the transaction, this may be something that you want. This effect is known as a nonrepeatable read. Such transactions also permit phantom reads.

◊ READ UNCOMMITTED: A read uncommitted transaction can read the same data more than once and read updates made to data by other uncommitted transactions. The danger here is that the uncommitted transaction may be rolled back, voiding their updates. This effect is known as a dirty read. Such transactions also permit nonrepeatable reads and phantom reads.

As mentioned earlier, the default isolation level is SERIALIZABLE. To set a lower level, you use the SET TRANSACTION command:

SET TRANSACTION

ISOLATION LEVEL isolation_level

Choose the isolation level from one of the four just discussed, as in

SET TRANSACTION

ISOLATION LEVEL REPEATABLE READ;

Database Sessions and Connections

To interact with a database, a user connects to it. When the user is finished, he or she disconnects. The time between the connection and disconnection is a database session.

SQL for Connecting and Disconnecting

To establish a connection to a database, you use the keyword CONNECT. For example, to connect under a specific user ID, a user or application would enter

CONNECT TO USER user_id

The SQL standard includes considerable flexibility in the CONNECT command syntax. If the DBMS has its own way of naming connections, then CONNECT may be enough. If the DBMS requires you to name connections, then you might use

CONNECT AS connection_identifier

You should consult your DBMS's documentation for the specific syntax required for a database connection.

Note: The CONNECT command assumes that there is some implementation-specific way to identify the database with which a user will interact once connected and that the database specification occurs before the user attempts to make a connection.

To terminate a connection you use the DISCONNECT command. If you specified a connection name, then the command is written

DISCONNECT connection_identifier

If you took the default connection with a user name, then DISCONNECT by itself is usually enough.

Session Length Considerations

There are two possible strategies governing the length of a database connection that come with their own set of trade-offs:

◊ An end user working with a SQL command processor or an application program can connect to the database at the beginning of work and stay connected until work is completed. This eliminates the overhead of repeated connections and disconnections but prevents another user from taking advantage of the connection when the connected user is idle. This strategy is therefore a problem if the number of concurrent users authorized for your DBMS is considerably smaller than the number of people who need to access the database.

◊ An end user working with a SQL command processor or an application program can connect to the database just before a database interaction occurs and disconnect immediately after completing the interaction. (Don't forget that temporary tables exist only during a single database session.) This creates additional overhead for processing the connection and disconnection. However, it ties up the connection for the smallest amount of time necessary and allows more people to access the database.

The bottom line is this: If your DBMS is authorized for the same number of users as people who need to use the database, then you can connect and stay connected. There's no reason not to. However, if you have more people than your software will allow at any one time, you will get less variance in your response times by connecting and disconnecting for each group of database actions.

Transaction Control

Most interactive SQL command processors consider each individual SQL command as a distinct transaction or give the end user a way to “Save changes” after entering a series of commands. However, when you are writing an embedded SQL program, the length of a transaction is totally under your direct control. You also have control over whether the transaction can read and write data, or read only.

Transaction Read/Write Permissions

By default, transactions can both read and write data. However, read-only transactions never require exclusive (write) locks and therefore in most cases permit higher concurrent use of a database. It therefore can be beneficial to indicate that a retrieval transaction is read only.

If you want a transaction that is read only, you can set that property with the SET TRANSACTION command:

SET TRANSACTION READ ONLY;

If you are also setting the transaction's isolation level, you can do so with the same command by separating the options with commas, as in

SET TRANSACTION ISOLATION LEVEL READ COMMITTED, READ ONLY;

Transaction Termination

Transactions end in one of two ways:

◊ If a transaction is committed, then any changes the transaction made to the database become permanent.

◊ If a transaction is rolled back, then any changes the transaction made are undone, restoring the database to the state it was in before the transaction began.

By definition, a committed transaction is never rolled back. To be able to roll back a transaction, a DBMS needs a log of every action taken by a transaction. This log, known more formally as a before-image file, contains information about all database transactions currently in progress. When a transaction is committed, its records are purged from the log and the vacated space is used for data about subsequent transactions. When a transaction is rolled back, the DBMS starts at the transaction's last record in the log file and replaces each current value with its old value from the log file. The process repeats, moving forward in the log file, until the DBMS reaches the log record that indicates the start of the transaction. At this point, the log records can be purged.

Starting Transactions

Although early versions of the SQL standard did not include any statement to start a transaction, the current Core specifications include a START TRANSACTION statement:

START TRANSACTION mode

The mode of a transaction can include its isolation level and whether it is read only or read/write.

Note: For those DBMSs that don't support the START TRANS-ACTION statement, a new transaction begins automatically whenever there is no current transaction and a user or application program issues a command that requires database action.

Ending Transactions

If you are using interactive SQL, you may request the end of a transaction in any of the following ways:

◊ Using a GUI, clicking a “Save Changes” or “Execute Command” button executes the command and, if the command executes without error (in other words, violates no constraints), automatically commits the transaction.

◊ Working from the command line, adding a colon to terminate the command and pressing Enter submits the command for processing. The transaction is committed automatically when it executes successfully.

◊ An application program makes its own decision whether to commit or roll back a transaction based on an error code returned by the DBMS. (You will read more about these error codes in Chapter 15.) Once the decision has been made, a program issues COMMIT or COMMIT WORK to commit the transaction. To undo everything done by a transaction, the program issues either ROLL-BACK or ROLLBACK WORK.

Transaction Length Considerations

One of the questions that always arises in a discussion of transactions is how long they should be. In general, they should be short, and there are two important reasons why:

◊ A transaction is the unit of recovery. When you perform a rollback, you must undo the entire transaction, not just part of it. You stand to lose a lot of processing if you must roll back a long transaction.

◊ As you read earlier in this chapter, for locking to be effective a transaction must hold all its locks until the transaction ends. Long transactions therefore tie up large portions of the database, cutting down on the amount of concurrent use the database can provide. Because all locks are released when a transaction terminates, shorter transactions maximize the number of users that can share the same database elements.

Note: Locking is essential for data consistency and integrity when multiple transactions are running concurrently. It is therefore not an option to remove the locking mechanism.

Programmers often wonder if it is necessary to end a transaction quickly if all the transaction is doing is retrieving data. The answer is “yes,” because retrieval transactions lock database elements. Although they typically allow other transactions to view the locked data, they prevent update of the data. Therefore, you should always commit a retrieval transaction immediately after bringing the data from the database into a query result table, freeing up the tables or views used for other users to modify.

Note: There is rarely any reason to roll back a retrieval transaction. The undo process just takes up processing time without affecting the contents of the database.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for 13. Users, Sessions, and Transaction Control

Create new playlist

Sign In

Sign Up

Table of Contents for
13. Users, Sessions, and Transaction Control