• The definition of a transaction
• The commands used to control transactions
• The syntax and examples of transaction commands
• When to use transactional commands
• The consequences of poor transactional control
In this hour, you learn the concepts behind the management of database transactions.
A transaction is a unit of work that is performed against a database. Transactions are units or sequences of work accomplished in a logical order, whether in a manual fashion by a user or automatically by some sort of a database program. In a relational database using SQL, transactions are accomplished using the Data Manipulation Language (DML) commands that were discussed during Hour 5, “Manipulating Data,” (INSERT
, UPDATE
, and DELETE
). A transaction is the propagation of one or more changes to the database. For instance, you are performing a transaction if you perform an UPDATE
statement on a table to change an individual’s name.
A transaction can either be one DML statement or a group of statements. When managing transactions, each designated transaction (group of DML statements) must be successful as one entity, or none of them will be successful.
The following list describes the nature of transactions:
• All transactions have a beginning and an end.
• A transaction can be saved or undone.
• If a transaction fails in the middle, no part of the transaction can be saved to the database.
Starting or executing transactions is implementation specific. You must check your particular implementation for how to begin transactions.
Transactional control is the capability to manage various transactions that might occur within a relational database management system (RDBMS). When you speak of transactions, you are referring to the INSERT
, UPDATE
, and DELETE
commands, which were covered during the previous hour.
When a transaction is executed and completes successfully, the target table is not immediately changed, although it might appear so according to the output. When a transaction successfully completes, transactional control commands are used to finalize the transaction, either saving the changes made by the transaction to the database or reversing the changes made by the transaction.
Three commands are used to control transactions:
• COMMIT
• ROLLBACK
• SAVEPOINT
Each of these is discussed in detail in the following sections.
Transactional control commands are only used with the DML commands INSERT
, UPDATE
, and DELETE
. For example, you do not issue a COMMIT
statement after creating a table. When the table is created, it is automatically committed to the database. Likewise, you cannot issue a ROLLBACK
statement to replenish a table that was just dropped. Also, there are other commands such as TRUNCATE
that are not logged and cannot be recovered from. So please make sure you check your RDBMS’s documentation before executing new commands to ensure you understand how they implement their transaction support.
When a transaction has completed, the transactional information is stored either in an allocated area or in a temporary rollback area in the database. All changes are held in this temporary rollback area until a transactional control command is issued. When a transactional control command is issued, changes are either made to the database or discarded; then the temporary rollback area is emptied. Figure 6.1 illustrates how changes are applied to a relational database.
The COMMIT
command is the transactional command used to save changes invoked by a transaction to the database. The COMMIT
command saves all transactions to the database since the last COMMIT
or ROLLBACK
command.
The syntax for this command is
commit [ work ];
The keyword COMMIT
is the only mandatory part of the syntax, along with the character or command that terminates a statement according to each implementation. WORK
is a keyword that is completely optional; its only purpose is to make the command more user-friendly.
In the following example, you begin by selecting all data from the PRODUCT_TMP
table:
SELECT * FROM PRODUCTS_TMP;
PROD_ID PROD_DESC COST
-----------------------------------------------
11235 WITCH COSTUME 29.99
222 PLASTIC PUMPKIN 18 INCH 7.75
13 FALSE PARAFFIN TEETH 1.1
90 LIGHTED LANTERNS 14.5
15 ASSORTED COSTUMES 10
9 CANDY CORN 1.35
6 PUMPKIN CANDY 1.45
87 PLASTIC SPIDERS 1.05
119 ASSORTED MASKS 4.95
1234 KEY CHAIN 5.95
2345 OAK BOOKSHELF 59.99
11 rows selected.
Next, you delete all records from the table where the product cost is less than $14.00.
DELETE FROM PRODUCTS_TMP
WHERE COST < 14;
8 rows deleted.
A COMMIT
statement is issued to save the changes to the database, completing the transaction.
COMMIT;
Commit complete.
Frequent COMMIT
statements in large loads or unloads of the database are highly recommended; however, too many COMMIT
statements cause the job to take a lot of extra time to complete. Remember that all changes are sent to the temporary rollback area first. If this temporary rollback area runs out of space and cannot store information about changes made to the database, the database will probably halt, disallowing further transactional activity.
You should realize that when an UPDATE
, INSERT
, or DELETE
is issued, most RDBMSs are using a form of transaction in the background so that if the query is cancelled or runs into an error, changes are not committed. So issuing a transaction is more of an action to ensure that a set of transactions are run as what is commonly referred to as a unit of work. So in a real-world example, you might be processing a bank transaction at an ATM with a client wanting to withdraw money. In such a situation, you need to both insert a transaction for the money being withdrawn as well as update the client’s balance to reflect the new total. Obviously, we would want either both of these statements to be successful or both of them to fail. Otherwise, our system’s data integrity is compromised. So in this instance, you would wrap your unit of work in a transaction to ensure that you could control the outcome of both statements.
In some implementations, transactions are committed without issuing the COMMIT
command—instead, merely signing out of the database causes a commit to occur. However, in some implementations, such as MySQL, after you perform a SET TRANSACTION
command, the auto-commit functionality does not resume until it has received a COMMIT
or ROLLBACK
statement. Additionally, in other implementations such as Microsoft SQL Server, statements are auto-committed unless a transaction is specifically used. So ensure that you check the documentation of your particular RDBMS to understand exactly how transactions and committing of statements are handled.
The ROLLBACK
command is the transactional control command that undoes transactions that have not already been saved to the database. You can only use the ROLLBACK
command to undo transactions since the last COMMIT
or ROLLBACK
command was issued.
The syntax for the ROLLBACK
command is as follows:
rollback [ work ];
Once again, as in the COMMIT
statement, the WORK
keyword is an optional part of the ROLLBACK
syntax.
In the following example, you begin by selecting all records from the PRODUCTS_TMP
table since the previous deletion of 14 records:
SELECT * FROM PRODUCTS_TMP;
PROD_ID PROD_DESC COST
-----------------------------------------------
11235 WITCH COSTUME 29.99
90 LIGHTED LANTERNS 14.5
2345 OAK BOOKSHELF 59.99
3 rows selected.
Next, you update the table, changing the product cost to $39.99 for the product identification number 11235:
update products_tmp
set cost = 39.99
where prod_id = '11235';
1 row updated.
If you perform a quick query on the table, the change appears to have occurred:
select * from products_tmp;
PROD_ID PROD_DESC COST
-----------------------------------------------
11235 WITCH COSTUME 39.99
90 LIGHTED LANTERNS 14.5
2345 OAK BOOKSHELF 59.99
3 rows selected.
Now issue the ROLLBACK
statement to undo the last change:
rollback;
Rollback complete.
Finally, verify that the change was not committed to the database:
select * from products_tmp;
PROD_ID PROD_DESC COST
-----------------------------------------------
11235 WITCH COSTUME 29.99
90 LIGHTED LANTERNS 14.5
2345 OAK BOOKSHELF 59.99
3 rows selected
A savepoint is a point in a transaction where you can roll the transaction back to this point without rolling back the entire transaction.
The syntax for the SAVEPOINT
command is
savepoint savepoint_name
This command serves only to create a savepoint among transactional statements. The ROLLBACK
command undoes a group of transactions. The savepoint is a way of managing transactions by breaking large numbers of transactions into smaller, more manageable groups.
Microsoft SQL Server uses a slightly different syntax. In SQL Server, you would use the statement SAVE TRANSACTION
instead of SAVEPOINT
, as is shown in the statement that follows:
save transaction savepoint_name
Otherwise, the procedure works exactly as the other implementations.
The syntax for rolling back to a savepoint is as follows:
ROLLBACK TO SAVEPOINT_NAME;
In this example, you are going to delete the remaining three records from the PRODUCTS_TMP
table. You want to issue a SAVEPOINT
command before each delete, so you can issue a ROLLBACK
command to any savepoint at any time to return the appropriate data to its original state:
savepoint sp1;
Savepoint created.
delete from products_tmp where prod_id = '11235';
1 row deleted.
savepoint sp2;
Savepoint created.
delete from products_tmp where prod_id = '90';
1 row deleted.
savepoint sp3;
Savepoint created.
delete from products_tmp where prod_id = '2345';
1 row deleted.
The SAVEPOINT name must be unique to the associated group of transactions. However, it can have the same name as a table or other object. Refer to specific implementation documentation for more details on naming conventions. Otherwise, savepoint names are a matter of personal preference and are used only by the database application developer to manage groups of transactions.
Now that the three deletions have taken place, let’s say you have changed your mind and decided to issue a ROLLBACK
command to the savepoint that you identified as SP2
. Because SP2
was created after the first deletion, the last two deletions are undone:
rollback to sp2;
Rollback complete.
Notice that only the first deletion took place because you rolled back to SP2
:
select * from products_tmp;
PROD_ID PROD_DESC COST
-----------------------------------------------
90 LIGHTED LANTERNS 14.5
2345 OAK BOOKSHELF 59.99
2 rows selected.
Remember, the ROLLBACK
command by itself rolls back to the last COMMIT
or ROLLBACK
statement. You have not yet issued a COMMIT
, so all deletions are undone, as in the following example:
rollback;
Rollback complete.
select * from products_tmp;
PROD_ID PROD_DESC COST
-----------------------------------------------
11235 WITCH COSTUME 29.99
90 LIGHTED LANTERNS 14.5
2345 OAK BOOKSHELF 59.99
3 rows selected.
The RELEASE SAVEPOINT
command removes a savepoint that you have created. After a savepoint has been released, you can no longer use the ROLLBACK
command to undo transactions performed since the savepoint. You might want to issue a RELEASE SAVEPOINT
command to avoid the accidental rollback to a savepoint that is no longer needed.
RELEASE SAVEPOINT savepoint_name;
Microsoft SQL Server does not support the RELEASE SAVEPOINT
syntax; instead, all SAVEPOINTs
are released when the transaction is completed. This is either by the COMMIT
or the ROLLBACK
of the transaction. Remember this point when you are structuring your transactions within your environment.
You can use the SET TRANSACTION
command to initiate a database transaction. This command specifies characteristics for the transaction that follows. For example, you can specify a transaction to be read-only or read/write:
SET TRANSACTION READ WRITE;
SET TRANSACTION READ ONLY;
READ WRITE
is used for transactions that are allowed to query and manipulate data in the database. READ ONLY
is used for transactions that require query-only access. READ ONLY
is useful for report generation and for increasing the speed at which transactions are accomplished. If a transaction is READ WRITE
, the database must create locks on database objects to maintain data integrity if multiple transactions are happening concurrently. If a transaction is READ ONLY
, no locks are established by the database, thereby improving transaction performance.
Other characteristics can be set for a transaction, but these are out of the scope of this book. MySQL supports this syntax for setting an isolation level for the transaction but in slightly different syntax. For more information, see the documentation for your implementation of SQL.
Poor transactional control can hurt database performance and even bring the database to a halt. Repeated poor database performance might be due to a lack of transactional control during large inserts, updates, or deletes. Large batch processes also cause temporary storage for rollback information to grow until either a COMMIT
or a ROLLBACK
command is issued.
When a COMMIT
is issued, rollback transactional information is written to the target table, and the rollback information in temporary storage is cleared. When a ROLLBACK
is issued, no changes are made to the database and the rollback information in the temporary storage is cleared. If neither a COMMIT
nor ROLLBACK
is issued, the temporary storage for rollback information continues to grow until there is no more space left, thus forcing the database to stop all processes until space is freed. Although space usage is ultimately controlled by the database administrator (DBA), a lack of transactional control can still cause database processing to stop, sometimes forcing the DBA to take action that might consist of killing running user processes.
During this hour, you learned the preliminary concepts of transactional management through the use of three transactional control commands: COMMIT
, ROLLBACK
, and SAVEPOINT
. You use COMMIT
to save a transaction to the database. You use ROLLBACK
to undo a transaction you performed. You use SAVEPOINT
to break a transaction or transactions into groups, which allows you to roll back to specific logical points in transaction processing.
Remember that you should frequently use the COMMIT
and ROLLBACK
commands when running large transactional jobs to keep space free in the database. Also, keep in mind that these transactional commands are used only with the three DML commands (INSERT
, UPDATE
, and DELETE
).
Q. Is it necessary to issue a commit after every INSERT
statement?
A. No, absolutely not. If you were inserting a few hundred thousand rows into a table, a COMMIT
would be recommended every 5,000–10,000 rows, depending on the size of the temporary rollback area. (Seek the advice of your database administrator.) Remember that the database might freeze up or not function properly when the rollback area fills up.
Q. How does the ROLLBACK
command undo a transaction?
A. The ROLLBACK
command clears all changes from the rollback area.
Q. If I issue a transaction and 99% of the transaction completes but the other 1% errs, will I be able to redo only the error part?
A. No, the entire transaction must succeed; otherwise, data integrity is compromised.
Q. A transaction is permanent after I issue a COMMIT
, but can’t I change data with an UPDATE
command?
A. The word permanent used in this matter means that it is now a part of the database. You can always use the UPDATE
statement to make modifications or corrections to the data.
The following workshop is composed of a series of quiz questions and practical exercises. The quiz questions are designed to test your overall understanding of the current material. The practical exercises are intended to afford you the opportunity to apply the concepts discussed during the current hour, as well as build upon the knowledge acquired in previous hours of study. Please take time to complete the quiz questions and exercises before continuing. Refer to Appendix C, “Answers to Quizzes and Exercises,” for answers.
1. True or false: If you have committed several transactions, have several more transactions that have not been committed, and issue a ROLLBACK command, all your transactions for the same session are undone.
2. True or false: A SAVEPOINT
command actually saves transactions after a specified number of transactions have executed.
3. Briefly describe the purpose of each one of the following commands: COMMIT
, ROLLBACK, and SAVEPOINT
.
4. What are some differences in the implementation of transactions in Microsoft SQL Server?
5. What are some performance implications when using transactions?
1. Take the following transactions and create a SAVEPOINT
or a SAVE TRANSACTION
command after the first three transactions. Then create a ROLLBACK
statement for your savepoint at the end. Try to determine what the CUSTOMER_TBL
will look like after you are done.
INSERT INTO CUSTOMER_TBL VALUES(615,'FRED WOLF','109 MEMORY LANE','PLAINFIELD','IN',46113,'3175555555',NULL);
INSERT INTO CUSTOMER_TBL VALUES(559,'RITA THOMPSON','125 PEACHTREE','INDIANAPOLIS','IN',46248,'3171111111',NULL);
INSERT INTO CUSTOMER_TBL VALUES(715,'BOB DIGGLER','1102 HUNTINGTON ST','SHELBY','IN',41234,'3172222222',NULL);
UPDATE CUSTOMER_TBL SET CUST_NAME='FRED WOLF' WHERE CUST_ID='559';
UPDATE CUSTOMER_TBL SET CUST_ADDRESS='APT C 4556 WATERWAY' WHERE CUST_ID='615';
UPDATE CUSTOMER_TBL SET CUST_CITY='CHICAGO' WHERE CUST_ID='715';
2. Take the following group of transactions and create a savepoinpt after the first three transactions.
Then place a COMMIT
statement at the end, followed by a ROLLBACK
statement to your savepoint. What do you think should happen?
UPDATE CUSTOMER_TBL SET CUST_NAME='FRED WOLF' WHERE CUST_ID='559';
UPDATE CUSTOMER_TBL SET CUST_ADDRESS='APT C 4556 WATERWAY' WHERE CUST_ID='615';
UPDATE CUSTOMER_TBL SET CUST_CITY='CHICAGO' WHERE CUST_ID='715';
DELETE FROM CUSTOMER_TBL WHERE CUST_ID='615';
DELETE FROM CUSTOMER_TBL WHERE CUST_ID='559';
DELETE FROM CUSTOMER_TBL WHERE CUST_ID='615';
3.137.212.212