Through executing transactions, you can manipulate the data in a database. In this chapter we’ll demonstrate how you can specify a transaction in a formal way. We won’t formally introduce any new concepts; we’ve already presented all the ingredients for specifying transactions.
The section “Formally Specifying Transactions” gradually develops how you can specify a transaction. You’ll learn that you can specify a transaction as a function over the database universe at hand. For every database state, this function returns another database state that reflects the intended changes of the transaction. In this section, we’ll also give two examples to familiarize you with this formal concept of a transaction.
The section “Example Transactions Over DB_UEX” provides example transactions for the database universe that was introduced in Chapter 7. For the second time in this book, you’ll see the use of SQL. Every example transaction will be accompanied by one or more equivalent SQL data manipulation language (DML) statements; that is, INSERT
, UPDATE
, and DELETE
statements. You’ll learn that expressing certain transactions in SQL requires executing more than one DML statement.
Note We assume you have a working knowledge of SQL DML statements. It is not the goal of this book to teach SQL.
In preparation for the next chapter (11) where you’ll be introduced to the challenges of implementing data integrity constraints using an SQL DBMS, we’ll also (somewhat) discuss which data integrity constraints a given transaction might violate. It is up to the DBMS to validate these involved data integrity constraints for the given transaction. The DBMS should reject the intended changes of the transaction if the resulting database state violates any of the involved constraints.
As usual, you’ll find a “Chapter Summary” at the end, followed by an “Exercises” section.
Next to data retrieval, transaction execution is the second most important application of a database. By executing transactions, you can maintain the database and ensure that its state remains a valid representation of the (changing) real world. In this section, we’ll demonstrate how you can formally specify transactions.
Let’s start with a simple example using the database design that was introduced in Chapter 7. Take a look at tuple tcrs1
. It represents a new course than needs to be inserted into the database:
tcrs1 := { (CODE;'AM4DP')
, (DESC;'Applied Mathematics for Database Professionals')
, (CAT;'DSG')
, (DUR;5) }
For your convenience, we repeat the definitions of characterization chr_CRS
, tuple universe tup_CRS
, and table universe tab_CRS
in Listing 10-1. All involved attribute, tuple, and table constraints with regards to the CRS table structure are in this listing.
chr_CRS :=
{ ( CODE; CRSCODE–TYP )
, ( DESCR; varchar(40) )
, ( CAT; { s | s∈varchar(3) ∧
s∈{'DSG','GEN','BLD'} } )
, ( DUR; { n | n∈number(2,0) ∧ 1 ≤ n ≤ 15 } )
}
tup_CRS :=
{ c | c∈Π(chr_CRS) ∧ c(CAT) = 'BLD' ⇒ t(DUR) ≤ 5
}
tab_CRS :=
{ C | C∈ ℘ (tup_CRS) ∧ (∀c1,c2∈C: c1(CODE) = c2(CODE) ⇒ c1 = c2)
}
Note tcrs1
does not violate any attribute or tuple constraints for the CRS
table structure; or put differently, tcrs1
is an element of tuple universe tup_CRS
.
Here is a first attempt to formally specify the transaction of inserting tcrs1
into the CRS
table structure. We could specify this transaction as a function, say Tx1a
, over DB_UEX
. For a given database state in DB_UEX
, say dbs
, function Tx1a
returns a database state that reflects the insertion of tcrs1
into the CRS
table structure.
Tx1a(dbs) := { (EMP; dbs(EMP) )
, (SREP; dbs(SREP))
, (MEMP; dbs(MEMP))
, (TERM; dbs(TERM))
, (DEPT; dbs(DEPT))
, (GRD; dbs(GRD) )
, (CRS; dbs(CRS) ∪ {tcrs1} )
, (OFFR; dbs(OFFR))
, (REG; dbs(REG) )
, (HIST; dbs(HIST)) }
As you can see, function Tx1a
yields another database state that holds the same ten table structures as dbs
. This resulting database state differs from dbs
only for the CRS
table structure; tuple tcrs1
has been added. For all other table structures, the resulting database state holds the corresponding tables that dbs
holds.
Let’s apply Tx1a
to the “empty database state” (that is, the database state in which all table structures hold the empty table). Here is the empty database state, which we’ll name db_empty
. We introduced you to this database state in Exercise 12 in Chapter 7.
db_empty := { (EMP; Ø)
, (SREP; Ø)
, (MEMP; Ø)
, (TERM; Ø)
, (DEPT; Ø)
, (GRD; Ø)
, (CRS; Ø)
, (OFFR; Ø)
, (REG; Ø)
, (HIST; Ø)}
In Exercise 12 in Chapter 7, you established that db_empty
is an element of DB_UEX
. Therefore, you can apply function Tx1a
to it. Applying function Tx1a
to db_empty
, denoted by Tx1a(db_empty)
, results in the following database state:
Tx1a(db_empty) = { (EMP; Ø)
, (SREP; Ø)
, (MEMP; Ø)
, (TERM; Ø)
, (DEPT; Ø)
, (GRD; Ø)
, (CRS; {tcrs1} )
, (OFFR; Ø)
, (REG; Ø)
, (HIST; Ø)}
Note that the application of Tx1a
to db_empty
results in a database state that is (again) an element of DB_UEX
; the resulting database state conforms to all static constraints that were specified as part of the definition of DB_UEX
. However, this isn’t true in general. Let’s take a look at applying Tx1a
to the following database state (dbs1
):
dbs1 :=
{ (EMP; Ø)
, (SREP; Ø)
, (MEMP; Ø)
, (TERM; Ø)
, (DEPT; Ø)
, (GRD; Ø)
, (CRS; { { (CODE;'AM4DP'), (DESC;'AM4DP workshop')
, (CAT;'DSG'), (DUR;1) } })
, (OFFR; Ø)
, (REG; Ø)
, (HIST; Ø)}
Applying transaction Tx1a
to database state dbs1
results in a database state that holds the following CRS
table:
{ { (CODE;'AM4DP'), (DESC;'AM4DP workshop'), (CAT;'DSG'), (DUR;1) }
, { (CODE;'AM4DP'), (DESC;'Applied Mathematics for Database Professionals'),
(CAT;'DSG'), (DUR;5) } }
This table clearly violates the table constraint that is specified as part of the definition of tab_CRS
, and is therefore not an admissible table for the CRS
table structure. Consequently, in this case, Tx1a(dbs1)
is not an element of DB_UEX
. Clearly the current definition of this transaction is flawed, for we don’t want a transaction to have the database end up in a state that violates any of the data integrity constraints.
A more proper definition of the transaction that inserts tcrs1
to the database would be this one, named Tx1b
. Note that in this definition we are reusing function Tx1a
, which was specified earlier in this section.
Tx1b(dbs) := dbs , if Tx1a(dbs)∉DB_UEX
:= Tx1a(dbs), otherwise
Transaction Tx1b
describes an insertion attempt. If the insertion of tcrs1
results in a database state that violates any of the data integrity constraints—that is, the resulting database state is not an element of DB_UEX
—then the insertion is refused (also referred to as rolled back), and the database state remains the same. Otherwise, the insertion of tcrs1
is allowed (the transaction executes successfully, or commits).
In this particular example, given a database state dbs
, the insertion attempt successfully executes only if the CRS
table in dbs
does not already contain a tuple where the code
attribute value equals 'AM4DP'
. However, if that tuple is in fact tcrs1
, the insertion attempt succeeds, but in this case does not add a tuple at all. Adding a tuple (through set union) to a table that already contains this tuple does not introduce a new element (tuple) to the table.
Thus far, we are only focusing on attribute, tuple, and table constraints in this discussion. When checking whether Tx1a(dbs)
is an element of DB_UEX
, you should also consider the database constraints that involve the CRS
table structure.
NoteYou can determine the involved database constraints for a transaction that inserts a tuple in CRS
by simply scanning for the string 'CRS'
in the formal specifications of the database constraints.
Here are the involved database constraints (using the short names introduced in Listing 7-39): PSSR6
, PTIJ9
, PTIJ10
, PTIJ12
, PTIJ14
, PTIJ15
, and PODC3
. In this particular example, inserting a new CRS
tuple into a valid database state—a database state in DB_UEX
—will never violate one of these database constraints (you might want to verify this for yourself). Therefore, we need not consider these involved database constraints in the discussion of this example.
Note The definition of Tx1b
does cover all static constraints: attribute, tuple, table, and database.
Last—this should not surprise you—the state transition constraints should be involved in the formal definition of a transaction.
Take a look at tuple toffr1
. It represents an offering for the AM4DP course that needs to be inserted into the database.
toffr1 := { (COURSE; 'AM4DP')
, (STARTS; '01-FEB-2007')
, (STATUS; 'CONF')
, (MAXCAP; 20)
, (TRAINER; 1126)
, (LOC; 'UTRECHT') }
This tuple conforms to all attribute and tuple constraints that are specified for the OFFR
table structure (see the definitions of chr_OFFR
and tup_OFFR
in Chapter 7).
Let’s assume you’d like to attempt to insert the offr1
tuple into the database whose begin state currently conforms to all static constraints specified in the definition of DB_UEX
. We’ll further assume that the begin database state for this transaction conforms to the following additional characteristics. In parentheses, we list involved constraints that cause us to list the characteristic.
CRS
holds a tuple that represents the AM4DP course (PSSR6
).OFFR
yet; this is the first one to be inserted (tab_OFFR
constraints PTIJ13
, PTIJ14
, PTIJ15
).EMP
holds a tuple that represents a trainer with employee number 1126
, who was hired before February 1, 2006. This trainer is still working for us; that is, he or she has not been terminated (PSSR8
, PTIJ11
, PTIJ12
, PODC7
).DEPT
holds a tuple that represents a department located in Utrecht. The aforementioned trainer is working in this department (PSSR7
, PODC3
).Three other static constraints involve OFFR
: PTIJ10
, PODC4
, and PODC5
. However, these three cannot be violated by a transaction that inserts a new CRS
tuple. They all involve the REG
table structure too. The second characteristic in the preceding bulleted list, together with the fact that the begin state must be a valid one (that is, contained in DB_UEX
), imply that there are no tuples in REG
. This in turn means that the database state conforms (and remains conformed after the insertion of offr1
) to PTIJ10
, PODC4
, and PODC5
.
Given the preceding characteristics, the transaction of inserting tuple toffr1
would successfully execute. However, note that toffr1
represents a confirmed offering. The definition of our state transition universe (ST_UEX
from Chapter 8) contains the specification of a state transition constraint stating that new offerings must start with status scheduled (constraint STC2
, in Listing 8-8). Clearly this transaction violates constraint STC2
.
A correct definition of the transaction that attempts to insert offr1
, say Tx2
, would be as follows.
For a database state dbs
in DB_UEX
, let
Tx2a(dbs) := { (EMP; dbs(EMP) )
, (SREP; dbs(SREP))
, (MEMP; dbs(MEMP))
, (TERM; dbs(TERM))
, (DEPT; dbs(DEPT))
, (GRD; dbs(GRD) )
, (CRS; dbs(CRS) )
, (OFFR; dbs(OFFR) ∪ {toffr1} )
, (REG; dbs(REG) )
, (HIST; dbs(HIST)) }
and let
Tx2b(dbs) := dbs , if Tx2a(dbs)∉DB_UEX
:= Tx2a(dbs), otherwise
then,
Tx2(dbs) := Tx2b(dbs), if (dbs,Tx2b(dbs))∈ST_UEX
:= dbs , otherwise
The definition of Tx2a
describes the insertion of toffr1
. Tx2b
ensures that this conforms to all static constraints. The definition of Tx2
ensures that the transaction conforms to all state transition constraints too. It does so by adding the condition that the ordered pair consisting of the begin state and the end state for the transaction is an element of the state transition universe.
Often a transaction modifies only a few of the table structures involved in a database design; many table structures remain unchanged. An alternative (and somewhat shorter) way of specifying Tx2a
, that more explicitly shows this fact, is as follows:
Tx2a(dbs) := dbs↓{EMP,SREP,MEMP,TERM,DEPT,GRD,CRS,REG,HIST}
∪
{ (OFFR; dbs(OFFR) ∪ {toffr1}) }
Here we use function limitation (↓) to add all unchanged table structures to the end state, and through union (∪) we add the table structure(s) that the transaction modifies.
The next section will present some more example transactions over DB_UEX
to familiarize you further with formally specifying transactions.
This section will provide example transactions. For every transaction, we’ll present you with the following:
Along the way, you’ll see that SQL has limitations that cause you to execute more than one DML statement to implement a given transaction.
Listings 10-1 through 10-7 introduce the example transactions. All transactions will be named ETXi
(example transaction i
), where i
represents a sequence number. In the formal specifications given for each transaction, dbs
will represent the begin state for the transaction (an element of DB_UEX
). The various alternative formal expressions will be named FEc
(formal expression c
), where c
represents a character (a
, b
, c
, and so on).
In the SQL specifications, we’ll use the table structure names introduced by database characterization DB_CHREX
(see Listing 7-38) as the table names available for INSERT
, UPDATE
, and DELETE
statements. Alternative SQL specifications will be named SEr
(SQL expression r
), where r
represents a Roman numeral (I
, II
, III
, and so on).
In the following formal expressions, we’ll specify a transaction as a function similar to the way Tx1a
and Tx2a
(always resulting in a modified database state) were specified in the previous section. However, we intend—as discussed in the previous section—that all static and dynamic constraints should remain satisfied; that is, that the transaction rolls back whenever either the resulting database state is not an element of DB_UEX
, or the ordered pair representing the transition from the begin state to the end state is not an element of ST_UEX
.
In the previous section, you saw two examples of a transaction that inserts a single tuple. Take a look at the example in Listing 10-2. It specifies a transaction that inserts (potentially) multiple tuples. Listing 10-2 registers all administrators that have been hired in the past quarter (91 days) for the offering of course AM4DP that starts on March 1, 2007.
FEa: dbs↓{EMP,SREP,MEMP,TERM,DEPT,GRD,CRS,OFFR,HIST}
∪
{ (REG; dbs(REG)
∪
{ { (COURSE;'AM4DP' )
,(STARTS;'01-mar-2007')
,(STUD; e(EMPNO) )
,(EVAL; -1 ) }
| e∈dbs(EMP) ∧ e(JOB)='ADMIN' ∧ e(HIRED) ≥ sysdate-91
∧ e(HIRED) ≤ sysdate } ) }
SEI: insert into REG(STUD,EVAL,COURSE,STARTS)
select e.EMPNO, -1, 'AM4DP', '01-mar-2007'
from EMP e
where e.JOB = 'ADMIN'
and e.HIRED between sysdate - 91 and sysdate
This transaction only changes the REG
table structure; the tuples that result from a query on the EMP
table are added to it. In these tuples, attributes COURSE
, STARTS
, and EVAL
are set to values 'AM4DP'
, '01-mar-2007'
, and -1
, respectively. Attribute STUD
ranges over all employee numbers of administrators that were hired in the past 91 days.
In the formal specification, the order of the attribute-value pairs enumerated inside the query part of this specification does not matter. However, in the SQL expression, the order of the value expressions listed in the second line (the one starting with keyword SELECT
) must match the order of the attributes listed in the first line (following keyword INSERT
).
Note In SQL, embedded queries such as the one displayed in SEI
are often referred to as subqueries.
Finally, you should recognize that because the subquery in SEI
selects EMPNO
(the uniquely identifying attribute in the EMP
table structure), there is no need to include the distinct
keyword right after the select
keyword.
Another common transaction is the deletion of tuples from the database. Listing 10-3 gives an example that deletes registrations. Listing 10-3 deletes all registrations of student 3124
for scheduled or confirmed offerings that start in the future.
FEa: dbs↓{EMP,SREP,MEMP,TERM,DEPT,GRD,CRS,OFFR,HIST}
∪
{ (REG; { r | r∈dbs(REG) ∧
¬ ( r(STUD) = 3124 ∧
r(STARTS) > sysdate ∧
{ o(STATUS) | o∈dbs(OFFR) ∧
o↓{COURSE,STARTS} = r↓{COURSE,STARTS} }
∈{'SCHD','CONF'} )
})}
FEb: dbs↓{EMP,SREP,MEMP,TERM,DEPT,GRD,CRS,OFFR,HIST}
∪
{ (REG; { r↓{STUD,COURSE,STARTS,EVAL} | r∈dbs(REG)⊗dbs(OFFR) ∧
( r(STUD) ≠ 3124 ∨
r(STARTS) ≤ sysdate ∨
r(STATUS) = 'CANC' )
})}
FEc: dbs↓{EMP,SREP,MEMP,TERM,DEPT,GRD,CRS,OFFR,HIST}
∪
{ (REG; dbs(REG) -
{r|r∈dbs(REG) ∧ r(STUD) = 3124 ∧ r(STARTS) > sysdate ∧
{ o(STATUS) | o∈dbs(OFFR) ∧
o↓{COURSE,STARTS} = r↓{COURSE,STARTS} }
∈{'SCHD','CONF'}
})}
SEI: delete from REG r
where r.STUD = 3124
and r.STARTS > sysdate
and (select o.STATUS
from OFFR o
where o.COURSE = r.COURSE
and o.STARTS = r.STARTS) in ('SCHD','CONF') )
Specification FEa
reinitializes the REG
table structure with the result from a query that is based on dbs(REG)
. The query retrieves all registrations that should remain in the database state; that is, all tuples that do not represent a registration of student 3124
, for an offering in the future that has status confirmed or scheduled.
FEb
does something similar, only now the query is based on a join between REG
and OFFR
. Note that in this specification, compared to FEa
, we have rewritten the negation according to De Morgan. The query now retrieves all registrations that are either not for student 3124
, or that are for an offering that starts in the past (including today), or that have status canceled.
The way FEc
specifies this transaction might be the most intuitive. It specifies a table that contains all the registrations that need to be deleted, and then subtracts that—using the difference operator (−)—from dbs(REG)
. This specification is much like the way you specify this transaction using SQL (see SEI
).
As you’ll understand by now, in our formalism a transaction is specified by defining, for every possible begin state, what the end state for the transaction will be. This involves specifying a table for every table structure. In contrast, with SQL you only specify the change in data that the transaction achieves; you needn’t specify data that remains unchanged.
Right now you’re probably thinking that formally specifying transactions is rather tedious. However, you should realize that the SQL expression for the preceding transaction ETX2
is actually a shorthand notation. It specifies that the SQL table variable REG
is (re)assigned its begin state value minus the result set of a query that is essentially specified by the WHERE
clause of the DELETE
statement.
Note Using the mathematical methodology presented in this book, you can also develop formal shorthand expressions. Bert De Brock in his book Foundations of Semantic Databases (Prentice Hall, 1995) develops these for common types of transactions (such as deleting tuples from a table structure). However, because this involves introducing a few more somewhat complex mathematical concepts, we won’t develop formal shorthand in this book. We refer the interested reader to Chapter 7 of De Brock’s book.
There is an important point to be made with regards to the ways ETX2
is specified in Listing 10-2. They all involve the OFFR
table structure to determine if the offering has status scheduled or confirmed. In fact, this involvement is not required. Given constraint PODC6
(canceled offerings cannot have registrations), in conjunction with the attribute-value set of the STATUS
attribute in OFFR
(only three values are allowed: scheduled, confirmed, and canceled), you can deduce that if there is a registration for an offering, then that offering will either have status scheduled or confirmed. There is no need for you to specify this in the transaction.
Here are equivalent formal and SQL specifications for transaction ETX2
:
FEa: dbs↓{EMP,SREP,MEMP,TERM,DEPT,GRD,CRS,OFFR,HIST}
∪
{ (REG; { r | r∈dbs(REG) ∧ ¬ ( r(STUD) = 3124 ∧ r(STARTS) > sysdate ) } ) }
SEII: delete from REG r
where r.STUD = 3124
and r.STARTS > sysdate
You can consider these specifications to be more efficient; a DBMS will likely require fewer resources to execute the transaction if specified in this less complex way.
Rewriting formal specifications—be they transactions, queries, or even constraints—using knowledge of already established data integrity constraints, into less complex specifications, is referred to as semantic optimization.
Note A true relational DBMS should be capable of performing this semantic optimization automatically for us. Unfortunately, DBMSes that perform such sophisticated semantic optimizations are still unavailable. This is primarily because current DBMSes still have poor support for declaratively specified data integrity constraints. This poor declarative support is also the reason why this book has Chapter 11.
Another common transaction is updating tuples in a database. Listing 10-4 gives an example that updates the maximum capacity of certain offerings. It doubles the maximum capacity of all future offerings planned in Amsterdam.
FEa: dbs↓{SREP,MEMP,TERM,EMP,DEPT,GRD,CRS,REG,HIST
∪
{ (OFFR; { o | o∈dbs(OFFR) ∧ ¬(o(STARTS) ∧ sysdate ∧ o(LOC) = 'AMSTERDAM') }
∪
{o↓{COURSE,STARTS,STATUS,TRAINER,LOC}
∪{ (MAXCAP; 2 * o(MAXCAP)) }
|o∈dbs(OFFR) ∧ o(STARTS) > sysdate ∧ o(LOC) = 'AMSTERDAM' }
)}
SEI: update OFFR o
set o.MAXCAP = 2 * o.MAXCAP
where o.STARTS > sysdate
and o.LOC = 'AMSTERDAM'
Specification FEa
clearly demonstrates that updating a subset of tuples can be seen as first deleting these tuples and then reinserting them with updated attribute values. The first operand of the union operator, at the fourth line of the definition of FEa
, represents the OFFR
table from which future offerings in Amsterdam have been deleted. The second operand of this union operator represents the reinsertion of these offerings with a doubled maximum capacity.
Again, as shown by expression SEI
, SQL offers a convenient shortha nd to specify an update transaction.
We continue with another example update transaction. Transaction ETX4
in Listing 10-5 updates the salary of certain employees. Note that state transition constraint STC5
(see Listing 8-8) requires that updates of salary must be logged in the HIST
table structure; more precisely, the MSAL
values of the begin state need to be logged. Listing 10-5 increases the salary of all trainers working in a department at location Denver by 10 percent.
FEa: dbs↓{SREP,MEMP,TERM,DEPT,GRD,CRS,OFFR,REG}
∪
{ (HIST; dbs(HIST)
∪
{ { (EMPNO; e(EMPNO) )
,(UNTIL; sysdate )
,(DEPTNO;e(DEPTNO) )
,(MSAL; e(MSAL) ) }
| e∈dbs(EMP)⊗dbs(DEPT) ∧ e(JOB) = 'TRAINER' ∧ e(LOC) = 'DENVER' }
) }
∪
{ (EMP; { e | e∈dbs(EMP) ∧ ¬(e(JOB) = 'TRAINER' ∧
{ d(LOC) | d∈dbs(DEPT) ∧
d(DEPTNO) = e(DEPTNO) }
= 'DENVER') }
∪
{e↓{EMPNO,ENAME,BORN,JOB,HIRED,SGRADE,USERNAME,DEPTNO}
∪ { (MSAL; 1.1 * e(MSAL)) }
|e∈dbs(EMP) ∧ e(JOB) = 'TRAINER' ∧
{ d(LOC) | d∈dbs(DEPT) ∧ d(DEPTNO) = e(DEPTNO) }
= 'DENVER' }
)}
SEI: insert into HIST(EMPNO,UNTIL,DEPTNO,MSAL)
(select e.EMPNO, sysdate, e.DEPTNO, e.MSAL
from EMP e
where e.job ='TRAINER'
and 'DENVER' = (select d.LOC
from DEPT d
where d.DEPTNO = e.DEPTNO));
update EMP e
set e.MSAL = 1.1 * e.MSAL
where e.job ='TRAINER'
and 'DENVER' = (select d.LOC
from DEPT d
where d.DEPTNO = e.DEPTNO)
SEII: update EMP e
set e.MSAL = 1.1 * e.MSAL
where e.job ='TRAINER'
and 'DENVER' = (select d.LOC
from DEPT d
where d.DEPTNO = e.DEPTNO);
insert into HIST(EMPNO,UNTIL,DEPTNO,MSAL)
(select e.EMPNO, sysdate, e.DEPTNO, e.MSAL / 1.1
from EMP e
where e.job ='TRAINER'
and 'DENVER' = (select d.LOC
from DEPT d
where d.DEPTNO = e.DEPTNO))
Specification FEa
for transaction ETX4
assigns new values to both the HIST
and EMP
table structures, at the same time. It inserts into HIST
the results of a query based on a join between EMP
and DEPT
, thereby logging, as required by constraint STC5
, the current MSAL
and DEPTNO
values of the tuples that are about to be updated (loosely speaking). It also updates EMP
in the requested way. The specifications of these tables (the “new” HIST
table and the “new” EMP
table) both refer to dbs
, the begin state of the transaction.
In SQL, you cannot add rows to one table, and change rows of another table, using one statement. You are required to specify (separately) an INSERT
statement and an UPDATE
statement, and choose an order in which these two statements are to be serially executed. We have indicated the serial execution in SEI
and SEII
by separating the two DML statements with a semicolon.
Expressions SEI
and SEII
differ in the chosen order of the two statements. An SQL DBMS will provide an intermediate database state that already reflects the modifications made by the first DML statement to the second DML statement. This statement (and subqueries embedded within it) will “see” this intermediate database state.
Note This is often the default behavior of an SQL DBMS. We’ll have more to say on these intermediate database states in Chapter 11.
This side effect doesn’t occur in our formalism; FEa
references the begin state only. This side effect is why, in SEII
, the subquery retrieving the tuples that need to be logged (inserted) into the HIST
table structure needs to perform a division by 1.1
; it sees the modifications (increased salaries) resulting from the UPDATE
statement that executed first.
Let’s take a look at another update transaction. Listing 10-6 specifies the transaction of canceling a scheduled offering. Note that static constraint PODC6
(“Canceled offerings cannot have registrations;” see Listing 7-43) requires that all registrations for the offering (if any) must be deleted. Listing 10-6 cancels the offering for course J2EE that starts on February 14, 2007.
FEa: dbs↓{SREP,MEMP,TERM,DEPT,GRD,CRS,EMP,HIST}
∪
{ (REG; { r | r∈dbs(REG) ∧ (r(COURSE) ≠ 'J2EE' ∨ r(STARTS) ≠ '14-feb-2007') }
)}
∪
{(OFFR; { o | o∈dbs(OFFR) ∧ ¬ (o(COURSE) = 'J2EE' ∧
o(STARTS) = '14-feb-2007') }
∪
{ o↓{COURSE,STARTS,MAXCAP,TRAINER,LOC}
∪ { (STATUS; 'CANC') }
| o∈dbs(OFFR) ∧ o(COURSE) = 'J2EE' ∧ o(STARTS) = '14-feb-2007' }
)}
SEI: delete from REG r
where r.COURSE = 'J2EE'
and r.STARTS = '14-feb-2007';
update OFFR o
set o.STATUS = 'CANC'
where o.COURSE = 'J2EE'
and o.STARTS = '14-feb-2007'
SEII: update OFFR o
set o.STATUS = 'CANC'
where o.COURSE = 'J2EE'
and o.STARTS = '14-feb-2007';
delete from REG r
where r.COURSE = 'J2EE'
and r.STARTS = '14-feb-2007'
Formal expression FEa
should be straightforward by now; it specifies the deletion of zero or more registrations and an update of a single offering. SQL
expressions SEI
and SEII
again only differ in the order of execution of the required two DML statements to implement this transaction in an SQL DBMS.
In contrast with ETX4
, one of the alternative orders of execution has a prominent disadvantage. SEII
creates an intermediate database state that potentially violates a static constraint. However, the second DML statement will always correct the violation.
Suppose the offering of the J2EE course on Valentine’s Day already had at least one registration. Then, the first DML statement—setting the status of the offering to canceled—will create a database state that violates constraint PODC6
. The second DML statement—the deletion of corresponding registrations—will repair this; it modifies the intermediate database state into one that conforms to PODC6
(it “fixes the violation”).
This state of affairs in alternative SEII
is very risky. If the intermediate database state (that violates a constraint) is queried by other application code, executing within the same transaction, then results of these queries might be incorrect. Or, maybe even worse, subqueries inside the second DML statement can produce false results.
Note We assume here that application code executing within other transactions can never see these intermediate database states. This is indeed the case in Oracle’s SQL DBMS (we’ll come back to this in Chapter 11 when we discuss the transaction isolation mode offered by Oracle).
For instance, say that you want to query the number of scheduled or confirmed offerings that have at least one registration. Let’s call this query EXQ11
. Here is a formal specification for EXQ11
:
EXQ11(dbs) = #{ o | o∈dbs(OFFR) ∧ o(STATUS)∈{'SCHD','CONF'} ∧
(∃r∈dbs(REG): r↓{COURSE,STARTS} = o↓{COURSE,STARTS}) }
In SQL this query might look like this:
select count(*)
from OFFR o
where o.STATUS in ('SCHD','CONF')
and exists (select r.*
from REG r
where r.COURSE = o.COURSE
and r.STARTS = o.STARTS)
The application developer might have been very smart and semantically optimized the query into the following:
EXQ11(dbs) = #{ r↓{COURSE,STARTS} | r∈dbs(REG) }
This translates to the following query in SQL:
select count(*)
from (select distinct r.COURSE, r.STARTS
from REG r)
Here the developer has used constraint PODC6
, which implies that if there is a registration for some offering, then this offering cannot have status canceled. So, by counting distinct (offering) foreign key values in REG
, you are effectively counting offerings that have at least one registration and have status scheduled or confirmed.
If this semantically rewritten query would execute in the middle of SEII
’s version of transaction ETX5
, then obviously the result would be one too many; it also counts the canceled offering whose registrations are about to be—but haven’t been yet—deleted.
We’ll have more to say about this phenomenon in Chapter 11.
Let’s give two more examples of transactions. Listing 10-7 defines an update transaction that involves subqueries to determine new attribute values. Listing 10-7 updates the commission of sales reps. For every sales rep, increase the commission by 2 percent of the average monthly salaries of all sales reps (including the one being updated) that work in the same department as the one being updated.
FEa: dbs↓{MEMP,TERM,DEPT,GRD,REG,OFFR,CRS,EMP,HIST}
∪
{ (SREP; { s↓{EMPNO,TARGET}
∪ { (COMM; 1.02 *
(AVG e1∈{e|e∈dbs(EMP) ∧
e(DEPTNO)∈{ e2(DEPTNO) | e2∈dbs(EMP) ∧
e2(EMPNO) = s(EMPNO) } ∧
e(JOB) = 'SALESREP' }
: e1(MSAL))
)}
|s∈dbs(SREP) }
)}
SEI: update SREP s
set COMM = (select 1.02 * avg(e1.MSAL)
from EMP e1
where e1.DEPTNO = (select e2.DEPTNO
from EMP e2
where e2.EMPNO = s.EMPNO)
and e1.JOB = 'SALESREP')
Note that, in contrast with prior update transaction examples, this is an unrestricted update; all rows of SREP
are modified.
For your convenience, here is the structure of the AVG
operator as we have formally defined it in Chapter 5:
(AVG x∈S: f(x))
For every element x
that can be chosen from set S
, the average operator evaluates expression f(x)
and computes the average of all such evaluations. In expression FEa
, this operator is used as follows:
(AVG e1∈{ all salesreps working in same department as salesrep s(EMPNO) }
: e1(MSAL))
As you can see in expression SEI
, SQL allows you to specify subqueries within the set clause of an UPDATE
statement. However, note that in these cases, the subquery should always retrieve exactly one row with just one column.
Listing 10-8 defines a DELETE
transaction; salary grades that are currently not “used” are deleted from GRD
. The listing deletes salary grades that have no employee assigned to them.
FEa: dbs↓{SREP,MEMP,TERM,DEPT,REG,OFFR,CRS,EMP,HIST}
∪
{ (GRD; { g | g∈dbs(GRD) ∧ ¬(∃e∈dbs(EMP): e(SGRADE) = g(GRADE) } ) }
SEI: delete from GRD g
where not exists (select e.*
from EMP e
where e.SGRADE = g.GRADE)
Despite the “not exists” restriction, transaction ETX7
is still likely to fail, given the last table constraint specified in the definition of table universe tab_GRD
(see Listing 7-31). In fact, it would only successfully execute when either the lowest (and zero or more of its direct successors), or the highest (and zero or more of its direct predecessors) salary grades are the only ones that are deleted. In all other cases, the table constraint would be violated and thus the transaction rolled back.
There is an exercise at the end of this chapter for you to specify an UPDATE
statement that fixes such a violation.
This section provides a summary of this chapter, formatted as a bulleted list. You can use it to check your understanding of the various concepts introduced in this chapter before continuing with the exercises in the next section.
ETX1
. Also discuss the properties that the begin state should have for transaction ETX1
to execute successfully.ETX2
might violate constraint PODC4
; if the removal of a registration for student 3124
brings the number of registrations for a confirmed offering below six, then this constraint is violated. Amend the specification (both formal and SQL) for ETX2
, such that as many as possible registrations for student 3124
are deleted, without violating constraint PODC4
.UPDATE
statement of transaction ETX4
.ETX7
creates a database state that indeed violates the discussed table constraint. Choose a strategy to fix this by updating the remaining salary grades. Formally specify your strategy and supply an SQL UPDATE
statement for it.40
by 5 percent. If the update requires the modification of the salary grade, then modify the salary grade too.18.116.67.70