images

Through executing transactions, you can manipulate the data in a database. In this chapter we’ll demonstrate how you can specify a transaction in a formal way. We won’t formally introduce any new concepts; we’ve already presented all the ingredients for specifying transactions.

The section “Formally Specifying Transactions” gradually develops how you can specify a transaction. You’ll learn that you can specify a transaction as a function over the database universe at hand. For every database state, this function returns another database state that reflects the intended changes of the transaction. In this section, we’ll also give two examples to familiarize you with this formal concept of a transaction.

The section “Example Transactions Over DB_UEX” provides example transactions for the database universe that was introduced in Chapter 7. For the second time in this book, you’ll see the use of SQL. Every example transaction will be accompanied by one or more equivalent SQL data manipulation language (DML) statements; that is, INSERT, UPDATE, and DELETE statements. You’ll learn that expressing certain transactions in SQL requires executing more than one DML statement.

imagesNote   We assume you have a working knowledge of SQL DML statements. It is not the goal of this book to teach SQL.

In preparation for the next chapter (11) where you’ll be introduced to the challenges of implementing data integrity constraints using an SQL DBMS, we’ll also (somewhat) discuss which data integrity constraints a given transaction might violate. It is up to the DBMS to validate these involved data integrity constraints for the given transaction. The DBMS should reject the intended changes of the transaction if the resulting database state violates any of the involved constraints.

As usual, you’ll find a “Chapter Summary” at the end, followed by an “Exercises” section.

Formally Specifying Transactions

Next to data retrieval, transaction execution is the second most important application of a database. By executing transactions, you can maintain the database and ensure that its state remains a valid representation of the (changing) real world. In this section, we’ll demonstrate how you can formally specify transactions.

Let’s start with a simple example using the database design that was introduced in Chapter 7. Take a look at tuple tcrs1. It represents a new course than needs to be inserted into the database:

tcrs1 := { (CODE;'AM4DP')
         , (DESC;'Applied Mathematics for Database Professionals')
         , (CAT;'DSG')
         , (DUR;5) }

For your convenience, we repeat the definitions of characterization chr_CRS, tuple universe tup_CRS, and table universe tab_CRS in Listing 10-1. All involved attribute, tuple, and table constraints with regards to the CRS table structure are in this listing.

Listing 10-1. Definitions of chr_CRS, tup_CRS, and tab_CRS

chr_CRS :=
{ ( CODE;  CRSCODE–TYP                       )
, ( DESCR; varchar(40)                       )
, ( CAT;   { s | s∈varchar(3) ∧
                 s∈{'DSG','GEN','BLD'} }      )
, ( DUR;   { n | n∈number(2,0) ∧ 1 ≤ n ≤ 15 } )
}

tup_CRS :=
{ c | c∈Π(chr_CRS) ∧ c(CAT) = 'BLD' ⇒ t(DUR) ≤ 5
}

tab_CRS :=
{ C | C∈ ℘ (tup_CRS) ∧ (∀c1,c2∈C: c1(CODE) = c2(CODE) ⇒ c1 = c2)
}

imagesNote   tcrs1 does not violate any attribute or tuple constraints for the CRS table structure; or put differently, tcrs1 is an element of tuple universe tup_CRS.

Here is a first attempt to formally specify the transaction of inserting tcrs1 into the CRS table structure. We could specify this transaction as a function, say Tx1a, over DB_UEX. For a given database state in DB_UEX, say dbs, function Tx1a returns a database state that reflects the insertion of tcrs1 into the CRS table structure.

Tx1a(dbs) := { (EMP;  dbs(EMP) )
             , (SREP; dbs(SREP))
             , (MEMP; dbs(MEMP))
             , (TERM; dbs(TERM))
             , (DEPT; dbs(DEPT))
             , (GRD;  dbs(GRD) )
             , (CRS;  dbs(CRS) ∪ {tcrs1} )
             , (OFFR; dbs(OFFR))
             , (REG;  dbs(REG) )
             , (HIST; dbs(HIST)) }

As you can see, function Tx1a yields another database state that holds the same ten table structures as dbs. This resulting database state differs from dbs only for the CRS table structure; tuple tcrs1 has been added. For all other table structures, the resulting database state holds the corresponding tables that dbs holds.

Let’s apply Tx1a to the “empty database state” (that is, the database state in which all table structures hold the empty table). Here is the empty database state, which we’ll name db_empty. We introduced you to this database state in Exercise 12 in Chapter 7.

db_empty := { (EMP;  Ø)
            , (SREP; Ø)
            , (MEMP; Ø)
            , (TERM; Ø)
            , (DEPT; Ø)
            , (GRD;  Ø)
            , (CRS;  Ø)
            , (OFFR; Ø)
            , (REG;  Ø)
            , (HIST; Ø)}

In Exercise 12 in Chapter 7, you established that db_empty is an element of DB_UEX. Therefore, you can apply function Tx1a to it. Applying function Tx1a to db_empty, denoted by Tx1a(db_empty), results in the following database state:

Tx1a(db_empty) = { (EMP;  Ø)
                 , (SREP; Ø)
                 , (MEMP; Ø)
                 , (TERM; Ø)
                 , (DEPT; Ø)
                 , (GRD;  Ø)
                 , (CRS;  {tcrs1} )
                 , (OFFR; Ø)
                 , (REG;  Ø)
                 , (HIST; Ø)}

Note that the application of Tx1a to db_empty results in a database state that is (again) an element of DB_UEX; the resulting database state conforms to all static constraints that were specified as part of the definition of DB_UEX. However, this isn’t true in general. Let’s take a look at applying Tx1a to the following database state (dbs1):

dbs1 :=
 { (EMP;  Ø)
 , (SREP; Ø)
 , (MEMP; Ø)
 , (TERM; Ø)
 , (DEPT; Ø)
 , (GRD;  Ø)
 , (CRS; { { (CODE;'AM4DP'), (DESC;'AM4DP workshop')
           , (CAT;'DSG'), (DUR;1) } })
 , (OFFR; Ø)
 , (REG;  Ø)
 , (HIST; Ø)}

Applying transaction Tx1a to database state dbs1 results in a database state that holds the following CRS table:

{ { (CODE;'AM4DP'), (DESC;'AM4DP workshop'), (CAT;'DSG'), (DUR;1) }
, { (CODE;'AM4DP'), (DESC;'Applied Mathematics for Database Professionals'),
                                             (CAT;'DSG'), (DUR;5) } }

This table clearly violates the table constraint that is specified as part of the definition of tab_CRS, and is therefore not an admissible table for the CRS table structure. Consequently, in this case, Tx1a(dbs1) is not an element of DB_UEX. Clearly the current definition of this transaction is flawed, for we don’t want a transaction to have the database end up in a state that violates any of the data integrity constraints.

A more proper definition of the transaction that inserts tcrs1 to the database would be this one, named Tx1b. Note that in this definition we are reusing function Tx1a, which was specified earlier in this section.

Tx1b(dbs) := dbs      , if Tx1a(dbs)∉DB_UEX
          := Tx1a(dbs), otherwise

Transaction Tx1b describes an insertion attempt. If the insertion of tcrs1 results in a database state that violates any of the data integrity constraints—that is, the resulting database state is not an element of DB_UEX—then the insertion is refused (also referred to as rolled back), and the database state remains the same. Otherwise, the insertion of tcrs1 is allowed (the transaction executes successfully, or commits).

In this particular example, given a database state dbs, the insertion attempt successfully executes only if the CRS table in dbs does not already contain a tuple where the code attribute value equals 'AM4DP'. However, if that tuple is in fact tcrs1, the insertion attempt succeeds, but in this case does not add a tuple at all. Adding a tuple (through set union) to a table that already contains this tuple does not introduce a new element (tuple) to the table.

Thus far, we are only focusing on attribute, tuple, and table constraints in this discussion. When checking whether Tx1a(dbs) is an element of DB_UEX, you should also consider the database constraints that involve the CRS table structure.

imagesNoteYou can determine the involved database constraints for a transaction that inserts a tuple in CRS by simply scanning for the string 'CRS' in the formal specifications of the database constraints.

Here are the involved database constraints (using the short names introduced in Listing 7-39): PSSR6, PTIJ9, PTIJ10, PTIJ12, PTIJ14, PTIJ15, and PODC3. In this particular example, inserting a new CRS tuple into a valid database state—a database state in DB_UEX—will never violate one of these database constraints (you might want to verify this for yourself). Therefore, we need not consider these involved database constraints in the discussion of this example.

imagesNote  The definition of Tx1b does cover all static constraints: attribute, tuple, table, and database.

Last—this should not surprise you—the state transition constraints should be involved in the formal definition of a transaction.

Take a look at tuple toffr1. It represents an offering for the AM4DP course that needs to be inserted into the database.

toffr1 := { (COURSE;  'AM4DP')
          , (STARTS;  '01-FEB-2007')
          , (STATUS;  'CONF')
          , (MAXCAP;  20)
          , (TRAINER; 1126)
          , (LOC;     'UTRECHT') }

This tuple conforms to all attribute and tuple constraints that are specified for the OFFR table structure (see the definitions of chr_OFFR and tup_OFFR in Chapter 7).

Let’s assume you’d like to attempt to insert the offr1 tuple into the database whose begin state currently conforms to all static constraints specified in the definition of DB_UEX. We’ll further assume that the begin database state for this transaction conforms to the following additional characteristics. In parentheses, we list involved constraints that cause us to list the characteristic.

  • CRS holds a tuple that represents the AM4DP course (PSSR6).
  • No offerings are in OFFR yet; this is the first one to be inserted (tab_OFFR constraints PTIJ13, PTIJ14, PTIJ15).
  • EMP holds a tuple that represents a trainer with employee number 1126, who was hired before February 1, 2006. This trainer is still working for us; that is, he or she has not been terminated (PSSR8, PTIJ11, PTIJ12, PODC7).
  • DEPT holds a tuple that represents a department located in Utrecht. The aforementioned trainer is working in this department (PSSR7, PODC3).

Three other static constraints involve OFFR: PTIJ10, PODC4, and PODC5. However, these three cannot be violated by a transaction that inserts a new CRS tuple. They all involve the REG table structure too. The second characteristic in the preceding bulleted list, together with the fact that the begin state must be a valid one (that is, contained in DB_UEX), imply that there are no tuples in REG. This in turn means that the database state conforms (and remains conformed after the insertion of offr1) to PTIJ10, PODC4, and PODC5.

Given the preceding characteristics, the transaction of inserting tuple toffr1 would successfully execute. However, note that toffr1 represents a confirmed offering. The definition of our state transition universe (ST_UEX from Chapter 8) contains the specification of a state transition constraint stating that new offerings must start with status scheduled (constraint STC2, in Listing 8-8). Clearly this transaction violates constraint STC2.

A correct definition of the transaction that attempts to insert offr1, say Tx2, would be as follows.

For a database state dbs in DB_UEX, let

Tx2a(dbs) := { (EMP;  dbs(EMP) )
             , (SREP; dbs(SREP))
             , (MEMP; dbs(MEMP))
             , (TERM; dbs(TERM))
             , (DEPT; dbs(DEPT))
             , (GRD;  dbs(GRD) )
             , (CRS;  dbs(CRS) )
             , (OFFR; dbs(OFFR) ∪ {toffr1} )
             , (REG;  dbs(REG) )
             , (HIST; dbs(HIST)) }

and let

Tx2b(dbs) := dbs      , if Tx2a(dbs)∉DB_UEX
          := Tx2a(dbs), otherwise

then,

Tx2(dbs) := Tx2b(dbs), if (dbs,Tx2b(dbs))∈ST_UEX
         := dbs      , otherwise

The definition of Tx2a describes the insertion of toffr1. Tx2b ensures that this conforms to all static constraints. The definition of Tx2 ensures that the transaction conforms to all state transition constraints too. It does so by adding the condition that the ordered pair consisting of the begin state and the end state for the transaction is an element of the state transition universe.

Often a transaction modifies only a few of the table structures involved in a database design; many table structures remain unchanged. An alternative (and somewhat shorter) way of specifying Tx2a, that more explicitly shows this fact, is as follows:

Tx2a(dbs) := dbs↓{EMP,SREP,MEMP,TERM,DEPT,GRD,CRS,REG,HIST}
             ∪
             { (OFFR; dbs(OFFR) ∪ {toffr1}) }

Here we use function limitation (↓) to add all unchanged table structures to the end state, and through union (∪) we add the table structure(s) that the transaction modifies.

The next section will present some more example transactions over DB_UEX to familiarize you further with formally specifying transactions.

Example Transactions Over DB_UEX

This section will provide example transactions. For every transaction, we’ll present you with the following:

  • An informal description for the transaction
  • One or more formal expressions to specify the transaction
  • One or more SQL specifications that implement the transaction in an SQL DBMS

Along the way, you’ll see that SQL has limitations that cause you to execute more than one DML statement to implement a given transaction.

Listings 10-1 through 10-7 introduce the example transactions. All transactions will be named ETXi (example transaction i), where i represents a sequence number. In the formal specifications given for each transaction, dbs will represent the begin state for the transaction (an element of DB_UEX). The various alternative formal expressions will be named FEc (formal expression c), where c represents a character (a, b, c, and so on).

In the SQL specifications, we’ll use the table structure names introduced by database characterization DB_CHREX (see Listing 7-38) as the table names available for INSERT, UPDATE, and DELETE statements. Alternative SQL specifications will be named SEr (SQL expression r), where r represents a Roman numeral (I, II, III, and so on).

In the following formal expressions, we’ll specify a transaction as a function similar to the way Tx1a and Tx2a (always resulting in a modified database state) were specified in the previous section. However, we intend—as discussed in the previous section—that all static and dynamic constraints should remain satisfied; that is, that the transaction rolls back whenever either the resulting database state is not an element of DB_UEX, or the ordered pair representing the transition from the begin state to the end state is not an element of ST_UEX.

In the previous section, you saw two examples of a transaction that inserts a single tuple. Take a look at the example in Listing 10-2. It specifies a transaction that inserts (potentially) multiple tuples. Listing 10-2 registers all administrators that have been hired in the past quarter (91 days) for the offering of course AM4DP that starts on March 1, 2007.

Listing 10-2. Transaction ETX1(dbs)

FEa: dbs↓{EMP,SREP,MEMP,TERM,DEPT,GRD,CRS,OFFR,HIST}
     ∪
     { (REG; dbs(REG)
             ∪
             { { (COURSE;'AM4DP'      )
                ,(STARTS;'01-mar-2007')
                ,(STUD; e(EMPNO)      )
                ,(EVAL; -1            ) }
               | e∈dbs(EMP) ∧ e(JOB)='ADMIN' ∧ e(HIRED) ≥ sysdate-91
                                              ∧ e(HIRED) ≤ sysdate } ) }
SEI: insert into REG(STUD,EVAL,COURSE,STARTS)
     select e.EMPNO, -1, 'AM4DP', '01-mar-2007'
     from EMP e
     where e.JOB = 'ADMIN'
       and e.HIRED between sysdate - 91 and sysdate

This transaction only changes the REG table structure; the tuples that result from a query on the EMP table are added to it. In these tuples, attributes COURSE, STARTS, and EVAL are set to values 'AM4DP', '01-mar-2007', and -1, respectively. Attribute STUD ranges over all employee numbers of administrators that were hired in the past 91 days.

In the formal specification, the order of the attribute-value pairs enumerated inside the query part of this specification does not matter. However, in the SQL expression, the order of the value expressions listed in the second line (the one starting with keyword SELECT) must match the order of the attributes listed in the first line (following keyword INSERT).

imagesNote  In SQL, embedded queries such as the one displayed in SEI are often referred to as subqueries.

Finally, you should recognize that because the subquery in SEI selects EMPNO (the uniquely identifying attribute in the EMP table structure), there is no need to include the distinct keyword right after the select keyword.

Another common transaction is the deletion of tuples from the database. Listing 10-3 gives an example that deletes registrations. Listing 10-3 deletes all registrations of student 3124 for scheduled or confirmed offerings that start in the future.

Listing 10-3. Transaction ETX2(dbs)

FEa: dbs↓{EMP,SREP,MEMP,TERM,DEPT,GRD,CRS,OFFR,HIST}
     ∪
     { (REG; { r | r∈dbs(REG) ∧
                   ¬ ( r(STUD) = 3124 ∧
                        r(STARTS) > sysdate ∧
                       images{ o(STATUS) | o∈dbs(OFFR) ∧
                                       o↓{COURSE,STARTS} = r↓{COURSE,STARTS} }
                       ∈{'SCHD','CONF'} )
             })}
FEb: dbs↓{EMP,SREP,MEMP,TERM,DEPT,GRD,CRS,OFFR,HIST}
     ∪
     { (REG; { r↓{STUD,COURSE,STARTS,EVAL} | r∈dbs(REG)⊗dbs(OFFR) ∧
                                             ( r(STUD) ≠ 3124 ∨
                                               r(STARTS) ≤ sysdate ∨
                                               r(STATUS) = 'CANC' )
             })}
FEc: dbs↓{EMP,SREP,MEMP,TERM,DEPT,GRD,CRS,OFFR,HIST}
     ∪
     { (REG; dbs(REG) -
             {r|r∈dbs(REG) ∧ r(STUD) = 3124 ∧ r(STARTS) > sysdate ∧
                images{ o(STATUS) | o∈dbs(OFFR) ∧
                               o↓{COURSE,STARTS} = r↓{COURSE,STARTS} }
                ∈{'SCHD','CONF'}
             })}
SEI: delete from REG r
     where r.STUD = 3124
       and r.STARTS > sysdate
       and (select o.STATUS
            from OFFR o
            where o.COURSE = r.COURSE
              and o.STARTS = r.STARTS) in ('SCHD','CONF') )

Specification FEa reinitializes the REG table structure with the result from a query that is based on dbs(REG). The query retrieves all registrations that should remain in the database state; that is, all tuples that do not represent a registration of student 3124, for an offering in the future that has status confirmed or scheduled.

FEb does something similar, only now the query is based on a join between REG and OFFR. Note that in this specification, compared to FEa, we have rewritten the negation according to De Morgan. The query now retrieves all registrations that are either not for student 3124, or that are for an offering that starts in the past (including today), or that have status canceled.

The way FEc specifies this transaction might be the most intuitive. It specifies a table that contains all the registrations that need to be deleted, and then subtracts that—using the difference operator (−)—from dbs(REG). This specification is much like the way you specify this transaction using SQL (see SEI).

As you’ll understand by now, in our formalism a transaction is specified by defining, for every possible begin state, what the end state for the transaction will be. This involves specifying a table for every table structure. In contrast, with SQL you only specify the change in data that the transaction achieves; you needn’t specify data that remains unchanged.

Right now you’re probably thinking that formally specifying transactions is rather tedious. However, you should realize that the SQL expression for the preceding transaction ETX2 is actually a shorthand notation. It specifies that the SQL table variable REG is (re)assigned its begin state value minus the result set of a query that is essentially specified by the WHERE clause of the DELETE statement.

imagesNote   Using the mathematical methodology presented in this book, you can also develop formal shorthand expressions. Bert De Brock in his book Foundations of Semantic Databases (Prentice Hall, 1995) develops these for common types of transactions (such as deleting tuples from a table structure). However, because this involves introducing a few more somewhat complex mathematical concepts, we won’t develop formal shorthand in this book. We refer the interested reader to Chapter 7 of De Brock’s book.

There is an important point to be made with regards to the ways ETX2 is specified in Listing 10-2. They all involve the OFFR table structure to determine if the offering has status scheduled or confirmed. In fact, this involvement is not required. Given constraint PODC6 (canceled offerings cannot have registrations), in conjunction with the attribute-value set of the STATUS attribute in OFFR (only three values are allowed: scheduled, confirmed, and canceled), you can deduce that if there is a registration for an offering, then that offering will either have status scheduled or confirmed. There is no need for you to specify this in the transaction.

Here are equivalent formal and SQL specifications for transaction ETX2:

FEa: dbs↓{EMP,SREP,MEMP,TERM,DEPT,GRD,CRS,OFFR,HIST}
     ∪
     { (REG; { r | r∈dbs(REG) ∧ ¬ ( r(STUD) = 3124 ∧ r(STARTS) > sysdate ) } ) }
SEII: delete from REG r
      where r.STUD = 3124
        and r.STARTS > sysdate

You can consider these specifications to be more efficient; a DBMS will likely require fewer resources to execute the transaction if specified in this less complex way.

Rewriting formal specifications—be they transactions, queries, or even constraints—using knowledge of already established data integrity constraints, into less complex specifications, is referred to as semantic optimization.

imagesNote   A true relational DBMS should be capable of performing this semantic optimization automatically for us. Unfortunately, DBMSes that perform such sophisticated semantic optimizations are still unavailable. This is primarily because current DBMSes still have poor support for declaratively specified data integrity constraints. This poor declarative support is also the reason why this book has Chapter 11.

Another common transaction is updating tuples in a database. Listing 10-4 gives an example that updates the maximum capacity of certain offerings. It doubles the maximum capacity of all future offerings planned in Amsterdam.

Listing 10-4. Transaction ETX3(dbs)

FEa: dbs↓{SREP,MEMP,TERM,EMP,DEPT,GRD,CRS,REG,HIST
     ∪
     { (OFFR; { o | o∈dbs(OFFR) ∧ ¬(o(STARTS) ∧ sysdate ∧ o(LOC) = 'AMSTERDAM') }
              ∪
              {o↓{COURSE,STARTS,STATUS,TRAINER,LOC}
               ∪{ (MAXCAP; 2 * o(MAXCAP)) }
                |o∈dbs(OFFR) ∧ o(STARTS) > sysdate ∧ o(LOC) = 'AMSTERDAM' }
     )}
SEI: update OFFR o
     set o.MAXCAP = 2 * o.MAXCAP
     where o.STARTS > sysdate
       and o.LOC = 'AMSTERDAM'

Specification FEa clearly demonstrates that updating a subset of tuples can be seen as first deleting these tuples and then reinserting them with updated attribute values. The first operand of the union operator, at the fourth line of the definition of FEa, represents the OFFR table from which future offerings in Amsterdam have been deleted. The second operand of this union operator represents the reinsertion of these offerings with a doubled maximum capacity.

Again, as shown by expression SEI, SQL offers a convenient shortha nd to specify an update transaction.

We continue with another example update transaction. Transaction ETX4 in Listing 10-5 updates the salary of certain employees. Note that state transition constraint STC5 (see Listing 8-8) requires that updates of salary must be logged in the HIST table structure; more precisely, the MSAL values of the begin state need to be logged. Listing 10-5 increases the salary of all trainers working in a department at location Denver by 10 percent.

Listing 10-5. Transaction ETX4(dbs)

FEa:   dbs↓{SREP,MEMP,TERM,DEPT,GRD,CRS,OFFR,REG}
       ∪
       { (HIST; dbs(HIST)
                ∪
                { { (EMPNO; e(EMPNO)  )
                   ,(UNTIL; sysdate   )
                   ,(DEPTNO;e(DEPTNO) )
                   ,(MSAL; e(MSAL)    ) }
                 | e∈dbs(EMP)⊗dbs(DEPT) ∧ e(JOB) = 'TRAINER' ∧ e(LOC) = 'DENVER' }
       ) }
       ∪
       { (EMP; { e | e∈dbs(EMP) ∧ ¬(e(JOB) = 'TRAINER' ∧
                                    images{ d(LOC) | d∈dbs(DEPT) ∧
                                                 d(DEPTNO) = e(DEPTNO) }
                                    = 'DENVER') }
               ∪
               {e↓{EMPNO,ENAME,BORN,JOB,HIRED,SGRADE,USERNAME,DEPTNO}
                ∪ { (MSAL; 1.1 * e(MSAL)) }
                  |e∈dbs(EMP) ∧ e(JOB) = 'TRAINER' ∧
                                images{ d(LOC) | d∈dbs(DEPT) ∧ d(DEPTNO) = e(DEPTNO) }
                                = 'DENVER' }
       )}
SEI:   insert into HIST(EMPNO,UNTIL,DEPTNO,MSAL)
       (select e.EMPNO, sysdate, e.DEPTNO, e.MSAL
       from EMP e
       where e.job ='TRAINER'
         and 'DENVER' = (select d.LOC
                         from DEPT d
                         where d.DEPTNO = e.DEPTNO));
       update EMP e
       set e.MSAL = 1.1 * e.MSAL
       where e.job ='TRAINER'
         and 'DENVER' = (select d.LOC
                         from DEPT d
                         where d.DEPTNO = e.DEPTNO)
SEII:  update EMP e
       set e.MSAL = 1.1 * e.MSAL
       where e.job ='TRAINER'
         and 'DENVER' = (select d.LOC
                         from DEPT d
                         where d.DEPTNO = e.DEPTNO);
       insert into HIST(EMPNO,UNTIL,DEPTNO,MSAL)
       (select e.EMPNO, sysdate, e.DEPTNO, e.MSAL / 1.1
        from EMP e
        where e.job ='TRAINER'
          and 'DENVER' = (select d.LOC
                          from DEPT d
                          where d.DEPTNO = e.DEPTNO))

Specification FEa for transaction ETX4 assigns new values to both the HIST and EMP table structures, at the same time. It inserts into HIST the results of a query based on a join between EMP and DEPT, thereby logging, as required by constraint STC5, the current MSAL and DEPTNO values of the tuples that are about to be updated (loosely speaking). It also updates EMP in the requested way. The specifications of these tables (the “new” HIST table and the “new” EMP table) both refer to dbs, the begin state of the transaction.

In SQL, you cannot add rows to one table, and change rows of another table, using one statement. You are required to specify (separately) an INSERT statement and an UPDATE statement, and choose an order in which these two statements are to be serially executed. We have indicated the serial execution in SEI and SEII by separating the two DML statements with a semicolon.

Expressions SEI and SEII differ in the chosen order of the two statements. An SQL DBMS will provide an intermediate database state that already reflects the modifications made by the first DML statement to the second DML statement. This statement (and subqueries embedded within it) will “see” this intermediate database state.

imagesNote   This is often the default behavior of an SQL DBMS. We’ll have more to say on these intermediate database states in Chapter 11.

This side effect doesn’t occur in our formalism; FEa references the begin state only. This side effect is why, in SEII, the subquery retrieving the tuples that need to be logged (inserted) into the HIST table structure needs to perform a division by 1.1; it sees the modifications (increased salaries) resulting from the UPDATE statement that executed first.

Let’s take a look at another update transaction. Listing 10-6 specifies the transaction of canceling a scheduled offering. Note that static constraint PODC6 (“Canceled offerings cannot have registrations;” see Listing 7-43) requires that all registrations for the offering (if any) must be deleted. Listing 10-6 cancels the offering for course J2EE that starts on February 14, 2007.

Listing 10-6. Transaction ETX5(dbs)

FEa:   dbs↓{SREP,MEMP,TERM,DEPT,GRD,CRS,EMP,HIST}
       ∪
       { (REG; { r | r∈dbs(REG) ∧ (r(COURSE) ≠ 'J2EE' ∨ r(STARTS) ≠ '14-feb-2007') }
       )}
       ∪
       {(OFFR; { o | o∈dbs(OFFR) ∧ ¬ (o(COURSE) = 'J2EE' ∧
                                      o(STARTS) = '14-feb-2007') }
               ∪
               { o↓{COURSE,STARTS,MAXCAP,TRAINER,LOC}
                 ∪ { (STATUS; 'CANC') }
                 | o∈dbs(OFFR) ∧ o(COURSE) = 'J2EE' ∧ o(STARTS) = '14-feb-2007' }
       )}
SEI:   delete from REG r
       where r.COURSE = 'J2EE'
         and r.STARTS = '14-feb-2007';
       update OFFR o
       set o.STATUS = 'CANC'
       where o.COURSE = 'J2EE'
         and o.STARTS = '14-feb-2007'
SEII:  update OFFR o
       set o.STATUS = 'CANC'
       where o.COURSE = 'J2EE'
         and o.STARTS = '14-feb-2007';
       delete from REG r
       where r.COURSE = 'J2EE'
         and r.STARTS = '14-feb-2007'

Formal expression FEa should be straightforward by now; it specifies the deletion of zero or more registrations and an update of a single offering. SQL expressions SEI and SEII again only differ in the order of execution of the required two DML statements to implement this transaction in an SQL DBMS.

In contrast with ETX4, one of the alternative orders of execution has a prominent disadvantage. SEII creates an intermediate database state that potentially violates a static constraint. However, the second DML statement will always correct the violation.

Suppose the offering of the J2EE course on Valentine’s Day already had at least one registration. Then, the first DML statement—setting the status of the offering to canceled—will create a database state that violates constraint PODC6. The second DML statement—the deletion of corresponding registrations—will repair this; it modifies the intermediate database state into one that conforms to PODC6 (it “fixes the violation”).

This state of affairs in alternative SEII is very risky. If the intermediate database state (that violates a constraint) is queried by other application code, executing within the same transaction, then results of these queries might be incorrect. Or, maybe even worse, subqueries inside the second DML statement can produce false results.

imagesNote We assume here that application code executing within other transactions can never see these intermediate database states. This is indeed the case in Oracle’s SQL DBMS (we’ll come back to this in Chapter 11 when we discuss the transaction isolation mode offered by Oracle).

For instance, say that you want to query the number of scheduled or confirmed offerings that have at least one registration. Let’s call this query EXQ11. Here is a formal specification for EXQ11:

EXQ11(dbs) = #{ o | o∈dbs(OFFR) ∧ o(STATUS)∈{'SCHD','CONF'} ∧
                    (∃r∈dbs(REG): r↓{COURSE,STARTS} = o↓{COURSE,STARTS}) }

In SQL this query might look like this:

select count(*)
from OFFR o
where o.STATUS in ('SCHD','CONF')
  and exists (select r.*
              from REG r
              where r.COURSE = o.COURSE
                and r.STARTS = o.STARTS)

The application developer might have been very smart and semantically optimized the query into the following:

EXQ11(dbs) = #{ r↓{COURSE,STARTS} | r∈dbs(REG) }

This translates to the following query in SQL:

select count(*)
from (select distinct r.COURSE, r.STARTS
      from REG r)

Here the developer has used constraint PODC6, which implies that if there is a registration for some offering, then this offering cannot have status canceled. So, by counting distinct (offering) foreign key values in REG, you are effectively counting offerings that have at least one registration and have status scheduled or confirmed.

If this semantically rewritten query would execute in the middle of SEII’s version of transaction ETX5, then obviously the result would be one too many; it also counts the canceled offering whose registrations are about to be—but haven’t been yet—deleted.

We’ll have more to say about this phenomenon in Chapter 11.

Let’s give two more examples of transactions. Listing 10-7 defines an update transaction that involves subqueries to determine new attribute values. Listing 10-7 updates the commission of sales reps. For every sales rep, increase the commission by 2 percent of the average monthly salaries of all sales reps (including the one being updated) that work in the same department as the one being updated.

Listing 10-7. Transaction ETX6(dbs)

FEa:   dbs↓{MEMP,TERM,DEPT,GRD,REG,OFFR,CRS,EMP,HIST}
       ∪
       { (SREP; { s↓{EMPNO,TARGET}
                  ∪ { (COMM; 1.02 *
                             (AVG e1∈{e|e∈dbs(EMP) ∧
                                          e(DEPTNO)∈{ e2(DEPTNO) | e2∈dbs(EMP) ∧
                                                          e2(EMPNO) = s(EMPNO) } ∧
                                          e(JOB) = 'SALESREP' }
                             : e1(MSAL))
                    )}
                  |s∈dbs(SREP) }
      )}
SEI:  update SREP s
      set COMM = (select 1.02 * avg(e1.MSAL)
                  from EMP e1
                  where e1.DEPTNO = (select e2.DEPTNO
                                     from EMP e2
                                     where e2.EMPNO = s.EMPNO)
                    and e1.JOB = 'SALESREP')

Note that, in contrast with prior update transaction examples, this is an unrestricted update; all rows of SREP are modified.

For your convenience, here is the structure of the AVG operator as we have formally defined it in Chapter 5:

(AVG x∈S: f(x))

For every element x that can be chosen from set S, the average operator evaluates expression f(x) and computes the average of all such evaluations. In expression FEa, this operator is used as follows:

(AVG e1∈{ all salesreps working in same department as salesrep s(EMPNO) }
 : e1(MSAL))

As you can see in expression SEI, SQL allows you to specify subqueries within the set clause of an UPDATE statement. However, note that in these cases, the subquery should always retrieve exactly one row with just one column.

Listing 10-8 defines a DELETE transaction; salary grades that are currently not “used” are deleted from GRD. The listing deletes salary grades that have no employee assigned to them.

Listing 10-8. Transaction ETX7(dbs)

FEa:   dbs↓{SREP,MEMP,TERM,DEPT,REG,OFFR,CRS,EMP,HIST}
       ∪
       { (GRD; { g | g∈dbs(GRD) ∧ ¬(∃e∈dbs(EMP): e(SGRADE) = g(GRADE) } ) }
SEI:   delete from GRD g
       where not exists (select e.*
                         from EMP e
                         where e.SGRADE = g.GRADE)

Despite the “not exists” restriction, transaction ETX7 is still likely to fail, given the last table constraint specified in the definition of table universe tab_GRD (see Listing 7-31). In fact, it would only successfully execute when either the lowest (and zero or more of its direct successors), or the highest (and zero or more of its direct predecessors) salary grades are the only ones that are deleted. In all other cases, the table constraint would be violated and thus the transaction rolled back.

There is an exercise at the end of this chapter for you to specify an UPDATE statement that fixes such a violation.

Chapter Summary

This section provides a summary of this chapter, formatted as a bulleted list. You can use it to check your understanding of the various concepts introduced in this chapter before continuing with the exercises in the next section.

  • You can formally specify a transaction as a function over a database universe. For every database state, this function returns another database state that reflects the modifications intended by the transaction.

  • You can specify this function using the set theory and logic introduced in Part 1 in conjunction with the various table operators introduced in Part 2.

  • Transactions start out in a valid database state and they should always end up in a valid database state; they must leave the database in a state that conforms to all constraints.

  • Every transaction is in fact a transaction attempt. The DBMS should ensure that a transaction is rolled back when the resulting database state does not conform to all static constraints, or when the state transition is not covered by the state transition universe.

  • In SQL a transaction is implemented as one or more data manipulation language (DML) statements that execute serially, one after the other, in an order determined by you.

  • You cannot choose the order of these DML statements arbitrarily. Subsequent DML statements query a modified (intermediate) database state; the modifications of prior DML statements executed in the transaction are made visible to them by the SQL DBMS.

  • Sometimes the intermediate database state is in violation of one or more data integrity constraints. Such a situation is undesirable, because queries executed against such a database state might deliver wrong results.

  • Semantically optimized (sub) queries run a particularly high risk of producing false results in these circumstances.

Exercises

  1. List all involved data integrity constraints for transaction ETX1. Also discuss the properties that the begin state should have for transaction ETX1 to execute successfully.

  2. Transaction ETX2 might violate constraint PODC4; if the removal of a registration for student 3124 brings the number of registrations for a confirmed offering below six, then this constraint is violated. Amend the specification (both formal and SQL) for ETX2, such that as many as possible registrations for student 3124 are deleted, without violating constraint PODC4.

  3. Determine the static data integrity constraints that might get violated by the SQL UPDATE statement of transaction ETX4.

  4. Suppose transaction ETX7 creates a database state that indeed violates the discussed table constraint. Choose a strategy to fix this by updating the remaining salary grades. Formally specify your strategy and supply an SQL UPDATE statement for it.

  5. Specify, both formally and with SQL, the following transaction: increase the monthly salary of all administrators working in department 40 by 5 percent. If the update requires the modification of the salary grade, then modify the salary grade too.

  6. Specify, both formally and with SQL, the following transaction: remove all data of employees who have been terminated more than five years ago. What issues with regards to constraints will this transaction encounter?

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.116.67.70