images

Thus far this book’s main focus has been on formally specifying a database design. Of course, you want to do more than just specify a database design; you usually would like to implement it using some DBMS. In this chapter we turn our attention towards implementing a database design.

Given the background of both authors, this chapter focuses specifically on implementing a database design in Oracle’s SQL DBMS. We’ll assume you’re familiar with the PL/SQL language—Oracle’s procedural programming language. We assume concepts such as database triggers, packages, procedures, and functions are known concepts.

imagesNote We refer you to the standard Oracle documentation available on the Internet (http://tahiti.oracle.com) if you are unfamiliar with any of the PL/SQL concepts that you’ll encounter in this chapter.

This will be a less formal chapter. We’ll use SQL terminology (such as row and column) when referring to SQL constructs. We’ll still use formal terminology when referring to the formal concepts introduced in this book.

The first few sections (through the section “Implementing Data Integrity Code”) establish some concepts with regards to implementing business applications on top of a database. You’ll also see that three distinctly different strategies exist to implement data integrity constraints.

Sections then follow that deal with implementing table structures, attribute constraints, and tuple constraints (through the section “Implementing Tuple Constraints”).

The section “Table Constraint Implementation Issues” is a rather large section. It introduces the challenges that you’re faced with when implementing multi-tuple constraints (that is, table or database constraints). A big chunk of this section will explore various constraint validation execution models. These execution models range from rather inefficient ones to more sophisticated (efficient) ones.

The sections “Implementing Table Constraints” and “Implementing Database Constraints” cover the implementation of table and database constraints, respectively, using one of the execution models introduced in the section “Table Constraint Implementation Issues.” This is followed by an exploration of the implementation of transition constraints (the section “Implementing Transition Constraints”).

The section “Bringing Deferred Checking into the Picture” deals with deferring the validation of constraints—a nasty phenomenon that cannot be avoided given that you implement a database design in an SQL DBMS.

At the end of this chapter we introduce you to the RuleGen framework: a framework that can help you implement data integrity constraints in Oracle’s SQL DBMS, and that applies many of the concepts explained throughout this chapter.

This chapter does not have an “Exercises” section.

If you’re familiar with a different SQL DBMS, then most of the concepts discussed in this chapter will probably also apply. We can’t be sure though, given our background. The discussion that involves serializing transactions is likely Oracle specific. Of course, the actual code examples will have to be done differently given another SQL DBMS.

Introduction

Thus far this book’s main focus has been to show you how a database design can be formally specified. To that end, we’ve introduced you to a bit of set theory and a bit of logic. Set theory and logic are the required mathematical tools that enable you to deal with professionally, talk about, manage, and document a database design. Also, as you have seen, a database design is much more than just table structures; the data integrity constraints are by far the most important part of a database design.

Of course, you want to do more than just specify a database design; you usually would like to implement it using some DBMS, and build business applications on top of it.

Unfortunately there are no true relational DBMSes available to us; we are forced to use an SQL DBMS to implement a database design. Implementing a database design in an SQL DBMS, as you’ll see in the course of this chapter, poses quite a few challenges—not so much in the area of implementing the table structures, but mainly in the area of implementing the involved data integrity constraints.

As you’ll see in the course of this chapter, SQL provides us with constructs to state data integrity constraints declaratively; however, not all of these are supported by SQL DBMSes available today. Most notably, the CREATE ASSERTION command—which has been in the official standard of SQL since 1992—is not supported in most SQL DBMSes (including Oracle’s).

Before we start investigating these constraint implementation challenges, we’ll first broaden our scope by investigating the general software architecture of a business application, or, as we prefer to call such an application from here on, a window-on-data application. Once we’ve established this architecture, we’ll then, within the context of that architecture, discuss the challenges of implementing a database design.

imagesNote This chapter will be quite different from the previous ones. It contains far less mathematics, although, as you will see, mathematics can still be applied when implementing a database design. We think that this chapter adds value to this book, because in our experience, the challenges investigated in this chapter are often overlooked.

Window-on-Data Applications

A window-on-data (WoD) application is—well, just as it says—an application that provides the user with windows on data. In a WoD application, users navigate through various windows of the application (browser pages, nowadays). Every page in the application either

  • Enables the user to compose a data retrieval request, execute it, and then have the retrieved data displayed in the window, or
  • Offers the user the possibility of composing a transaction using already retrieved—or newly entered—data, and then execute it.

Because these applications are all about querying and manipulating data, you should spend a fair amount of effort on designing and implementing the underlying table structures and involved integrity constraints necessary to support the application. The database design constitutes the underlying foundation of a WoD application; the quality of a WoD application can only be as good as the quality of the underlying database design.

imagesNote Here actually lay a main motivation for us to start writing this book. The current state of our industry is pretty bad when it comes to focusing on database design capabilities within an IT project. Few IT professionals nowadays are educated in the foundations of the relational model of data. Often, database designs are created that inevitably will cause the business application to not perform well, and/or be difficult to maintain. As mentioned earlier, discussing the quality of database designs is not within the scope of this book (it justifies at least another book in itself). We chose to offer you first the necessary tools to enable you to start dealing with database designs in a clear and professional way.

So, in short, a WoD application is all about managing business data. Needless to say, by far the majority of all business applications built on top of a database are WoD applications. This justifies taking a closer look at the general software architecture of this type of application.

In the next section we’ll introduce you to a classification scheme for the application code of a WoD application. Later on, when we discuss three implementation strategies for the data integrity constraints of a database design, you’ll see that we refer to this code classification scheme.

Classifying Window-on-Data Application Code

In this section, we’ll introduce you to a high-level code classification scheme that is at the core of every WoD application. Before we do this, we need to introduce two terms we’ll use in defining this classification scheme. These are data retrieval code and data manipulation code.

Data retrieval code is all code that queries data in the database. In our day-to-day practice these are typically the SQL queries (SELECT expressions) embedded within application code and interacting with the SQL DBMS.

Data manipulation code is all code that changes data in the database. These are the SQL data manipulation language (DML) statements (INSERT, UPDATE, or DELETE) embedded within application code that maintain data in the database.

imagesNote We’ll refer to these three types of DML statements (INSERT, UPDATE, or DELETE) jointly as update statements.

Having introduced these two terms, we can now introduce the code classification scheme. All code of a WoD application can be classified into one of the following three code classes:

  • Business logic code
  • Data integrity code
  • User interface code

The following three sections will discuss what we mean by the preceding three code classes.

Business Logic Code (BL Code)

Business logic code can be subdivided into two subclasses: first, code that composes and executes queries, and second, code that composes and executes transactions.

  • Query composing and executing code: This is procedural code holding only embedded data retrieval code (that is, query expressions). This code is responsible for composing the actual SQL queries, or conditionally determining which SQL queries should be executed. This code also initiates the execution of these queries and processes the rows returned by the SQL DBMS. We’ll refer to this subclass as read BL code (rBL code for short).
  • Transaction composing and executing code: This is procedural code holding embedded data manipulation code (that is, SQL DML statements). This code is responsible for composing update statements, or conditionally determining which update statements should be executed. This code also initiates the execution of these statements. Depending upon the return code(s) given by the SQL DBMS, this code might also execute the commit or rollback processing for the transaction. Note that data retrieval code will often be part of (the procedural code of) transaction composition. We’ll refer to this class as write BL code (wBL code for short).

Write BL code attempts to change the current database state. When it executes, the SQL DBMS should ensure that none of the data integrity constraints gets violated; the resulting database state should satisfy all data integrity constraints.

Often, you’ll find code embedded within wBL code whose specific purpose is to verify that the transaction that gets executed won’t result in a state that violates any of the involved data integrity constraints. We consider this constraint-verifying code not part of the wBL code class, but rather part of the data integrity code class discussed in the next section.

Data Integrity Code (DI Code)

Data integrity code is all declarative or procedural code that deals with verifying the continued validity of data integrity constraints. Whenever wBL code executes transactions, the data manipulation statements that get executed as part of such transactions can potentially violate data integrity constraints. If this is indeed the case, data integrity code will ensure that such a statement fails and that the changes it made are rolled back.

imagesNote We deliberately consider DI code to be in a distinct class by itself and not part of the business logic code class. Most other books and articles dealing with business logic often remain fuzzy about whether DI code is considered part of business logic or not.

For the majority of the data integrity constraints—in fact, all but attribute and tuple constraints—DI code will have to execute data retrieval code (queries) to determine if a given data manipulation statement is allowed or not. For multi-tuple constraints, DI code will always need to execute queries (inspecting other involved rows) to verify that the resulting database state still satisfies all data integrity constraints.

For example, consider the “at most one president allowed” table constraint defined in Listing 7-26 (table universe tab_EMP). When a transaction attempts to insert a president, you should run a constraint validation query either before execution of the insert or after execution of the insert, to verify that the insert is allowed given this constraint.

If you decide to validate whether the constraint remains satisfied before the actual execution of the insert, you could execute the following constraint validation query:

select 'there is already a president'
from emp
where job='PRESIDENT'
  and rownum=1

If this query returns a row, then obviously the insert must not be allowed to proceed; inserting another president will clearly create a database state that violates this constraint.

In the other case—executing a query after the insert—you could run the following query:

select count(*)
from emp
where job='PRESIDENT'

If this query returns more than 1, then the constraint will be violated and the insert must be rolled back.

Ideally you would want that, given the declaration of all constraints, the DBMS automatically deduces and executes these constraint validation queries (alongside some conditional code either to refrain execution of a DML statement or to force a DML statement rollback). A DBMS that provides this service makes the task of the developer who codes business logic a lot easier.

imagesNote If you’re familiar with Oracle, you’ll know that Oracle’s SQL DBMS can perform this service only for a limited number of types of constraints.

You can subdivide the DI code class into five subclasses, one per type of constraint:

  1. Code that verifies attribute constraints
  2. Code that verifies tuple constraints
  3. Code that verifies table constraints
  4. Code that verifies database constraints
  5. Code that verifies transition constraints

As you’ll see shortly, you can state the required DI code for some constraints declaratively in an SQL DBMS (which is then responsible for maintaining the constraint). However, you need to develop the code for most of them manually; that is, you are responsible for the implementation. This means that you need to design and write procedural code to maintain the constraint yourself.

imagesNote There are two high-level implementation strategies for creating DI code procedurally; either you make use of database triggers, or you embed DI code within wBL code. We’ll discuss the former one in more detail in the remainder of this chapter.

User Interface Code (UI Code)

UI code is code that determines the look and feel of a business application. It is responsible for the front end of the business application that the user deals with when using the application. UI code either

  • Creates user interface for the user—the look of the business application—and typically displays retrieved data within the user interface, or
  • Responds to user interface events initiated by the user and then modifies the user interface—the feel of the business application.

Creating and/or modifying the user interface always requires embedded calls to business logic code. This BL code will return the data to the UI code for display, and/or will change data as part of the transaction requested by the user. Depending upon the results of these embedded BL code calls, UI code will modify the user interface accordingly.

As a summary, Figure 11-1 illustrates the correlation between the different WoD application code classes that we’ve introduced in this section.

imagesNote In Figure 11-1, the reason why the DI code box is deliberately drawn outside the DBMS box is because application developers who develop wBL code need to implement the majority of DI code procedurally (as will become obvious in the course of this chapter).

images

Figure 11-1. WoD application code classes

This figure illustrates the following relations between the code classes (note that the numbers in this list refer to the numbers inside Figure 11-1):

  1. UI code holds calls to rBL code (queries) or wBL code (transactions).
  2. rBL code holds embedded data retrieval code that reads data from the database.
  3. wBL code holds embedded data manipulation code that writes to the database. wBL code often also holds embedded queries reading the database.
  4. DI code requires the execution data retrieval code.
  5. DI code is typically called from wBL code.
  6. DI code can also be called by the DBMS via database triggers (see the next section), which fire as a result of the execution of DML statements originating from wBL code.
  7. Sometimes you’ll see that UI code calls DI code directly; this is often done to create a responsive, more user-friendly user interface.

Implementing Data Integrity Code

In this section you’ll find a brief exploration of the aforementioned strategies for implementing DI code in Oracle’s SQL DBMS. We’ll also discuss a few related topics.

Alternative Implementation Strategies

As mentioned in the previous section, when transactions execute, you should run DI code to verify that all data integrity constraints remain satisfied. There are three different strategies to ensure execution of DI code when a WoD application attempts to execute a transaction:

  • Declarative
  • Triggered procedural
  • Embedded procedural
Declarative

In the declarative strategy, you simply declare the constraints that are involved in your database design to the DBMS. The DBMS then automatically ensures—hopefully in an efficient manner—that offending DML statements will be rolled back when they try to create a database state that violates any of the declared constraints.

Note that to be able to declare constraints to the DBMS, you first require (some form of) a formal specification of these constraints that states exactly what the constraint is. The strength of the declarative strategy is that you just tell the DBMS what the constraint is; that is, there is no need for you to write any DI code. In essence, the declaration of the constraint to the DBMS constitutes the DI code.

Once you’ve declared a constraint, the DBMS then has the challenge of computing how best to check the constraint in an efficient manner; it is tasked to generate (hopefully) efficient DI code and ensure this code is automatically executed at the appropriate times when transactions are processed.

imagesNote You’ll see in the course of this chapter that the word “efficient” in the preceding sentence is imperative.

Oracle’s SQL DBMS allows you to specify certain constraints declaratively either as part of the create table command, or separately (after the table has already been created) through the alter table add constraint command.

Starting from the section “Implementing Attribute Constraints” onwards, you’ll find a full treatment of constraints that you can specify declaratively. As a preview, Listing 11-1 contains an example of adding a constraint to an already created table. It shows the SQL syntax for declaring the second table constraint in Listing 7-26 (the table universe for EMP) stating that {USERNAME} is a key.

Listing 11-1. Declaring a Key Using SQL

alter table EMP add constraint EMP_K2 unique (USERNAME);

Through the preceding alter table SQL statement, you declare {USERNAME} as a key for the EMP table structure. This command allows you to give the constraint a name (in the preceding case it is named EMP_K2). You’ll need to use this name if you want to drop the constraint.

imagesNote If you don’t supply a name, the DBMS will generate one for you.

We’ve chosen to use only the alter table add constraint statement in our examples throughout this chapter, so that we can gradually build the implementation of the database design. You could have declared every declaratively added constraint instead as part of the create table statement. You can find the SQL expressions that declare constraints of the UEX database design as part of the create table statement online at http://www.rulegen.com/am4dp.

Triggered Procedural

In the triggered procedural strategy, you don’t declare the constraint (the what) to the DBMS. Instead, you are tasked to write procedural DI code (the how) yourself. And, in conjunction, you also tell the DBMS when to execute your DI code by creating triggers that call your DI code. You can view a trigger as a piece of procedural code that the DBMS will automatically execute when certain events occur. Let’s explain this a bit more.

imagesNote Triggers can serve many purposes. Executing DI code to enforce data integrity is only one such purpose (and the one we’ll be focusing on in this chapter).

Every trigger is associated with a table structure and will automatically be executed (fired) by the DBMS whenever a transaction executes a DML statement against that table structure. The trigger code—which executes as part of the same transaction—can then check whether the new database state still satisfies all constraints.

Oracle’s SQL DBMS offers the 12 different types of triggers for each table structure, which are defined in the SQL standard:

  • Four insert triggers: One that fires before an INSERT statement starts executing (referred to as the before statement insert trigger), one that fires right before an actual row is inserted (before row insert trigger), one that fires after a row has been inserted (after row insert trigger), and finally one that fires after completion of an INSERT statement (after statement insert trigger).
  • Four update triggers: Similar to the insert triggers, there are four types of update triggers—before statement, before row, after row, and after statement.
  • Four delete triggers: The same four types.

The statement triggers (before and after) will only fire once per (triggering) DML statement. The row triggers will fire as many times as there are rows affected by the DML statement. For instance, an UPDATE statement that increases the monthly salary of all employees in the employee table (say there are currently ten) will cause the before and after row update triggers each to fire ten times. The before and after statement update triggers will only fire once each in this case.

Row-level triggers are able to refer to the column values of the rows that are affected by the triggering DML statement. Before and after row insert triggers can inspect all column values of the new rows (the ones that are inserted). Before and after row delete triggers can inspect all columns of the old rows (the ones that are deleted). Before and after row update triggers can inspect both all column values of the old rows and all column values of the new rows.

imagesNote An UPDATE statement replaces one set of rows (the “old” ones) by another (the “new” ones).

If you aren’t already familiar with the various types of table triggers available in Oracle, we refer you to the Oracle manuals for more information on this. In the rest of this chapter, we assume you are familiar with them.

There isn’t an easy way for statement-level triggers to reference the rows that are affected by the triggering DML statement; they either see the (old) table as it existed before the start of the DML statement (before trigger), or the resulting (new) table as it exists after completion of the DML statement (after trigger). You can enable statement-level triggers to see the affected rows, but it requires some sophisticated coding on your part. Later on in this chapter, you’ll discover that this is actually a requirement in order to be able to develop efficient DI code in the triggered procedural strategy. We’ll then demonstrate how this can be done.

To illustrate the use of a trigger as a means to implement DI code, here is an example. Let’s take a look at implementing a trigger for the “at most one president allowed” table constraint of the EMP table structure. Listing 11-2 shows the SQL syntax for implementing such a trigger that is attached to the EMP table structure.

Listing 11-2. Attaching a Trigger to a Table Using SQL

create trigger EMP_AIUS
after insert or update on EMP
declare pl_num_presidents number;
begin
  --
  select count(*) into pl_num_presidents
  from EMP
  where job='PRESIDENT';
  --
  if pl_num_presidents > 1
  then
    raise_application_error(-20999,'Not allowed to have more than one president.');
  end if;
  --
end;
/

The way this trigger is created (in the second line: after insert or update) ensures that the DBMS executes the procedural code of the trigger whenever BL code inserts or updates an employee (the when of this constraint). The trigger will fire immediately after execution of such an insert or update. Therefore, the embedded code will see a database state that reflects the changes made by the insert or update.

The body of the trigger (that is, the procedural code between declare and end), shown in Listing 11-2, represents the how of this constraint. First a query is executed to retrieve the number of presidents in the EMP table. Then, if this number is more than one, the trigger generates an error by calling raise_application_error. This call raises an SQL exception, which in turn forces the triggering DML statement to fail and the changes it made to be undone. In this case, the DBMS will perform this DML statement rollback for you, contrary to the embedded procedural strategy that’s described hereafter.

Note that it would be inefficient to have this trigger also fire for DELETE statements. This particular constraint cannot be violated when a transaction deletes rows from the EMP table structure; if there is at most one president before deleting a row from EMP, then there will still be at most one president after deletion of the row.

In the triggered procedural strategy, you—the DI code developer—need to think about not only when DI code should be run, but also when it would be unnecessary to run DI code. This is part of ensuring that DI code is efficiently implemented. Of course, in the declarative strategy you needn’t think about either of these things.

imagesNote The trigger in Listing 11-2 now fires for every type of INSERT and every type of UPDATE statement executed against the EMP table structure. This is still rather inefficient. For instance, if you insert a new clerk or update the monthly salary of some employee, the trigger really needn’t be run. In both of these cases (similar to a DELETE statement, discussed earlier) the triggering DML statement can never violate the constraint.

Apart from being suboptimal in terms of efficiency, we need to address a more serious issue about the trigger in Listing 11-2: it is not complete when it comes to implementing the given constraint. There are still issues, which we will address in more detail in the section “Table Constraint Implementation Issues,” in the area of transactions executing concurrently.

Embedded Procedural

Finally, in the embedded procedural strategy to implement DI code, you again need to write the DI code (the how) yourself. Instead of creating triggers that call your DI code, you now embed the DI code into the wBL code modules that hold the DML statements that might cause constraints to be violated. DI code then executes as part of the execution of wBL code.

This is probably by far the most used procedural strategy for implementing DI code in WoD applications.

imagesNote In this case, “most used” strategy certainly doesn’t imply “best” strategy.

Let’s take a look at an example. Assume the application supplies a page where the user can enter data for a new employee to be inserted. The user supplies attributes ENAME, BORN, JOB, and DEPTNO. The BL code determines the other employee attributes (EMPNO, SGRADE, MSAL, HIRED, and USERNAME) according to rules that were agreed upon with the users. Listing 11-3 lists the procedure the UI code calls to insert a new employee. It holds BL code first to determine the other employee attributes, and then executes an INSERT statement. Immediately following this INSERT statement, the BL code runs embedded DI code to ensure the “at most one president” constraint remains satisfied.

Listing 11-3. Embedding DI Code in wBL Code

create procedure p_new_employee
(p_ename  in varchar
,p_born   in date
,p_job    in varchar
,p_deptno in number) as
--
pl_sgrade   number;
pl_msal     number;
pl_hired    date;
pl_username varchar(15);
--
begin
  -- Determine monthly start salary.
  select grade, llimit into pl_sgrade, pl_msal
  from grd
  where grade = case p_job
                when 'ADMIN'     then 1
                when 'TRAINER'   then 3
                when 'SALESREP'  then 5
                when 'MANAGER'   then 7
                when 'PRESIDENT' then 9;
  -- Determine date hired.
  pl_hired := trunc(sysdate);
  -- Determine username.
  pl_username := upper(substr(p_ename,1,8));
  -- Now insert the new employee row. Set a transaction savepoint first.
  savepoint sp_pre_insert;
  -- Use EMP_SEQ sequence to generate a key.
  insert into emp(empno,ename,job,born,hired,sgrade,msal,username,deptno)
  values(emp_seq.nextval,p_ename,p_job,p_born,pl_hired,pl_sgrade
        ,pl_msal,pl_username,p_deptno);
  -- Verify 'at most one president' constraint.
  if p_job = 'PRESIDENT'
  then
    declare pl_num_presidents number;
    begin
      --
      select count(*) into pl_num_presidents
      from EMP
      where job='PRESIDENT';
      --
      if pl_num_presidents > 1
      then
        -- Constraint is violated, need to rollback the insert.
        rollback to sp_pre_insert;
        raise_application_error(-20999,
                               'Not allowed to have more than one president.');
      end if;
    end;
  end if;
end;

The embedded DI code following the INSERT statement very much resembles the code in the body of the trigger listed in Listing 11-2. It is a more efficient implementation than the INSERT statement trigger; the DI code is executed only when a president is inserted.

Also note that in this procedural strategy—in contrast to the triggered procedural strategy— it is up to the developer to force a statement rollback explicitly. In the preceding code, this is done by setting a transaction savepoint just prior to executing the INSERT statement, and performing a rollback to that savepoint whenever the constraint is found to be violated.

There is another more serious drawback of the embedded procedural strategy. You now have to replicate DI code for a given constraint into every wBL code module that holds DML statements that might potentially violate the constraint. In this strategy, all necessary DI code for a given constraint is typically scattered across various wBL code modules. Often, when wBL code of a WoD application needs to be changed, you’ll forget to change (add or remove) the necessary DI code. Even more so, when an actual constraint of a database design needs to be changed, it is often difficult to locate all wBL code modules that hold embedded DI code for the constraint.

imagesNote This drawback also applies in some extent to the triggered procedural strategy. The example trigger given in Listing 11-2 is actually shorthand for two triggers: an update trigger and an insert trigger. DI code for the constraint can be considered scattered across two triggers. In more complex cases of constraints, you’ll need more triggers to implement the DI code; you could need as many as three times the number of table structures involved in the constraint (an insert, an update, and a delete trigger per involved table structure). However, with the triggered procedural strategy, this drawback is not as bad as it can get in a WoD application that was built using the embedded procedural strategy.

The embedded procedural strategy has the same issues in the area of transactions executing concurrently (we’ll explain these in the section “Table Constraint Implementation Issues”).

Order of Preference

It should not be a surprise that the declarative strategy is highly preferable to the two procedural strategies. Not only will a declaratively stated constraint free you of the burden of developing DI code, it will likely outperform any self-developed procedural implementation. Don’t forget that it also frees you of the burden of maintaining this code during the WoD application’s life cycle!

imagesNote This last remark assumes that the DBMS vendor has done its research with respect to designing and implementing an efficient execution model for DI code. In the section “Table Constraint Implementation Issues,” we’ll introduce you to various execution models, ranging from inefficient ones to more efficient ones.

However, as you’re probably aware, you can state only a limited set of constraint types declaratively in Oracle’s SQL DBMS. Note that this is not due to an inherent deficiency in the SQL standard, but rather due to the poor implementation of this language in an SQL DBMS. For implementing constraints that cannot be declared to the DBMS, we prefer to follow the triggered procedural strategy. Following are the main two reasons why we prefer the triggered over the embedded procedural strategy:

  • Like declared constraints, the triggered procedural strategy cannot be subverted.
  • The triggered procedural strategy is likely to create a more manageable code architecture; DI code isn’t replicated in lots of BL code. DI code is fully detached from all BL code, and is therefore better manageable (the BL code itself, by the way, will be better manageable too).

imagesNote The second reason assumes that triggers will only hold DI code. Embedding BL code in triggers is generally considered a bad practice.

The embedded procedural strategy is the worst strategy of all; it can obviously be subverted, simply by using some other application (perhaps a general-purpose one supplied by the DBMS vendor) to update the database.

As you’ll see starting from the section “Table Constraint Implementation Issues” and onwards, when we start dealing with table constraints and database constraints (that is, multi-tuple constraints), implementing efficient DI code for these constraints through triggers is far from being a trivial task.

Obviously, the type of effort involved in implementing different constraints within one class (attribute, tuple, table, database, transition) is the same. Also, the complexity involved in implementing DI code for constraints probably increases as the scope of data increases. The remainder of this chapter will offer guidelines for implementing the table structures of a database design and, in increasing scope order, for implementing the DI code for the involved constraints. Here is an overview of the remaining sections of this chapter:

  • The section “Implementing Table Structures” deals with implementing the table structures in Oracle’s SQL DBMS.
  • The section “Implementing Attribute Constraints” discusses how to implement attribute constraints.
  • The section “Implementing Tuple Constraints” discusses how to implement tuple constraints.
  • The section “Table Constraint Implementation Issues” acts as a preface for the sections that follow it. In this section, we’ll introduce you to various issues that come into play when developing triggers to implement DI code for multi-tuple constraints. This section will also elaborate on different execution models.
  • The section “Implementing Table Constraints” discusses how you can state some table constraints declaratively, and how to design triggers for checking all other table constraints.
  • The section “Implementing Database Constraints” discusses how you can state some database constraints declaratively, and how to design triggers for checking all other database constraints.
  • The section “Implementing Transition Constraints” discusses an approach for implementing DI code for transition constraints.
  • The section “Bringing Deferred Checking into the Picture” explains the need for deferring the execution of DI code for certain types of constraints.
  • The section “The RuleGen Framework” introduces you to RuleGen, a framework that supports implementing DI code using the triggered procedural strategy, and that takes care of many of the issues discussed in this chapter.

Implementing Table Structures

Whenever you implement a database design in an SQL DBMS, you usually start with creating the table structures. Through the create table command, you implement the table structures of your database design within the SQL DBMS. In this chapter, we’ll implement the UEX database design one on one. Specifically, we’ll create an SQL table-per-table structure.

imagesNote It is not the goal of this book to investigate denormalization (for improving performance) or other reasons that could give rise to deviation from the formally specified database design. Moreover, it’s generally our personal experience that such one-on-one implementation usually provides excellent performance given the optimizer technology available in Oracle’s current SQL DBMS.

Creating a table in an SQL DBMS involves the following:

  • Choosing a name for the table structure (we’ve done this formally in the database characterization)
  • Declaring names for the columns (we’ve done this formally in the characterizations)
  • Declaring data types for the columns (also done in the characterizations)

As for the last bullet, unfortunately the support is rather immature for creating user-defined types—representing the attribute-value sets—that can be used in a create table statement. For instance, we cannot do the following:

-- Create a type to represent the value set for the EMPNO attribute.
-- Note this is not valid Oracle syntax.
create type empno_type under number(4,0) check(member > 999);
/
-- Create a type to represent the enumeration value set for the JOB attribute.
-- Invalid syntax.
create type job_type under varchar(9)
            check(member in ('ADMIN','TRAINER','SALESREP','MANAGER','PRESIDENT));
/
-- Now use the empno_type and job_type types in the create table statement.
create table EMP
(empno   empno_type not null
,job     job_type   not null
,...);

In all practicality, we’re forced to use built-in data types provided by the SQL DBMS. In the case of Oracle, the most commonly used built-in data types are varchar, number, and date. These will suffice for our example database design.

In the next section (“Implementing Attribute Constraints”), you’ll see that with the use of the SQL alter table add constraint statement, you can still in effect implement the attribute-value sets. The column data types declared in the create table statements act as supersets of the attribute-value sets, and by adding constraints you can narrow down those supersets to exactly the attribute-value sets that were specified in the characterizations for the table structures.

Listings 11-4 through 11-13 show the create table statements for the UEX example database design.

Listing 11-4. Create Table for GRD Table Structure

create table grd
( grade      number(2,0)  not null
, llimit     number(7,2)  not null
, ulimit     number(7,2)  not null
, bonus      number(7,2)  not null);

imagesNote The not null alongside every column indicates that no NULLs are allowed in that column. In such cases, we say that the column is mandatory.

Listing 11-5. Create Table for EMP Table Structure

create table emp
( empno      number(4,0)  not null
, ename      varchar(8)   not null
, job        varchar(9)   not null
, born       date         not null
, hired      date         not null
, sgrade     number(2,0)  not null
, msal       number(7,2)  not null
, username   varchar(15)  not null
, deptno     number(2,0)  not null);

Listing 11-6. Create Table for SREP Table Structure

create table  srep
( empno       number(4,0)  not null
, target      number(6,0)  not null
, comm        number(7,2)  not null);

Listing 11-7. Create Table for MEMP Table Structure

create table  memp
( empno       number(4,0) not null
, mgr         number(4,0) not null);

Listing 11-8. Create Table for TERM Table Structure

create table  term
( empno       number(4,0)  not null
, left        date         not null
, comments    varchar(60));

imagesNote In our example, the user doesn’t always want to supply a value for COMMENTS when an employee has been terminated. Oracle doesn’t allow an empty string in a mandatory column. To prevent the users from entering some random varchar value in these cases (for instance, a space), this column has been made optional (sometimes called nullable), meaning that NULLs are allowed to appear in this column.

Listing 11-9. Create Table for HIST Table Structure

create table  hist
( empno       number(4,0)  not null
, until       date         not null
, deptno      number(2,0)  not null
, msal        number(7,2)  not null);

Listing 11-10. Create Table for DEPT Table Structure

create table  dept
( deptno      number(2,0)  not null
, dname       varchar(12)  not null
, loc         varchar(14)  not null
, mgr         number(4,0)  not null);

Listing 11-11. Create Table for CRS Table Structure

create table  crs
( code        varchar(6)   not null
, descr       varchar(40)  not null
, cat         varchar(3)   not null
, dur         number(2,0)  not null);

Listing 11-12. Create Table for OFFR Table Structure

create table  offr
( course      varchar(6)   not null
, starts      date         not null
, status      varchar(4)   not null
, maxcap      number(2,0)  not null
, trainer     number(4,0)
, loc         varchar(14)  not null);

imagesNote For reasons we’ll discuss in the section “Implementing Database Constraints,” the TRAINER attribute has been defined as nullable; we’ll use NULL instead of the special value -1 that was introduced in this attribute’s attribute-value set.

Listing 11-13. Create Table for REG Table Structure

create table  reg
( stud        number(4,0)  not null
, course      varchar(6)   not null
, starts      date         not null
, eval        number(1,0)  not null);

In the relational model of data, all attributes in a database design are mandatory. Therefore, you can consider it a disappointment that in the SQL standard by default a column is nullable; SQL requires us to add not null explicitly alongside every column to make it mandatory.

The next section deals with the aforementioned “narrowing down” of the built-in data types that were used in the create table statements.

Implementing Attribute Constraints

We now revisit the matter with regards to the term attribute constraints (mentioned in the section “Classification Schema for Constraints” in Chapter 7).

Formally, a characterization just attaches attribute-value sets to attributes. Attaching an attribute-value set to an attribute can be considered an attribute constraint. However, in practice you implement database designs in SQL DBMSes that are notorious for their poor support of user-defined types. User-defined types would have been ideal for implementing attribute-value sets. However, as discussed in the section “Implementing Table Structures,” you can’t use them to do so. Instead, you must use an appropriate superset (some built-in data type, as shown in the previous section) as the attribute-value set of a given attribute. Luckily you can use declarative SQL check constraints to narrow down these supersets to exactly the attribute-value set that was specified in the characterizations. During implementation, we refer to these declarative check constraints as the attribute constraints of an attribute.

All attribute constraints can—and, given our preference in strategies, should—be stated as declarative check constraints. You can declare these constraints using the alter table add constraint statement.

Listing 11-14 shows the declaration of six check constraints that are required to declaratively implement the attribute-value sets for the EMP table structure as defined in the definition of chr_EMP in Listing 7-2. We’ll discuss each of these after the listing.

Listing 11-14. Attribute Constraints for EMP Table Structure

alter table EMP add constraint emp_chk_empno check (empno > 999);
alter table EMP add constraint emp_chk_job
                               check (job in ('PRESIDENT','MANAGER','SALESREP'
                                             ,'TRAINER','ADMIN' ));
alter table EMP add constraint emp_chk_brn   check (trunc(born) = born);
alter table EMP add constraint emp_chk_hrd   check (trunc(hired) = hired);
alter table EMP add constraint emp_chk_msal  check (msal > 0);
alter table EMP add constraint emp_chk_usrnm check (upper(username) = username);

As you can see from this listing, all check constraints are given a name. The name for the first one is emp_chk_empno. It narrows down the declared data type for the empno column, number(4,0), to just numbers consisting of four digits (greater than 999).

Once this constraint is declared and stored in the data dictionary of Oracle’s SQL DBMS, the DBMS will run the necessary DI code whenever a new EMPNO value appears in EMP (through an INSERT statement), or an existing EMPNO value in EMP is changed (through an UPDATE statement). The DBMS will use the constraint name in the error message that you receive, informing you whenever an attempt is made to store an EMPNO value in EMP that does not satisfy this constraint.

Constraint emp_chk_job (the second one in the preceding listing) ensures that only the five listed values are allowed as a value for the JOB column.

Constraints emp_chk_brn and emp_chk_hrd ensure that a date value (which in the case of Oracle’s SQL DBMS always holds a time component too) is only allowed as a value for the BORN or HIRED columns, if its time component is truncated (that is, set to 0:00 midnight).

Constraint emp_chk_msal ensures that only positive numbers—within the number(7,2) superset—are allowed as values for the MSAL column.

Finally, constraint emp_chk_usrnm ensures that values for the USERNAME column are always in uppercase.

Listings 11-15 through 11-23 supply the attribute constraints for the other table structures of the UEX database design.

Listing 11-15. Attribute Constraints for GRD Table Structure

alter table GRD add constraint grd_chk_grad check (grade > 0);
alter table GRD add constraint grd_chk_llim check (llimit > 0);
alter table GRD add constraint grd_chk_ulim check (ulimit > 0);
alter table GRD add constraint grd_chk_bon1 check (bonus > 0);

Listing 11-16. Attribute Constraints for SREP Table Structure

alter table SREP add constraint srp_chk_empno check (empno > 999);
alter table SREP add constraint srp_chk_targ  check (target > 9999);
alter table SREP add constraint srp_chk_comm  check (comm > 0);

Listing 11-17. Attribute Constraints for MEMP Table Structure

alter table MEMP add constraint mmp_chk_empno check (empno > 999);
alter table MEMP add constraint mmp_chk_mgr   check (mgr > 999);

Listing 11-18. Attribute Constraints for TERM Table Structure

alter table TERM add constraint trm_chk_empno check (empno > 999);
alter table TERM add constraint trm_chk_lft   check (trunc(left) = left);

Listing 11-19. Attribute Constraints for HIST Table Structure

alter table HIST add constraint hst_chk_eno  check (empno > 999);
alter table HIST add constraint hst_chk_unt  check (trunc(until) = until);
alter table HIST add constraint hst_chk_dno  check (deptno > 0);
alter table HIST add constraint hst_chk_msal check (msal > 0);

Listing 11-20. Attribute Constraints for DEPT Table Structure

alter table DEPT add constraint dep_chk_dno check (deptno > 0);
alter table DEPT add constraint dep_chk_dnm check (upper(dname) = dname);
alter table DEPT add constraint dep_chk_loc check (upper(loc) = loc);
alter table DEPT add constraint dep_chk_mgr check (mgr > 999);

Listing 11-21. Attribute Constraints for CRS Table Structure

alter table CRS add constraint reg_chk_code check (code = upper(code));
alter table CRS add constraint reg_chk_cat  check (cat in ('GEN','BLD','DSG'));
alter table CRS add constraint reg_chk_dur1 check (dur between 1 and 15);

Listing 11-22. Attribute Constraints for OFFR Table Structure

alter table OFFR add constraint ofr_chk_crse check (course = upper(course));
alter table OFFR add constraint ofr_chk_strs check (trunc(starts) = starts);
alter table OFFR add constraint ofr_chk_stat
                                check (status in ('SCHD','CONF','CANC'));
alter table OFFR add constraint ofr_chk_trnr check (trainer > 999)
alter table OFFR add constraint ofr_chk_mxcp check (maxcap between 6 and 99);

imagesNote You might be wondering how an SQL DBMS deals with constraint ofr_chk_trnr whenever it encounters NULLs in the TRAINER column. We’ll discuss this at the end of this section.

Listing 11-23. Attribute Constraints for REG Table Structure

alter table REG add constraint reg_chk_stud check (stud > 999);
alter table REG add constraint reg_chk_crse check (course = upper(course));
alter table REG add constraint reg_chk_strs check (trunc(starts) = starts);
alter table REG add constraint reg_chk_eval check (eval between -1 and 5);

If a declarative check constraint evaluates to UNKNOWN, usually arising from the use of NULLs, then the SQL standard considers the constraint satisfied; the check evaluates to TRUE. Beware; you’ll observe the opposite behavior in the PL/SQL programming language. Here a Boolean expression evaluating to unknown is handled as FALSE. To illustrate this, take a look at the following trigger definition; it is not equivalent to check constraint ofr_chk_trnr:

create trigger ofr_chk_trnr
after insert or update on OFFR
for each row
begin
  if not (:new.trainer > 999)
  then
    raise_application_error(-20999,'Value for trainer must be greater than 999.);
  end if;
end;

The declarative check constraint will allow a NULL in the TRAINER column, whereas the preceding trigger won’t allow a NULL in the TRAINER column. You can fix this discrepancy by changing the fifth line in the preceding trigger definition into the following:

if not (:new.trainer > 999 or :new.trainer IS NULL)

The trigger is now equivalent to the declarative check constraint.

We continue by investigating how you can implement tuple constraints (the next level after attribute constraints) in Oracle’s SQL DBMS.

Implementing Tuple Constraints

Before we deal with the implementation of tuple constraints, we need to confess something up front. The formal methodology that has been developed in this book is based on 2-valued logic (2VL). The science of 2VL is sound; we’ve explored propositions and predicates in Chapters 1 and 3 and developed some rewrite rules with it. However, in this chapter we’ll make various statements about predicates that are expressed in SQL. As demonstrated by the preceding trigger and attribute constraint ofr_chk_trnr in Listing 11-22, due to the possible presence of NULLs SQL doesn’t apply 2VL; instead it applies 3-valued logic (3VL). The most crucial assumption in 3VL is that, besides the two truth values TRUE and FALSE, a third value represents “possible” or UNKNOWN.

imagesNote 3VL is counterintuitive, as opposed to the classical 2VL. We won’t provide an in-depth discussion of 3VL here; you can find a brief exploration of 3VL in Appendix D.

We admit up front that we’re taking certain liberties in this chapter. By using NOT NULL on almost all columns in the SQL implementation of the example database design, we’re in effect avoiding 3VL issues. Without the use of NOT NULL, various statements we’re making about logical expressions in this chapter would be open to question.

As you saw in Chapter 1, conjunction, disjunction, and negation are truth functionally complete. Therefore, you can rewrite every formally specified tuple constraint into an equivalent specification that uses just the three connectives that are available in SQL.

Once transformed in such a way, all tuple constraints can—and therefore should—be stated declaratively as check constraints. You can use the alter table add constraint statement to declare them to the DBMS. Let’s demonstrate this using the tuple constraints of the EMP table structure. For your convenience, we repeat the tuple universe definition tup_EMP here:

tup_EMP :=
{ e | e∈Π(chr_EMP) ⋀
      /* We hire adult employees only */
      e(BORN) + 18 ≤ e(HIRED) ⋀
      /* Presidents earn more than 120K */
      e(JOB) = 'PRESIDENT' ⇒ 12*e(MSAL) > 120000 ⋀
      /* Administrators earn less than 5K */
      e(JOB) = 'ADMIN' ⇒ e(MSAL) < 5000 }

The preceding three tuple constraints can be stated as follows (see Listing 11-24).

imagesNote The preceding three constraints are formally expressed in 2VL, but the three constraints expressed in SQL in Listing 11-24 are in 3VL. In this case, the constraints expressed in 3VL are equivalent to the formally expressed constraints only because we have carefully declared all involved columns to be mandatory (NOT NULL).

Listing 11-24. Tuple Constraints for EMP Table Structure

alter table EMP add constraint emp_chk_adlt
                    check ((born + interval '18' year) <= hired);
alter table EMP add constraint emp_chk_dsal
                    check ((job <> 'PRESIDENT') or (msal > 10000));
alter table EMP add constraint emp_chk_asal
                    check ((job <> 'ADMIN') or (msal < 5000));

The implementation of the first constraint, named emp_chk_adlt, uses date arithmetic (the + interval operator) to add 18 years to a given born date value.

Because SQL only offers three logical connectives (and, or, not), you are forced to transform the second and third tuple constraints—both involving the implication connective—into a disjunction. In case you’ve forgotten the important rewrite rule that enables you to do so, here it is once more:

( P ⇒ Q ) ⇔ ( ( ¬P ) ⋁ Q )

Once again, you should be aware that this transformation might not be safe in general, because when you’re using SQL you’re in the world of 3VL, not the 2VL world from which the rewrite rule is taken. If NULL is permitted in any of the columns involved, you’ll need to think about how these constraints work in SQL’s 3VL logic.

Given that the tuple constraints are declared in the way shown in Listing 11-24, the DBMS will ensure that rows that violate any of them are rejected.

In Listings 11-25 through 11-28, you can find the implementation of the tuple constraints for table structures GRD, MEMP, CRS, and OFFR. The other remaining table structures in the example database design don’t have tuple constraints.

Listing 11-25. Tuple Constraints for GRD Table Structure

alter table GRD add constraint grd_chk_bndw  check (llimit <= (ulimit - 500));
alter table GRD add constraint grd_chk_bon2  check (bonus < llimit);

Listing 11-26. Tuple Constraints for MEMP Table Structure

alter table MEMP add constraint  mmp_chk_cycl  check (empno <> mgr);

Listing 11-27. Tuple Constraints for CRS Table Structure

alter table CRS add constraint  reg_chk_dur2  check ((cat <> 'BLD') or (dur <= 5));

Listing 11-28. Tuple Constraints for OFFR Table Structure

alter table OFFR add constraint ofr_chk_trst
                     check (trainer is not null or status in ('CANC','SCHD'));

The accompanying formal specification for the tuple constraint stated in Listing 11-28 was the following:

tup_OFFR :=
{ o | o∈P(chr_OFFR) ⋀
      /* Unassigned TRAINER allowed only for certain STATUS values */
      o(TRAINER) = -1 ⇒ o(STATUS)∈{'CANC','SCHD'}
}

After a rewrite of the implication into a disjunction, this changes into the following:

tup_OFFR :=
{ o | o∈P(chr_OFFR) ⋀
      /* Unassigned TRAINER allowed only for certain STATUS values */
      o(TRAINER) ≠ -1 ⋁ o(STATUS)∈{'CANC','SCHD'}
}

Because we have decided to represent the -1 with a NULL in the implementation of the OFFR table structure (again for reasons that will be explained later on), the first disjunct changes to trainer is not null in the preceding check constraint.

We’ll end this section on implementing tuple constraints with an observation that is also valid for the constraint classes that follow hereafter.

It is good practice to write all tuple constraints in conjunctive normal form (CNF; see the section “Normal Forms” in Chapter 3). This might require you to apply various rewrite rules first. By rewriting a constraint into CNF, you’ll end up with as many conjuncts as possible, where each conjunct represents a separately implementable constraint. For tuple constraints, you would create one declarative check constraint per conjunct. This in turn has the advantage that the DBMS reports violations of tuple constraints in as detailed a way as possible.

Let’s explain this.

imagesNote We again assume that SQL’s 3VL behaves in a 2VL fashion because all columns that are involved in constraints are mandatory.

Suppose you create one check constraint for a tuple constraint that is—when rewritten in CNF—of the form A ⋀ B. When that check constraint gets violated, all you know (in 2VL) is that A ⋀ B is not TRUE. This, using the laws of De Morgan, translates as either A is not TRUE or B is not TRUE. Wouldn’t it be nicer if you knew exactly which one of the two was FALSE? If you would have created two separate checkconstraints (one for A and one for B), the DBMS could report which one of the two was causing the violation (or maybe they both are). In other words, by rewriting a constraint specification into CNF and implementing each conjunct separately, you’ll get more detailed error messages.

As mentioned earlier, this observation also applies to the constraint classes that follow (table, database, and transition).

Table Constraint Implementation Issues

Up until now, everything has been straightforward concerning the implementation of data integrity constraints. However, when you increase the scope from tuple constraints to table constraints, and thus start dealing with constraints that span multiple tuples, implementing efficient DI code rapidly becomes much more complex.

The main reason for this complexity is the poor support for declaring these constraints to the DBMS. You can state only two types of table constraints declaratively: uniquely identifying attributes (keys) and subset requirements referencing back to the same table, in which case a subset requirement is a table constraint (foreign key to the same table).

Implementing all other types of table constraints requires you to develop procedural DI code. In practice, this means that you’ll often have to resort to the triggered procedural strategy.

imagesNote We think there’s a reason why DBMS vendors offer us such poor declarative support. We’ll reveal this reason in the course of this section.

We’ll introduce you to the complexity involved in implementing table constraints by illustrating different DI code execution models. In the first (rather large) subsection that follows, we’ll illustrate six different execution models, ranging from very inefficient to more efficient. As you’ll see, implementing more efficient execution models for DI code is also more complex.

To explain every execution model clearly, we’ll be using two example table constraints and show how these constraints are implemented in every execution model. The constraints we’ll use are the last one specified in table universe tab_EMP in Listing 7-26 and the last one specified in table universe tab_DEPT in Listing 7-30. For your convenience, we repeat the formal specifications of these two constraints here (note that in these specifications E represents an employee table and D represents a department table).

/* A department that employs the president or a manager */
/* should also employ at least one administrator     */

(∀d∈E⇓{DEPTNO}:
  ( ∃e2∈E: e2(DEPTNO) = d(DEPTNO) ⋀ e2(JOB) ∈ {'PRESIDENT','MANAGER'} )
    ⇒
  ( ∃e3∈E: e3(DEPTNO) = d(DEPTNO) ⋀ e3(JOB) = 'ADMIN' )
)
/* You cannot manage more than two departments */
( ∀m∈D⇓{MGR}: #{ d | d∈D ⋀ d(MGR) = m(MGR) } ≤ 2 )

Next to implementing an efficient execution model, another—rather serious—issue comes into play when implementing DI code for table constraints. This concerns transaction serialization. Given that Oracle’s SQL DBMS can execute transactions concurrently, you must ensure that the queries inside DI code for a given constraint are executed in a serializable way: Oracle’s SQL DBMS does not guarantee serializability. We’ll explain this issue to you in detail in the section “DI Code Serialization.”

DI Code Execution Models

This section will discuss various execution models for implementing DI code for table constraints following the triggered procedural strategy. However, before doing so we’ll first provide you with a few preliminary observations with regards to the timing of DI code execution in relation to the DML statement execution.

Some Observations

With an SQL DBMS, you update the database by executing INSERT, UPDATE, or DELETE statements. Each of these statements operates on just one target table structure in just one manner—it’s either an INSERT, or an UPDATE, or a DELETE. Typically, transactions need to change more than one table, or possibly just one table in more than one manner. Therefore, your transactions in general consist of multiple DML statements that are serially executed one after the other.

You implicitly start a transaction when the first DML statement is executed. A transaction is explicitly ended by either executing a COMMIT statement (requesting the DBMS to persistently store the changes made by this transaction), or by executing a rollback statement (requesting the DBMS to abort and undo the changes made by the current transaction). After ending a transaction, you can start a new one again by executing another (first) DML statement. All changes made by a transaction—that has not yet committed—are only visible to that transaction; other transactions cannot see these changes. Once a transaction commits, the changes it made become visible to other transactions.

imagesNote Here we’re assuming that the DBMS is running in the read-committed isolation mode—the mode most often used within Oracle’s installed base.

Of course, all constraints must be satisfied at the end of a transaction (when it commits). That is to say, you don’t want a transaction to commit successfully while the database state, produced by the serial execution of its DML statements so far, violates one of the constraints. But what about the database states that exist in between the execution of two DML statements inside the same transaction? Should these database states always satisfy all constraints too, or might they be in violation of some constraints, as long as the last database state (the one that the transaction commits) satisfies all constraints?

For the time being, we disallow that these intermediate database states violate any database constraint.

imagesNote However, we’ll revisit this question in the section “Bringing Deferred Checking into the Picture,” where you’ll see that to implement certain transactions using Oracle’s SQL DBMS, we must allow certain constraints to be temporarily violated in one of the intermediate database states.

A DML statement that attempts to create a database state that violates a constraint will fail; in the following execution models we’ll ensure that the changes of such a DML statement will be rolled back immediately, while preserving the database state changes made by prior DML statements that executed successfully inside the transaction.

In Table 11-1 you can find an example transaction that executes four DML statements; the table shows the database state transitions that occur within this transaction.

images

Our execution model will be based on triggers. As mentioned before, triggers are associated with a table structure and will automatically be executed (“fired”) by the DBMS if a DML statement changes the content of that table. The code inside the trigger body can then check whether the new database state satisfies all constraints. If the state does not satisfy all constraints, then this code will force the triggering DML statement to fail; the DBMS then ensures that its changes are rolled back.

You should be aware of a limitation that row triggers have (the ones that fire for each affected row). These triggers are only allowed to query the state of other table structures; that is, they are not allowed to query the table structure on which the triggering DML statement is currently operating. If you try this, you’ll hit the infamous mutating table error (ORA-04091: table ... is mutating, trigger/function may not see it).

The very valid reason why Oracle’s SQL DBMS disallows you to do this is to prevent nondeterministic behavior. That’s because if your row triggers would be allowed to query a table structure that a DML statement is currently modifying, then these queries would perform a dirty read within the transaction. These queries see intermediate table states that only exist while the triggering DML statement is being executed row by row. Depending upon the order in which the SQL optimizer happens to process the rows, the outcome of these queries can be different. This would cause nondeterministic behavior, which is why Oracle’s DBMS won’t allow you to query the “mutating” table.

Given the essence of a table constraint—that is, it involves multiple rows in a table—the DI code for a table constraint will always require you to execute queries against the table that has been modified; however, the mutating table error prevents you from doing so. Therefore, row triggers are not suitable to be used as containers of DI code for table constraints.

Before statement triggers see the start database state in which a DML statement starts execution. After statement triggers see the end database state created by the execution of a DML statement. Because DI code needs to validate the end state of a DML statement, you are left with no more than three after statement triggers per table structure (insert, update, and delete) on which to base an execution model.

Given these observations, we can now go ahead and illustrate six different execution models for DI code. In discussing the execution models, we’ll sometimes broaden the scope to also include database constraints.

Execution Model 1: Always

In the first execution model (EM1), whenever a DML statement is executed, then the corresponding after statement trigger will hold code that sequentially executes the DI code for every constraint. In this model, every intermediate database state (including the last one) is validated to satisfy all constraints.

This execution model only serves as a starting point; you would never want to make use of this model, because it’s highly inefficient. For instance, if a DML statement changes the EMP table structure, then this execution model would then also run the DI code to check constraints that do not involve the EMP table structure. Obviously, this is completely unnecessary because these other table structures remain unchanged; constraints that don’t involve the table structure upon which the triggering DML statement operates need not be validated.

Let’s quickly forget this model, and move on to a more efficient one.

Execution Model 2: On-Involved-Table

This execution model (EM2) very much resembles EM1. The only difference is that you now make use of the knowledge of what the involved table structures are for each constraint. You only run DI code for a given constraint, if the table structure that is being changed by a DML statement is involved in the constraint (hence the “On-Involved-Table” in the section title).

Let’s take a closer look at how the example table constraint of the EMP table structure is implemented in this execution model. Remember, this was the constraint: “A department that employs the president or a manager should also employ at least one administrator.” You can formally derive the constraint validation query that you need to execute for verifying whether a new database state still satisfies this constraint. The way to do this is by translating the formal specification into an SQL WHERE-clause expression and then executing a query that evaluates the truth value of this expression. You can use the DUAL table to evaluate the expression. Let’s demonstrate this. Here is the formal specification of this table constraint:

( ∀d∈E⇓{DEPTNO}:
  ( ∃e2∈E: e2(DEPTNO) = d(DEPTNO) ⋀ e2(JOB) ∈ {'PRESIDENT','MANAGER'} )
    ⇒
  ( ∃e3∈E: e3(DEPTNO) = d(DEPTNO) ⋀ e3(JOB) = 'ADMIN' )
)

Before you can translate the formal specification into an SQL expression, you’ll need to get rid of the universal quantifier and implication. Following is the rewritten version of the specification.

imagesTip Try to rewrite this specification yourself; start by adding a double negation in front of the preceding specification.

¬ ( ∃d∈E⇓{DEPTNO}:
    ( ∃e2∈E: e2(DEPTNO) = d(DEPTNO) ⋀ e2(JOB) ∈ {'PRESIDENT','MANAGER'} )
      ⋀
    ¬ ( ∃e3∈E: e3(DEPTNO) = d(DEPTNO) ⋀ e3(JOB) = 'ADMIN' )
)

This now easily translates to SQL (we’re naming this constraint EMP_TAB03).

imagesNote The DUAL table in Oracle is a single-column, single-row system table. It is most often used to have the SQL engine evaluate either a SELECT-clause expression or a WHERE-clause expression. The following code displays the latter usage.

select 'Constraint EMP_TAB03 is satisfied'
from DUAL
where not exists(select d.DEPTNO
                 from EMP d
                 where exists(select e2.*
                              from EMP e2
                              where e2.DEPTNO = d.DEPTNO
                                and e2.JOB in ('PRESIDENT','MANAGER'))
                   and not exists(select e3.*
                                  from EMP e3
                                  where e3.DEPTNO = d.DEPTNO
                                    and e3.JOB = 'ADMIN'))

In EM2, you would create three after statement triggers for this constraint on only the EMP table structure (the one involved in this constraint). These triggers hold the preceding query to verify that the new database state still satisfies this constraint. Listing 11-29 shows these three triggers combined into one create trigger statement.

Listing 11-29. EM2 DI Code for Constraint EMP_TAB03

create trigger EMP_AIUDS_TAB03
after insert or update or delete on EMP
declare pl_dummy varchar(40);
begin
  --
  select 'Constraint EMP_TAB03 is satisfied' into pl_dummy
  from DUAL
  where not exists(select d.DEPTNO
                   from EMP d
                   where exists(select e2.*
                                from EMP e2
                                where e2.DEPTNO = d.DEPTNO
                                  and e2.JOB in ('PRESIDENT','MANAGER'))
                     and not exists(select e3.*
                                    from EMP e3
                                    where e3.DEPTNO = d.DEPTNO
                                      and e3.JOB = 'ADMIN'));
  --
exception when no_data_found then
  --
  raise_application_error(-20999,'Constraint EMP_TAB03 is violated.');
end;

imagesNote A DBMS could, by parsing the declared formal specification of a constraint, compute what the involved tables are. Also, the DBMS could compute the validation query that needs to be run to validate whether a constraint is still satisfied (all this requires is the application of rewrite rules to end up with a specification that can be translated into an SQL expression). Therefore, the DBMS could generate the preceding trigger. In other words, this execution model could be fully supported by a DBMS vendor in a declarative way!

Listing 11-30 shows the three triggers representing the DI code for constraint DEPT_TAB01 using this execution model.

Listing 11-30. EM2 DI Code for Constraint DEPT_TAB01

create trigger DEPT_AIUDS_TAB01
after insert or update or delete on DEPT
declare pl_dummy varchar(40);
begin
  --
  select 'Constraint DEPT_TAB01 is satisfied' into pl_dummy
  from DUAL
  where not exists(select m.DEPTNO
                   from DEPT m
                   where 2 < (select count(*)
                              from DEPT d
                              where d.MGR = m.MGR));
  --
exception when no_data_found then
  --
  raise_application_error(-20999,'Constraint DEPT_TAB01 is violated.');
  --
end; /

This execution model is still inefficient. For instance, when you update the name of an employee, then this execution model will validate constraint EMP_TAB03. Obviously, because the ENAME column is not involved at all in this constraint, DML statements that update ENAME should never give rise to the need to check constraint EMP_TAB03. A similar inefficiency applies to constraint DEPT_TAB01; for instance, EM2 will validate this constraint when you update the location of a department (whereas attribute LOC is not involved in constraint DEPT_TAB01). The next execution model addresses this inefficiency.

Execution Model 3:On-Involved-Column(s)

This execution model (EM3) is the same as EM2 in the cases where the DML statement is an INSERT or a DELETE. However, when updates occur you now also make use of the knowledge of what the involved attributes are (per table structure) for each constraint. For a given constraint, there’s only a need for this constraint to be validated if involved attributes get modified by the UPDATE statement. INSERT and DELETE statements always involve all attributes, and therefore give rise to a need for constraints that involve the table structure to be validated whenever they occur (irrespective of the attributes involved in the constraint).

Let’s again demonstrate this using the example table constraint EMP_TAB03. In this case, the insert and delete triggers remain the same as in EM2. You only change the update trigger to be more efficient. Simply by scanning the formal specification of constraint EMP_TAB03, you can discover the attributes that are involved; for this constraint these are DEPTNO and JOB. Whenever UPDATE statements occur, you now only execute the DI code that was developed in EM2 when the UPDATE statement changes either the DEPTNO or the JOB (or both) attributes.

imagesNote You’ll probably easily understand that an update of JOB can violate this constraint. For instance, in a given department, when you promote an administrator to become a trainer, it might well be the case that you just “removed” the single administrator who was required to be in that department, because that department also employs the president or a manager. Also, for instance, when you promote a trainer to become a manager, this might be the first manager in the department. This now would require that the department also employ an administrator. Similar scenarios apply to updates of DEPTNO; if you switch a manager who is currently working in department 10 to start working in department 20, then this manager might be the first manager in department 20, therefore . . . , and so on.

The SQL standard allows you to specify these columns in update triggers. The update trigger will then only fire if one of the involved columns has been changed. Here is how you would code the after statement update trigger in Oracle (see Listing 11-31).

Listing 11-31. EM3’s More Efficient Update Trigger for Constraint EMP_TAB03

create trigger EMP_AUS_TAB03
after update of DEPTNO,JOB on EMP
declare pl_dummy varchar(40);
begin
  --
  select 'Constraint EMP_TAB03 is satisfied' into pl_dummy
  from DUAL
  where not exists(select d.DEPTNO
                   from EMP d
                   where exists(select e2.*
                                from EMP e2
                                where e2.DEPTNO = d.DEPTNO
                                  and e2.JOB in ('PRESIDENT','MANAGER'))
                     and not exists(select e3.*
                                    from EMP e3
                                    where e3.DEPTNO = d.DEPTNO
                                      and e3.JOB = 'ADMIN'));
  --
exception when no_data_found then
  --
  raise_application_error(-20999,'Constraint EMP_TAB03 is violated.');
  --
end;
/

The second line in Listing 11-31 specifies the involved columns.

imagesNote As was the case with EM2, a DBMS vendor can also easily support this execution model declaratively. The extra work the DBMS needs to do for EM3, compared to EM2, is to parse all constraints. It does this not only to determine the table structure(s) that are involved for EM2, but also to determine per table structure what the involved attribute(s) are. The DBMS can then implement the more sophisticated update trigger automatically.

Listing 11-32 displays the optimized DI code for constraint DEPT_TAB01 in the case of UPDATE statement execution (note that only the MGR attribute is involved in this constraint).

Listing 11-32. DI Code for DEPT_TAB01 in Case of Updates

create trigger DEPT_AUS_TAB01
after update of MGR on DEPT
declare pl_dummy varchar(40);
begin
  --
  select 'Constraint DEPT_TAB01 is satisfied' into pl_dummy
  from DUAL
  where not exists(select m.DEPTNO
                   from DEPT m
                   where 2 < (select count(*)
                              from DEPT d
                              where d.MGR = m.MGR));
  --
exception when no_data_found then
  --
  raise_application_error(-20999,'Constraint DEPT_TAB01 is violated.');
  --
end;
/

There is a way to further improve on the efficiency of this execution model. For a given constraint, you can sometimes deduce that an INSERT statement (into one of the involved table structures) or a DELETE statement (on one of the involved table structures) can never violate the constraint. In the case of table constraint EMP_TAB03, neither can be deduced. You could be inserting a president, in which case the constraint should be validated. Or, you could be deleting an administrator, in which case the constraint should be validated too. However, in the case of table constraint DEPT_TAB01, you can deduce that deleting a department can never violate this constraint.

The next execution model addresses this further optimization with regards to INSERT or DELETE statements.

Execution Model 4: On-Involved-Column(s) Plus Polarity-of-Involved-Tables

This execution model (EM4) is the same as EM3 when the triggering DML statement is an UPDATE statement. However, when inserts or deletes occur, you now also make use of the knowledge of what the polarity of an involved table structure is per constraint.

The polarity of a table structure for a given constraint is defined to be positive if inserts into that table structure can violate the constraint and deletes cannot. The polarity of a table structure for a given constraint is negative if deletes from that table structure can violate the constraint and inserts cannot. The polarity of a table structure is defined to be neutral if both an insert and a delete give rise to a need for the constraint to be validated. The polarity of a table structure for a given constraint is undefined if the table structure is not involved in the constraint.

If the polarity of a table structure for a given constraint is neutral, then EM4 is equivalent to EM3; there is no opportunity for you to further optimize (in comparison to EM3) the insert or delete trigger. However, if it is positive or negative, then you can further optimize the DELETE or INSERT statement trigger, respectively; in fact, you can drop them.

Let’s demonstrate this by examining the DI code for constraint DEPT_TAB01. As mentioned earlier, only inserting a department can potentially violate this constraint; if all department managers currently manage no more than two departments, then deleting a department can never violate this constraint. We say that the DEPT table structure has a positive polarity with regards to the DEPT_TAB01 constraint. In this case you can optimize the delete trigger not to run any DI code at all for DEPT_TAB01; you simply needn’t create the DEPTDELETE statement trigger that would have been created in EM3 for constraint DEPT_TAB01.

imagesNote Scientific research has shown that a DBMS can also fairly easily compute the polarity of an involved table structure for a given (formally specified) constraint. This again implies that a DBMS vendor should be capable of supplying us with full declarative multi-tuple constraint support using execution model EM4.

Still, execution model EM4 sometimes runs DI code when there is no need to. For instance, considering constraint EMP_TAB03, when you insert a sales rep, or delete a trainer, EM4 will run the DI code for constraint EMP_TAB03. But obviously, because neither sales reps nor trainers play any role in this constraint, then inserts and/or deletes of them should never give rise to the need to check constraint EMP_TAB03. The next execution model addresses this inefficiency.

Execution Model 5: On-Transition-Effect-Property

This execution model (EM5) assumes the availability of a transition effect of a given DML statement that has just been executed. The transition effect describes the actual rows that have been affected by the DML statement, including—in the case of an UPDATE statement—how these rows have been modified.

As you’ll see shortly, the transition effect provides a convenient means for an after statement trigger to see which rows precisely have been affected by the triggering DML statement. By inspecting the transition effect in the after statement trigger, you are able to address the inefficiency mentioned earlier for EM4.

You can implement the transition effect as a view that only holds rows directly after the processing of a DML statement. After statement triggers that fire can access the view to determine exactly which rows have been inserted, deleted, or updated.

In this execution model, we assume the availability of three transition effect (TE) views for each table structure:

  • The insert TE view named v_[table name]_ite: This view will show the row(s) that a triggering INSERT statement has just inserted. It is empty (holds no rows) if the triggering statement is not an INSERT.
  • The update TE view named v_[table_name]_ute: This view will show the row(s) that a triggering UPDATE statement has just updated; this view shows both old and new values of modified columns. It is empty if the triggering statement is not an UPDATE.
  • The delete TE view named v_[table_name]_dte: This view will show the row(s) that a triggering DELETE statement has just deleted. It is empty if the triggering statement is not a DELETE.

Currently, Oracle’s SQL DBMS does not provide you with a transition effect (there is at least one other DBMS vendor that does supply it). However, you can develop a row and a statement trigger for each table structure that do the necessary bookkeeping to provide these three TE views.

Take a look at Listing 11-33. It shows the necessary DI code required to maintain the transition effect of the EMP table structure. The row triggers use the session temporary table EMP_TE to store the transition effect. On top of this table, the three TE views are defined.

Listing 11-33. DI Code for Maintaining Transition Effect of EMP Table Structure

create global temporary table EMP_TE
(DML      char(1)  not null check(DML in ('I','U','D'))
,ROW_ID   rowid
,EMPNO    number(4,0)
,JOB      varchar(9)
,HIRED    date
,SGRADE   number(2,0)
,MSAL     number(7,2)
,DEPTNO   number(2,0)
,check(DML<>'I' or ROW_ID is not null)
,check(DML<>'U' or ROW_ID is not null)
,check(DML<>'D' or ROW_ID is null)
) on commit delete rows
/
create trigger EMP_BIUDS_TE
before insert or update or delete on EMP
begin
  -- Reset transition effect before every DML statement execution.
  delete from EMP_TE;
  --
end;
/
create trigger EMP_AIUDR_TE
after insert or update or delete on EMP
for each row
begin
  -- Conditionally maintain the transition effect.
  if INSERTING
  then
    -- Only store 'pointer' to affected row.
    insert into EMP_TE(DML,ROW_ID) values('I',:new.rowid);
  elsif UPDATING
  then
    -- Store snapshot of old version of row, plus pointer to new version.
    insert into EMP_TE(DML,ROW_ID,EMPNO,JOB,HIRED,SGRADE,MSAL,DEPTNO)
    values ('U',:new.rowid,:old.empno,:old.job,:old.hired
                          ,:old.sgrade,:old.msal,:old.deptno);
  elsif DELETING
  then
    -- Store snapshot of old version of row.
    insert into EMP_TE(DML,ROW_ID,EMPNO,JOB,HIRED,SGRADE,MSAL,DEPTNO)
    values ('D',null,:old.empno,:old.job,:old.hired
                    ,:old.sgrade,:old.msal,:old.deptno);
  end if;
  --
end;
/
create view V_EMP_ITE as
select e.*
from EMP_TE te
    ,EMP e
where DML='I'
  and te.ROW_ID = e.ROWID
/
create view V_EMP_UTE as
select e.EMPNO   as N_EMPNO  ,e.JOB   as N_JOB  ,e.HIRED   as N_HIRED
      ,e.SGRADE  as N_SGRADE ,e.MSAL  as N_MSAL ,e.DEPTNO  as N_DEPTNO
      ,te.EMPNO  as O_EMPNO  ,te.JOB  as O_JOB  ,te.HIRED  as O_HIRED
      ,te.SGRADE as O_SGRADE ,te.MSAL as O_MSAL ,te.DEPTNO as O_DEPTNO
from EMP_TE te
    ,EMP e
where DML='U'
  and te.ROW_ID = e.ROWID
/
create view V_EMP_DTE as
select EMPNO,JOB, HIRED,SGRADE,MSAL,DEPTNO
from EMP_TE
where DML='D'
/

imagesNote In the transition effect, you need to maintain only the columns that are involved in any of the (multi-row) constraints that involve the EMP table structure. This is why the preceding code does not maintain columns ENAME, BORN, and USERNAME: these three columns aren’t involved in any of the constraints of DB_UEX.

Given the preceding code, you can now create more efficient after INSERT and DELETE statement triggers for constraint EMP_TAB03. Remember that in EM4 the DI code for this constraint would needlessly run if a sales rep was inserted or a trainer deleted.

Using the transition effect, you can now precisely code when the constraints need to be validated on execution of INSERT statements or DELETE statements:

  • For inserts: Only when the statement inserts a president or a manager should you need to check whether there is an administrator (in the same department).
  • For deletes: Only when the statement deletes an administrator should you need to check whether there (still) is another administrator, in case the department employs a manager or president.

In all other cases of INSERT or DELETE statements, it is not required to validate constraint EMP_TAB03.

In Listing 11-34 you can find the modified insert and delete triggers. These now first query the transition effect to verify if one of the preceding properties is TRUE, and if so, only then execute the query that validates constraint EMP_TAB03.

Listing 11-34. EM5’s More Efficient Insert and Delete Triggers for Constraint EMP_TAB03

create trigger EMP_AIS_TAB03
after insert on EMP
declare pl_dummy varchar(40);
begin
  -- If this returns no rows, then EMP_TAB03 cannot be violated.
  select 'EMP_TAB03 must be validated' into pl_dummy
  from DUAL
  where exists (select 'A president or manager has just been inserted'
                from v_emp_ite
                where JOB in ('PRESIDENT','MANAGER'));
  --
  begin
    --
    select 'Constraint EMP_TAB03 is satisfied' into pl_dummy
    from DUAL
    where not exists(select d.DEPTNO
                     from EMP d
                     where exists(select e2.*
                                  from EMP e2
                                  where e2.DEPTNO = d.DEPTNO
                                    and e2.JOB in ('PRESIDENT','MANAGER'))
                        and not exists(select e3.*
                                       from EMP e3
                                       where e3.DEPTNO = d.DEPTNO
                                         and e3.JOB = 'ADMIN'));
    --
  exception when no_data_found then
    --
    raise_application_error(-20999,'Constraint EMP_TAB03 is violated.');
    --
  end;
exception when no_data_found then
    -- No need to validate EMP_TAB03.
    null;
    --
end;
/
create trigger EMP_ADS_TAB03
after delete on EMP
declare pl_dummy varchar(40);
begin
  -- If this returns no rows, then EMP_TAB03 cannot be violated.
  select 'EMP_TAB03 must be validated' into pl_dummy
  from DUAL
  where exists (select 'An administrator has just been deleted'
                from v_emp_dte
                where JOB='ADMIN');
  --
  begin
    --
    select 'Constraint EMP_TAB03 is satisfied' into pl_dummy
    from DUAL
    where not exists(select d.DEPTNO
                     from EMP d
                     where exists(select e2.*
                                  from EMP e2
                                  where e2.DEPTNO = d.DEPTNO
                                    and e2.JOB in ('PRESIDENT','MANAGER'))
                       and not exists(select e3.*
                                      from EMP e3
                                      where e3.DEPTNO = d.DEPTNO
                                        and e3.JOB = 'ADMIN'));
    --
  exception when no_data_found then
    --
    raise_application_error(-20999,'Constraint EMP_TAB03 is violated.');
    --
  end;
exception when no_data_found then
    -- No need to validate EMP_TAB03.
    null;
    --
end;
/

You might have noticed that with the availability of the transition effect, you can now also write a more efficient update trigger. For instance, updating a trainer to become a sales rep does not require executing DI code for EMP_TAB03.

Written as a query on DUAL, here is the property to look for, which would require EMP_TAB03 to be validated in case of the execution of an UPDATE statement:

select 'EMP_TAB03 is in need of validation'
from DUAL
where exists(select  'Some department just won a president/manager or
                      just lost an administrator'
             from v_emp_ute
             where (n_job in ('PRESIDENT','MANAGER') and
                    o_job not in ('PRESIDENT','MANAGER')
               or (o_job='ADMIN' and n_job<>'ADMIN')
               or (o_deptno<>n_deptno and
                   (o_job='ADMIN' or n_job in ('PRESIDENT','MANAGER')))

You can use the preceding query to create a more efficient update trigger in the same way as was shown for the insert and delete triggers in Listing 11-34.

We refer to these queries on the transition effect views hereafter as transition effect queries (TE queries).

imagesNote We don’t know whether the DBMS can compute the TE queries (looking for the constraint-specific property) used in this execution model. Investigating the scientific research done in this area does not provide us with a clear answer to this question. Therefore, we cannot decisively say whether a DBMS should in principle be capable of supplying us with full declarative multi-tuple constraint support using execution model EM5.

Listing 11-35 supplies the triggers for implementing the second example table constraint using execution model EM5. It assumes the code for maintaining the transition effect views for the DEPT table structure has already been set up in a similar way as the transition effect for EMP was set up in Listing 11-33. Note that constraint DEPT_TAB01 does not require a delete trigger.

Listing 11-35. EM5 Implemention of DI Code for Constraint DEPT_TAB01

create trigger DEPT_AIS_TAB01
after insert on DEPT
declare pl_dummy varchar(40);
begin
  -- If this returns no rows, then DEPT_TAB01 cannot be violated.
  select 'DEPT_TAB01 must be validated' into pl_dummy
  from DUAL
  where exists (select 'A row has just been inserted'
                from v_dept_ite);
  --
  begin
    --
    select 'Constraint DEPT_TAB01 is satisfied' into pl_dummy
    from DUAL
    where not exists(select m.DEPTNO
                     from DEPT m
                     where 2 < (select count(*)
                                from DEPT d
                                where d.MGR = m.MGR));
    --
  exception when no_data_found then
    --
    raise_application_error(-20999,'Constraint DEPT_TAB01 is violated.');
    --
  end;
exception when no_data_found then
    -- No need to validate DEPT_TAB01.
    null;
    --
end;
/
create trigger DEPT_AUS_TAB01
after update on DEPT
declare pl_dummy varchar(40);
begin
  -- If this returns no rows, then DEPT_TAB01 cannot be violated.
  select 'DEPT_TAB01 must be validated' into pl_dummy
  from DUAL
  where exists (select 'A department manager has just been updated'
                from v_dept_ute
                where o_mgr<>n_mgr);
  --
  begin
    --
    select 'Constraint DEPT_TAB01 is satisfied' into pl_dummy
    from DUAL
    where not exists(select m.DEPTNO
                     from DEPT m
                     where 2 < (select count(*)
                                from DEPT d
                                where d.MGR = m.MGR));
    --
  exception when no_data_found then
    --
    raise_application_error(-20999, 'Constraint DEPT_TAB01 is violated.');
    --
  end;
exception when no_data_found then
    -- No need to validate DEPT_TAB01.
    null;
    --
end;
/

Note that in the case of constraint DEPT_TAB01, execution models EM4 and EM5 are equivalent.

The question might arise whether testing for a certain property in the transition effect defeats its purpose. That’s because if performing this test is as costly as is running the actual constraint validation code, then what is your gain? In answer to this question, we can make the following observations:

  • The number of rows in the transition effect is usually much smaller than the number of rows in the table structures that are involved in the constraint.
  • The transition effect is always in cache, which cannot be said about all rows in the involved table structures.
  • The TE query itself will normally be much simpler than the query that is conditionally run to validate the constraint.

The examples shown have demonstrated this last observation. Together with the first two observations, this should then answer the question; a TE query does not defeat its purpose (efficiency of DI code).

Furthermore, there is a second—more important—purpose for guarding constraint validation with TE queries. Preventing unnecessary execution of queries that validate the constraint will strongly benefit the concurrency of transactions. We’ll deal with this aspect in the next subsection, “DI Code Serialization.”

There is one more way to further optimize the efficiency of execution model EM5. This involves executing a more efficient query to validate the constraint. Up until now you’ve derived these queries directly from the formal specifications of the constraints, and as such they validate the full constraint predicate.

In the case of EMP_TAB03, the query validates whether every department satisfies the predicate that is quantified (if there is a president/manager, then there is an administrator). In the case of DEPT_TAB01, the query validates whether every department manager is not managing more than two departments. Often, given a triggering DML statement, it suffices to execute a query that validates a weaker predicate.

For instance, if an INSERT statement inserts a manager for department 10 into the EMP table structure, then theoretically you would only have to validate whether the quantified predicate is still satisfied for department 10 only. Likewise, if an UPDATE statement changes the manager for some department, then you would only have to validate whether this new department manager isn’t managing more than two departments.

The last execution model addresses this opportunity for further increasing the efficiency of DI code.

Execution Model 6: On-Transition-Effect-Property Plus Optimized-Query

This execution model (EM6) resembles EM5 in that it also requires a transition effect for each table structure; you now use this transition effect in a slightly more sophisticated manner, so that you can determine the weaker predicate.

Instead of just looking for a property in the transition effect to guard execution of a constraint validation query, you now also use the transition effect to supply values that can be used to optimize the validation query. For instance, in the case of EMP_TAB03, you can use the transition effect to determine for which department(s) the weaker predicate must be checked; this should always result in executing a more efficient validation query. In the case of DEPT_TAB01, you can use the transition effect to find out for what MGR value(s) the weaker predicate must be checked.

Listing 11-36 shows you how to do this. It shows the EM6 version of the insert, update, and delete triggers for constraint EMP_TAB03.

Listing 11-36. EM6 Implemention of DI Code for Constraint EMP_TAB03

create trigger EMP_AIS_TAB03
after insert on EMP
declare pl_dummy varchar(40);
begin
  --
  for r in (select distinct deptno
            from v_emp_ite
            where JOB in ('PRESIDENT','MANAGER'));
  loop
  begin
    -- Note: this now uses r.deptno value from preceeding TE query.
    select 'Constraint EMP_TAB03 is satisfied' into pl_dummy
    from DUAL
    where not exists(select e2.*
                     from EMP e2
                     where e2.DEPTNO = r.DEPTNO
                       and e2.JOB in ('PRESIDENT','MANAGER'))
       or exists(select e3.*
                 from EMP e3
                 where e3.DEPTNO = r.DEPTNO
                   and e3.JOB = 'ADMIN'));
    --
  exception when no_data_found then
    --
    raise_application_error(-20999,
     'Constraint EMP_TAB03 is violated for department '||to_char(r.deptno)||'.');
    --
  end;
  end loop;
end;
/
create trigger EMP_ADS_TAB03
after delete on EMP
declare pl_dummy varchar(40);
begin
  --
  for r in (select distinct deptno
            from v_emp_dte
            where JOB='ADMIN');
  loop
  begin
    --
    select 'Constraint EMP_TAB03 is satisfied' into pl_dummy
    from DUAL
    where not exists(select e2.*
                     from EMP e2
                     where e2.DEPTNO = r.DEPTNO
                       and e2.JOB in ('PRESIDENT','MANAGER'))
       or exists(select e3.*
                 from EMP e3
                 where e3.DEPTNO = r.DEPTNO
                   and e3.JOB = 'ADMIN'));
    --
  exception when no_data_found then
    --
    raise_application_error(-20999,
     'Constraint EMP_TAB03 is violated for department '||to_char(r.deptno)||'.');
    --
  end;
  end loop;
end;
/
create trigger EMP_AUS_TAB03
after update on EMP
declare pl_dummy varchar(40);
begin
  --
  for r in (select n_deptno as deptno
            from v_emp_ute
            where (o_job not in ('PRESIDENT','MANAGER') or updated_deptno='TRUE')
              and n_job in ('PRESIDENT','MANAGER')
            union
            select o_deptno as deptno
            from v_emp_ute
            where (n_job<>'ADMIN' or updated_deptno='TRUE')
               and old_job ='ADMIN')
  loop
  begin
    --
    select 'Constraint EMP_TAB03 is satisfied' into pl_dummy
    from DUAL
    where not exists(select e2.*
                     from EMP e2
                     where e2.DEPTNO = r.DEPTNO
                       and e2.JOB in ('PRESIDENT','MANAGER'))
       or exists(select e3.*
                 from EMP e3
                 where e3.DEPTNO = r.DEPTNO
                   and e3.JOB = 'ADMIN'));
    --
  exception when no_data_found then
    --
    raise_application_error(-20999,
     'Constraint EMP_TAB03 is violated for department '||to_char(r.deptno)||'.');
    --
  end;
  end loop;
end;
/

imagesNote The distinct keyword inside the preceding TE queries possibly prevents multiple executions of the same constraint validation query in case the UPDATE statement affected more than one row.

Because the example constraints happen to be universal quantifications, finding the weaker predicate in the cases of EMP_TAB03 and DEPT_TAB01 is trivial. However, this is not always the case; the weaker predicate can also be rather complex to discover. This all depends on the specification of the constraint. Again, we don’t know if it is possible at all whether, instead of having to discover it yourself, you can have the DBMS automatically compute the weaker predicate (including the TE query) given a declared formal specification of a constraint.

DI Code Serialization

You should be aware of another major issue when implementing DI code yourself. This issue applies irrespective of whether you follow the triggered procedural strategy or the embedded procedural strategy.

In Oracle’s SQL DBMS, transactions execute concurrently. At any point in time there can be many open transactions; that is, transactions that have started but that have not yet committed. As mentioned earlier in this chapter, an open transaction cannot see the changes made by other concurrent open transactions. This has serious ramifications in the area of correctness of the DI code. As we’ll demonstrate in a moment, the root cause of the issue lies in the fact that the constraint validation query, which executes as part of the DI code, assumes that all data it reads—or failed to read due to the absence of the data—is not currently being changed (or created) by other transactions. The DI code for a given constraint can only correctly implement the constraint if concurrently running transactions, which execute the validation query (embedded in the DI code) of this constraint, are serialized.

imagesNote We’ll actually demonstrate some of the classical concurrency problems that arise in a DBMS that does not support serializability of concurrently executing transactions. If you are unfamiliar with the concept of serializability, then check out some of the references that we’ve added in Appendix C.

Let’s demonstrate this serializability issue by describing a scenario of two concurrently executing transactions that both potentially violate the EMP_TAB03 constraint. For the sake of simplicity, assume that the EMP table structure currently contains a single row representing an administrator working for department 20. Transaction TX1 inserts a manager working for department 20 in the EMP table structure. Transaction TX2 deletes the administrator working for department 20 from the EMP table structure. Now take a look at Table 11-2. It describes a possible scenario of these two transactions executing concurrently.

images

At time t=0, transaction TX1 starts. It inserts the manager, which causes the execution of DI code for constraint EMP_TAB03. The constraint validation query that executes as part of this DI code finds the administrator in department 20, which causes the DI code to allow this insert. At time t=1, transaction TX2 starts. It deletes the only row in the EMP table structure representing the administrator. Deleting an administrator also causes the same DI code for constraint EMP_TAB03 to be executed. The constraint validation query that executes as part of this DI code does not find a manager (or president) working in department 20, which causes the DI code to also allow this delete.

imagesNote At t=1, queries executing within transaction TX2 do not see the uncommitted changes made by transaction TX1.

At t=2, transaction TX1 commits. At t=3, transaction TX2 commits. The database now violates constraint EMP_TAB03; the EMP table structure holds a manager working for department 20, but there is no administrator in that department.

Let’s also look at an example within the context of constraint DEPT_TAB01. Assume the DEPT table structure currently holds one department managed by employee 1042. Transaction TX3 inserts a new department that is also managed by employee 1042. Transaction TX4 also inserts a department again managed by employee 1042. Take a look at Table 11-3. It describes a possible scenario of these two transactions executing concurrently.

images

Again, the DI code that executes at t=1 in TX4 does not see the uncommitted department inserted by transaction TX3 at t=0. Both inserts execute successfully. After t=3, the database violates constraint DEPT_TAB01; employee 1042 is managing three departments.

For every multi-tuple constraint, you can devise a scenario of concurrently executing transactions such that upon commit of these transactions the database is in violation of the constraint. Note that this is irrespective of the execution model that is used to implement the DI code.

Now, the question is, can you resolve this issue? The answer is yes. However, it requires that you develop and embed sometimes rather sophisticated serialization code in the DI code to ensure that there can never be two transactions executing at the same time that involve the same constraint (that is, that run the validation query of the same constraint).

imagesNote The following section assumes that you are familiar with Oracle’s dbms_lock package and the concept of an autonomous transaction. If you’re not, then we advise you first to study the Oracle documentation on this package and autonomous transactions.

The trick is that you use the dbms_lock supplied package of Oracle’s SQL DBMS. By using this package you can acquire application locks, through which you can effectively serialize concurrently executing transactions that involve the same constraint.

Listing 11-37 shows the code of procedure p_request_lock that’s built on top of the dbms_lock package, and through which you can request an application lock. This procedure needs to call a dbms_lock module that performs an implicit commit. Because you’ll be calling p_request_lock from DI code, which in turn is executed from within a trigger (a context in which you’re not allowed to commit), you need to hide this implicit commit from the current transaction. You can do this by use of an autonomous transaction. Auxiliary function f_allocate_unique (also in Listing 11-37) implements this autonomous transaction.

Listing 11-37. Application Lock Services

create function f_allocate_unique
(p_lockname in varchar) return varchar as
--
pragma autonomous_transaction;
--
pl_lockhandle   varchar(128);
--
begin
  -- This does implicit commit.
  dbms_lock.allocate_unique(upper(p_lockname)
                            ,pl_lockhandle
                            ,60*10); -- Set expiration to 10 minutes.
  --
  return pl_lockhandle;
  --
end;
/
create procedure p_request_lock(p_lockname in varchar) as
--
pl_lockhandle varchar(128);
pl_return number;
--
begin
  --
  -- Go get a unique lockhandle for this lockname.
  --
  pl_lockhandle := f_allocate_unique(p_lockname);
  --
  -- Request the named application lock in exclusive mode.
  -- Allow for a blocking situation that lasts no longer than 60 seconds.
  --
  pl_return :=
  dbms_lock.request(lockhandle        => pl_lockhandle
                   ,lockmode          => dbms_lock.x_mode
                   ,timeout           => 60
                   ,release_on_commit => true);
  --
  if pl_return not in (0,4)
  then
    raise_application_error(-20998,
          'Unable to acquire constraint serialization lock '||p_lockname||'.');
  end if;
  --
end;
/

You can now—in this case fairly easily—ensure that two transactions, which both need to run the DI code for constraint DEPT_TAB01, are correctly serialized to prevent the concurrency problem that was demonstrated in Table 11-3. Take a look at Listing 11-38, which shows the DI code for constraint DEPT_TAB01 using execution model EM6, including the necessary calls to p_lock_request to ensure correct serialization.

Listing 11-38. EM6 Implemention of DI Code for DEPT_TAB01 Including Serialization

create trigger DEPT_AIS_TAB01
after insert on DEPT
declare pl_dummy varchar(40);
begin
  --
  for r in (select distinct mgr as mgr
            from v_dept_ite)
  loop
  begin
    -- Acquire serialization lock.
    p_request_lock('DEPT_TAB01');
    --
    select 'Constraint DEPT_TAB01 is satisfied' into pl_dummy
    from DUAL
    where 2 >= (select count(*)
                from DEPT d
                where d.MGR = r.MGR);
    --
  exception when no_data_found then
    --
    raise_application_error(-20999,'Constraint DEPT_TAB01 is violated '||
                            'for department manager '||to_char(r.MGR)||'.');
    --
  end;
  end loop;
  --
end;
/
create trigger DEPT_AUS_TAB01
after update on DEPT
declare pl_dummy varchar(40);
begin
  --
  for r in (select distinct n_mgr as mgr
            from v_dept_ute)
  loop
  begin
    -- Acquire serialization lock.
    p_request_lock('DEPT_TAB01');
    --
    select 'Constraint DEPT_TAB01 is satisfied' into pl_dummy
    from DUAL
    where 2 >= (select count(*)
                from DEPT d
                where d.MGR = r.MGR);
    --
  exception when no_data_found then
    --
    raise_application_error(-20999,'Constraint DEPT_TAB01 is violated '||
                            'for department manager '||to_char(r.MGR)||'.');
  end;
  end loop;
  --
end;
/

The scenario that was described in Table 11-3 now executes as follows (see Table 11-4).

images

On t=1, transaction TX4 will start waiting for transaction TX3 to release the application lock. Right after t=2, this waiting ends and execution of DI code is resumed. It now executes the constraint validation query, and finds that the insert in TX4 is not allowed because it would make employee 1042 the department manager of three departments.

You’ve now fixed the serialization issue: two transactions that both require executing the DEPT_TAB01 validation query can never execute simultaneously. However, the locking scheme that is implemented might be a bit too coarse. For instance, if TX4 were to insert a department that is managed by someone else (that is, not employee 1042), then TX4 will also block, whereas it really needn’t in this case for the DI code to enforce the constraint correctly.

Given the use of EM6 in Listing 11-38, you can implement a more granular locking scheme for this constraint, by changing the calls to p_request_lock to the following:

p_request_lock('DEPT_TAB01'||to_char(r.mgr));

Instead of always requesting an application lock whose name is constant ('DEPT_TAB01'), you now request an application lock whose name is dependent upon the case that is in need to be checked. Now two transactions inserting a department managed by different employees are allowed to execute simultaneously.

imagesNote Only EM6 allows you to implement this more granular locking scheme.

We’ll end this section with a few concluding remarks related to serializability, further clarifying why vendors offer such poor support for the declarative strategy to implement constraints.

Recall the various execution models to which you’ve been introduced. Not only will less efficient execution models more often execute nonoptimized constraint validation queries than is necessary, but they’ll also acquire serialization locks more often than is necessary. For instance, if you were to add serialization lock calls into DI code built using EM1, then effectively you’d only allow one transaction at a time; there cannot be any concurrent execution of transactions in this execution model. In EM2, this is relaxed to only one transaction at a time per table structure. Going further through all the remaining execution models, the level of concurrency increases each time.

Next to the remarks made earlier about the DBMS maybe not being able to compute neither the TE queries nor the optimized constraint validation queries, similar remarks apply to the DBMS being able to compute what the most optimal serialization lock calls should be. A lazy strategy that ensures correctness of DI code execution is always to acquire an exclusive application lock for each constraint (as shown in Listing 11-38). However, it is likely that you won’t accept the unnecessary degraded concurrency of your DBMS that comes along with it.

Having extensively investigated the issues around implementing efficient DI code for table constraints using the triggered procedural strategy in this section, the next section will only offer you a summary of guidelines for implementing DI code for table constraints.

Implementing Table Constraints

As demonstrated by the previous section, it is here at the table constraint level where things rapidly become more complex. You can state only two types of table constraints declaratively: uniquely identifying attributes (keys), and subset requirements referencing back to the same table (in which case a subset requirement is a table constraint).

Here’s an example that demonstrates the use of SQL primary key constraints and unique key constraints (see Listing 11-39).

Listing 11-39. Declarative Implementation of the Keys of the DEPT Table Structure

alter table DEPT add constraint DEPT_key1 primary key(deptno);
after table DEPT add constraint DEPT_key2 unique(dname,loc);

The example database design doesn’t involve a subset requirement at the table level. In the next section you’ll see how you can state database-level subset requirements declaratively.

Given the section “Table Constraint Implementation Issues,” you might perhaps realize now why the DBMS vendor doesn’t offer you the possibility of including queries inside declarative constraints; it’s likely that a DBMS cannot offer a better execution model than EM4, and it’s also likely that you won’t accept the imposed serialization of transactions that comes along with EM4.

For instance, the following SQL statement, which attempts to implement DEPT_TAB01 declaratively, is not allowed:

-- Invalid "alter table add constraint" syntax for Oracle.
alter table DEPT add constraint dept_tab01 check
                     (not exists(select m.DEPTNO
                                 from DEPT m
                                 where 2 < (select count(*)
                                            from DEPT d
                                            where d.MGR = m.MGR)))

You must implement procedural DI code yourself for all other types of table constraints. Developing and implementing efficient DI code is not an easy task. As shown when the various execution models were investigated, it requires the following high-level stepping stones for each constraint:

  1. Translate the formal specification into a constraint validation query.
  2. Develop code to maintain transition effects.
  3. Devise TE queries that ensure the constraint validation query is only run when necessary.
  4. Discover a means to optimize the constraint validation query by having the TE query provide values that can be used in the validation query.
  5. Devise and add a serialization strategy to the DI code.

However, you’ll experience that by implementing table constraints regularly and becoming proficient in doing so, implementing table constraints procedurally is in general quite doable.

Implementing Database Constraints

As you probably know, declarative support for implementing database constraints is only available for subset requirements; in SQL a subset requirement is expressed by declaring a foreign key. Let’s take a look at implementing subset requirement PSSR1 from our example database design:

PSSR1(EMP,DEPT) :=
     /* Employee works for a known department */
     { e(DEPTNO) | e∈EMP } ⊆ { d(DEPTNO) | d∈DEPT }

In SQL, this subset requirement is stated as follows:

alter table emp add constraint emp_fk_dept
                    foreign key(deptno) references dept(deptno);

SQL requires that the set of attributes that are referenced constitute a key. In the preceding example, this is the case; DEPTNO is uniquely identifying in DEPT. However, in the following case—requirement PSSR7LOC does not constitute a key in DEPT. Therefore, you cannot implement PSSR7 by declaring a foreign key.

PSSR7(OFFR,DEPT) :=
     /* Courses take place in locations where we have a department */
     { o(LOC) | o∈OFFR } ⊆ { d(LOC) | d∈DEPT }

If you try the following alter table command, the SQL DBMS will give you an error:

-- Invalid because dept(loc) is not a key.
alter table offr add constraint offr_fk_dept
                     foreign key(loc) references dept(loc);

There are other scenarios where a foreign key cannot be used to implement a subset requirement declaratively. Some of the subset requirements in the example database design reference a subset of the tuples in the referenced table structure; this is quite common. Here is an example:

PSSR2(DEPT,EMP) :=
     /* Dept mgr is a known employee, excluding admins and president */
     { d(MGR)   | d∈DEPT } ⊆
     { e(EMPNO) | e∈EMP ⋀ e(JOB) ∉ {'ADMIN','PRESIDENT'} }

You can declare a foreign key from DEPT to EMP as follows:

alter table offr add constraint offr_fk_dept
                     foreign key(loc) references dept(loc);

This foreign key will enforce that only known employees are referenced; it will still allow administrators or the president to manage a department. There is no declarative possibility to specify that a subset of the employees is only allowed to be referenced; you’ll need to write additional procedural code for this.

A similar restriction applies to the cases where only a subset of the rows needs to reference the key values in rows of another table. This is the case in subset requirement PSSR8:

PSSR8(OFFR,EMP) :=
    /* Trainer of course offering is a known trainer */
    { o(TRAINER) | o∈OFFR ⋀ o(TRAINER) ≠ -1 } ⊆
    { e(EMPNO) | e∈EMP ⋀ e(JOB) = 'TRAINER' }

It is not possible to declare to the DBMS that only the subset of OFFR rows where TRAINER≠-1 each references a known trainer.

There is a trick that you can apply in this case, though. By choosing to use a NULL instead of value -1 as a means to represent that no trainer has been assigned yet, you enable yourself to declare a foreign key, at least, to enforce that trainer assignments reference a known employee. Again, you still need to develop additional procedural code to implement the restriction at the other end (only employees where JOB='TRAINER' are allowed to be referenced).

Now let’s spend a little time investigating this; before you build additional procedural code, it would be wise first to think about what the remaining predicate to be implemented would be here (given that the foreign key already does some work to implement PSSR8). To be more precise, can you rewrite PSSR8 into a conjunction where one conjunct covers exactly what the foreign key already implements declaratively, and the other conjunct covers what remains to be implemented procedurally? Here it is.

(∀o∈OFFR: o(TRAINER)≠-1 ⇒ (∃e∈EMP: e(empno)=o(trainer))) ⋀
(∀t∈(OFFR◊◊{(trainer;empno)})⊗EMP: t(job)='TRAINER')

The preceding first conjunct states that for every offering the assigned trainer should be a known employee; this represents what the foreign key implements. The second conjunct states that for all tuples in the join of OFFR (which requires attribute renaming) and EMP, the JOB attribute should hold value 'TRAINER'; this represents what remains to be implemented procedurally. By the way, do you recognize the predicate pattern of the second conjunct? It is in fact a tuple-in-join predicate.

imagesNote You’ll probably intuitively comprehend that the second predicate is exactly what remains to be implemented procedurally for PSSR8. It is worth mentioning that you could actually formally prove that the preceding conjunction is logically equivalent with the PSSR8 predicate. It requires the development of more rewrite rules with regards to quantifiers, and a few concepts with regards to how formal proofs are to be set up (both of which we haven’t done in this book).

Procedurally implementing database constraints is done in a similar way as table constraints; it just requires developing more triggers because multiple table structures are involved now. Take a look at Listing 11-40. It lists all necessary triggers using EM6 (including serialization logic) for implementing the remaining tuple-in-join predicate listed earlier to complete the PSSR8 implementation. In this case, you require an after statement update trigger on the EMP table structure, and an after INSERT statement trigger together with an after statement update trigger on the OFFR table structure.

Listing 11-40 does not list the necessary global temporary table, pre-statement trigger, after row triggers, and view definitions to maintain the transition effect views; you need to set these up for both the OFFR and EMP table structures in a similar way, as shown in Listing 11-33.

Listing 11-40. EM6 Implemention of DI Code for Remaining PSSR8 Predicate

create trigger EMP_AUS_PSSR8
after update on EMP
declare pl_dummy varchar(40);
begin
  -- Changing a trainers job, requires validation of PSSR8.
  for r in (select n_empno as empno
            from v_emp_ute e
            where (e.0_empno=e.n_empno and e.n_job<>'TRAINER' and e.o_job='TRAINER')
               or (e.0_empno<>e.n_empno and e.n_job<>'TRAINER'))
  loop
  begin
    -- Acquire serialization lock.
    p_request_lock('PSSR8'||to_char(r.empno));
    --
    select 'Constraint PSSR8 is satisfied' into pl_dummy
    from DUAL
    where not exists(select 'This employee is assigned as a trainer to an offering'
                     from OFFR o
                     where o.TRAINER = r.empno);
    --
  exception when no_data_found then
    --
    raise_application_error(-20999,'Constraint PSSR8 is violated '||
                            'for employee '||to_char(r.empno)||'.');
    --
  end;
  end loop;
  --
end;
/
create trigger OFFR_AIS_PSSR8
after insert on OFFR
declare pl_dummy varchar(40);
begin
  -- Inserting an offering, requires validation of PSSR8.
  for r in (select distinct trainer
            from v_offr_ite i)
  loop
  begin
    -- Acquire serialization lock.
    p_request_lock('PSSR8'||to_char(r.trainer));
    --
    select 'Constraint PSSR8 is satisfied' into pl_dummy
    from DUAL
    where not exists(select 'This employee is not a trainer'
                     from EMP e
                     where e.EMPNO = r.trainer
                       and e.JOB  <> 'TRAINER');
    --
  exception when no_data_found then
    --
    raise_application_error(-20999,'Constraint PSSR8 is violated '||
                            'for trainer '||to_char(r.trainer)||'.');
  end;
  end loop;
end;
/
create trigger OFFR_AUS_PSSR8
after update on OFFR
declare pl_dummy varchar(40);
begin
  -- Updating the trainer of an offering, requires validation of PSSR8.
  for r in (select distinct n_trainer as trainer
            from v_offr_ute u
            where o_trainer<>n_trainer)
  loop
  begin
    -- Acquire serialization lock.
    p_request_lock('PSSR8'||to_char(r.trainer));
    --
    select 'Constraint PSSR8 is satisfied' into pl_dummy
    from DUAL
    where not exists(select 'This employee is not a trainer'
                     from EMP e
                     where e.EMPNO = r.trainer
                        and e.JOB <> 'TRAINER');
    --
  exception when no_data_found then
    --
    raise_application_error(-20999,'Constraint PSSR8 is violated '||
                            'for trainer '||to_char(r.trainer)||'.');
    --
  end;
  end loop;
end;
/

We’ll end this section with an observation about the poor declarative support for multi-tuple constraints offered by DBMS vendors.

Because we believe that it is not possible for a DBMS vendor to program an algorithm that accepts an arbitrarily complex predicate and then computes efficient TE queries, a minimal validation query, and optimal serialization code to implement execution model EM6, we should not expect full support for multi-tuple constraints—in a practical, usable and acceptable way—from these vendors in the future. The best we can hope for is that database researchers first come up with more common classes of constraints and develop convenient shorthands for these. The DBMS vendors, in their turn, should then provide us with new declarative constructs, consistent with these shorthands, to state these common classes of constraints easily to the DBMS. Given such a common class declaration, the DBMS vendor should be able to program an algorithm that provides us with an EM6-like execution model under the covers to implement the constraint.

imagesNote We’ve hinted at a few common classes of constraints in this book for which DBMS vendors should offer us full declarative support: specialization, generalization, and tuple-in-join constraints. Also, in the area of subset requirement constraints—where we currently only have the SQL foreign key construct available to us—more declarative variations should be offered.

This concludes the investigation of developing DI code for database constraints. The next section takes a look at implementing transition constraints.

Implementing Transition Constraints

Please go back to Listing 8-8 and take a moment to recall the transition constraints of our example database universe.

Do you notice that the way the state transition constraints are specified is not different from the way you would specify a database constraint? They are predicates that involve two or more parameters of type table. Of course, in the case of transition constraints, these parameters always represent old and new snapshots of involved table structures. But in essence, a transition constraint is a predicate involving multiple table structures. This means that the complexity involved in implementing DI code for transition constraints is in principle not different from the complexity involved in database constraints.

A transition constraint deals with a transaction’s begin state and end state. As explained before, transactions are implemented in SQL by the consecutive execution of DML statements. Such implementation not only creates intermediate database states, but—in the context of this section—also creates intermediate state transitions. Formally you would only need to validate the transition from the begin state to the end state of the transaction. DI code for transition constraints should only execute after the last DML statement of the transaction has created the end state; that is, its execution should be deferred to the end of the transaction.

In the section “Bringing Deferred Checking into the Picture,” you’ll find a brief exploration of deferring execution of DI code. For now we’ll demonstrate how you can develop DI code for transition constraints (in the triggered procedural strategy), such that all intermediate state transitions satisfy them too.

imagesNote As you probably know, there is no support whatsoever in current SQL DBMSes to implement transition constraints declaratively.

If you start thinking about implementing transition constraints, you’ll immediately hit this question: “In an after statement trigger, how do I query the old snapshot of an involved table structure?” The answer happens to be simple. In Oracle’s SQL DBMS you can use flashback queries to do exactly that: query an old snapshot of a table structure.

To be able to use a flashback query on a table structure, you’ll somehow need to administer what the system change number was when the transaction started. You can do this using a packaged variable, or a session temporary table. Note that only the table structures whose old snapshot is involved in any of the state transition constraints will ever need to be queried using a flashback query. So it is safe to administer the system change number only when a DML statement changes one of these table structures; you can use pre-statement triggers for this that determine the current system change number via dbms_flashback.get_system_change_number and store that in the packaged variable or session table. In the following example, we’ll assume these triggers are already in place.

We’ll now present you with an exploration of implementing the DI code for state transition constraint STC5.

STC5(HISTB,EMPB,HISTE,EMPE) :=
     /* New history records must accurately reflect employee updates*/
     ( ∀h∈(HISTE⇓{EMPNO,UNTIL} − HISTB⇓{EMPNO,UNTIL})⊗HISTE:
         h(UNTIL) = sysdate ⋀
         ( ∃e1∈EMPB, e2∈EMPE:
           e1↓{EMPNO,MSAL,DEPTNO} = h↓{EMPNO,MSAL,DEPTNO} ⋀
           e2(EMPNO) = h(EMPNO) ⋀
           ( e2(MSAL) ≠ e1(MSAL) ⋁ e2(DEPTNO) ≠ e1(DEPTNO) ) ) )

For maintaining this constraint, you only need to develop an after INSERT statement trigger on the HIST table structure. Listing 11-41 lists this trigger. Note how the formal specification is again translated fairly easily into a constraint validation query. The following code assumes the availability of a transition effect view and also assumes the availability of function f_get_start_scn to retrieve the previously stored system change number.

Listing 11-41. EM6 Implemention of DI Code for STC5 (Without Serialization)

create or replace trigger HIST_AIS_STC5
after insert on HIST
declare pl_dummy varchar(40);
begin
  -- Inserting a history record, requires validation of STC5.
  for r in (select empno,until
            from v_hist_ite i)
  loop
  begin
    --
    select 'Constraint STC5 is satisfied' into pl_dummy
    from DUAL
    where exists (select 'The history record is OK'
                  from HIST h
                  where h.EMPNO = r.empno
                    and h.UNTIL = r.until
                    and h.UNTIL = sysdate
                    and exists(select 'A corresponding update on EMP'
                               from EMP as of scn f_get_tx_start_scn e1
                                   ,EMP e2
                               where e1.EMPNO  = h.EMPNO
                                 and e1.MSAL   = h.MSAL
                                 and e1.DEPTNO = h.DEPTNO
                                 and e2.EMPNO  = h.EMPNO
                                 and (e2.MSAL<>e1.MSAL or e2.DEPTNO<>e1.DEPTNO)));
  exception when no_data_found then
    --
    raise_application_error(-20999,'Constraint STC5 is violated '||
                            'for history record '||to_char(r.empno)||'/'||
                                                   to_char(r.until)||'.');
    --
  end;
  end loop;
  --
end;

Devising a serialization strategy for DI code of transition constraints is not trivial. For instance, consider the following scenario (in Table 11-5) of two concurrently executing transactions: TX5 and TX6.

images

Given Oracle’s read-committed isolation level, when the DI code for constraint STC5 fires at t=4, it sees the change in salary for employee 1042 established by transaction TX6. The old snapshot from t=0 did not have this change. The new snapshot (that is, the current one at t=4) does have this change because TX6 has already committed at t=4. STC5’s DI code could therefore approve the history record insert performed by TX5 (given, of course, that it correctly reflects the salary update). Adding serialization lock calls won’t help in this case; any locks acquired by the DI code running in TX6 will already have been released at t=4.

Obviously, the reason why the current setup of STC5’s DI code doesn’t work correctly is because it wrongly deduces that the change it sees with respect to the EMP table structure is due to prior DML statements in the same transaction.

We’ve given you this example to show that implementing DI code for (at least some) state transition constraints is by far not trivial and definitely requires further investigation.

imagesNote At the time of the writing of this book we have not yet fully investigated the issues with respect to correctly serializing state transition DI code. In this case, a possible fix could be to have the constraint validation query (the flashback query) run in Oracle’s serializable isolation level. Queries that run in this mode won’t see changes made by other transactions that were committed after the current transaction began; you only see the changes that the current transaction made. Unfortunately you cannot change the isolation mode of a single query; you would have to resort to running the whole transaction in Oracle’s serializable mode.

In the next section you’ll find an exploration of another issue that we have ignored up to now.

Bringing Deferred Checking into the Picture

So far we’ve been assuming that during the serial execution of DML statements in a transaction, every intermediate database state should satisfy all constraints; all required DI code is run immediately at statement level using after DML-triggers. However, you’ll sometimes require that the execution of DI code be deferred to the end of the transaction.

Why Deferred Checking?

There is a class of data integrity constraints that will always need to be temporarily violated during the transaction, and rechecked in a deferred way at the end of the transaction. An example of this is a specialization constraint. Take a look at constraint PSPEC1 from our example database design:

PSPEC1(EMP,SREP) :=
    /* Sales reps have a target and a commission */
    { e(EMPNO) | e∈EMP ⋀ e(JOB) = 'SALESREP' } =
    { s(EMPNO) | s∈SREP }

Because with SQL, you cannot insert both the EMP row and the SREP row at the same time, a transaction that inserts a new sales representative will have to issue two inserts. Also, irrespective of the order of the inserts, the intermediate database state will violate constraint PSPEC1.

This example gives us a first reason why, in our SQL world, you have to deal with temporary constraint violations.

DML Statements Operate on a Single Table Structure

Given our set of data integrity constraints, certain valid database state transitions might well require a DML statement on more than one table structure. Because constraints in the database constraint class have a data scope of multiple table structures, there is a good possibility that a database constraint violation will be caused by a first DML statement on one of the involved table structures and that a second DML statement will correct this violation on one of the other involved table structures.

Note the use of the word “possibility.” Some transactions will always need to violate a certain database constraint temporarily, regardless of the order in which the DML statements within that transaction are executed (the preceding PSPEC1 is an example of this). However, many database constraints never need deferred checking. The data constrained by them can be kept to satisfy the constraint at all times by issuing the different DML statements in a specific order implied by the constraint. For instance, this is the case with a subset requirement. It can always be satisfied by intermediate database states as long as DML statements are executed in the right order; you’ll first need to insert a new row to the superset and then to the subset.

In the next section we’ll discuss a second shortcoming of the SQL language, which gives rise to deferred checking.

DML Statements Operate in a Single Manner

A DML statement is an INSERT, an UPDATE, or a DELETE statement. However, a valid state transition of one table structure might require more than one type of DML statement to achieve, and thus could possibly give rise to the need to allow temporary violations for table constraints too.

For example, take the following (not very realistic) table constraint: “The number of sales reps plus twice the number of clerks must equal either 100 or zero.” Let’s look at the transaction of introducing a clerk. Assume that the current EMP table holds 100 sales reps (rendering our table constraint TRUE). As soon as we introduce the new clerk, either by updating the JOB of a sales rep or by inserting a new clerk, we’ll always introduce a violation and need a second different type of DML statement to restore the truth of the table constraint.

This shortcoming of the SQL language implies that DI code for certain table constraints can be subject to deferred execution too. We’ll call table and database constraints that require temporary violations inside transactions, deferrable constraints.

Outline of Execution Model for Deferred Checking

If a DML statement, say DML1, introduces a violation within a transaction, then there must be a subsequent DML statement within the same transaction, say DML2, that corrects the violation introduced by DML1 prior to the end of the transaction. On execution of DML1, you would either

  • Not want to execute the involved DI code at all, but instead schedule it to be executed at the end of the transaction, or
  • Have the involved DI code execute in such a way that only if it detects a violation is it scheduled to be re-executed at the end of the transaction.

In both cases, if on re-execution of the DI code the constraint is still found to be violated, then the transaction should obviously be prohibited from committing.

So, how do you schedule DI code to be executed at the end of a transaction? Well, you don’t; there is no way to achieve this in Oracle’s DBMS. A concept enabling you to do this would have been the concept of a commit trigger. A commit trigger would fire just prior, as part of the system commit procedure, and it could check the end state that is about to be committed by the transaction. By embedding DI code into this trigger, you could recheck whether subsequent DML statements have resolved all temporary violations of constraints. Only if this is the case does the trigger allow the commit procedure to succeed. Unfortunately, the DBMS doesn’t offer the concept of a commit trigger.

However, there is another way that allows you to re-execute DI code of a temporarily violated constraint. In the remainder of this section, we’ll provide you with an outline of how you could modify execution model EM6 to also cater for deferred checking.

Take a look at Table 11-6. It describes a transaction that executes four DML statements. Statement DML1 involves constraint C1, which has been identified as a deferrable constraint.

Table 11-6. Re-Executing DI Code of Deferred Constraint C1

TX Comment
DML1; Involves constraint C1; DI code fires and finds that DML1 violates it. DI code allows this.
DML2; DI code of other constraints executes. C1 DI code is re-executed; finds C1 is still in violation.
DML3; DI code of other constraints executes. C1 is rechecked; DI code now finds C1 is satisfied.
DML4; DI code of other constraints executes. C1 is no longer rechecked.
COMMIT;  

Statement DML1 introduces a violation of constraint C1. Because C1 has been identified as deferrable, its DI code will not raise an error to force a rollback of DML1. Instead, it stores information about the fact that C1 is currently violated somewhere inside the context of this transaction.

When DML2 is executed, various other (non-deferrable) constraints might be involved, and the DI code for those constraints is executed accordingly. In our modified execution model, this is now followed by a check of the transaction’s context to find out if certain constraints are currently violated. If such a constraint is found, then the modified execution model will now also re-execute the DI code of this constraint. If on recheck the constraint is found to still be in violation, then the context remains unchanged. If on recheck the constraint is found to be satisfied, then the context will be modified to reflect that constraint C1 is no longer in violation.

In the preceding scenario, statement DML3 repairs the violation of constraint C1. When DML4 is executed, the execution model now no longer re-executes the DI code for C1.

The preceding scenario shows you that you can use the triggers that fire for subsequent DML statements to recheck a deferrable constraint that a preceding DML statement violated.

You have one challenge left now. Can you prevent a commit from successfully executing when the transaction context still holds information stating that one or more deferrable constraints are in violation? The answer is yes, you can.

If the DI code for a given deferrable constraint finds that this constraint is violated, it can store this information in a session temporary table. And if the DI code finds that the constraint is satisfied again, then it deletes the associated record from the session temporary table. You can set up this session temporary table in a way that whenever this table holds a record, a transaction cannot successfully commit. Take a look at Listing 11-42, which defines this session temporary table.

Listing 11-42. Table for Storing Temporary Violations

create global temporary table current_violations
(constraint_name   varchar(30) not null
,constraint all_satisfied_at_commit check(0=1) initially deferred
,constraint curvio_pk primary key(constraint_name))
on commit preserve rows;

Do you see the trick that’s used? The way this session temporary table (on commit preserve rows) and the all_satisfied_at_commit constraint (initially deferred) are set up allows DI code to insert records in the current_violations table. However, at the same time it disables transactions from committing successfully when there are still records in this table, thereby preventing transactions from committing when a deferrable constraint is in violation.

As you’ll probably agree, bringing deferred execution into the picture complicates the execution model substantially:

  • DI code for deferrable constraints must now appropriately insert and delete from the current_violations table.
  • You must extend all DI code with procedural code that rechecks all constraints that are currently registered in the current_violations table. To be able to perform the recheck without having to replicate DI code for deferrable constraints, you’ll need to move current DI code from the trigger bodies into stored procedures; this enables you to call this code from other triggers too.
  • You might want a more efficient execution model than the one outlined so far. Currently a deferrable constraint (one that is in violation) is rechecked on every subsequent DML statement. For a given subsequent DML statement, you can deduce whether rechecking a constraint is even sensible or not. For instance, if a subsequent DML statement operates on a table that is not involved in the deferrable constraint, then this DML statement can never restore a violation of the deferrable constraint. To prevent unnecessary rechecks, you’ll only want to run the recheck if the subsequent DML statement is such that it could potentially restore a violation.

imagesNote In fact, you can determine this by using the transition effect to guard such re-execution of DI code for a deferrable constraint. Instead of querying the transition effect to verify if a DML statement can violate a constraint, you now do the inverse: query the transition effect to verify if a DML statement can restore a constraint.

This concludes the outline of an execution model for deferred checking. We wrap up this section with one important observation with regards to deferrable constraints.

There is a serious problem with allowing constraints to be temporarily violated inside transactions. You run the risk of getting incorrect results from queries executing in these transactions. For instance, assume that constraint PSPEC1 is currently violated due to the fact that in the current transaction an insert of a new sales reps into the EMP table structure has not yet been followed by a corresponding insert into the SREP table structure. Now suppose you want to determine the number of sales reps. When you write data retrieval statements, you normally assume that all constraints are satisfied. Under these circumstances, there are two ways to find the number of sales reps:

select count(*)
from EMP
where JOB='SALESREP';

select count(*)
from SREP;

Note that when PSPEC1 is violated in the way just described, then the first SELECT expression will return the correct result, and the second SELECT expression will return an incorrect result. Actually, in the given intermediate database state you might argue whether the number of sales reps is at all defined, because two supposedly equivalent query expressions return different results. Getting incorrect results is a serious problem when you allow constraints to be temporarily violated.

imagesNote The real solution for preventing this problem is to add the concept of a multiple assignment to the SQL language. We refer you to papers written by Chris Date and Hugh Darwen on this subject (see Appendix C).

Having explored various matters concerning the implementation of DI code in a triggered procedural approach, we conclude this chapter with a short introduction to a framework that can assist you in implementing DI code.

The RuleGen Framework

Having seen the various examples of DI code in this chapter, you can imagine that as the number of constraints that you have implemented grows, maintaining them can become quite a challenge. Our example database design only has about 50 multi-tuple constraints. Real-world database designs typically have hundreds—if not over a thousand—multi-tuple constraints, most of which cannot be stated declaratively to the DBMS.

For every constraint that you implement, you are repeating a lot of code over and over again; the parts that differ for each constraint are the TE queries, the validation query, and the serialization code. Wouldn’t it be great if you could just register these three for a given constraint and have some piece of software generate all the required row and statement triggers for you?

Over the past few years one of the authors—Toon Koppelaars—has developed a framework, called RuleGen, that does just this. RuleGen implements execution model EM6, including the outlined enhancements necessary to cater for deferrable constraints.

You register a constraint within RuleGen by populating a few tables of its repository. This involves information about the constraint, its involved tables and involved columns, the TE queries, the validation query, and the serialization code. Given this information, RuleGen will fully generate the necessary row and statement triggers for each involved table. Row triggers will maintain the transition effect that the statement triggers use. Statement triggers will validate the necessary constraints.

By the time this book is available, we expect to have made available more information about the RuleGen framework. If you are interested in this framework, you can find up-to-date information, documentation, and papers at http://www.rulegen.com/am4dp. We’ll also maintain a download on this site of all DI code for the constraints involved in the example database universe described in this book.

Chapter Summary

This section provides a summary of this chapter, formatted as a bulleted list.

  • You usually implement a database design in order to build a business application on top of it. These applications normally are window-on-data (WoD) applications. Users query and transact data by using these applications.
  • All code of a WoD application can be classified into three classes: user interface code (UI code), business logic code (BL code), and data integrity code (DI code). UI code creates the user interface that the user sees, and it responds to events initiated by the user in the user interface. DI code is responsible for the continued validity of all data integrity constraints as users change data in the database. BL code is responsible for composing and executing queries and transactions.
  • This chapter’s main focus has been how to implement DI code in an efficient manner.
  • You can implement DI code using one of the following three strategies: declarative, triggered procedural, or embedded procedural.
  • You can state all attribute and tuple constraints declaratively. You can state only a few table and database constraints declaratively.
  • The majority of (multi-row) data integrity constraints must be implemented procedurally. In this chapter, the triggered procedural strategy is preferred over the embedded procedural strategy.
  • We introduced you to six execution models for implementing DI code for multi-tuple constraints. These range from rather inefficient (every constraint is fully checked for every DML statement), to rather efficient (a constraint is conditionally checked in a minimal way).
  • Given Oracle’s standard read-committed isolation level, you must programmatically serialize DI code. Failure to do so can result in constraint violations when transactions execute concurrently. Serializing DI code of transition constraints is particularly difficult.
  • Certain constraints cannot be validated at the statement level; they require a deferred execution model of the DI code. Extending the execution models to cater for deferred execution of DI code is not easy. You’ve seen an outline of how this could be done, which involved setting up a central table where temporary violations are logged.
  • If you have many data integrity constraints that require a triggered procedural implementation, then the RuleGen framework can help you manage all DI code for these constraints.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.148.107.254