DATABASE CONSTRAINTS

A database constraint constrains the values that can appear in a given database. In Tutorial D, such constraints are specified by means of a CONSTRAINT statement (or some logically equivalent shorthand); in SQL, they’re specified by means of a CREATE ASSERTION statement (or, again, some equivalent shorthand). I don’t want to get into details of those shorthands—at least, not yet—because they’re essentially just a matter of syntax; for now, let me stay with the “longhand” forms. Here then are some examples (Tutorial D on the left and SQL on the right as usual):

Example 1:

image with no caption

Constraint CX1 says: Supplier status values must be in the range 1 to 100 inclusive. This constraint involves just a single attribute of a single relvar. Note in particular that it can be checked for a given supplier tuple by examining just that tuple in isolation—there’s no need to look at any other tuples in the relvar or any other relvars in the database. For that reason, such constraints are sometimes referred to, informally, as tuple constraints, or row constraints in SQL—though this latter term is also used in SQL to mean, more specifically, a row constraint that can’t be formulated as a column constraint (see the section DATABASE CONSTRAINTS IN SQL). Now, all such usages ought really to be deprecated, because constraints constrain updates, and as we saw in Chapter 5 there’s no such thing as a tuple or row level update in the relational world. (By the same token, there’s no such thing as a tuple variable, or row variable, in a relational database.) However, the terms can sometimes be convenient, and so—somewhat against my own better judgment—I’ll be using them occasionally myself in what follows.

Recall now that (as mentioned in a footnote in Chapter 7) tuple constraints can alternatively be formulated in terms of the aggregate operator AND. Here by way of example is such a formulation for constraint CX1:

     CONSTRAINT CX1 AND ( S , STATUS ≥ 1 AND STATUS ≤ 100 ) ;

This formulation says, in effect, that the expression STATUS ≥ 1 AND STATUS ≤ 100 must evaluate to TRUE for every tuple in S. As you can see, the desired constraint (“Status values must be greater than or equal to 1 and less than or equal to 100”) is stated a little more directly with this formulation than it was with the IS_EMPTY version, where it had to be stated in the negative (“Status values mustn’t be less than 1 or greater than 100”).

What about SQL? Well, SQL’s analog of AND is called EVERY. Here’s an SQL formulation of constraint CX1 that makes use of that operator:

     CREATE ASSERTION CX1 CHECK
        ( ( SELECT COALESCE ( EVERY ( STATUS >= 1 AND STATUS <= 100 ) ,
                              TRUE )
            FROM   S ) = TRUE ) ;

As you can see, however, this formulation isn’t as user friendly as the Tutorial D version, for at least two reasons:

  • First, EVERY, unlike Tutorial D’s AND, returns null, not TRUE, if its argument is empty; hence the need for that COALESCE.

  • Second, I pointed out in Chapter 7 that SQL doesn’t really support aggregate operators anyway, and the present example brings that point home. To be specific, the parenthesized subexpression SELECT ... FROM S is, of course, a table expression; hence it denotes, not a truth value as such, but rather a one-row, one-column table that contains such a truth value. In fact, that subexpression, parentheses included, is a scalar subquery. As explained in Chapter 2, then, asking for that subquery and the literal value TRUE to be tested for equality causes a double coercion to occur; in other words, the truth value is effectively extracted from the table and then tested to see whether it’s equal to TRUE.

The net of this discussion is that EVERY isn’t nearly as useful for the formulation of row constraints in SQL as AND is for the formulation of tuple constraints in Tutorial D.

Aside: The foregoing might be a little unfair to SQL. To be specific, I think—according to my own reading of the standard—that it would be possible to simplify the example by omitting both the COALESCE and the explicit comparison with TRUE, thereby reducing the CHECK clause portion of the assertion to just:

CHECK ( ( SELECT EVERY ( STATUS >= 1 AND STATUS <= 100 ) FROM S ) ) ;

But these simplifications rely on several aspects of SQL that are, to put matters politely, hardly very respectable. First of all, note that the double enclosing parentheses are necessary—the outer parentheses enclose a subquery, which requires parentheses of its own. Second, the subquery in question is in fact a scalar subquery, and the table it returns gets doubly coerced to the single value—actually a truth value—in the single column of the single row of the table in question (see Chapter 12). Third, if the EVERY invocation in fact returns a null, that null is considered to stand for the truth value UNKNOWN (see Chapter 4). Fourth, if the boolean expression in a CHECK clause evaluates to UNKNOWN, that UNKNOWN gets coerced to TRUE! (See the answer to Exercise 8.20g in Appendix F for further discussion of this last point.) Speaking for myself, therefore, I would far rather include both the COALESCE and the comparison with TRUE, in the interest of explicitness if nothing else. End of aside.

Example 2:

image with no caption

Constraint CX2 says: Suppliers in London must have status 20. This constraint involves two distinct attributes; however, it’s still the case, as it was with constraint CX1, that the constraint can be checked for a given supplier tuple by examining just that tuple in isolation (hence it too is a tuple or row constraint). Here for interest are AND and EVERY formulations:

image with no caption

Example 3:

image with no caption

Constraint CX3 says: Every supplier has a unique supplier number; in other words, {SNO} is a superkey—actually, of course, it’s a key—for relvar S (recall from Chapter 5 that a superkey is a superset of a key, loosely speaking). Like constraints CX1 and CX2, this constraint still involves just one relvar; however, it can’t be checked for a given supplier tuple by examining just that tuple in isolation, and so it isn’t a tuple or row constraint. Points arising:

  • In practice, of course, it’s very unlikely that constraint CX3 would be specified in longhand as shown—some kind of explicit KEY shorthand is almost certainly preferable. I give the longhand form merely to make the point that such shorthands are indeed, in the final analysis, just shorthands.[115]

  • As you can see, the SQL formulation of constraint CX3 involves an invocation of the SQL UNIQUE operator. That operator returns TRUE if and only if every row within its argument table is distinct; in the example, therefore, the UNIQUE invocation returns TRUE if and only if no two rows in table S have the same supplier number. Note, incidentally, that the SELECT expression in that invocation must—for once—definitely not specify DISTINCT! (Why not?) I’ll have more to say about SQL’s UNIQUE operator in Chapter 10.

Here for interest is an SQL formulation of constraint CX3 that more closely resembles the Tutorial D formulation:[116]

     CREATE ASSERTION CX3 CHECK
        ( ( SELECT COUNT ( ALL SNO ) FROM S ) =
          ( SELECT COUNT ( DISTINCT SNO ) FROM S ) ) ;

Example 4:

image with no caption

Constraint CX4 says: Whenever two suppliers have the same supplier number, they also have the same city. In other words, a certain functional dependency (FD) holds in relvar S—namely, an FD from {SNO} to {CITY}. In practice, as I’m sure you know, that FD would more usually be expressed as follows:

{ SNO } → { CITY }

Here’s a precise definition:

Definition: Let A and B be subsets of the heading of relvar R. Then the functional dependency (FD) AB holds in R if and only if, in every relation that’s a legal value for R, whenever two tuples have the same value for A, they also have the same value for B.

The FD AB is read as “B is functionally dependent on A,” or “A functionally determines B,” or, more simply, just “A arrow B.” As the example shows, however, a functional dependency is basically just another integrity constraint (though, like constraint CX3, it isn’t a tuple or row constraint).

Now, as noted in Chapter 5, the fact that relvar S is subject to this particular FD is a logical consequence of the fact that {SNO} is a key for that relvar. For that reason, there’s no need to state it explicitly, just so long as the fact that {SNO} is a key is stated explicitly. But not all FDs are consequences of keys. For example, suppose it’s the case that if two suppliers are in the same city, then they must have the same status. This hypothetical new constraint (which is not satisfied by our usual sample values, please note) is clearly an FD:

{ CITY } → { STATUS }

It can thus be stated in the style of constraint CX4 (see Exercise 8.22 at the end of the chapter).

Now, you might be thinking some shorthand syntax would be desirable for stating FDs, similar to the shorthand we already have for stating keys. Myself, I don’t think so, because although not all FDs are consequences of keys in general, all FDs will almost certainly be consequences of keys if the database is well designed. In other words, the very fact that FDs are hard to state if the database is badly designed might be seen as a small argument in favor of not designing the database badly in the first place! Note: By “well designed” here, I really mean fully normalized. Normalization as such is beyond the scope of this book (it’s covered in depth in the book Normal Forms and All That Jazz, which is a companion to the present book—see Appendix G). Of course, relational (or SQL) statements and expressions will work regardless of whether or not the relvars (or tables) are fully normalized. But I should at least point out that those statements and expressions will often be easier to formulate (and, contrary to popular opinion, will often perform better too) if the tables are fully normalized. However, normalization as such is primarily a database design issue, not a relational model or SQL issue.

Example 5:

image with no caption

Constraint CX5 says: No supplier with status less than 20 can supply part P6. Observe that this constraint involves (better: interrelates) two distinct relvars, S and SP. In general, a database constraint might involve, or interrelate, any number of distinct relvars. Terminology: A constraint that involves just a single relvar is known, informally, as a relvar constraint (sometimes a single relvar constraint, for emphasis); a constraint that involves two or more distinct relvars is known, informally, as a multirelvar constraint. (Thus, constraints CX1-CX4 were single relvar constraints, while constraint CX5 is a multirelvar constraint.) All of these terms are somewhat deprecated, however, for reasons to be discussed in the next chapter in connection with what’s called The Principle of Interchangeability.

Example 6:

image with no caption

Constraint CX6 says: Every supplier number in SP must appear in S. As you can see, the Tutorial D formulation involves a relational comparison; SQL doesn’t support relational comparisons, however, and so we have to indulge in some circumlocution in the SQL formulation. However, given that {SNO} is a key—in fact, the sole key—for relvar S, it’s clear that constraint CX6 is basically just the foreign key constraint from SP to S. The usual FOREIGN KEY syntax can thus be regarded as shorthand for constraints like CX6.



[115] In SQL, that shorthand would involve a specification of the form UNIQUE(SNO) as part of the CREATE TABLE for table S. The semantics of such a specification are explained by the standard as follows (I’ve adapted the standard’s own generic phrasing to apply to the specific case at hand): “The constraint UNIQUE(SNO) is not satisfied if and only if EXISTS(SELECT * FROM S WHERE NOT(UNIQUE(SELECT SNO FROM S))) is true.” I hope that’s perfectly clear.

[116] But is this SQL formulation valid? As you can see, it involves an equality comparison in which the comparands are denoted by subqueries. Since subqueries evaluate to tables, it appears we’re trying to test two tables for equality—yet we saw in Chapter 3 that SQL doesn’t support table comparisons. See Exercise 12.5 in Chapter 12.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.116.10.201