MISCELLANEOUS ISSUES

There are a few further points to do with integrity that I need to cover somewhere but don’t fit very well into any of the preceding sections.

First of all, a constraint, since it’s basically a boolean expression that must evaluate to TRUE, is in fact a proposition (I more or less suggested as much in the previous section, but I never came out and stated it explicitly). To see that this is so, consider constraint CX1 once again from the section DATABASE CONSTRAINTS:

     CONSTRAINT CX1 IS_EMPTY ( S WHERE STATUS < 1 OR STATUS > 100 ) ;

The relvar name “S” here constitutes what logicians call a designator; when the constraint is checked, it designates a specific value—namely, the value of that relvar at the time in question. By definition, that value is a relation (s, say), and so the constraint effectively becomes:

     CONSTRAINT CX1 IS_EMPTY ( s WHERE STATUS < 1 OR STATUS > 100 ) ;

Clearly, the boolean expression here—which is really the constraint as such, “CONSTRAINT CX1” being little more than window dressing—is certainly either true or false, unequivocally, and that’s the definition of what it means to be a proposition (see Chapter 5).

Second, suppose relvar S already contains a tuple that violates constraint CX1 when the CONSTRAINT statement just shown is executed; then that execution must fail. More generally, whenever we try to define a new database constraint, the system must first check to see whether that constraint is satisfied by the database at that time; if it isn’t, the constraint must be rejected, otherwise it can be accepted and enforced from that point forward.

Third, relational databases are supposed to satisfy the referential integrity rule, which says there mustn’t be any unmatched foreign key values. Now, in Chapter 1, I referred to this rule as a “generic integrity constraint.” However, it should be clear by now that it’s somewhat different in kind from the constraints we’ve been examining in this chapter. It’s really a metaconstraint, in a sense; what it says is that every specific database must satisfy the specific referential constraints that apply to that particular database. In the case of the suppliers-and-parts database, for example, it says the referential constraints from SP to S and P must be satisfied—because if they aren’t, then that database will violate the referential integrity metaconstraint. Likewise, in the case of the departments-and-employees database from Chapter 1, the referential constraint from EMP to DEPT must be satisfied, because if it isn’t, then again that database will violate the referential integrity metaconstraint.

Fourth, I remind you from Chapter 5 that update operators are always set level, and hence that constraint checking mustn’t be done until all of the updating has been done; i.e., a set level update mustn’t be treated as a sequence of individual tuple level updates (or row level updates, in SQL). I also said in that chapter that the SQL standard does conform to this requirement, but that products might not. Indeed, the last time I looked, there was at least one major product that didn’t conform but (on foreign key constraints, at least) did “inflight checking” instead. One problem with this state of affairs is that it can lead to undesirable and possibly complex prohibitions against certain operations. For example, suppose there’s a cascade delete rule from suppliers to shipments. Then the product in question won’t allow the following apparently innocuous, and reasonable, DELETE statement:

     DELETE
     FROM   S
     WHERE  SNO NOT IN
          ( SELECT SNO
            FROM   SP ) ;

(an attempt to delete suppliers with no shipments).

Another issue I didn’t mention previously is the possibility of supporting what are called transition constraints. A transition constraint is a constraint on the legal transitions that variables of some kind—relvars in particular—can make from one value to another (by contrast, a constraint that isn’t a transition constraint is sometimes said to be a state constraint). For example, a person’s marital status can change from “never married” to “married” but not the other way around. Here’s a database example (“No supplier’s status must ever decrease”):

     CONSTRAINT CX9 IS_EMPTY
          ( ( ( S′ { SNO , STATUS } RENAME { STATUS AS OLD } )
                JOIN
              ( S  { SNO , STATUS } RENAME { STATUS AS NEW } ) )
            WHERE OLD > NEW ) ;

Explanation: I’m adopting the convention that a primed relvar name such as S′ refers to the pertinent relvar as it was prior to the update under consideration. Constraint CX9 thus says: If we join the old value of S and the new one on {SNO} and restrict the result to just those tuples where the old status is greater than the new one, the final result must be empty. (Since the join is on {SNO}, any tuple in the join for which the old status is greater than the new one would represent a supplier whose status had decreased.)

Transition constraints aren’t currently supported in either Tutorial D or SQL (other than procedurally).

Last, I hope you agree from everything we’ve covered in this chapter that constraints are vital—and yet they seem to be very poorly supported in current products; indeed, they seem to be underappreciated at best, if not completely misunderstood. The emphasis in practice always seems to be on performance, performance, performance; other objectives, such as ease of use, physical data independence, and in particular integrity, seem so often to be sacrificed to—or at best to take a back seat to—that overriding goal.[125]

Now, I don’t want you to misunderstand me here. Of course performance is important too. Functionally speaking, a system that doesn’t deliver at least adequate performance isn’t a system (not a usable one, at any rate). But what’s the point of a system performing well if we can’t be sure the results we’re getting from it are correct? Frankly, I don’t care how fast a system runs if I don’t feel I can trust it to give me the right answers to my queries.



[125] I don’t mean to suggest here that system enforcement of constraints implies bad performance; in fact, I think it ought to improve performance. (Not to mention the fact that user enforcement is highly nontrivial, and very likely to be incorrect! The book by Lex de Haan and Toon Koppelaars, Applied Mathematics for Database Professionals, mentioned in an earlier footnote, gives a good idea of what’s involved in such enforcement.) All I mean is, there tends to be a huge emphasis in vendor development effort on performance issues, to the exclusion of other matters such as data integrity.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.138.102.178