UPDATE OPERATIONS

I claimed earlier that The Principle of Interchangeability implies that views must be updatable (i.e., assignable to). Now, I can hear some readers objecting right away: Surely some views just can’t be updated, can they? For example, consider a view defined as the join—a many to many join, observe—of relvars S and P on {CITY}; surely we can’t insert a tuple into, or delete a tuple from, that view, can we? Note: I apologize for the sloppy manner of speaking here; as we know from Chapter 5, there’s no such thing as “inserting or deleting a tuple” in the relational model. But to be too pedantic about such matters in the present discussion would get in the way of understanding, probably.

Well, even if it’s true—which it might or might not be—that we can’t insert a tuple into or delete a tuple from S JOIN P, let me point out that certain updates on certain base relvars can’t be done, either. For example, inserting a tuple into relvar SP will fail if the SNO value in that tuple doesn’t currently exist in relvar S. Thus, updates on base relvars can always fail on integrity constraint violations—and the same is true for updates on views. In other words, it isn’t that some views are inherently nonupdatable; rather, it’s that some updates on some views will fail on integrity constraint violations (i.e., violations of The Golden Rule). Note: Actually, updates, on both base relvars and views, can fail on violations of The Assignment Principle too, as we’ll quickly see.

To illustrate the point (albeit briefly), let’s take a slightly closer look at the foregoing many to many join example (S JOIN P). Here’s a Tutorial D definition:

     VAR SCP VIRTUAL ( S JOIN P ) KEY { SNO , PNO } ;

Now, as a basis for discussing updates on this or any other view, it’s helpful to think of the view in question as if it were another base relvar, living alongside (as it were) the base relvars in terms of which it’s defined. Let’s see what happens if we adopt this mode of thinking in the case at hand. First of all, note that—by definition—relvar SCP is subject to the constraint that it’s equal to the join of S and P on {CITY}:

     CONSTRAINT VCX SCP = S JOIN P ;

Given our usual sample values, then, the following INSERT will succeed:

     INSERT SCP RELATION { TUPLE { SNO 'S6' , ... , CITY 'Madrid' ,
                                   PNO 'P7' , ... } } ;

(I’m making the obvious assumption here that inserting a tuple into SCP causes appropriate subtuples of that tuple to be inserted into S and P.) By contrast, consider the following INSERT:

     INSERT SCP RELATION { TUPLE { SNO 'S6' , ... , CITY 'London' ,
                                   PNO 'P7' , ... } } ;

Here there are two possibilities:

  • Appropriate subtuples are inserted into S and P, but the only tuple inserted into SCP is the one specified in the INSERT statement. Net effect: The Golden Rule is violated (to be specific, constraint VCX is violated), so the INSERT fails.

  • Appropriate subtuples are inserted into S and P and certain additional tuples, over and above the one specified, are inserted into SCP in order to ensure constraint VCX remains satisfied. Net effect: The Assignment Principle is violated, so the INSERT fails. Note: We could—and in my opinion we should—avoid this failure, however, by specifying some appropriate compensatory actions. See the subsection London vs. Non London Suppliers Revisited, later.

From these examples, I hope you can see that it’s not that INSERT operations are intrinsically impossible on view SCP; rather, it’s that some INSERTs (not all) on that view fail on a violation of either The Golden Rule or The Assignment Principle. I’ll go into more detail regarding such matters later; for now, let me just say that a much more detailed discussion of the foregoing example and others like it can be found in the paper “How to Update Views” (see Appendix G).

So let V be a view; in order to support updates on V properly, then, the system needs to know the total constraint, VC say, for V. In other words, it needs to be able to perform constraint inference, so that, given the constraints that apply to the relvars in terms of which V is defined, it can determine VC. As I’m sure you realize, however, SQL products today do, or are capable of doing, very little in the way of such constraint inference. As a result, SQL’s support for view updating is quite weak (and this is true of the standard as well as the major products). Typically, in fact, SQL products don’t allow updating on views any more complex than simple restrictions and/or projections of a single underlying base table (and even here there are problems). By way of example, consider view LS once again. That view is just a restriction of base table S, and we can certainly perform the following DELETE on it:

     DELETE
     FROM   LS
     WHERE  STATUS > 15 ;

This DELETE maps to:

     DELETE
     FROM   S
     WHERE  CITY = 'London'
     AND    STATUS > 15 ;

But what about INSERTs on this view? Here there are at least two problems:

  1. The new row might violate the requirement that the city must be London.

  2. The new row might have the same supplier number as some non London supplier.

I’ll address the first of these issues in the subsection The CHECK Option below and the second in the subsection London vs. Non London Suppliers Revisited later. As for SQL view updatability in general, I’ll have a little more to say on that topic in the next subsection but one (“More on SQL”).

The CHECK Option

Consider the following SQL INSERT on view LS:

     INSERT INTO LS ( SNO , SNAME , STATUS , CITY )
            VALUES  ( 'S6', .............. , 'Madrid' ) ;

This INSERT maps to:

     INSERT INTO S  ( SNO , SNAME , STATUS , CITY )
            VALUES  ( 'S6', .............. , 'Madrid' ) ;

(The change is just in the target table name.) Observe now that the new row violates the constraint for view LS, because the city isn’t London. So what happens? By default, SQL will insert that row into base table S; precisely because it doesn’t satisfy the defining expression for view LS, however, it won’t be visible through that view. From the perspective of that view, in other words, the new row just drops out of sight (alternatively, we might say the INSERT is a “no op”—again, from the perspective of the view). Actually, however, what’s happened from the perspective of view LS is that The Assignment Principle has been violated.

Now, I hope it goes without saying that the foregoing behavior is logically incorrect. It wouldn’t be tolerated in Tutorial D. As for SQL, the CHECK option is provided to address the problem: If (but only if) WITH CASCADED CHECK OPTION is specified for a given view, then updates to that view are required to conform to the defining expression for that view. Recommendation: Specify WITH CASCADED CHECK OPTION on view definitions whenever possible. Be aware, however, that SQL permits such a specification only if it regards the view as updatable,[132] and (as previously noted) not all logically updatable views are regarded as such in SQL. Note: The alternative to CASCADED is LOCAL, but don’t use it. (The reason I say this is that the semantics of LOCAL are bizarre in the extreme—so bizarre, in fact, that (a) I don’t want to waste time and space and energy attempting to explain them here, and in any case (b) it’s hard to see why anyone would ever want such semantics. Indeed, it’s hard to resist the suspicion that LOCAL was included in the standard originally for no other reason than to allow certain flawed implementations, extant at the time, to be able to claim conformance.) It’s all right to specify neither CASCADED nor LOCAL, however, because CASCADED is the default.

More on SQL

As we’ve seen, SQL’s support for view updating is limited. It’s also extremely hard to understand!—in fact, the standard is even more impenetrable in this area than it usually is. The following extract from the 2003 version of the standard gives some idea of the complexities involved:

[The] <query expression> QE1 is updatable if and only if for every <query expression> or <query specification> QE2 that is simply contained in QE1:

  1. QE1 contains QE2 without an intervening <query expression body> that specifies UNION DISTINCT, EXCEPT ALL, or EXCEPT DISTINCT.

  2. If QE1 simply contains a <query expression body> QEB that specifies UNION ALL, then:

    1. QEB immediately contains a <query expression> LO and a <query term> RO such that no leaf generally underlying table of LO is also a leaf generally underlying table of RO.

    2. For every column of QEB, the underlying columns in the tables identified by LO and RO, respectively, are either both updatable or not updatable.

  3. QE1 contains QE2 without an intervening <query term> that specifies INTERSECT.

  4. QE2 is updatable.

Here’s my own gloss on the foregoing extract:

  • The extract doesn’t seem to make sense, at least on the face of it. For one thing, the opening sentence says, in effect, that four conditions a), b), c), and d) have to be satisfied “for every ... QE2 that is simply contained in QE1”—yet item b) in particular doesn’t even mention QE2 (?). For another, a careful reading appears to indicate that the following <query expression> is updatable, which surely can’t be correct (I’m assuming here, not unreasonably, that columns X and Y are “both ... not updatable”):

    SELECT 2 * STATUS AS X
    FROM   S
    UNION  ALL
    SELECT 3 * QTY AS Y
    FROM   SP
  • Next, even if I’m wrong and the extract does make sense, note that it (i.e., the extract) states just one of the many rules that have to be taken in combination in order to determine whether a given view is updatable in SQL.

  • The rules in question aren’t given all in one place but are scattered over many different portions of the standard.

  • All of those rules rely on a variety of additional concepts and constructs—e.g., updatable columns, leaf generally underlying tables, <query term>s—that are in turn defined in still further portions of the standard.

Because of such considerations, I won’t even attempt a precise characterization here of just which views SQL regards as updatable. Loosely speaking, however, they do at least include the following:

  1. Views defined as a restriction and/or projection of a single base table

  2. Views defined as a one to one or one to many join of two base tables (in the one to many case, only the many side is updatable)

  3. Views defined as a UNION ALL or INTERSECT of two distinct base tables

  4. Certain combinations of Cases 1-3 above

But even these limited cases are treated incorrectly, thanks to SQL’s lack of understanding of (a) constraint inference, (b) The Golden Rule, and (c) The Assignment Principle, and thanks also to the fact that SQL permits nulls and duplicate rows. And the picture is complicated still further by the fact that SQL identifies four distinct cases: A view in SQL can be updatable, potentially updatable, simply updatable, or insertable into. Now, the standard defines these terms formally but gives no insight into their intuitive meaning or why they were given those names. However, I can at least say that “updatable” refers to UPDATE and DELETE and “insertable into” refers to INSERT, and a view can’t be insertable into unless it’s updatable.[133] But note the suggestion that some views might permit some updates but not others (e.g., DELETEs but not INSERTs), and the further suggestion that it’s therefore possible that DELETE and INSERT might not be inverses of each other. Both of these facts, if facts they are, I regard as further violations of The Principle of Interchangeability.

Regarding Case 1 above, however, I can be a little more precise. To be specific, an SQL view is certainly updatable if all of the following conditions are satisfied:

  • The defining expression is either (a) a simple SELECT expression (not a UNION, INTERSECT, or EXCEPT involving two such expressions) or (b) an “explicit table” (see Chapter 12) that’s logically equivalent to such an expression. Note: I’ll assume for simplicity in what follows that Case (b) is automatically converted to Case (a).

  • The SELECT clause in that SELECT expression specifies ALL, not DISTINCT.

  • After expansion of any “asterisk style” items, every item in the SELECT item commalist is a simple column name (possibly qualified, and possibly with an AS specification), and no such item appears more than once.

  • The FROM clause in that SELECT expression takes the form FROM T [AS ...], where T is the name of an updatable table (either a base table or an updatable view).

  • The WHERE clause, if any, in that SELECT expression contains no subquery in which the FROM clause references T.

  • The SELECT expression has no GROUP BY or HAVING clause.

    Recommendation: Lobby the SQL vendors to improve their support for view updating as soon as possible.

London vs. Non London Suppliers Revisited

Consider the case of London vs. non London suppliers once again (relvars LS and NLS). By The Principle of Interchangeability, the behavior of these relvars, and indeed that of relvar S also, shouldn’t depend on which relvars if any are base ones and which if any are views. Until further notice, therefore, let’s suppose all three are base relvars:

     VAR S   BASE RELATION { ... } KEY { SNO } ;
     VAR LS  BASE RELATION { ... } KEY { SNO } ;
     VAR NLS BASE RELATION { ... } KEY { SNO } ;

As the definitions show, {SNO} is a key for each of these relvars. The relvars are also clearly subject to the following constraints:

     CONSTRAINT ... LS  = S WHERE CITY = 'London' ;
     CONSTRAINT ... NLS = S WHERE CITY ≠ 'London' ;

What’s more, these constraints taken singly or together imply certain additional ones, as follows:

     CONSTRAINT ... IS_EMPTY ( LS  WHERE CITY ≠ 'London' ) ;
     CONSTRAINT ... IS_EMPTY ( NLS WHERE CITY = 'London' ) ;

     CONSTRAINT ... S = UNION { LS , NLS } ;
     CONSTRAINT ... IS_EMPTY ( JOIN { LS { SNO } , NLS { SNO } } ) ;

The first two of these additional constraints are self-explanatory; the third says every supplier is represented in either LS or NLS, and the fourth says no supplier is represented in both. (In other words, the union in the third constraint is actually a disjoint union, and the join in the fourth constraint is actually an intersection.)

Now, in order to ensure these constraints remain satisfied when updates are done, certain compensatory actions need to be in effect. In general, a compensatory action is an additional update (over and above some update that’s requested by the user) that’s performed automatically by the DBMS, precisely in order to avoid some integrity violation that might otherwise occur. Cascade delete is a typical example (see Chapter 5).[134] In the case at hand, in fact, it should be clear that “cascading” is exactly what we need to deal with DELETE operations in particular. To be specific, deleting a tuple from either LS or NLS clearly needs to “cascade” to cause that same tuple to be deleted from S. So we might imagine a couple of compensatory actions—actually cascade delete rules—that look something like this (hypothetical syntax):

     ON DELETE ls FROM LS : DELETE ls FROM S ;

     ON DELETE nls FROM LS : DELETE nls FROM S ;

Likewise, deleting a tuple from S clearly needs to “cascade” to cause that same tuple to be deleted from whichever of LS or NLS it appears in:

     ON DELETE s FROM S : DELETE ( s WHERE CITY = 'London' ) FROM LS ,
                          DELETE ( s WHERE CITY ≠ 'London' ) FROM NLS ;

As an aside, I remark that, given that an attempt (via DELETE, as opposed to I_DELETE) to delete a nonexistent tuple has no effect—see Chapter 5—we could replace each of the expressions in parentheses here by just s. However, the expressions in parentheses are perhaps preferable, inasmuch as they’re clearly more specific.

Analogously, we’ll need some compensatory actions (“cascade insert rules”) for INSERT operations:

     ON INSERT ls INTO LS : INSERT ls INTO S ;
     ON INSERT nls INTO LS : INSERT nls INTO S ;
     ON INSERT s INTO S : INSERT ( s WHERE CITY = 'London' ) INTO LS ,
                          INSERT ( s WHERE CITY ≠ 'London' ) INTO NLS ;

As for UPDATE operations, they can be regarded, at least in the case at hand, as a DELETE followed by an INSERT; in other words, the necessary compensatory actions are just a combination of the corresponding delete and insert rules, loosely speaking. For example, consider the following UPDATE on relvar S:

     UPDATE S WHERE SNO = 'S1' : { CITY := 'Oslo' } ;

What happens here is this:

  1. The existing tuple for supplier S1 is deleted from relvar S and (thanks to the cascade delete rule from S to LS) from relvar LS also.

  2. Another tuple for supplier S1, with CITY value Oslo, is inserted into relvar S and (thanks to the cascade insert rule from S to NLS) into relvar NLS also. In other words, the tuple for supplier S1 has moved from relvar LS to relvar NLS!—now speaking very loosely, of course.

    Suppose now that the original UPDATE had been directed at relvar LS rather than relvar S:

         UPDATE LS WHERE SNO = 'S1' : { CITY := 'Oslo' } ;

Now what happens is this:

  1. The existing tuple for supplier S1 is deleted from relvar LS and (thanks to the cascade delete rule from LS to S) from relvar S also.

  2. An attempt is made to insert another tuple for supplier S1, with CITY value Oslo, into relvar LS. This attempt fails, however, because it violates the constraint on that relvar that the CITY value must always be London. So the update fails overall; the first step (viz., deleting the original tuple for supplier S1 from LS and S) is undone, and the net effect is that the database remains unchanged.

The foregoing examples notwithstanding, sometimes an UPDATE compensatory action will have to be specified explicitly. In the case of our usual suppliers and shipments relvars, for example, there might be a cascade delete rule from S to SP but (for obvious reasons) no corresponding cascade insert rule, and so we might want to specify an explicit update rule along the following lines:

     ON UPDATE s { SNO := sno } IN S :
        UPDATE ( SP MATCHING s ) { SNO := sno } IN SP ;

Anyway, now I come to my real point: Everything I’ve said in this subsection so far applies pretty much unchanged if some or all of the relvars concerned are views. For example, suppose as we originally did that S is a base relvar and LS and NLS are views:

     VAR S   BASE RELATION { ................. } KEY { SNO } ;
     VAR LS  VIRTUAL ( S WHERE CITY = 'London' ) KEY { SNO } ;
     VAR NLS VIRTUAL ( S WHERE CITY ≠ 'London' ) KEY { SNO } ;

Now consider a user who sees only views LS and NLS, but wants to be able to behave as if those views were actually base relvars. Of course, that user will be aware of the corresponding relvar predicates, which as we saw earlier are essentially as follows:

LS: Supplier SNO is under contract, is named SNAME, has status STATUS, and is located in city CITY (which is London).

NLS:Supplier SNO is under contract, is named SNAME, has status STATUS, and is located in city CITY (which isn’t London).

That same user will also be aware of the following constraints (as well as the fact that {SNO} is a key for both relvars):

     CONSTRAINT ... IS_EMPTY ( LS  WHERE CITY ≠ 'London' ) ;
     CONSTRAINT ... IS_EMPTY ( NLS WHERE CITY = 'London' ) ;
     CONSTRAINT ... IS_EMPTY ( JOIN { LS { SNO } , NLS { SNO } } ) ;

However, the user won’t be aware of any of the compensatory actions as such, precisely because that user isn’t aware that LS and NLS are actually views of relvar S; in fact, the user won’t even be aware of the existence of relvar S (which is why the user is also unaware of the constraint that the union of LS and NLS is equal to S). But updates by that user on relvars LS and NLS will all work correctly, just as if LS and NLS really were base relvars.

What about a user who sees only view LS, say (i.e., not view NLS and not base relvar S), but still wants to behave as if LS were a base relvar? Well, that user will certainly be aware of the pertinent relvar predicate and the following constraint:

     CONSTRAINT ... IS_EMPTY ( LS WHERE CITY ≠ 'London' ) ;

Clearly, this user mustn’t be allowed to insert tuples into this relvar, nor to update supplier numbers within this relvar, because such operations have the potential to violate constraints of which this user is unaware (and must be unaware). Again, however, there are parallels with base relvars as such: With base relvars in general, it’ll be the case that certain users will be prohibited from performing certain updates on the relvar in question (e.g., consider a user who sees only base relvar SP and not base relvar S). So this state of affairs doesn’t in and of itself constitute a violation of The Principle of Interchangeability, either.

One last point: Please understand that I’m not suggesting that the DBA should have to specify, explicitly, all of the various constraints and compensatory actions that apply in connection with any given view. Au contraire: In many cases if not all, I believe the DBMS should be able to determine those constraints and actions for itself, automatically, from the pertinent view definitions. Again, see the paper “How to Update Views” for further explanation.



[132] And then only if the view defining expression isn’t possibly nondeterministic (see Chapter 12). Incidentally, note the implication here that SQL allows updates on “possibly nondeterministic views,” and the further implication that SQL is apparently quite willing to allow certain updates to have unpredictable results! This state of affairs strikes me as odd, given that (as far as I know) the rationale for not allowing possibly nondeterministic expressions in constraints was precisely to avoid updates having unpredictable results.

[133] The asymmetry here is intuitively odd. An argument might be made—not by me!—that you can do DELETEs but not INSERTs on a union view, because the delete rule is obvious (delete from both operands) but the insert rule isn’t (do we insert into both operands or just one—and if just one, which?). But an exactly analogous argument would surely say you can do INSERTs but not DELETEs on an intersection view (?). In other words, if some views are “deletable from but not insertable into,” then surely others must be “insertable into but not deletable from.”

[134] Cascade delete in particular is usually thought of as applying to foreign key constraints specifically, but the concept of compensatory actions in general is applicable to constraints of all kinds. Also, don’t get the idea that such actions must always take the form of simple “cascades”; while all of the examples examined in the present subsection do happen to take that form, more complicated cases are likely to require actions of some less straightforward form.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.222.117.35