Chapter 10. Union Views

Listen to me—go spread the news—

It’s easy to update union views!

Anon.:

Now let’s move on to union. You won’t be surprised to learn that the structure of this chapter is very similar to that of the previous one, on intersection. Just one preliminary remark: A UNION B resembles A INTERSECT B in that it’s not very interesting if A = B; unlike A INTERSECT B, however, it certainly is interesting—well, somewhat interesting—if A and B are disjoint. So my first example will be exactly that, an example involving a disjoint union.

Example 1: Disjoint Union

This example is essentially the inverse of the motivating example from Chapters Chapter 1 and Chapter 4. Thus, we’re given two relvars, LS (“London suppliers”) and NLS (“non London suppliers”), that look like this:

LS  { SNO , SNAME , STATUS , CITY } KEY { SNO }
NLS { SNO , SNAME , STATUS , CITY } KEY { SNO }

Of course, in terms of our usual suppliers-and-parts database, these relvars are probably views—views of relvar S, to be specific—but you can think of them as base relvars if you like. More to the point, observe that these relvars are indeed disjoint, inasmuch as it’s impossible for the very same tuple to appear in both. Here are the predicates:

LS: Supplier SNO is under contract, is named SNAME, has status STATUS, and is located in city CITY (and CITY is London).
NLS: Supplier SNO is under contract, is named SNAME, has status STATUS, and is located in city CITY (and CITY isn’t London).

Now let’s define the (disjoint) union of LS and NLS as a view ULN:

ULN { SNO , SNAME , STATUS , CITY } KEY { SNO }

The predicate for this relvar is:

ULN: Supplier SNO is under contract, is named SNAME, has status STATUS, and is located in city CITY (and CITY is either London or not London).

(Of course, if LS and NLS are indeed views of relvar S, then ULN is identical to that relvar S.) Sample values are shown in Figure 10-1 below.

Relvars LS, NLS, and ULN—sample values
Figure 10-1. Relvars LS, NLS, and ULN—sample values

The following constraints, edited versions of the ones from the discussion of the corresponding example in Chapter 4, clearly apply:

CONSTRAINT ... ULN = LS UNION NLS ;
CONSTRAINT ... DISJOINT { LS { SNO } , NLS { SNO } } ;
CONSTRAINT ... IS_EMPTY ( LS  WHERE CITY ≠ 'London' ) ;
CONSTRAINT ... IS_EMPTY ( NLS WHERE CITY = 'London' ) ;
CONSTRAINT ... LS  = ( ULN WHERE CITY = 'London' ) ;
CONSTRAINT ... NLS = ( ULN WHERE CITY ≠ 'London' ) ;

Also, each of the three relvars has {SNO} as its sole key, as already indicated, and {SNO} is a foreign key, referencing ULN, in each of LS and NLS. Note that (in contrast to the intersection examples discussed in the previous chapter) here we do have information equivalence—the design consisting of ULN by itself is information equivalent to the design consisting of LS and NLS taken in combination.

The following compensatory actions are also taken, lightly edited, from Chapter 4:

ON DELETE d FROM LS : DELETE d FROM ULN ;
ON DELETE d FROM NLS : DELETE d FROM ULN ;

ON INSERT i INTO LS : INSERT i INTO ULN ;
ON INSERT i INTO NLS : INSERT i INTO ULN ;

ON DELETE d FROM ULN : DELETE ( d WHERE CITY = 'London' ) FROM LS ,
                       DELETE ( d WHERE CITY ≠ 'London' ) FROM NLS ;

ON INSERT i INTO ULN : INSERT ( i WHERE CITY = 'London' ) INTO LS ,
                       INSERT ( i WHERE CITY ≠ 'London' ) INTO NLS ;

Note that (again in contrast to the intersection examples in the previous chapter) there’s no question here of inserts or deletes on either of LS and NLS cascading to the other.

As for a user who sees just relvar ULN, I hope it’s obvious without going into detail that such a user can behave in all respects exactly as if that relvar is a base relvar.

Let me now point out that if LS and NLS are indeed views of relvar S, then ULN could alternatively have been defined directly as a restriction of S, thus:

S WHERE CITY = 'London' OR CITY ≠ 'London'

Let’s call this restriction ULN’. Clearly, then, updates on ULN’ should have the same effect as updates on ULN—and so they do. To be specific:

  • INSERT: Tuples to be inserted into either ULN or ULN’ must have CITY value either London or not London (!) and so must be inserted into the other as well.

  • DELETE: Tuples to be deleted from either ULN or ULN’ certainly do have CITY value either London or not London and so must be deleted from the other as well.

(As noted in the previous chapter, I’ll have a lot more to say about the general question of updates on relvars with different but equivalent definitions in Chapter 14.)

Well, as you can see, this first union example is essentially identical to the motivating example from Chapters Chapter 1 and Chapter 4, mutatis mutandis, and there’s really not much more to say about it. So let’s hurry on to another example, this one involving an overlapping union instead of a disjoint one.

Example 2: Explicit Overlap

My second example, like Example 1 in the previous chapter, is based on the one previously discussed in the section Overlapping Restrictions in Chapter 4. It involves two relvars, NLS (“non London suppliers”) and NPS (“non Paris suppliers”), that look like this:

NLS { SNO , SNAME , STATUS , CITY } KEY { SNO }
NPS { SNO , SNAME , STATUS , CITY } KEY { SNO }

As pointed out in the previous chapter, these relvars do indeed overlap, in the sense that it’s possible for the very same tuple to appear in both; moreover, the overlap is explicit, in the sense explained in that same chapter. Here once again are the predicates:

NLS: Supplier SNO is under contract, is named SNAME, has status STATUS, and is located in city CITY (and CITY isn’t London).
NPS: Supplier SNO is under contract, is named SNAME, has status STATUS, and is located in city CITY (and CITY isn’t Paris).

Now let’s define the union of these two relvars as a view ULP:

ULP { SNO , SNAME , STATUS , CITY } KEY { SNO }

The predicate for this relvar is:

ULP: Supplier SNO is under contract, is named SNAME, has status STATUS, and is located in city CITY (and CITY either isn’t London or isn’t Paris).

(Of course, if NLS and NPS are views of relvar S, as they well might be, then ULP, like ULN in the previous section, is identical to that relvar S.) Sample values are shown in Figure 10-2.

Relvars NLS, NPS, and ULP—sample values
Figure 10-2. Relvars NLS, NPS, and ULP—sample values

The following constraints clearly apply:

CONSTRAINT ... ULP = NLS UNION NPS ;
CONSTRAINT ... IS_EMPTY ( NLS WHERE CITY = 'London' ) ;
CONSTRAINT ... IS_EMPTY ( NPS WHERE CITY = 'Paris' ) ;
CONSTRAINT ... ( NLS WHERE CITY ≠ 'Paris' ) =
                         ( NPS WHERE CITY ≠ 'London' ) ;

Also, each of the three relvars has {SNO} as its sole key, as already indicated, and {SNO} is a foreign key, referencing ULP, in each of NLS and NPS. Note that here again we have information equivalence—the design consisting of ULP by itself is information equivalent to the design consisting of NLS and NPS in combination. Note: The fact that {SNO} is a key for ULP in particular guarantees that if some supplier is represented in both NLS and NPS, then the two tuples representing that supplier are in fact one and the same. Of course, this constraint will be enforced automatically if NLS and NPS are in fact views of the suppliers relvar S.

Here now are the compensatory actions—first, the rules for cascading updates on NLS to NPS and vice versa (which are the same as their counterparts in Chapter 9, of course):

ON INSERT i INTO NLS : INSERT ( i WHERE CITY ≠ 'Paris' ) INTO NPS ;
ON INSERT i INTO NPS : INSERT ( i WHERE CITY ≠ 'London' ) INTO NLS ;

ON DELETE d FROM NLS : DELETE ( d WHERE CITY ≠ 'Paris' ) FROM NPS ;
ON DELETE d FROM NPS : DELETE ( d WHERE CITY ≠ 'London' ) FROM NLS ;

And here are the rules for cascading updates on NLS and NPS to the union view ULP:

ON INSERT i INTO NLS : INSERT i INTO ULP ;
ON INSERT i INTO NPS : INSERT i INTO ULP ;

ON DELETE d FROM NLS : DELETE ( d WHERE CITY ≠ 'Paris' ) FROM ULP ;
ON DELETE d FROM NPS : DELETE ( d WHERE CITY ≠ 'London' ) FROM ULP ;

Finally, here are the rules for updates on the union relvar as such:

ON DELETE d FROM ULP :
   DELETE ( d WHERE CITY ≠ 'London' ) FROM NLS ,
   DELETE ( d WHERE CITY ≠ 'Paris' ) FROM NPS ;

ON INSERT i INTO ULP :
   INSERT ( i WHERE CITY ≠ 'London' ) INTO NLS ,
   INSERT ( i WHERE CITY ≠ 'Paris' ) INTO NPS ;

Points arising:

  • Let’s focus on the insert rule for a moment. As far as relvar ULP is concerned, it would of course be sufficient to cascade to just one of the relvars NLS and NPS. But cascading to either NLS or NPS will (if logically necessary) cause an additional cascade to the other relvar anyway; and since there’s no good reason to favor one over the other, at least in concrete syntax (assuming, of course, that both relvars are visible)—it’s a difference that makes no difference, so to speak—I’ve specified cascades to both, for reasons of symmetry and explicitness.

  • As a matter of fact, a similar remark applies to the delete rule also (in general, deleting from a union needs to cascade to both of the relvars involved; in the case at hand, however, cascading to either will cause an additional cascade to the other one anyway, if logically necessary).

  • In the case of the delete rule, the restriction conditions could be dropped without significant loss.

As for a user who sees just relvar ULP, I hope it’s obvious without going into detail that such a user can behave in all respects exactly as if that relvar is a base relvar.

Let me now point out that if NLS and NPS are indeed views of relvar S, then ULP could alternatively have been defined directly as a restriction of S, thus:

S WHERE CITY ≠ 'London' OR CITY ≠ 'Paris'

Let’s call this restriction ULP’. Clearly, then, updates on ULP’ should have the same effect as updates on ULP—and so they do. To be specific:

  • INSERT: Tuples to be inserted into either ULP or ULP’ must have CITY value either not London or not Paris and so must be inserted into the other as well.

  • DELETE: Tuples to be deleted from either ULP or ULP’ certainly do have CITY value either not London or not Paris and so must be deleted from the other as well.

(Once again, I’ll have more to say about the general question of updates on relvars with different but equivalent definitions in Chapter 14.)

As you can see, then, this second example isn’t seriously different in kind from the one discussed in the previous section (essentially because both preserve information equivalence). So let’s move on to look at an example where information equivalence is lost.

Example 3: Implicit Overlap

This example is an edited version of Example 2 from the previous chapter. Just to remind you, the basic situation is as follows: At any given time, some parts are on sale, some parts are in stock, and some are both. Furthermore, this state of affairs is represented by means of two relvars, each with just a single attribute PNO—one relvar (PL) giving part numbers for parts on sale, the other (PK) giving part numbers for parts in stock. Of course, the union of these two relvars—call it ULK—gives part numbers for parts that are either on sale or in stock or both. The point is, however, we can’t tell just by looking at a given tuple in the original parts relvar P whether that part ought to be represented in PL or PK or both or neither. So the predicates are as follows:

PL: Part PNO is on sale.
PK: Part PNO is in stock.
ULK: Part PNO is on sale or in stock.

Sample values are given in Figure 10-3.

Relvars PL, PK, and ULK—sample values
Figure 10-3. Relvars PL, PK, and ULK—sample values

As for constraints, {PNO} is obviously the sole key for each of these three relvars, and of course ULK is equal to the union of the other two:

CONSTRAINT ... ULK = PL UNION PK ;

Updates on PL and/or PK are noncontroversial:

ON INSERT i INTO PL : INSERT i INTO ULK ;
ON INSERT i INTO PK : INSERT i INTO ULK ;

ON DELETE d FROM PL : DELETE ( d MINUS PK ) FROM ULK ;
ON DELETE d FROM PK : DELETE ( d MINUS PL ) FROM ULK ;

But what about updates on ULK? Well, the delete rule is obvious:

ON DELETE d FROM ULK :
   DELETE d FROM PL , DELETE d FROM PK ;

As for the insert rule, clearly it would be sufficient for inserts on ULK to cascade to just one of PL and PK (it must cascade to at least one, of course). However, there’s no good reason for choosing either of these possibilities over the other; in fact, what we have here is the ambiguity (?) issue once again, albeit in a slightly different guise, and I’ll have a little more to say about it in the final section of this chapter. For now, however, I propose the following as an appropriate insert rule (and observe that, not incidentally, it has the same general form as its counterparts in Examples 1 and 2 in the previous two sections):

ON INSERT i INTO ULK :
   INSERT i INTO PL , INSERT i INTO PK ;

Of course, as indicated earlier, we’re dealing here once again with a situation in which information equivalence is lost. To be specific, any information that can be represented by relvar ULK alone can certainly be represented by relvars PL and PK taken in combination, but the converse is false. (Here’s an example of a query on the latter that has no exact counterpart on the former: “Get part numbers for parts that are on sale and not in stock.”) As a consequence, it should be obvious that there’ll be certain updates that can be done on PL and/or PK that have no exact counterpart on ULK. An example of such an update is “Delete some existing tuple from PL without simultaneously deleting that same tuple from PK” (i.e., update the database to say some part is no longer on sale but, assuming it was previously in stock, still is).

I’ll leave it as an exercise to determine the implications of all of the foregoing for a user who sees just relvar ULK. However, let me now point out that (as in the case of the intersection analog of this example in the previous chapter) there’s unfortunately another issue here. Suppose we start off with just relvars PL and PK, without the union view. Suppose too, just to be definite, that part P4 is represented in neither PL nor PK. Then the following INSERT operation is clearly legitimate:

INSERT ( P4 ) INTO PL ;

Observe in particular that there’s no question of this insert cascading to relvar PK. But now suppose we introduce the union view ULK. Given the rules defined above, then, this insert will now cascade to relvar PK!—and it’ll do so, moreover, even if view ULK isn’t visible to the user issuing the insert on relvar PL. Or to put the point another way: Introducing that view apparently requires the simultaneous introduction of a cascade insert rule from PL to PK and vice versa.[108]

Now, one possible fix for this problem, in the particular case at hand, is to execute an appropriate delete immediately after the insert, thus:

INSERT ( P4 ) INTO PL ;
DELETE ( P4 ) FROM PK ;

However, the solution to be discussed in the subsection immediately following is in my opinion greatly to be preferred (of course, it’s essentially the same as the solution to the corresponding problem in the intersection case, as discussed in the previous chapter).

A Better Design

As you’ll recall from the previous chapter, the solution I have in mind here involves replacing relvars PL and PK in their entirety by a single relvar, POI, with attributes PNO, ON_SALE, and IN_STOCK (where attributes ON_SALE and IN_STOCK are of type BOOLEAN and have the obvious interpretations, and {PNO} is the sole key). A possible value for such a relvar is shown in Figure 10-4 (a repeat of Figure 9-3 from the previous chapter).

Relvar POI—sample value
Figure 10-4. Relvar POI—sample value

Now we define restriction views PL and PK’ and union view ULK’ as indicated by the following constraints:

CONSTRAINT ... PL'  = POI WHERE ON_SALE ;
CONSTRAINT ... PK'  = POI WHERE IN_STOCK ;
CONSTRAINT ... ULK' = PL' UNION PK' ;

Figure 10-5 shows sample values corresponding to those in Figure 10-4.

Relvars PL’, PK’, and ULK’—sample values
Figure 10-5. Relvars PL’, PK’, and ULK’—sample values

The predicates are as follows:

PL’: Part PNO is on sale if and only if ON_SALE is TRUE and in stock if and only if IN_STOCK is TRUE (and ON_SALE is TRUE).
PK’: Part PNO is on sale if and only if ON_SALE is TRUE and in stock if and only if IN_STOCK is TRUE (and IN_STOCK is TRUE).
ULK’: Part PNO is on sale if and only if ON_SALE is TRUE and in stock if and only if IN_STOCK is TRUE (and ON_SALE is TRUE or IN_STOCK is TRUE).

As for constraints, each of these relvars has {PNO} as sole key, and {PNO} is also a foreign key, referencing ULK’, in each of PL’ and PK’. The following constraints also hold:[109]

CONSTRAINT ... IS_EMPTY ( PL' WHERE NOT ( ON_SALE ) ) ;
CONSTRAINT ... IS_EMPTY ( PK' WHERE NOT ( IN_STOCK ) ) ;
CONSTRAINT ... IS_EMPTY ( ULK' WHERE NOT ( ON_SALE OR IN_STOCK ) ) ;

Here now repeated from Chapter 9 are the update rules connecting relvar POI and relvars PL’ and PK’:

ON INSERT i INTO POI :
   INSERT ( i WHERE ON_SALE ) INTO PL' ,
   INSERT ( i WHERE IN_STOCK ) INTO PK' ;

ON DELETE d FROM POI :
   DELETE ( d WHERE ON SALE ) FROM PL' ,
   DELETE ( d WHERE IN_STOCK ) FROM PK' ;

ON INSERT i INTO PL' : INSERT i INTO POI ;

ON INSERT i INTO PK' : INSERT i INTO POI ;

ON DELETE d FROM PL' : DELETE d FROM POI ;

ON DELETE d FROM PK' : DELETE d FROM POI ;

And here are the rules connecting relvars PL’ and PK’ and relvar ULK’:

ON INSERT i INTO PL' : INSERT i INTO ULK' ;
ON INSERT i INTO PK' : INSERT i INTO ULK' ;

ON DELETE d FROM PL' : DELETE ( d WHERE NOT ( IN_STOCK ) ) FROM ULK' ;
ON DELETE d FROM PK' : DELETE ( d WHERE NOT ( ON_SALE ) ) FROM ULK' ;

ON INSERT i INTO ULK' :
   INSERT ( i WHERE ON_SALE ) INTO PL' ,
   INSERT ( i WHERE IN_STOCK ) INTO PK' ;

ON DELETE d FROM ULK' :
   DELETE d FROM PL' , DELETE d FROM PK' ;

And these rules are all, as I hope you’ll agree, completely noncontroversial (though as in Chapter 9 I think it’s instructive to compare them with the rules I gave in connection with the previous version of the example). To repeat from that previous chapter: In effect, what I’ve done here is redesign the database in such a way that information that was previously represented only implicitly, by the relvar names PL and PK, is now represented explicitly by values of the attributes ON_SALE and IN_STOCK instead. And the effect of that redesign is to convert the previous version of the example—which suffered from update behavior that was at least arguably unacceptable—into a version that behaves just like Example 1 as discussed earlier in the chapter. And, of course, this latter example didn’t suffer from the ambiguities and consequent update problems that arose with the previous version of the example currently under discussion.

I’ll leave it as an exercise to determine the implications of the foregoing for a user who sees just relvar ULK’. As for defining ULK’ directly as a restriction of POI—which we obviously could have done if we’d wanted to—I’ll leave it as another exercise to show that the behavior of such a restriction with respect to updates would be essentially identical to that of the union version as described above. Finally, I’ll also leave it as an exercise to show that (assuming it’s supposed to be capable of supporting update operations properly) a “double prime” version of the example—analogous to the “double prime” version of the counterpart example, Example 2, in Chapter 9—is effectively a nonstarter.

A Remark on Disjoint Union

I’d like to consider, briefly, a slightly different example. Suppose some parts are manufactured abroad, others are manufactured domestically, and no parts are both. Suppose further that we represent this situation by means of two relvars, both with a single attribute PNO, one (PA) giving part numbers for parts manufactured abroad, the other (PD) giving part numbers for parts manufactured domestically. Note that the union of these two relvars—call it UAD—is a disjoint union and simply gives part numbers for all parts. But the point is, of course, we can’t tell just by looking at a tuple in that union whether the tuple in question derives from PA or PD (contrast the situation with Example 1 earlier in this chapter, which also involved a disjoint union). The predicates are as follows:

PA: Part PNO is manufactured abroad.
PD: Part PNO is manufactured domestically.
UAD: Part PNO is manufactured either abroad or domestically (and not both).

Now, the significant point about this example, compared to the example involving relvars PL and PK as discussed earlier in this section (where the union wasn’t disjoint), is that the following constraint clearly holds:

CONSTRAINT ... DISJOINT { PA , PD } ;

As a consequence of this constraint, first, deletes on UAD will succeed (in general), though deleting any given tuple will effectively cascade to just one of PA and PD; second, and perhaps more to the point, inserts on UAD will always fail on a Golden Rule violation (unless they’re “no ops”).

Concluding Remarks

In closing, let me abstract from the various examples discussed in earlier sections and summarize the update rules when there’s a union involved. Let V be defined as A UNION B. Further, assume for simplicity that tuples to be inserted into A and B are required to satisfy boolean expressions ax and bx, respectively, where ax and bx denote restriction conditions on the pertinent relvars (default simply TRUE). Then we have:

ON INSERT i INTO A : INSERT i INTO V ;
ON INSERT i INTO B : INSERT i INTO V ;

ON DELETE d FROM A : DELETE ( d WHERE bx ) FROM V ;
ON DELETE d FROM B : DELETE ( d WHERE ax ) FROM V ;

ON INSERT i INTO V :
   INSERT ( i WHERE ax ) INTO A ,
   INSERT ( i WHERE bx ) INTO B ;

ON DELETE d FROM V :
   DELETE ( d WHERE ax ) FROM A ,
   DELETE ( d WHERE bx ) FROM B ;

Note: I’ll leave it as another exercise to show that the specific rules given in connection with the examples earlier in the chapter are all special cases—sometimes slightly “optimized” special cases—of the foregoing general rules.

Now, we saw that the rule for inserting through a union can sometimes lead to results that might be unacceptable, or at least undesirable, in practice. Let me elaborate on this point for a moment. Let the predicates for A and B be PA and PB, respectively; then the predicate for V = A UNION B is PA OR PB. Thus, if we insert t into V, we mean the proposition PA(t) OR PB(t) is true. By contrast, inserting t into both A and B means PA(t) AND PB(t) is true. So if inserting t into V causes t be inserted into both A and B, we’re effectively treating logical OR as logical AND once again.[110]

We also saw that the foregoing criticism applied to the first version of the “parts on sale” vs. “parts in stock” example. However, the real problem with that example was that, given a particular tuple to be inserted into the union ULK, the DBMS was unable to tell whether that tuple logically belonged in relvar PL or relvar PK or both. In effect, the pertinent restriction condition for each of PL and PK in that example was simply TRUE (contrast the situation with Example 2, the NLS vs. NPS example, as discussed earlier in the chapter). And my proposed solution to this problem was, in effect, to redesign the database in such a way that the DBMS could tell which relvar(s) a given tuple logically belonged in after all—a solution that I venture to suggest will often work in practice, and indeed is likely to offer benefits in other areas as well, in addition to view updating as such.



[108] A related issue is the following. Under the stated rules for updates on ULK, inserting a tuple into ULK and then deleting it again will preserve the status quo, but deleting a tuple from ULK and then inserting it again might not. The reason is that deleting a tuple from ULK might actually cause a tuple to be deleted from just one of PL and PK, whereas inserting a tuple into ULK will always cause a tuple to be inserted into both. Of course, the status quo with respect to relvar ULK as such is always preserved.

[109] As in Chapter 9, there’s also a constraint to the effect that the restrictions PL’ WHERE IN_STOCK and PK’ WHERE ON_SALE must be equal, as a consequence of which relvars PL’ and PK’ violate The Principle of Orthogonal Design.

[110] I remind you again that David McGoveran has a proposal for resolving this ambiguity, which I’ll be discussing in Chapter 15.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.141.47.25