EXAMPLE 5: NAMING SUBEXPRESSIONS

Another query: “Get full supplier details for suppliers who supply all purple parts.” Note: This query, or one very like it, is often used to demonstrate a flaw in the relational divide operator as originally defined. See the further remarks on this topic at the end of the present section.

Here first is a logical formulation:

     { SX } WHERE FORALL PX ( IF PX.COLOR = 'Purple' THEN
                  EXISTS SPX ( SPX.SNO = SX.SNO AND SPX.PNO = PX.PNO ) )

(“names of suppliers SX such that, for all parts PX, if PX is purple, there exists a shipment SPX with SNO equal to the supplier number for supplier SX and PNO equal to the part number for part PX”). First we apply the implication law:

     { SX } WHERE FORALL PX ( NOT ( PX.COLOR = 'Purple' ) OR
                  EXISTS SPX ( SPX.SNO = SX.SNO AND SPX.PNO = PX.PNO ) )

Next De Morgan:

     { SX } WHERE
            FORALL PX ( NOT ( ( PX.COLOR = 'Purple' ) AND
            NOT EXISTS SPX ( SPX.SNO = SX.SNO AND SPX.PNO = PX.PNO ) ) )

Apply the quantification law:

     { SX } WHERE
            NOT EXISTS PX ( NOT ( NOT ( ( PX.COLOR = 'Purple' ) AND
            NOT EXISTS SPX ( SPX.SNO = SX.SNO AND SPX.PNO = PX.PNO ) ) ) )

Double negation:

     { SX } WHERE
            NOT EXISTS PX ( ( PX.COLOR = 'Purple' ) AND
            NOT EXISTS SPX ( SPX.SNO = SX.SNO AND SPX.PNO = PX.PNO ) )

Drop some parentheses and map to SQL:

     SELECT *
     FROM   S AS SX
     WHERE  NOT EXISTS
          ( SELECT *
            FROM   P AS PX
            WHERE  PX.COLOR = 'Purple'
            AND    NOT EXISTS
                 ( SELECT *
                   FROM   SP AS SPX
                   WHERE  SPX.SNO = SX.SNO
                   AND    SPX.PNO = PX.PNO ) )

Recall now from Chapter 7 that if there aren’t any purple parts, every supplier supplies all of them—even supplier S5, who supplies no parts at all (see the discussion of empty ranges in Chapter 10 for further explanation). So the result is the entire suppliers relation:

SNO

SNAME

STATUS

CITY

S1

Smith

20

London

S2

Jones

10

Paris

S3

Blake

30

Paris

S4

Clark

20

London

S5

Adams

30

Athens

Now, you might have had some difficulty in following the transformations in the foregoing example, and you might also be having some difficulty in understanding the final SQL formulation. Well, a useful technique, when the expressions start getting a little complicated as in this example, is to abstract a little by introducing symbolic names for subexpressions (I did briefly mention this point in the previous chapter, but now I want to get more specific). Let’s use exp1 to denote the subexpression

     PX.COLOR = 'Purple'

and exp2 to denote the subexpression

     EXISTS SPX ( SPX.SNO = SX.SNO AND SPX.PNO = PX.PNO )

(note that both of these subexpressions can be directly represented, more or less, in SQL). Then the original relational calculus expression becomes:

     {SX } WHERE FORALL PX ( IF exp1 THEN exp2 )

As I said in the previous chapter, now we can see the forest as well as the trees (as it were), and we can start to apply our usual transformations—though now it seems to make more sense to apply them in a different sequence, precisely because we do now have a better grasp of the big picture. First, then, the quantification law:

     {SX } WHERE NOT EXISTS PX ( NOT ( IF exp1 THEN exp2 ) )

Implication law:

     {SX } WHERE NOT EXISTS PX ( NOT ( NOT ( exp1 ) OR exp2 ) )

De Morgan:

     {SX } WHERE NOT EXISTS PX ( NOT ( NOT ( exp1 AND NOT ( exp2 ) ) ) )

Double negation:

     {SX } WHERE NOT EXISTS PX ( exp1 AND NOT ( exp2 ) )

Finally, expand exp1 and exp2 and map to SQL:

     SELECT *
     FROM   S AS SX
     WHERE  NOT EXISTS
          ( SELECT *
            FROM   P AS PX
            WHERE  PX.COLOR = 'Purple'
            AND    NOT EXISTS
                 ( SELECT *
                   FROM   SP AS SPX
                   WHERE  SPX.SNO = SX.SNO
                   AND    SPX.PNO = PX.PNO ) )

As I think this example demonstrates, SQL expressions obtained by the techniques under discussion are often quite hard to understand directly; as I said earlier, however, we know they’re correct, because of the systematic manner in which they’ve been derived.[158]

As an aside, I can’t resist showing a Tutorial D version of the example by way of comparison:

     S WHERE ( !!SP ) { PNO } ⊇ ( P WHERE COLOR = 'Purple' ) { PNO }

Now let me explain the remark I made at the beginning of this section, regarding divide. Let’s denote the restriction P WHERE COLOR = ‘Purple’ by the symbol PP. Also, let’s simplify the query at hand—“Get full supplier details for suppliers who supply all purple parts”—such that it asks for supplier numbers only, instead of full supplier details. Then it might be thought that the query could be represented by the following algebraic expression:

     SP { SNO , PNO } DIVIDEBY PP { PNO }

Note: DIVIDEBY here represents the divide operator as originally defined. See Chapter 7 if you need an explanation of this point.

With our usual sample data values, however, relation PP, and hence the projection of relation PP on {PNO}, are both empty (because there aren’t any purple parts), and the foregoing expression therefore returns the supplier numbers S1, S2, S3, and S4. But if there aren’t any purple parts, then every supplier supplies all of them (see the discussion of empty ranges in the previous chapter)—even supplier S5, who supplies no parts at all. And the foregoing division can’t possibly return supplier number S5, because it extracts supplier numbers from SP instead of S, and supplier S5 isn’t currently represented in SP. So the informal characterization of that division as “Get supplier numbers for suppliers who supply all purple parts” is incorrect; it should be, rather, “Get supplier numbers for suppliers who supply at least one part and also supply all purple parts.” As this example demonstrates, therefore (and to repeat something I said in Chapter 7), the divide operator doesn’t really solve the problem it was originally, and explicitly, intended to solve.



[158] It’s worth pointing out in passing that the tactic of introducing names for subexpressions is reminiscent, somewhat, of the use of WITH in simplifying complex expressions as discussed in Chapter 6. But there’s a difference: For WITH, the subexpressions in question are required to be closed, whereas no such requirement applies in the present context. Indeed, all we’re doing in the present context is, in effect, simple text substitution, which is not what happens with WITH.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.148.144.139