EXTEND

You might have noticed that the algebra as I’ve described it so far in this book doesn’t have any conventional computational capabilities. Now, SQL does; for example, we can write queries in SQL along the lines of SELECT A + B AS C ... (for example). However, as soon as we write that “+” sign, we’ve gone beyond the bounds of the algebra as originally defined. So we need to add something to the algebra in order to provide this kind of functionality, and that’s what EXTEND is for. By way of example, suppose part weights (in relvar P) are given in pounds, and we want to see those weights in grams. There are 454 grams to a pound, and so we can write:

image with no caption

Given our usual sample values, the result looks like this:

PNO

PNAME

COLOR

WEIGHT

CITY

GMWT

P1

Nut

Red

12.0

London

5448.0

P2

Bolt

Green

17.0

Paris

7718.0

P3

Screw

Blue

17.0

Oslo

7718.0

P4

Screw

Red

14.0

London

6356.0

P5

Cam

Blue

12.0

Paris

5448.0

P6

Cog

Red

19.0

London

8626.0

Important: Relvar P is not changed in the database! EXTEND is not an SQL-style ALTER TABLE; the EXTEND expression is just an expression, and like any expression it simply denotes a value. In particular, therefore, it can be nested inside other expressions. Here’s an example (the query is “Get part number and gram weight for parts with gram weight greater than 7000 grams”):

image with no caption

As you can see, there’s an interesting difference between the Tutorial D and SQL versions of this example. To be specific, the (sub)expression WEIGHT * 454 appears once in the Tutorial D version but twice in the SQL version. In the SQL version, therefore, we have to hope the implementation will be smart enough to recognize that it need evaluate that subexpression just once per tuple (or row, rather) instead of twice.

The problem this example illustrates is that SQL’s SELECT - FROM - WHERE template is too rigid. What we need to do, as the Tutorial D formulation makes clear, is form a restriction of an extension; in SQL terms, we need to apply the WHERE clause to the result of the SELECT clause, as it were. But the SELECT - FROM - WHERE template forces the WHERE clause to apply to the result of the FROM clause, not the SELECT clause (see the section EVALUATING SQL TABLE EXPRESSIONS in Chapter 6). To put it another way: In many respects, it’s the whole point of the algebra that (thanks to closure) relational operations can be combined and nested in arbitrary ways; but SQL’s SELECT - FROM - WHERE template effectively means that queries must be expressed as a product, followed by a restrict, followed by some combination of project and/or extend and/or rename[93]—and many queries just don’t fit this pattern.

Incidentally, you might be wondering why I didn’t formulate the SQL version like this:

     SELECT PNO , WEIGHT * 454 AS GMWT
     FROM   P
     WHERE  GMWT > 7000.0

(The change is in the last line.) The reason is that GMWT is the name of a column of the final result; table P has no such column, the WHERE clause thus makes no sense, and the expression fails at compile time.

Actually, the SQL standard does allow the query under discussion to be formulated in a style that’s a little closer to that of Tutorial D (and now I’ll make all of the otherwise implicit dot qualifications explicit, for clarity):

     SELECT TEMP.PNO , TEMP.GMWT
     FROM ( SELECT PNO , ( WEIGHT * 454 ) AS GMWT
            FROM P ) AS TEMP
     WHERE  TEMP.GMWT > 7000.0

But I’m not sure all SQL products allow nested subqueries to appear in the FROM clause in this manner. Note too that this kind of formulation inevitably leads to a need to reference certain variables (TEMP, in the example) before they’re defined—quite possibly a long way before they’re defined, in fact, in real SQL queries.

Note: I need to say a little more about the FROM clause in the foregoing example. As you can see, it takes the form

     FROM ( ... ) AS TEMP

Formally speaking, it’s the parenthesized portion of this FROM clause that constitutes the nested subquery (see Chapter 12). And—here comes the point—SQL has a syntax rule to the effect that a nested subquery in the FROM clause must be accompanied by an explicit AS clause that defines a name for the table denoted by that subquery,[94] even if that name is never explicitly referenced elsewhere in the overall expression. In fact, in the example at hand, we could omit all of the explicit references to the name TEMP (i.e., all of the explicit “TEMP.” dot qualifications) if we wanted to, thus:

     SELECT PNO , GMWT
     FROM ( SELECT PNO , ( WEIGHT * 454 ) AS GMWT
            FROM P ) AS TEMP
     WHERE  GMWT > 7000.0

But the TEMP definition (i.e., that AS TEMP specification) is still needed nonetheless.

I’ll close this section with a formal definition of the EXTEND operator:

Definition: Let r be a relation, and let r not have an attribute named X. Then the extension EXTEND r : {X := exp} is a relation with (a) heading the heading of r extended with attribute X and (b) body the set of all tuples t such that t is a tuple of r extended with a value for attribute X that’s computed by evaluating exp on that tuple of r. Observe that the result has cardinality equal to that of r and degree equal to that of r plus one. The type of X in that result is the type of exp.



[93] And/or ungroup (see later in this chapter).

[94] More accurately, it defines a corresponding range variable. See Chapter 12 for further explanation.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.216.37.123