RANGE VARIABLES

As we saw in Chapter 10, a range variable in the relational model is a variable—a variable in the sense of logic, that is, not the usual programming language sense—that ranges over the set of tuples in some relation (or the set of rows in some table, in SQL terms). In SQL, such variables are defined by means of AS specifications in the context of either FROM or explicit JOIN. Here’s a simple example of the FROM case:

     SELECT SX.SNO
     FROM   S AS SX
     WHERE  SX.STATUS > 15

SX here is a range variable that ranges over table S; in other words, its permitted values are rows of table S. You can think of the SELECT expression overall as being evaluated as follows. First, the range variable takes on one of its permitted values, say the row for supplier S1. Is the status value in that row greater than 15? If it is, then supplier number S1 appears in the result. Next, the range variable moves on to another row of table S, say the row for supplier S2; again, if the status value in that row is greater than 15, then the relevant supplier number appears in the result. And so on, exhaustively, until variable SX has taken on all of its permitted values.

Note: SQL calls a name such as SX in the example a correlation name. However, it doesn’t seem to have a term for the thing that such a name names; certainly there’s no such thing in SQL as a “correlation.” (Note in particular that the term doesn’t necessarily have anything to do with correlated subqueries, which are discussed in the next section.) I prefer the term range variable.

Incidentally, it’s worth noting that SQL requires SELECT expressions always to be formulated in terms of range variables; if no such variables are specified explicitly, it assumes the existence of implicit ones with the same names as the corresponding tables. For example, the SELECT expression

     SELECT SNO
     FROM   S
     WHERE  STATUS > 15

—arguably a more “natural” SQL formulation of the example discussed above—is treated as shorthand for this expression (note the text in bold):

     SELECT S.SNO
     FROM   S AS S
     WHERE  S.STATUS > 15

In this latter formulation, the “S” dot qualifiers and the “S” in the specification “AS S” do not denote table S; rather, they denote a range variable called S that ranges over the table with the same name.[165]

Now, the BNF grammar defined later in this chapter refers to the items in the commalist in a FROM clause—i.e., the items following the keyword FROM itself—as table specifications.[166] The operand expressions in an explicit JOIN are also table specifications. Let ts be such a table specification. Then, if the table expression portion of ts is a table subquery (see the next section), then ts must also include an AS clause—even if the range variable introduced by that AS clause is never explicitly mentioned anywhere else in the overall expression. Here’s a JOIN example:

     ( SELECT SNO , CITY FROM S ) AS TEMP1
       NATURAL JOIN
     ( SELECT PNO , CITY FROM P ) AS TEMP2

Here’s another example (repeated from Chapter 7):

     SELECT PNO , GMWT
     FROM ( SELECT PNO , WEIGHT * 454 AS GMWT
            FROM   P ) AS TEMP
     WHERE  GMWT > 7000.0

For interest, here’s the same example with all implicit qualifiers made explicit:

     SELECT TEMP.PNO , TEMP.GMWT
     FROM ( SELECT P.PNO , P.WEIGHT * 454 AS GMWT
            FROM   P ) AS TEMP
     WHERE  TEMP.GMWT > 7000.0

Note: A range variable definition in SQL can always optionally include a column name commalist that defines column names for the table the range variable ranges over, as in this example (see the last two lines):

     SELECT TEMP.SNO , TEMP.SNAME , TEMP.STATUS, TEMP.SCITY ,
            TEMP.PNO , TEMP.PNAME , TEMP.COLOR , TEMP.WEIGHT , TEMP.PCITY
     FROM ( SELECT * FROM S JOIN P ON S.CITY > P.CITY ) AS TEMP
          ( SNO , SNAME , STATUS , SCITY ,
            PNO , PNAME , COLOR , WEIGHT , PCITY )

The introduced column names here (SNO, SNAME, STATUS, SCITY, PNO, PNAME, COLOR, WEIGHT, and PCITY) effectively rename columns SNO, SNAME, STATUS, S.CITY, PNO, PNAME, COLOR, WEIGHT, and P.CITY, respectively (see the explanation of JOIN ... ON in Chapter 6).[167] However, it shouldn’t be necessary to introduce column names in this way very often if other recommendations in this book are followed.

Recommendation: Favor the use of explicit range variables, especially in “complex” expressions—they can aid clarity, and sometimes they can save keystrokes.[168] Be aware, however, that SQL’s name scoping rules for such variables can be quite hard to understand (but this is true regardless of whether the variables in question are explicit or implicit).

Caveat: Many SQL texts refer to range variable names (or correlation names) as aliases, and describe them as if they were just alternative names for the tables they range over. But such a characterization seriously misrepresents the true state of affairs—indeed, it betrays a serious lack of understanding of what’s really going on—and is strongly deprecated on that account. Be on your guard against this sloppy manner of speaking.



[165] Here I might admit if pressed to a sneaking sympathy with a remark an old friend once made to me in connection with this very point: “You mathematicians are all alike—you spend hours agonizing over things that are perfectly obvious to everybody else.”

[166] The standard term is table reference, but that term is hardly very apt. In most languages, a variable reference is a special case of an expression; syntactically, it’s just a variable name, used to denote the value of the variable in question. But an SQL “table reference” isn’t a special case of a table expression—not in the sense in which the latter term is used in this book, and (perhaps more to the point) not in the sense in which it’s used in SQL, either.

[167] As the example suggests, the column name commalist in a range variable definition is required, somewhat annoyingly, to be exhaustive—there’s no way to rename just some of the columns concerned and not others. Also, note the need here to be fully cognizant of SQL’s rules regarding left to right column ordering in the result of the explicit JOIN!

[168] I’ll omit them from most of my own examples in the remainder of this chapter, however, because (a) using explicit range variables might distract from the main point I’m trying to make with those examples and (b) those examples are all fairly simple, anyway.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.22.61.179