SOME PRELIMINARIES

Let me begin by reviewing a few points from Chapter 1. First, recall that each algebraic operator takes at least one relation as input and produces another relation as output. Second, recall too that the fact that the output is the same kind of thing as the input(s)—they’re all relations—constitutes the closure property of the algebra, and it’s that property that lets us write nested relational expressions. Third, I gave outline descriptions in Chapter 1 of what I called “the original operators” (restrict, project, product, intersect, union, difference, and join); now I’m in a position to define those operators, and others, much more carefully. Before I can do that, however, I need to make a few more general points:

  • The operators of the algebra are generic: They apply, in effect, to all possible relations. For example, we don’t need one specific join operator to join departments and employees and another, different, join operator to join suppliers and shipments. (Incidentally, do you think an analogous remark applies to object systems?).

  • The operators are also read-only: They “read” their operands and they return a result, but they don’t update anything. In other words, they operate on relations, not relvars.

  • Of course, the previous point doesn’t mean that relational expressions can’t refer to relvars. For example, if R1 and R2 are relvar names, then R1 UNION R2 is certainly a valid relational expression in Tutorial D (so long as the relvars denoted by those names are of the same type, that is). In that expression, however, R1 and R2 don’t denote those relvars as such; rather, they denote the relations that happen to be the current values of those relvars at that time. In other words, we can certainly use a relvar name to denote a relation operand—and such a relvar reference in itself thus constitutes a valid relational expression[70]—but in principle we could equally well denote the very same operand by means of an appropriate relation literal instead.[71]

    An analogy might help clarify this latter point. Suppose N is a variable of type INTEGER, and at time t it has the value 3. Then N + 2 is certainly a valid expression, but at time t it means exactly the same thing as 3 + 2, no more and no less.

  • Finally, given that the operators of the algebra are indeed all read-only, it follows that INSERT, DELETE, and UPDATE (and relational assignment), though they’re certainly relational operators, aren’t relational algebra operators as such—though, regrettably, you’ll often come across statements to the contrary in the literature.

I also need to say something here about the design of Tutorial D, because its support for the algebra in particular is significantly different from that of SQL. The overriding point is that, in operations like UNION or JOIN that need some correspondence to be established between operand attributes, Tutorial D does so by requiring the attributes in question to be, formally, the very same attribute (i.e., to have the same name and same type). For example, here’s a Tutorial D expression for the join of parts and suppliers on cities:

     P JOIN S

The join operation here is performed, by definition, on the basis of part and supplier cities, CITY being the sole attribute that P and S have in common (i.e., the sole common attribute).

To repeat, Tutorial D establishes the correspondence between operand attributes, when such a correspondence is required, by insisting that the attributes in question in fact be the very same attribute. And it applies this same technique uniformly and consistently across the board, in all pertinent contexts. By contrast, SQL uses different techniques in different contexts. Sometimes it uses ordinal position (we’ve already seen an example of this case in connection with foreign keys, as discussed in the previous chapter). Sometimes it uses explicit specification. Sometimes it requires the attributes in question (or columns, rather) to have the same name—and then the correspondence is sometimes established explicitly, sometimes implicitly. And regardless of whether it requires the columns in question to have the same name, sometimes it requires those columns to be of the same type, and sometimes it doesn’t. In order to illustrate some but not all of these possibilities, let’s consider the P JOIN S example again. Here’s one possible formulation of that join in SQL:

     SELECT P.PNO , P.PNAME , P.COLOR , P.WEIGHT , P.CITY
                                             /* or S.CITY */ ,
            S.SNO , S.SNAME , S.STATUS
     FROM   P , S
     WHERE  P.CITY = S.CITY

In this formulation, the required column correspondence is specified explicitly in the WHERE clause. As you probably know, however, examples like this one can in fact be formulated in several different ways in SQL. Here are three more formulations for the case at hand (as you can see, the second and third are a little closer to the spirit of Tutorial D):[72]

     SELECT P.PNO , P.PNAME , P.COLOR , P.WEIGHT , P.CITY
                                             /* or S.CITY */ ,
            S.SNO , S.SNAME , S.STATUS
     FROM   P JOIN S
     ON     P.CITY = S.CITY

     SELECT P.PNO , P.PNAME , P.COLOR , P.WEIGHT , CITY ,
            S.SNO , S.SNAME , S.STATUS
     FROM   P JOIN S
     USING  ( CITY )

     SELECT P.PNO , P.PNAME , P.COLOR , P.WEIGHT , CITY ,
            S.SNO , S.SNAME , S.STATUS
     FROM   P NATURAL JOIN S

Observe now that:

  • In the first of these three formulations, the column correspondence is again specified explicitly, but this time by means of an ON clause instead of a WHERE clause.

  • In the second formulation, the correspondence is based on common column names, but it’s still specified explicitly, by means of the USING clause.

  • In the third formulation, the correspondence is again based on common column names, but this time it’s implicit.

Now I’d like to go back to the SQL formulation I gave first of all, partly because it was the only one supported in SQL as originally defined and partly, and more importantly, because it allows me to make a number of additional points concerning differences between SQL and Tutorial D:

  • SQL permits, and sometimes requires, dot qualified names. Tutorial D doesn’t. Note: I’ll have more to say about SQL’s dot qualified names in Chapter 12.

  • Tutorial D sometimes needs to rename attributes in order to avoid what would otherwise be naming clashes or mismatches. SQL usually doesn’t (though it does support an analog of the RENAME operator that Tutorial D uses for the purpose, as we’ll see in the next section).

  • Partly as a consequence of the previous point, Tutorial D has no need for SQL’s “correlation name” concept; in effect, it replaces that concept by the idea that attributes sometimes need to be renamed, as previously explained. Note: I’ll be discussing SQL’s correlation names in detail in Chapter 12.

  • As well as either explicitly or implicitly supporting certain features of the relational algebra, SQL also explicitly supports certain features of the relational calculus (correlation names are a case in point, and EXISTS is another). Tutorial D doesn’t. One consequence of this difference is that SQL is a highly redundant language, in that it typically provides numerous different ways of formulating the same query, a fact that can have serious negative implications for both the user and the optimizer. (I once wrote a paper on this topic called “Fifty Ways to Quote Your Query”—see Appendix G—in which I showed that even a query as simple as “Get names of suppliers who supply part P2” can be expressed in well over 50 different ways in SQL.)

  • SQL requires most queries to conform to its SELECT - FROM - WHERE template. Tutorial D has no analogous requirement. Note: I’ll have more to say on this particular issue in the next chapter.

In what follows, I’ll show examples in both Tutorial D and SQL.



[70] This is true in the algebra but not necessarily true in SQL. For example, if T1 and T2 are SQL table names, we typically can’t write things like T1 UNION T2—we have to write something like SELECT * FROM T1 UNION SELECT * FROM T2 instead.

[71] Again, this is true in the algebra but not necessarily true in SQL. See the BNF grammar for SQL table expressions in Chapter 12.

[72] Here’s a test of your SQL knowledge: For which of these formulations do corresponding columns have to be of the same type?

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.140.195.28