TYPE CHECKING AND COERCION IN SQL

SQL supports only a weak form of strong typing (if you see what I mean). To be specific:

  • BOOLEAN values can be assigned only to BOOLEAN variables and compared only with BOOLEAN values.

  • Numeric values can be assigned only to numeric variables and compared only with numeric values (where “numeric” means INTEGER, SMALLINT, BIGINT, NUMERIC, DECIMAL, or FLOAT).

  • Character string values can be assigned only to character string variables and compared only with character string values (where “character string” means CHAR, VARCHAR, or CLOB).

  • Bit string values can be assigned only to bit string variables and compared only with bit string values (where “bit string” means BINARY, BINARY VARYING, or BLOB).

Thus, for example, an attempt to compare a number and a character string is illegal. However, an attempt to compare (say) two numbers is legal, even if those numbers are of different types—say INTEGER and FLOAT, respectively (in this example, the INTEGER value will be coerced to type FLOAT before the comparison is done). Which brings me to the question of type coercion ... It’s a widely accepted principle in computing in general that coercions are best avoided, because they’re error prone. In SQL in particular, one bizarre consequence of permitting coercions is that certain unions, intersections, and differences can yield a result with rows that don’t appear in either operand! For example, consider the SQL tables T1 and T2 shown in Figure 2-3 below. Let column X be of type INTEGER in table T1 but NUMERIC(5,1) in table T2, and let column Y be of type NUMERIC(5,1) in table T1 but INTEGER in table T2. Now consider the SQL query:

     SELECT X , Y FROM T1
     UNION
     SELECT X , Y FROM T2

The result is shown as the rightmost table in Figure 2-3. As the figure suggests, columns X and Y in that result are both of type NUMERIC(5,1), and all values in those columns are obtained, in effect, by coercing some INTEGER value to type NUMERIC(5,1). Thus, the result consists exclusively of rows that appear in neither T1 nor T2!—a very strange kind of union, you might be forgiven for thinking.[34]

A very strange “union”

Figure 2-3. A very strange “union”

Strong recommendation: Do your best to avoid coercions wherever possible. (My own clear preference would be to do away with them entirely, regardless of whether we’re in the SQL context or any other context.) In the SQL context in particular, I recommend that you ensure that columns with the same name are always of the same type; this discipline, along with others recommended elsewhere in this book, will go a long way toward ensuring that type conversions in general are avoided. And when they can’t be avoided, I recommend doing them explicitly, using CAST or some CAST equivalent. For example (with reference to the foregoing UNION query):

     SELECT CAST ( X AS NUMERIC(5,1) ) AS X , Y FROM T1
     UNION
     SELECT X , CAST ( Y AS NUMERIC(5,1) ) AS Y FROM T2

For completeness, however, I need to add that certain coercions are unfortunately built into the very fabric of SQL and so can’t be avoided. (I realize the following remarks might not make much sense at this point in the book, but I don’t want to lose them.) To be specific:

  • If a table expression tx is used as a row subquery, then the table t denoted by tx is supposed to have just one row r, and that table t is coerced to that row r. Note: The term subquery occurs ubiquitously in SQL contexts. I’ll explain it in detail in Chapter 12; prior to that point, you can take it to mean, albeit rather loosely, just a SELECT expression enclosed in parentheses.

  • If a table expression tx is used as a scalar subquery, then the table t denoted by tx is supposed to have just one column and just one row and hence to contain just one value v, and that table t is doubly coerced to that value v. Note: This case occurs in connection with SQL-style aggregation in particular (see Chapter 7).

  • In practice, the row expression rx in the ALL or ANY comparison rx θ sq—where (a) θ is a comparison operator such as “<” or “>” followed by the keyword ALL or ANY and (b) sq is a subquery—often consists of a simple scalar expression, in which case the scalar value denoted by that expression is effectively coerced to a row that contains just that scalar value. Note: Throughout this book, I use the term row expression to mean either a row subquery or a row selector invocation (where row selector in turn is my preferred term for what SQL calls a row value constructor—see Chapter 3); in other words, I use row expression to mean any expression that denotes a row, just as I use table expression to mean any expression that denotes a table. As for ALL or ANY comparisons, they’re discussed in Chapter 11.

    Finally, SQL also uses the term coercion in a very special sense in connection with character strings. The details are beyond the scope of this book.



[34] In connection with this example, one reviewer suggested that the “strangeness” of the union might not matter in practice, since at least no information has been lost in the result. Well, that observation might be valid, in this particular example. But if the SQL language designers want to define an operator that manifestly doesn’t behave like the union operator of the relational model (or set theory, come that), then it seems to me that, first, it doesn’t help the cause of understanding to call that operator “union”; second (and rather more important), it isn’t incumbent on me to show such a “union” can sometimes cause problems—rather, it’s incumbent on those language designers to show it can’t.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.116.10.201