MORE ON PREDICATES

In this section,[192] I show how it’s possible to get “don’t know” answers out of a database without nulls, even if there aren’t any tables like table SUC (suppliers with an unknown city) that explicitly represent the fact that something is unknown. For simplicity, suppose our database consists in its entirety of just table SC (suppliers with a known city), as shown in Figure C-9 below.

Table SC (suppliers with a known city)

Figure C-9. Table SC (suppliers with a known city)

Now consider the following query on table SC:

Is supplier S1 in London?

In Tutorial D:[193]

( SC WHERE SNO = 'S1' AND CITY = 'London' ) { }

Clearly, this expression evaluates to either TABLE_DEE or TABLE_DUM: TABLE_DEE if supplier S1 is in London, TABLE_DUM otherwise. Note, therefore, that—as I mentioned in Chapter 3—TABLE_DEE and TABLE_DUM can be interpreted as yes and no, respectively. Note too the implicit appeal to The Closed World Assumption!—in effect, we’re saying that if the row (S1,London) fails to appear in table SC, we’re allowed to conclude that it’s not the case that supplier S1 is in London.

Now, I said previously that the predicate for table SC was Supplier SNO is located in city CITY. But it isn’t—not really. To see why not, consider what happens if some user tries to introduce a new row into the table, perhaps as follows:

INSERT SC RELATION { TUPLE { SNO 'S6' , CITY 'Madrid' } } ;

In effect, the user here is telling the system there’s a new supplier, S6, with city Madrid. Now, the system obviously has no way of knowing whether the user’s assertion is true; as explained in Chapter 8, all it can do (and does do) is check that the requested insertion, if performed, doesn’t cause any integrity constraints to be violated. If it doesn’t, then the system accepts the row, and interprets it as representing a true fact from this point forward.

We see, therefore, that rows in table SC don’t necessarily represent actual states of affairs in the real world; rather, they represent what the user tells the system about the real world, or in other words the user’s knowledge of the real world. Thus, the predicate for relvar S isn’t really just Supplier SNO is located in city CITY; rather, it’s We know that supplier SNO is located in city CITY. And the effect of a successful INSERT is to make the system aware of something the user already knows. Thus, the database doesn’t contain the real world (of course not); what it contains is, rather, the system’s knowledge of the real world. And the system’s knowledge is derived in turn from the user’s knowledge (of course!—there’s no magic here).[194]

So when we pose a query to the system, by definition that query can’t be a query about the real world; instead, it is—it has to be—a query about the system’s knowledge of the real world. For example, consider again the query discussed above: Is supplier S1 in London? This rather imprecise natural language formulation has to be understood as shorthand for the following more accurate one:

Do we know that supplier S1 is in London?

In practice, of course, we almost never talk in such precise terms; we usually elide qualifiers like “Do we know that” (or “According to the system’s knowledge, is it true that,” or “Does the database say that,” and so on). But even if we do elide them, we certainly need to understand that, conceptually, they’re there—for otherwise we’ll be really confused. (Though perhaps I should add that such confusions aren’t exactly unknown in practice.)

It follows from the foregoing discussion that the Tutorial D expression I showed earlier—

( S WHERE SNO = 'S1' AND CITY = 'London' ) { }

—doesn’t really represent the query Is supplier S1 in London? after all. Rather, it represents the query Do we know that supplier S1 is in London? And, appealing again to The Closed World Assumption, it follows further that:

  • If the result is TABLE_DEE (yes), it means we do know supplier S1 is in London.

  • If the result is TABLE_DUM (no), it means we don’t know whether supplier S1 is in London. And that’s a “don’t know” answer if ever you saw one.

Of course, if a row for supplier S1 does appear in the table but the CITY value in that row isn’t London, we know supplier S1 isn’t in London (I’m appealing here to the fact that {SNO} is a key for table SC). Putting it all together, then, we have the following:

  • If a row for supplier S1 appears in table SC and the CITY value in that row is London, it means yes, we know supplier S1 is in London.

  • If a row for supplier S1 appears in table SC but the CITY value in that row is something other than London, it means no, we know supplier S1 isn’t in London.

  • And if no row for supplier S1 appears in table SC at all, it means we don’t know whether supplier S1 is in London.

Given The Closed World Assumption, then, we can formulate queries that return a true / false / don’t know answer, and we don’t need nulls or 3VL to do so. Here’s a Tutorial D formulation for the example under discussion:

     ( EXTEND ( S WHERE SNO = 'S1' AND CITY = 'London' ) { } :
                                   { RESULT := 'true   ' } ) { RESULT }
       UNION
     ( EXTEND ( S WHERE SNO = 'S1' AND CITY ≠ 'London' ) { } :
                                   { RESULT := 'false  ' } ) { RESULT }
       UNION
     ( EXTEND ( RELATION { TUPLE { SNO 'S1' } } MINUS S { SNO } ) { } :
                                   { RESULT := 'unknown' } ) { RESULT }

As you can see, this expression takes the form a UNION b UNION c, where each of a, b, and c is a table of just one column. Moreover, it should be clear that exactly one of a, b, and c contains just one row and the other two contain no rows at all. The overall result is thus a one-column, one-row table; the single column, RESULT, is of type character string, and the single row contains the appropriate RESULT value. And the trick—though it isn’t really a trick at all—is that the RESULT value is a character string, not a truth value. As a consequence, there’s no need to get into the 3VL quagmire in order to formulate queries that can yield true, false, or unknown answers, if that’s what we really want.

For completeness, here’s an SQL analog of the foregoing Tutorial D expression:

     SELECT 'true    ' AS RESULT
     FROM ( SELECT S.*
            FROM   S
            WHERE  SNO = 'S1'
            AND    CITY = 'London' ) AS POINTLESS1
     UNION  CORRESPONDING
     SELECT 'false   ' AS RESULT
     FROM ( SELECT S.*
            FROM   S
            WHERE  SNO = 'S1'
            AND    CITY <> 'London' ) AS POINTLESS2
     UNION  CORRESPONDING
     SELECT 'unknown' AS RESULT
     FROM ( VALUES ( 'S1' )
            EXCEPT
            SELECT S.SNO
            FROM   S ) AS POINTLESS3

Incidentally, if you’re wondering about those AS POINTLESS specifications in this SQL expression, I remind you that SQL has a syntax rule to the effect that a subquery in the FROM clause must be accompanied by an explicit AS clause that defines an associated range variable, even if that range variable is never explicitly referenced anywhere in the overall expression. Note also that specifying CORRESPONDING on the EXCEPT in the final portion of this expression would actually be incorrect! It could be made correct by replacing the specification VALUES (‘S1’) by an expression of the form SELECT DISTINCT ‘S1’ AS SNO FROM T where T is some arbitrary—but necessarily nonempty—named table.



[192] The section is based on material from Chapter 4 (“The Closed World Assumption”) from my book Logic and Databases: The Roots of Relational Theory (see Appendix G).

[193] I have to use Tutorial D here, not SQL, because the example under discussion is a yes/no query; as we’ll see in a moment, therefore, it relies on the special relations TABLE_DEE and TABLE_DUM (the only relations of degree zero—see Chapter 3), and SQL doesn’t support those relations.

[194] Even the terms know and knowledge might be a little strong in contexts such as those at hand (the terms believe and beliefs might be better)—but I’ll stay with know and knowledge for the purposes of the present discussion.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.142.98.186