Perhaps you can begin to see now why it’s my opinion that (to repeat something I said in Chapter 5) the relational model is rock solid, and “right,” and will endure. A hundred years from now, I fully expect database systems still to be based on Codd’s relational model. Why? Because the foundations of that model—namely, set theory and predicate logic—are themselves rock solid in turn. Elements of predicate logic in particular go back well over 2,000 years, at least as far as Aristotle (384-322 BCE).

So what about other data models?—the “object oriented model,” for example, or the “hierarchic model,” or the CODASYL “network model,” or the “semistructured model”? In my view, these other models are just not in the same ballpark. Indeed, I seriously question whether they deserve to be called models at all.[174] The hierarchic and network models in particular never really existed in the first place!—as abstract models, I mean, preceding any implementations. Instead, they were invented after the fact; that is, hierarchic and network products were built first, and the corresponding models were defined afterward, by a process of induction—here just a polite term for guesswork—from those products. As for the object oriented and semistructured models, it’s entirely possible that the same criticism applies; I suspect it does, but it’s hard to be sure. One problem is that there doesn’t seem to be any consensus on what those models might consist of.[175] It certainly can’t be claimed, for example, that there’s a unique, clearly defined, and universally accepted object oriented model, and similar remarks apply to the semistructured model also. (Actually, some people have claimed there isn’t a unique relational model, either. I’ll deal with that argument in a few moments.)

Aside: The following quote from “The Object Oriented Database Sysem Manifesto,” by Malcolm Atkinson, François Bancilhon, David DeWitt, Klaus Dittrich, David Maier, and Stanley Zdonik (Proc. 1st International Conference on Deductive and Object Oriented Databases, Kyoto, Japan, 1989) lends weight to my suggestion that in the case of the object oriented model, at least, implementations came first and the model itself was—or indeed, the quote rather strongly suggests, should be (?)—defined afterward:

With respect to the specification of the system, we are taking a Darwinian approach: We hope that, out of the set of experimental prototypes being built, a fit model will emerge. We also hope that viable implementation technology for that model will evolve simultaneously.

In other words, the authors are suggesting that the code should be written first, and that a model might possibly be developed later by abstracting from that code. End of aside.

Another important reason why I don’t believe those other models really deserve to be called models at all is the following. First, I hope you agree it’s undeniable that the relational model is indeed a model and thus not, by definition, concerned with implementation issues. By contrast, those other models all fail, much of the time, to make a clear distinction between issues that truly are model issues and issues that have to do with matters of implementation; at the very best, they muddy that distinction considerably (they’re all much “closer to the metal,” as it were).[176] As a consequence, they’re harder to use and understand, and they give implementers far less freedom—far less than the relational model does, I mean—to adopt inventive or creative approaches to questions of implementation.

So what of those claims to the effect that there are several relational models, too? One example of such a claim can be found in the book Joe Celko’s Data and Databases: Concepts in Practice (Morgan Kaufmann, 1999), where the author, Joe Celko, says this:

There is no such thing as the relational model for databases anymore [sic] than there is just one geometry.

And to bolster his argument, he goes on to identify what he says are six “different relational models.”

Now, I wrote an immediate response to these claims when I first encountered them. Here’s a lightly edited version of what I said at the time:

It’s true there are several different geometries (euclidean, elliptic, hyperbolic, and so forth). But is the analogy a valid one? That is, do those “different relational models” differ in the same way those different geometries differ? It seems to me the answer to this question is no. Elliptic and hyperbolic geometries are often referred to, quite explicitly, as noneuclidean geometries;[177] for the analogy to be valid, therefore, it would seem that at least five of those “six different relational models” would have to be nonrelational models, and hence, by definition, not “relational models” at all. (Actually, I would agree that several of those “six different relational models” are indeed not relational. But then it can hardly be claimed—at least, it can’t be claimed consistently—that they’re different relational models.)

And I went on to say this (again somewhat edited here):

But I have to admit that Codd did revise his own definitions of what the relational model was, somewhat, throughout the 1970s and 1980s. One consequence of this fact is that critics have been able to accuse Codd in particular, and relational advocates in general, of “moving the goalposts” far too much. For example, Mike Stonebraker has written (in his introduction to Readings in Database Systems, 2nd edition, Morgan Kaufmann, 1994) that “one can think of four different versions” of the model:

  • Version 1: Defined by the 1970 CACM paper

  • Version 2: Defined by the 1981 Turing Award paper

  • Version 3: Defined by Codd’s 12 rules and scoring system

  • Version 4: Defined by Codd’s book

Let me interrupt myself briefly to explain the references here. They’re all by Codd. The 1970 CACM paper is “A Relational Model of Data for Large Shared Data Banks,” CACM 13, No. 6 (June 1970), and it’s discussed in a little more detail in Appendix G of the present book. The 1981 Turing Award paper is “Relational Database: A Practical Foundation for Productivity,” CACM 25, No. 2 (February 1982). The 12 rules and the accompanying scoring system are described in Codd’s Computerworld articles “Is Your DBMS Really Relational?” and “Does Your DBMS Run By The Rules?” (October 14th and October 21st, 1985). Finally, Codd’s book is The Relational Model for Database Management Version 2 (Addison-Wesley, 1990). Now back to my response:

Perhaps because we’re a trifle sensitive to such criticisms, Hugh Darwen and I have tried to provide, in our book Databases, Types, and the Relational Model: The Third Manifesto, our own careful statement of what we believe the relational model is (or ought to be!). Indeed, we’d like our Manifesto to be seen in part as a definitive statement in this regard. I refer you to the book itself for the details; here just let me say that we see our contribution in this area as primarily one of dotting a few i’s and crossing a few t’s that Codd himself left undotted or uncrossed in his own work. We most certainly don’t want to be thought of as departing in any major respect from Codd’s original vision; indeed, the whole of the Manifesto is very much in the spirit of Codd’s ideas and continues along the path that he originally laid down.

To all of the above I’d now like to add another point, which I think clearly refutes Celko’s original argument. I agree there are several different geometries. But the reason why those geometries are all different is: They start from different axioms. By contrast, we’ve never changed the axioms for the relational model. We have made a number of changes over the years to the model itself—for example, we’ve added relational comparisons—but the axioms (which are basically those of classical set theory and classical predicate logic) have remained unchanged ever since Codd’s first papers. Moreover, what changes have occurred have all been, in my view, evolutionary, not revolutionary, in nature. Thus, I really do claim there’s only one relational model, even though it has evolved over time and will presumably continue to do so. As I said in Chapter 1, it can be seen as a small branch of mathematics; as such, it grows over time as new theorems are proved and new results discovered. What’s more—as with mathematics in general—those new theorems and results can be proved and discovered by anyone who’s competent to do so. The relational model began as the brainchild of one man, but now belongs to the world.[178]

So what are those evolutionary changes? Here are some of them:

  • As already mentioned, we’ve added relational comparisons.

  • We’ve clarified the logical difference between relations and relvars.

  • We’ve clarified the concept of first normal form; as a consequence, we’ve embraced the concept of relation valued attributes in particular.

  • We have a better understanding of the nature of relational algebra, including the relative significance of various operators and an appreciation of the importance of relations of degree zero, and we’ve identified certain useful new operators (for example, extend and semijoin).

  • We’ve added the concept of image relations.

  • We have a better understanding of updating, including view updating in particular.

  • We have a better understanding of the fundamental significance of integrity constraints in general, and we have many good theoretical results regarding certain important special cases.

  • We’ve clarified the nature of the relationship between the model and predicate logic.

  • Finally, we have a clearer understanding of the relationship between the relational model and type theory (more specifically, we’ve clarified the nature of domains).

[174] Which is why I set them all in quotation marks. I’ll drop those quotation marks from this point forward because I know how annoying they can be, but you should think of them as still being there in some virtual kind of sense.

[175] My own opinion (for what it’s worth) is that the semistructured model and the object model are, respectively, just the old hierarchic model warmed over and the old network model warmed over.

[176] Actually I think these remarks are rather charitable; in my opinion, those other models are really little more than slightly abstract, but otherwise ad hoc, storage structures that have been elevated above their station and will not stand the test of time.

[177] I’ll have a little more to say about those noneuclidean geometries in the next section.

[178] “I see relational theory as simply a body of theory to which many people are contributing in different ways” (E. F. Codd, in an interview in Data Base Newsletter 10, No. 2, March 1982).

