Preface

This book is a major revision—in fact, a total rewrite—of an earlier one by the same authors. That earlier book was published by Morgan Kaufmann in 2003, under the title Temporal Data and the Relational Model. But the present book is so different from the previous one (except, to some extent, in its overall structure) that it doesn’t seem reasonable to refer to it as just a second edition. (We’ve even given it a different title, partly for that reason.) As a consequence, we won’t make any attempt in the text to call out specific points of difference with respect to that earlier book. Nor will we do what’s usually done with the preface to a new edition, which is first to repeat the preface from the previous edition and then to explain what’s changed in the new one. In other words, what you’re reading right now is, to all intents and purposes, the preface to a brand new book.

So what exactly is “temporal data”? Well, a temporal database system is one that includes special support for the time dimension; in other words, it’s a system that provides special facilities for storing, querying, and updating historical and/or future data. Database management systems (DBMSs for short) as classically understood aren’t temporal in this sense—they provide little or no special support for temporal data at all. However, this situation is beginning to change, for the following reasons among others:

■ Disks and other secondary storage media have become cheap enough that keeping large volumes of temporal data is now a practical possibility.

■ As a consequence, data warehouses have become increasingly widespread.

■ Hence, users of those warehouses have begun to find themselves faced with temporal data problems, and they’ve begun to feel the need for solutions to those problems.

■ In order to address those problems, certain temporal features have been incorporated into the most recent version of the SQL standard (“SQL:2011,” so called because it was ratified in late 2011).

■ Accordingly, vendors of conventional DBMS products have also begun to add temporal support to those products (there’s a huge market opportunity here).

Here are some examples of application scenarios where such temporal support is very definitely needed. The examples are taken, with permission, from “A Matter of Time: Temporal Data Management in DB2 10,” by Cynthia Saracco, Matthias Nicola, and Lenisha Gandhi (see Appendix F for further details):

1. An internal audit requires a financial institution to report on changes made to a client’s records over the past five years.

2. A lawsuit prompts a hospital to reassess its knowledge of a patient’s medical condition just before a new treatment was ordered.

3. A client challenges an insurance agency’s resolution of a claim involving a car accident. The agency needs to determine the policy terms that were in effect when the accident occurred.

4. An online travel agency wants to detect inconsistencies in itineraries. For example, if someone books a hotel in Rome for eight days and reserves a car in New York for three of those days, the agency would like to flag the situation for review.

5. A retailer needs to ensure that no more than one discount is offered for a given product during a given period of time.

6. A client inquiry reveals a data entry error involving the introductory interest rate on a credit card. The bank needs to correct the error, retroactively, and compute a new balance if necessary.

That same paper by Saracco, Nicola, and Gandhi also gives some indication of the development benefits to be obtained from the availability of “native” temporal support in the DBMS (IBM’s DB2 product, in the case at hand):

The DB2 temporal support reduced coding requirements by more than 90% over homegrown implementations. Implementing just the core logic in SQL stored procedures or Java required 16 and 45 times as many lines of code, respectively, as the equivalent SQL statements using the DB2 temporal features. Also, it took less than an hour to develop and test those DB2 statements. By contrast, the homegrown approaches required 4-5 weeks to code and test, and they provided only a subset of the temporal support built into DB2. Thus, providing truly equivalent support through a homegrown implementation would likely take months.

Now, research into temporal databases isn’t new—technical papers on the subject have been appearing in the literature ever since the beginning of the 1980s, if not earlier. However, much of that research ultimately proved unproductive: It turned out to be excessively complicated, or it led to logical inconsistencies, or it failed to solve certain aspects of the problem, or it was unsatisfactory for some other reason. So we’ll have little to say about that research in this book (apart from a few remarks in Chapter 19 and in the annotation to some of the references in Appendix F). Instead, we’ll focus on what we regard as a much more promising approach, one that’s firmly rooted in the relational model of data, which those others mostly aren’t, or weren’t. Of course, it’s precisely because of its strong relational foundations that we believe the approach we favor will stand the test of time (as it were!). And since the approach in question is directly and primarily due to one of the present authors (Lorentzos), the book can be regarded as authoritative.

The book also includes much original material resulting from continuing investigations by all three authors, material that’s currently documented nowhere else at all. Examples include new database design techniques; a new normal form; new relational operators; new update operators; a new approach to the problem of temporal “granularity”; and support for “cyclic point types.”1 Overall, therefore, the book can be seen as, among other things, an abstract blueprint for the design of a temporal DBMS and the language interface to such a DBMS. In other words, it’s forward looking, in the sense that it describes not only how temporal DBMSs might and do work today, but also, and more importantly, how we think they should and will work in the future.

One further point: Although this book concentrates on temporal data as such, many of the concepts are actually of much wider applicability. To be specific, the basic data construct involved is the interval, and intervals don’t necessarily have to be temporal in nature. (On the other hand, certain of the ideas discussed are indeed specifically temporal ones: for example, the notion sometimes referred to, informally, as “the moving point now.”)

Structure of the Book

The body of the book is divided into four major parts:

I. A Review of Relational Concepts

II. Laying the Foundations

III. Building on the Foundations

IV. SQL Support

To elaborate:

■ Part I (three chapters) provides a refresher course on the relational model, with the emphasis on aspects that don’t seem to be as widely appreciated as they might be. It also introduces the language Tutorial D—note the boldface name—which we’ll be using in coding examples throughout the book.

■ Part II (eight chapters) covers basic temporal concepts and principles. It explains some of the problems that temporal data seems to give rise to, with reference to queries and integrity constraints in particular, and it describes some important new operators that can help in formulating those queries and constraints. Note: We should immediately explain that those new operators are all, in the last analysis, just shorthand for certain combinations of operators that can already be expressed using the traditional relational algebra. However, the shorthands in question turn out to be extremely useful ones—not just because they simplify the formulation of queries and constraints (a laudable goal in itself, of course), but also, more importantly, because they serve to raise the level of abstraction, and hence the overall level of discourse, regarding temporal issues in general.

■ Part III (seven chapters) covers a range of more advanced temporal concepts and principles. In effect, it shows how the ideas introduced in Part II can be applied to such matters as temporal database design, temporal database updates, the formulation of temporal database constraints, and a variety of more specialized topics.

■ Part IV (one long chapter) describes the temporal features of the SQL standard.

In addition to the foregoing, there are six appendixes, covering (as appendixes are wont to do) a somewhat mixed bag of topics. Appendixes A-C discuss possible extensions to certain of the notions introduced in the body of the book. Appendix D provides an abbreviated grammar, for reference purposes, for the language Tutorial D. Appendix E discusses implementation and optimization issues. Finally, as already indicated, Appendix F gives an annotated and consolidated list of references for the entire book. Note: Talking of references, we should explain that throughout the book such references take the form of numbers in square brackets. For example, “[2]” refers to the second publication mentioned in that appendix: namely, the paper “Maintaining Knowledge about Temporal Intervals,” by James F. Allen, which was published in CACM26, No. 11, in November 1983.2

Note: Parts II, III, and IV, at least, are definitely not meant for “dipping.” Rather, they’re meant to be read in sequence as written—if you skip a section or a chapter, you’re likely to have difficulty with later material. While this state of affairs might seem a little undesirable in general, the fact is that temporal data does seem to suffer from certain innate complexities, and the book necessarily reflects some of those complexities. What’s more, it’s only fair to warn you that, beginning with Chapter 12 (the first chapter in Part III), you might notice a definite increase in complexity. At the same time, you should be aware that the full picture doesn’t really begin to emerge until that same Chapter 12. Please note too that this isn’t a closed subject!—several interesting research issues remain. Such issues are touched on and appropriately flagged at pertinent points in the book.

Last, a couple of remarks regarding our use of terminology:

1. Other than in Part IV, we adhere almost exclusively to the formal relational terms relation, tuple, attribute, etc., instead of using their SQL counterparts table, row, column, etc. In our opinion, those SQL terms have done the cause of genuine understanding a serious disservice over the years. Besides, the constructs referred to by those terms include many deviations from relational theory, such as duplicate rows, left to right column ordering, and nulls, and we certainly don’t want to give the impression that we might condone such deviations.

2. We’ve been forced to introduce our own terminology for several concepts—“packed form,” “U_ operators,” “U_keys,” “sixth normal form,” and others—precisely because the concepts themselves are new (for the most part, at least). However, we’ve tried in every case to choose terms that are appropriate and make good intuitive sense, and we haven’t intentionally used familiar terms in unfamiliar ways. We apologize if our choice of terms causes you any unnecessary difficulties.

Intended Readership

Who should read this book? Well, in at least one sense the book is quite definitely not self-contained—it does assume you’re professionally interested in database technology and are reasonably well acquainted with conventional database theory and practice. However, we’ve tried to define and explain, as carefully as possible, any concepts that might be thought novel. In fact, we’ve done the same for several concepts that really shouldn’t be novel at all but don’t seem to be as widely understood as they might be (relation variables, also known as relvars, are an obvious case in point here). We’ve also included a set of exercises, with answers, at the end of each chapter. In other words, we’ve tried to make the book suitable for both reference and tutorial purposes. Our intended audience is thus just about anyone with a serious interest in database technology, including but not limited to the following:

■ Database language designers and standardizers

■ DBMS product implementers and other vendor personnel

■ Data and database administrators

■ “Information modelers” and database designers

■ Application designers and developers

■ Computer science professors specializing in database issues

■ Database students, both graduate and undergraduate

■ People responsible for DBMS product evaluation and acquisition

■ Technically aware end users

The only background knowledge required is a general understanding of data management concepts and issues (including in particular a basic familiarity with the relational model) and, for Part IV, a general knowledge of SQL.

Note: There are currently few college courses, if any, that include coverage of temporal data. Because of what we see as the growing demand for proper temporal support, however, we can expect such courses to appear in the near future. We believe this book can serve as the text for such a course. For academic readers in particular, therefore (students as well as teachers), we should make it clear that we’ve tried to present the foundations of the temporal database field in a way that’s clear, precise, correct, and uncluttered by the baggage—not to say mistakes—that usually, and regrettably, seem to accompany commercial implementations. Thus, we believe the book provides an opportunity to acquire a firm understanding of that crucial foundation material, without being distracted by irrelevancies.

Acknowledgments

First of all, we’re pleased to be able to acknowledge the many friends and colleagues, too numerous to mention individually, who gave encouragement, participated in discussions and research, offered comments (both written and oral) on various drafts of this book and other publications, or helped in a variety of other ways. We’d also like to acknowledge the many conference and seminar attendees, again too numerous to mention individually, who have expressed support for the ideas contained herein. A special vote of thanks goes to our early reviewers Georgia Garani, Erwin Smout, Dimitri Souflis, Rios Viqueira, and Dave Voorhis. We’d also like to thank Cynthia Saracco, Matthias Nicola, and Lenisha Gandhi for permission to quote from their paper “A Matter of Time: Temporal Data Management in DB2 10” in this preface, and Krishna Kulkarni and Jan-Eike Michels for permission to quote from their paper “Temporal Features in SQL:2011” in Chapter 19 (also for their help in answering various technical questions in connection with—and in Krishna’s case reviewing—that chapter). Finally, we are grateful to Andrea Dierna, Steve Elliott, and Kaitlin Herbert, and to all of the other staff at Morgan Kaufmann, for their assistance and their high standards of professionalism. It has been a pleasure to work with them.

Nikos Lorentzos adds: I would like to thank my mother Efrosini, who knows better than anyone how endlessly busy my work keeps me, and Aliki Galati who has always encouraged me in my research. I would also like to express a strong debt of gratitude to Mike Sykes, the first person to express an interest in my work on temporal databases. Thanks to Mike, I came in contact with Hugh Darwen, with whom I have had many fruitful discussions on this topic. And thanks to Hugh, I finally met Chris Date, who has done a great job on this book. I am most grateful to my coauthors, Hugh and Chris, for our fruitful collaboration.

Hugh Darwen adds: I began my study of temporal database issues in the mid 1990s in connection with my work in UKDBL, the working group that formulates U.K. contributions to the development of the international SQL standard. I was joined in that study by my UKDBL colleague Mike Sykes, and together we began to search for alternatives to the temporal proposals then being considered by the SQL standards committee. It was Mike who first discovered the work of Nikos Lorentzos, and I am profoundly grateful to him for realizing that it was not only exactly what he had been looking for but also what he knew I had been looking for. (Mike had been looking for an approach based on intervals in general rather than just time intervals in particular. I had been looking for an approach that did not depart from the relational model.)

Various university contacts in the U.K. were helpful during my initial study period. I would especially like to thank Babis Theodoulidis of the University of Manchester Institute of Science and Technology for setting up a meeting between academics (including Nikos Lorentzos) and members of UKDBL. My subsequent education in the temporal field benefited greatly from discussions with people at IBM’s Almaden Research Center, especially Cliff Leung and later Bob Lyle. Finally, I must mention the participants at the June 1997 workshop on temporal databases in Dagstuhl, Germany, whose output was reference [55]. There were too many for me to name them individually, but I am grateful to them all for what struck me as a most informative, productive, stimulating, lively, and friendly event.

Chris Date adds: Once again I’d like to thank my wife Lindy for her support throughout the production of this book, as well as all of its predecessors. I’d also like to acknowledge the debt I owe my coauthors—Nikos, for doing such a good job of laying the theoretical foundations for the approach described in this book; Hugh, for his persistence in trying to persuade me that I ought to take an interest in temporal database matters in general and Nikos’s work in particular; and both Nikos and Hugh for their efforts in reviewing the numerous iterations this manuscript went through and their patience in correcting my many early errors and misconceptions.

C.J. Date

Healdsburg, California

Hugh Darwen

Shrewley, England

Nikos A. Lorentzos

Athens, Greece

2014


1All of the technical features described in this preface as “new” might more accurately be described as “new, except possibly (and to some partial extent) to readers of this book’s 2003 predecessor Temporal Data and the Relational Model.” Note: In this connection, we’d like to mention a couple of prototypes, MighTyD and SIRA_PRISE, that have implemented many of the features described in that earlier book: Details of those prototypes and much related material can be found on the website www.thethirdmanifesto.com.

2CACM = Communications of the ACM.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.144.128.57