Chapter 10. Contract Programming

Ensuring program correctness is a topic of increasing importance in a world where we trust various computing systems, large and small, with ever more bits of our existence. This chapter introduces program correctness mechanisms that kick in at runtime (as opposed to typechecking and other semantic checks, which enforce certain correctness constraints during compilation). Runtime checks for program correctness are only partially related to error handling and should not be confused with it. More specifically, there are three intertwined but distinct areas lying under the generous umbrella of “when things go wrong”:

  • Error handling (the topic of Chapter 9) deals with techniques and idioms for managing expected runtime errors.
  • Reliability engineering is a field that studies the ability of entire systems (e.g., hardware plus software) to perform to specification. (This book does not discuss reliability engineering.)
  • Program correctness is a field of programming language research dedicated to proving with static and dynamic means that a program is correct according to a given specification. Type systems are one of the best-known means for proving program correctness (a read of Wadler’s fascinating monograph “Proofs are programs” [59] is highly recommended). This chapter discusses Contract Programming, a paradigm for enforcing program correctness.

The major aspect that distinguishes program correctness from error handling is that the latter is concerned with errors that fall within the specification of the program (such as dealing with a corrupt data file or invalid user input), whereas the former is concerned with programming errors that put the program’s behavior outside the specification (such as miscalculating a percentage value that is outside the 0 through 100 range or unexpectedly obtaining a negative day of the week in a Date object). Ignoring this important distinction leads to unpardonable but, alas, still common misunderstandings such as checking file and network input with assert.

Contract Programming is an approach to defining software components introduced by Parnas [45], then further popularized by Meyer [40] along with the Eiffel programming language. Today Contract Programming has matured into a popular software development paradigm. Although most mainstream programming languages do not offer explicit support for Contract Programming, many shops have standards and conventions enforcing its underlying principles. Contracts are also an active area of research; recent work includes advanced topics such as contracts for higher-order functions [24] and static verification of contracts [61]. For the time being, D sticks with the simpler, traditional model of Contract Programming, which we’ll discuss in this chapter.

10.1 Contracts

Contract Programming uses a real-life metaphor to improve the definition and verification of modular interfaces. The metaphor is that of binding contract: when entity A (person, company) commits to perform a certain service for the benefit of entity B, a contract between A and B describes what B is expected to provide to A in exchange for the service, and exactly what A commits to provide once B fulfills its part of the contract.

Similarly, the Contract Programming paradigm defines a function’s specification as a contract between the function (the supplier) and its caller (the client). One part of the contract specifies what requirements the caller must fulfill in order for the function call to proceed. The other part of the contract specifies the guarantees that the function makes upon return in terms of returned value and/or side effects.

The central notions of Contract Programming are as follows:

  • Assertion: Not tied to a particular function, an assertion is a runtime check against an if-testable condition. If the condition is nonzero, assert has no effect. Otherwise, assert throws an AssertError object. AssertError is an unrecoverable exception—it does not inherit Exception but instead inherits Error directly, which means that it shouldn’t normally be caught.
  • Precondition: The precondition of a function is the totality of conditions that a caller must fulfill in order to invoke the function. The conditions may be directly related to the call site (such as parameter values) but also related to the system state (such as availability of memory).
  • Postcondition: The postcondition of a function is the totality of guarantees that the function makes upon normal return, assuming its precondition was satisfied.
  • Invariant: An invariant is a condition that stays unmodified throughout a portion of a computation. In D, invariants always refer to the state of an object before and after a method invocation.

Contract Programming generalizes very nicely some time-tested notions that today we take for granted. For example, a function signature is a contract all right. Consider a function found in the standard library, module std.math :


Click here to view code image

double sqrt(double x);


The sheer signature imposes a contract: the caller must provide exactly one value of type double, and the function’s return is one double value as well. You can’t call sqrt("hello") or assign the result of sqrt to a string. More interestingly, you can call sqrt(2) even though 2 is an int and not a double: the signature gives the compiler enough information to help the caller fulfill the input requirement by converting 2 to a double. The function may have side effects, but if it doesn’t, the pure attribute may be used to specify that:


Click here to view code image

// No side effects
pure double sqrt(double x);


This is a stronger, more binding contract for sqrt because it forces sqrt to not have any side effects. Finally, there is the nothrow attribute that allows us to specify an even more detailed (and restrictive) contract:


Click here to view code image

// No side effects, never throws
// (Actual declaration found in std.math)
pure nothrow double sqrt(double x);


Now we know for sure that the function either returns a double, terminates the program, or enters an infinite loop. There’s nothing else in the world it can ever do. So we were using contracts with functions by just writing down signatures.

To appreciate the contractual power of function signatures, consider a little piece of historical evidence. The early, pre-standard version of the C language (known as “K&R C” in honor of its creators, Kernighan and Ritchie) had a quirk. If you didn’t declare a function at all, K&R C would consider it a function with this signature:


Click here to view code image

// If you don't declare sqrt but call it, it's as if
//    you declared it as
int sqrt(...);


In other words, if you forgot to #include the header math.h (which provides the correct signature for sqrt), you could have called sqrt("hello") without the compiler minding it one bit. (The ellipsis introduces varargs, one of the most unsafe features of C.) One more subtle error was that invoking sqrt(2) compiled with or without including math.h but did very different things. With the #include, the compiler converted 2 to 2.0 before calling sqrt; without it, a terrible misunderstanding between parties occurred: the caller sent the integer 2 and sqrt picked up its binary representation as if it were a floating-point number, which in 32-bit IEEE is 2.8026e-45. ANSI C recognized the gravity of this problem and fixed it by requiring prototypes for all functions.

Function attributes and types can be used to specify simple contracts. Attributes are in fixed supply, but types are easy to define whenever needed. How far can types go in describing contracts? The answer is, sadly, that (at least with the current technology) types are not an adequate vehicle for expressing even moderately complex contracts.

A designer could specify a function’s contract in the documentation associated with the function, but I’m sure we all agree that setup is far from satisfactory. Users of a component don’t always peruse its documentation with due care, and even when they do it’s easy to make honest mistakes. Besides, documentation has a way of getting out of sync with design and implementation, particularly when specifications are nontrivial and change frequently (as often happens).

Contract Programming takes a simpler approach of specifying contractual requirements as executable predicates—snippets of code that describe the contract as pass/fail conditions. Let’s take a look at each in turn.

10.2 Assertions

This book has defined (§ 2.3.4.1 on page 46) and already used assert in many places—an implied acknowledgment of the notion’s usefulness. In addition, most languages include a sort of assertion mechanism, either as a primitive or as a library construct.

To recap, use the assert expression to ensure that an expression is supposed to be nonzero by design, in all runs of the program regardless of input:


Click here to view code image

int a, b;
...
assert(a == b);
assert(a == b, "a and b are different");


The asserted expression is often Boolean but may have any if-testable type: numeric, array, class reference, or pointer. If the expression is zero, assert throws an object of type AssertError; otherwise, nothing happens. An optional string parameter is made part of the error message carried by the AssertError object, if thrown. The string is evaluated only if the assertion does fail, which saves some potentially expensive computation:


Click here to view code image

import std.conv;

void fun() {
   int a, b;
   ...
   assert(a == b);
   assert(a == b, text(a, " and ", b, " are different"));
}


The std.conv.text function converts and concatenates all of its arguments into a string. That entails quite a bit of work—memory allocation, conversions, the works. It would be wasteful to do all that work if the assertion succeeds, so assert evaluates its second argument only if the first is zero.

What should assert do in case of a failure? Forcefully terminating the application is an option (and is what C’s homonym macro does), but D’s assert throws an exception. It’s not an ordinary exception, however; it’s an AssertError object, which inherits Error—the über-exception discussed in § 9.2 on page 302.

The AssertError object thrown by assert goes through the catch(Exception) handlers like a hot knife through butter. That’s a good thing because assert failures represent logic errors in your program, and usually you want logic errors to just terminate the application as soon and as orderly as possible.

To catch an AssertError exception, use Error or AssertError directly in a catch handler instead of Exception or its descendants. But then again: you should seldom be in a place in life where catching Errors would help.

10.3 Preconditions

Preconditions are contractual obligations that must be satisfied upon a function’s entry. For example, say we want to write a contract enforcing non-negative inputs for a function fun. That would be a precondition imposed by fun on its callers. In D, you write a precondition as follows:


Click here to view code image

double fun(double x)
in {
   assert(x >= 0);
}
body {
   // Implementation of fun
   ...
}


The in contract is automatically executed before the function’s body. That’s virtually the same as the simpler version:


Click here to view code image

double fun(double x) {
   assert(x >= 0);
   // Implementation of fun
   ...
}


but we’ll see that it is important to distinguish the precondition from the function’s body when objects and inheritance enter the picture.

Some languages restrict contracts to Boolean expressions and automatically throw the exception if the Boolean is false, for example:


Click here to view code image

// Not D
double fun(double x)
in (x >= 0)
body {
   ...
}


D is more flexible in that it allows you to check for preconditions that don’t easily lend themselves to single Boolean expressions. Also, you have the freedom of throwing any exception you want, not only an AssertError exception. For example, fun might want to throw some exception type that records the faulty input:


Click here to view code image

import std.conv, std.contracts;

class CustomException : Exception {
   private string origin;
   private double val;
   this(string msg, string origin, double val) {
      super(msg);
      this.origin = origin;
      this.val = val;
   }
   override string toString() {
      return text(origin, ": ", super.toString(), val);
   }
}

double fun(double x)
in {
   if (x !>= 0) {
      throw new CustomException("fun", x);
   }
}
body {
   double y;
   // Implementation of fun
   ...
   return y;
}


But don’t abuse that flexibility. As discussed above, assert throws an AssertError object, which is different from regular exceptions. It is best to use AssertError or other exceptions that inherit Error but not Exception when signaling a precondition failure. This is because precondition failure indicates a serious logic error in your program that is not supposed to get caught casually.

The compiler actually takes explicit steps to disallow contract misuse. First, inside the in clause you cannot execute the return statement, meaning that you can’t use a contract to entirely skip the function’s body. Second, D explicitly disallows changing parameters in a contract. For example, the following code is in error:


Click here to view code image

double fun(double x)
in {
   if (x <= 0) x = 0; // Error!
      // Cannot modify parameter 'x' inside contract!
}
body {
   double y;
   ...
   return y;
}


Yet, although the compiler could enforce that a contract is pure (which would be a logical decision), it doesn’t. This means you can still alter global variables or generate output from within a contract. This freedom was granted with a purpose: impure uses are useful during debugging sessions, and it would be too restrictive to disallow them. Nevertheless, remember that generally it’s not good style to alter the state of the world from within a contract. Contract code is only supposed to verify observance of the contract and throw an exception if the contract has been violated—nothing else.

10.4 Postconditions

With the in contract in tow, fun is asymmetric and in a certain way unfair. fun specifies its requirements to the caller but provides no guarantee. Why should the caller work hard to provide a non-negative number to fun? To check postconditions, use an out contract. Let’s assume that fun guarantees a result between 0 and 1:


Click here to view code image

double fun(double x)
// As before
in {
   assert(x >= 0);
}
// added
out(result) {
   assert(result >= 0 && result <= 1);
}
body {
   // Implementation of fun
   double y;
   ...
   return y;
}


If the in contract or the function’s body throws an exception, out does not execute at all. If the in contract passes and body returns normally, the out contract is executed. The parameter result passed to out is whatever the function is about to return. The result parameter is optional; out { ... } is also a valid out contract that doesn’t need the result or applies to a void-returning function. In the example above, result will be a copy of y.

Just like the in contract, the out contract should only verify without modifying. The only interaction of out contracts with the outer world should be either doing nothing at all (pass) or throwing an exception (fail). In particular, out is not a good place for last-minute result adjustments. Compute the result in body, and check it with out. The following code does not compile for two reasons: the out contract attempts to rebind result and also attempts to (harmlessly but suspiciously) rebind an argument:


Click here to view code image

int fun(int x)
out(result) {
   x = 42;                     // Error!
      // Cannot modify parameter 'x' in a contract!
   if (result < 0) result = 0; // Error!
      // Cannot modify the result in a contract!
}
body {
   ...
}


10.5 Invariants

An invariant is a condition that remains satisfied at certain milestones during a computation. For example, a pure function ensures that the entire state of the program remains unchanged throughout the execution of the function. Such a guarantee is very strong but often too coarse to be used intensively.

A more granular invariance guarantee may be applied to an individual object, and this is the model D works with. Consider, for example, a simple Date class that stores the day, month, and year as individual integers:


Click here to view code image

class Date {
   private uint year, month, day;
   ...
}


It is reasonable to posit that at no point in the lifetime of a Date object should the year, month, and day members take nonsensical values. To express such an assumption, use an invariant:


Click here to view code image

import std.algorithm, std.range;



Click here to view code image

class Date {
private:
   uint year, month, day;
   invariant() {
      assert(1 <= month && month <= 12);
      switch (day) {
         case 29:
            assert(month != 2 || leapYear(year));
            break;
         case 30:
            assert(month != 2);
            break;
         case 31:
            assert(longMonth(month));
            break;
         default:
            assert(1 <= day && day <= 28);
            break;
      }
      // No restriction on year
  }
  // Helper functions
   static pure bool leapYear(uint y) {
      return (y % 4) == 0 && (y % 100 || (y % 400) == 0);
   }
   static pure bool longMonth(uint m) {
      return !(m & 1) == (m > 7);
   }
public:
   ...
}


The three tests for days 30, 31, and 29 handle the customary verifications for month February and leap year. The test in longMonth returns true if a month has 31 days and works by claiming, “A long month is an even number if and only if it is greater than July,” which makes sense (months 1, 3, 5, 7, 8, 10, and 12 are long).

The invariant must pass for any valid Date object at all times. In theory the compiler could emit calls to the invariant whenever it wants. However, things are not that simple. Consider, for example, that the compiler makes the executive decision to insert a call to invariant at the end of each statement. That would be not only inefficient, but also incorrect. Consider setting a Date from another Date:


Click here to view code image

// Inside class Date
void copy(Date another) {
  year = another.year;
  __call_invariant();    // Inserted by the compiler
  month = another.month;
  __call_invariant();    // Inserted by the compiler
  day = another.day;
  __call_invariant();    // Inserted by the compiler
}


Between these statements it’s quite possible that the Date is temporarily out of sync, so inserting an invariant evaluation per statement is not correct. (For example, assigning date 1 August 2015 to a date currently containing 29 February 2012 would temporarily make the date be 29 February 2015, which is an invalid date.)

How about inserting an invariant call at the beginning and end of each method? Negative again. Consider, for example, that you write a function that advances a date by one month. Such a function is useful, for example, for tracking events that happen once a month. The function must pay attention only to adjusting the day around the end of the month such that the date goes, for example, from August 31 to September 30.


Click here to view code image

// Inside class Date
void nextMonth() {
   __call_invariant();              // Inserted by the compiler
   scope(exit) __call_invariant(); // Inserted by the compiler
   if (month == 12) {
      ++year;
      month = 1;
   } else {
      ++month;
      adjustDay();
   }
}
// Ancillary function
private void adjustDay() {
   __call_invariant();             // Inserted by the compiler
                                   // (PROBLEMATIC)
   scope(exit) __call_invariant(); // Inserted by the compiler
                                   // (PROBLEMATIC)
   switch (day) {
      case 29:
         if (month == 2 && !leapYear(year)) day = 28;
         break;
      case 30:
         if (month == 2) day = 28 + leapYear(year);
         break;
      case 31:
         if (month == 2) day = 28 + leapYear(year);
         else if (!isLongMonth(month)) day = 30;
         break;
      default:
         // Nothing to do
         break;
   }
}


Function nextMonth takes care of year rollover and uses an ancillary private function adjustDay to ensure that the day remains inside a valid date. Here’s exactly where the problem is: upon entrance in adjustDay the invariant may be broken. Of course it might—the sole purpose of adjustDay was to fix the Date object!

What makes adjustDay special? It’s its protection level: it’s a private function, accessible only to other functions that have the right to modify the Date object. Upon entrance in and exit from a private function, in general, it’s acceptable to have a broken invariant. The places where the invariant must definitely be accepted are at public method boundaries: an object doesn’t want to allow a client operation to find or leave this in an invalid state.

How about protected functions? According to the discussion in § 6.7.6 on page 201, protected is just one little notch better than public. However, it was deemed that requiring invariant satisfaction at the boundaries of protected functions was too restrictive.

If a class defines an invariant, the compiler automatically inserts calls to the invariant in the following places:

  1. At the end of all constructors
  2. At the beginning of the destructor
  3. At the beginning and end of all public non-static methods

Say we put on X-ray vision goggles that allow us to see the code inserted by the compiler in the Date class. We’d then see this:


Click here to view code image

class Date {
   private uint day, month, year;
   invariant() { ... }
   this(uint day, uint month, uint year) {
      scope(exit) __call_invariant();
      ...
   }
   ~this () {
      __call_invariant();
      ...
   }
   void somePublicMethod() {
      __call_invariant();
      scope(exit) __call_invariant();
      ...
   }
}


A detail about the constructor and destructor is worth noting. Recall from the discussion of an object’s lifetime (§ 6.3 on page 181) that once allocated, an object is considered valid. Therefore, even if a constructor throws, it must leave the object in an invariant-abiding state.

10.6 Skipping Contract Checks. Release Builds

Contracts are concerned exclusively with verifying the internal logic of an application. In keeping with that charter, most, if not all, programming systems that support contracts also allow a mode in which all contract checking is ignored. That mode is supposed to be activated only with programs that have been thoroughly reviewed, verified, and tested.

Any D compiler provides a flag (-release in the reference implementation) that ignores contracts altogether, that is, parses and typechecks all contract code but leaves no trace of it in the executable binary. A release build runs without contract checking (which is riskier) but also at full speed (which is, well, faster). If the application has its ducks in a row, the added risk of skipping contract checks is very low and the increase in speed is well worth that risk. The possibility of running without contracts reinforces the warning that code should not use contracts for routine checks that could reasonably fail. Contracts must be reserved for never-expected errors that reflect a logic bug in your program. Again, you should never use contracts to make sure that user input is correct. Also, remember the repeated warnings against doing any significant work (such as side effects) inside assert, in, and out? Now it’s painfully obvious why: a program that does such unsavory acts would oddly behave differently in non-release and release mode.

One commonly encountered error is asserting expressions with side effects, for example, assert(++x < y), which is bound to cause much head scratching. It’s the worst of all worlds: the bug manifests itself in release mode, when by definition you have fewer means at your disposal to find the source of the problem.

10.6.1 enforce Is Not (Quite) assert

It’s a pity that assert disappears from release builds, because using it is very convenient. Instead of writing


Click here to view code image

if (!expr1) throw new SomeException;
...
if (!expr2) throw new SomeException;
...
if (!expr3) throw new SomeException;


you get to write only


Click here to view code image

assert(expr1);
...
assert(expr2);
...
assert(expr3);


Given that assert is so concise, many libraries provide an “always assert” feature that checks a condition and throws an exception if the condition is zero, whether you compile in release mode or not. Such checkers go in C++ by names such as VERIFY, ASSERT_ALWAYS, or ENFORCE. D defines such a function in module std.contracts under the name enforce. Use enforce with the same syntax as assert:


Click here to view code image

enforce(expr1);
enforce(expr2, "That isn't quite true");


If the passed-in expression is zero, enforce throws an object of type Exception regardless of whether you compiled the program in release or non-release mode. If you want to throw a different type, you may specify it as follows:


Click here to view code image

import std.contracts;
bool something = true;



Click here to view code image

enforce(something, new Error("Something isn't right"));


If something is zero, the second argument is thrown; enforce evaluates it lazily such that no object creation occurs if expr1 is nonzero.

Although assert and enforce look and feel very much alike, they serve fundamentally different purposes. Don’t forget the differences between the two:

  • assert checks your application logic, whereas enforce checks error conditions that don’t threaten the integrity of your application.
  • assert throws only the unrecoverable AssertError exception, whereas enforce throws by default a recoverable exception (and may throw any exception with an extra argument).
  • assert may disappear, so don’t take it into consideration when figuring the flow of your function; enforce never disappears, so after you call enforce(e) you can assume that e is nonzero.

10.6.2 assert(false)

An assertion against a constant that is known to be zero during compilation, such as assert(false), assert(0), or assert(null), behaves a tad differently from a regular assert.

In non-release mode, assert(false) does not do anything special: it just throws an AssertError exception.

In release mode, however, assert(false) is not compiled out of existence; it will always cause a program to stop. This time, however, there would be no exception and no chance of continuing to run after an assert(false) was hit. The program will crash. This is achieved on Intel machines by executing the HLT (“halt”) instruction, which causes the program to abort immediately.

Many of us tend to think of a crash as a highly dangerous event that indicates a program gone out of control. This disposition is prevalent most likely because many programs that do go out of control terminate, sooner or later, via a crash. But assert(false) is a very controlled way to terminate a program. In fact, on some operating systems, HLT automatically loads your debugger and positions it on the very assert that triggered the crash.

What’s the purpose of this particular behavior of assert(false)? One obvious use has to do with system-level programs. There had to be a portable way to issue HLT, and assert(false) integrates well with the rest of the language. In addition, the compiler is aware of the semantics of assert(false), so, for example, it disallows dead code following an assert(false) expression:


Click here to view code image

int fun(int x) {
   ++x;
   assert(false);
   return x;          // Error!
                      // Statement is not reachable!
}


On the contrary, in other situations you may need to add assert(false) to suppress a compiler error. Consider, for example, calling the standard library function std.contracts.enforce(false) discussed just above:


Click here to view code image

import std.contracts;

string fun() {
   ...
   enforce(false, "can't continue"); // Always throws
   assert(false);                    // Unreachable
}


The call enforce(false) always throws an exception, but the compiler doesn’t know that. To make the compiler understand that that point cannot possibly be reached, insert an assert(false). Finishing fun with return ""; also works, but in that case, if someone comments out the enforce call later on, fun would start returning bogus values. The assert(false) is a veritable deus ex machina that saves your code from such situations.

10.7 Contracts: Not for Scrubbing Input

This section discusses a controversial matter related to contracts that is the source of continuous debate. The matter essentially boils down to this question: If a function must make some check, where should the check go—in a contract or in the function’s body?

When first getting accustomed to Contract Programming, many of us are tempted to move most checks inside contracts. Consider, for example, a function called readText that loads a text file in its entirety as a string. Armed with contracts, we might define it as follows:


Click here to view code image

import std.file, std.utf;

string readText(in char[] filename)
out(result) {
   std.utf.validate(result);
}
body {
    return cast(string) read(filename);
}


(readText is actually a function in the standard library; you may want to look it up in module std.file.)

readText relies on two other file functions. First, it uses read to load an entire file into a memory buffer. The memory buffer has type void[], which readText casts to string. But it would be incorrect to leave things at that: what if the file contains malformed UTF characters? To validate the cast, the out contract verifies the result by calling std.utf.validate, which throws a UtfException object if the buffer contains an invalid UTF character.

That would be fine, were it not for a fundamental issue: contracts must validate the logic of an application, not the validity of its inputs. Anything that’s not considered an endemic problem of the application does not belong inside contracts. Also, contracts are not supposed to change the semantics of the application—hence D’s intentional curbing of what can be modified inside a contract.

Assuming no contracts fail, an application must run with the same behavior and results with or without actually executing contracts. This is a very simple and memorable litmus test for deciding what’s a contract and what isn’t. Contracts are specification checks, and if the checks go away for a correct implementation, that doesn’t stop the implementation from working! That’s how contracts are meant to work. Expecting that a file is always valid may reveal a positive attitude but should not be part of readText’s specification. A correct definition of readText makes the check an integral part of the function:


Click here to view code image

import std.file, std.utf;

string readText(in char[] filename) {
   auto result = cast(string) read(filename);
   std.utf.validate(result);
   return result;
}


In light of the discussion so far, the answer to the question regarding check placement is: If the check concerns the application logic, it should go in a contract; otherwise, the check should go in the body of the function and never get skipped.

That sounds great, but how to define “application logic” in applications built out of separate, generic libraries written by independent entities? Consider a large general-purpose library, such as the Microsoft Windows API or the K Desktop Environment. Many applications use APIs like these, and it is inevitable that library functions receive arguments that do not conform to the spec. (In fact, an operating system API may count on receiving all sorts of malformed arguments.) If an application does not fulfill the precondition of a library function call, where does the blame go? It was clearly the fault of the application, but it’s the library that takes the hit—in terms of instability, undefined behavior, corrupted state inside the library, crashes, all those bad things. As unfair as it may seem, such problems would reflect poorly on the library (“Library Xyz is prone to instability and surprising quirks”) more than on the bug-ridden applications using it.

A general-purpose and large-distribution API should verify all inputs to all of its functions as a matter of course—not in contracts. Failure to verify an argument is unequivocally a library bug. No spokesperson would ever wave a copy of a book or paper and say, “We were using Contract Programming throughout, so we’re not at fault.”

Does that invalidate the argument that functions should use preconditions to specify, for example, argument ranges? Not at all. It’s all a matter of defining and distinguishing “application logic” from “user input.” To a function that’s an integral part of an application, receiving valid arguments is part of the application logic. To a general-purpose function belonging to an independently delivered library, arguments are nothing but user input.

On the other hand, it is perfectly fine for a library to use contracts in its private functions. Those functions relate to the internal workings of the library and cannot be accessed by user code, so it is sensible to have them use contracts to express adherence to specification.

10.8 Contracts and Inheritance

The often-quoted Liskov Substitution Principle [38] states that inheritance is substitutability: an object of the derived class must be substitutable wherever an object of the base class is expected. This insight essentially determines the interaction of contracts with inheritance.

In the real world, the relationship between contracts and substitutability is as follows: once a contract is established, a substitute contractor must be at least as qualified to perform the job, deliver the job within at least the specified tolerance, and require at most the same compensation that was established in the contract. There is some flexibility, but never in the direction of tightening the preconditions of the contract or loosening the postconditions. If either of these happens, the contract becomes invalid and must be rewritten. The flexibility concerns only variations that don’t negatively affect the understanding in the contract: a substitute is allowed to require less and offer more.

10.8.1 Inheritance and in Contracts

Consider the Date example again. Let’s say we define a very simple, lightweight BasicDate class that offers only minimal support and leaves enhancements to derived classes. BasicDate offers a function format that takes a string representing a format specification and returns a string with the date formatted appropriately:


Click here to view code image

import std.conv;

class BasicDate {
   private uint day, month, year;
   string format(string spec)
   in {
      // Require str to be equal to "%Y/%m/%d"
      assert(spec == "%Y/%m/%d");
   }
   body {
      // Simplistic implementation
      return text(year, '/', month, '/', day);
   }
   ...
}


The contract imposed by Date.format requires that the format specification be exactly "%Y/%m/%d", which we assume means “year in long format followed by a slash followed by month followed by a slash followed by day.” That’s the only format BasicDate worries about supporting. Derived classes may add localization, internationalization, the works.

A class Date that inherits BasicDate wants to offer a better format primitive—for example, say Date wants to allow the specifiers %Y, %m, and %d in any positions and mixed with arbitrary characters. Also, %% should be allowed because it represents the actual character %. Repeated occurrences of the same specifiers should also be allowed. To enforce all that, Date writes its own contract:


Click here to view code image

import std.regex;

class Date : BasicDate {
   override string format(string spec)
   in {
      auto pattern = regex("(%[mdY%]|[^%])*");
      assert(!match(spec, pattern).empty);
   }
   body {
      string result;
      ...
      return result;
   }
   ...
}


Date enforces its constraints on spec with the help of a regular expression. Regular expressions are an invaluable aid in string manipulation; Friedl’s classic Mastering Regular Expressions [26] is warmly recommended. This is not the place to discuss regular expressions in depth, but suffice it to say that "(%[mdY%]|[^%])*" means “a % followed by any of m, d, Y, or %'; or anything other than a %—repeated zero or more times.” The equivalent code that would match such a pattern by hand would be considerably more verbose. The assert makes sure that matching the string against the pattern returns a non-empty match, that is, it worked. (For more on using regular expressions with D, you may want to peruse the online documentation of the standard module std.regex.)

What is the aggregate contract of Date.format? It should mind BasicDate.format’s contract but also relax it. It’s fine if the base in contract fails, as long as the derived in contract passes. Also, Date.format’s contract should never strengthen BasicDate.format’s in contract. The emerging rule is as follows: In an overridden method, first execute the base class contract. If that succeeds, transfer control to the body. Otherwise, execute the derived class contract. If that succeeds, transfer control to the body. Otherwise, report contract failure.

Put another way, the in contracts are combined by using disjunction with short-circuit: exactly one must pass, and the base class contract is tried first. That way there is no possibility that the derived contract is more difficult to satisfy than the base class contract. On the contrary, the derived class offers a second chance for failed preconditions.

The rule above works very well for Date and BasicDate. First, the composite contract checks against the exact pattern "%Y/%m/%d". If that succeeds, formatting proceeds. Failing that, conformance to the derived, more permissive, contract is checked. If that passes, again formatting may proceed.

The code generated for the combined contract looks like this:


Click here to view code image

void __in_contract_Date_format(string spec) {
   try {
      // Try the base contract
      this.BasicDate.__in_contract_format(spec);
   } catch (Throwable) {
      // Base contract failed, try derived contract
      this.Date.__in_contract_format(spec);
   }
   // Success, can invoke body
}


10.8.2 Inheritance and out Contracts

With out contracts the situation is exactly the opposite: when substituting a derived object for a base object, the overridden function must offer more than what the contract promised. So right off the bat, the out guarantee of the base must always be fulfilled by the overriding method (unlike the case for the in contract).

Conversely, this means that a base class should set the contract as loose as is useful, to avoid the risk of over-constraining derived classes. For example, if BasicDate.format imposes that the returned string has the format year/month/day, it would effectively prevent any derived class from performing any other formatting. Perhaps BasicDate.format could impose a weaker contract—for example, if the formatting string is not empty, an empty string is not allowed as output:


Click here to view code image

import std.range, std.string;

class BasicDate {
   private uint day, month, year;
   string format(string spec)
   out(result) {
      assert(!result.empty || spec.empty);
   }
   body {
      return std.string.format("%04s/%02s/%02s", year, month, day);
   }
   ...
}


Date sets its ambitions a bit higher: it computes the expected result length from the format specification and then compares the length of the actual result to the expected length:


Click here to view code image

import std.algorithm, std.regex;

class Date : BasicDate {
   override string format(string spec)
   out(result) {
      bool escaping;
      size_t expectedLength;
      foreach (c; spec) {
         switch (c) {
            case '%':
               if (escaping) {
                  ++expectedLength;
                  escaping = false;
               } else {
                  escaping = true;
               }
               break;
            case 'Y':
               if (escaping) {
                  expectedLength += 4;
                  escaping = false;
               }
               break;
            case 'm': case 'd':
               if (escaping) {
                  expectedLength += 2;
                  escaping = false;
               }
               break;
            default:
               assert(!escaping);
               ++expectedLength;
               break;
         }
      }
      assert(walkLength(result) == expectedLength);
   }
   body {
      string result;
      ...
      return result;
   }
   ...
}


(Why walkLength(result) instead of result.length? The number of characters in a UTF-encoded string may be smaller than its length in chars.) Given these two out contracts, what is the correct combined out contract? The answer is simple: The contract of the base class must be also verified. Then, if the derived class promises additional contractual obligations, those must be fulfilled as well. It’s a simple conjunction. The code below is what the compiler might generate for composing the base and derived contracts:


Click here to view code image

void __out_contract_Date_format(string spec) {
   this.BasicDate.__out_contract_format(spec);
   this.Date.__out_contract_format(spec);
   // Success
}


10.8.3 Inheritance and invariant Contracts

Just as in the case of out contracts, we’re looking at a conjunction, an “and” relation: a class must fulfill the invariant of all of its base classes in addition to its own invariant. There is no way for a class to weaken the invariant of its base class. The current compiler calls invariant() clauses from the top of the hierarchy down, but that should not matter at all for the implementor of an invariant; as discussed, invariants should have no side effects.

10.9 Contracts in Interfaces

Possibly the most interesting application of contracts is in conjunction with interfaces. An interface is a complex contract, so it is fitting that each of an interface’s methods should describe an abstract contract—a contract without a body. The contract is enforced in terms of the not-yet-implemented primitives defined by the interface.

Consider, for example, that we want to enhance the Stack interface defined in § 6.14 on page 233. Here it is for reference:


Click here to view code image

interface Stack(T) {
   @property bool empty();
   @property ref T top();
   void push(T value);
   void pop();
}


Let’s attach contracts to the interface that reveal the interplay of these primitives. Interface contracts look just like regular contracts without a body.


Click here to view code image

interface Stack(T) {
   @property bool empty();
   @property ref T top()
   in {
      assert(!empty);
   }

   void push(T value)
   in {
      assert(!empty);
   }
   out {
      assert(value == top);
   }

   void pop()
   in {
      assert(!empty);
   }
}


For an interface method with a contract, the trailing semicolon is not needed anymore. With the new definition of Stack, implementations are constrained to work within the confines defined by Stack’s contracts. One nice thing is that the contract-enhanced Stack is a good specification of a stack that is at the same time easy to read by a programmer and verified dynamically.

As discussed in § 10.7 on page 327, Stack’s contracts may be compiled out. If you define a container library for large and general use, it may be a good idea to treat method calls as user input. In that case, the NVI idiom (§ 6.9.1 on page 213) may be better suited. A stack interface that uses NVI to always check for valid calls would look like this:


Click here to view code image

interface NVIStack(T) {
protected:
   ref T topImpl();
   void pushImpl(T value);
   void popImpl();

public:
   @property bool empty();

   final @property ref T top() {
      enforce(!empty);
      return topImpl();
   }

   final void push(T value) {
      enforce(!empty);
      pushImpl(value);
      enforce(value == topImpl());
   }

   final void pop() {
      assert(!empty);
      popImpl();
   }
}


NVIStack uses enforce throughout—a test that’s impossible to compile out of existence and also makes push, pop, and top final and hence impossible to hijack by implementations. One nice effect is that all major error handling has been hoisted out of each implementation in turn into the interface—a good form of reuse and of dividing responsibilities. NVIStack implementations can assume without fear that pushImpl, popImpl, and topImpl are always called in valid states and optimize them accordingly.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.191.74.66