Ensuring program correctness is a topic of increasing importance in a world where we trust various computing systems, large and small, with ever more bits of our existence. This chapter introduces program correctness mechanisms that kick in at runtime (as opposed to typechecking and other semantic checks, which enforce certain correctness constraints during compilation). Runtime checks for program correctness are only partially related to error handling and should not be confused with it. More specifically, there are three intertwined but distinct areas lying under the generous umbrella of “when things go wrong”:
The major aspect that distinguishes program correctness from error handling is that the latter is concerned with errors that fall within the specification of the program (such as dealing with a corrupt data file or invalid user input), whereas the former is concerned with programming errors that put the program’s behavior outside the specification (such as miscalculating a percentage value that is outside the 0 through 100 range or unexpectedly obtaining a negative day of the week in a Date
object). Ignoring this important distinction leads to unpardonable but, alas, still common misunderstandings such as checking file and network input with assert
.
Contract Programming is an approach to defining software components introduced by Parnas [45], then further popularized by Meyer [40] along with the Eiffel programming language. Today Contract Programming has matured into a popular software development paradigm. Although most mainstream programming languages do not offer explicit support for Contract Programming, many shops have standards and conventions enforcing its underlying principles. Contracts are also an active area of research; recent work includes advanced topics such as contracts for higher-order functions [24] and static verification of contracts [61]. For the time being, D sticks with the simpler, traditional model of Contract Programming, which we’ll discuss in this chapter.
Contract Programming uses a real-life metaphor to improve the definition and verification of modular interfaces. The metaphor is that of binding contract: when entity A (person, company) commits to perform a certain service for the benefit of entity B, a contract between A and B describes what B is expected to provide to A in exchange for the service, and exactly what A commits to provide once B fulfills its part of the contract.
Similarly, the Contract Programming paradigm defines a function’s specification as a contract between the function (the supplier) and its caller (the client). One part of the contract specifies what requirements the caller must fulfill in order for the function call to proceed. The other part of the contract specifies the guarantees that the function makes upon return in terms of returned value and/or side effects.
The central notions of Contract Programming are as follows:
if
-testable condition. If the condition is nonzero, assert
has no effect. Otherwise, assert
throws an AssertError
object. AssertError
is an unrecoverable exception—it does not inherit Exception
but instead inherits Error
directly, which means that it shouldn’t normally be caught.Contract Programming generalizes very nicely some time-tested notions that today we take for granted. For example, a function signature is a contract all right. Consider a function found in the standard library, module std.math
:
The sheer signature imposes a contract: the caller must provide exactly one value of type double
, and the function’s return is one double
value as well. You can’t call sqrt("hello")
or assign the result of sqrt
to a string
. More interestingly, you can call sqrt(2)
even though 2
is an int
and not a double
: the signature gives the compiler enough information to help the caller fulfill the input requirement by converting 2
to a double
. The function may have side effects, but if it doesn’t, the pure
attribute may be used to specify that:
This is a stronger, more binding contract for sqrt
because it forces sqrt
to not have any side effects. Finally, there is the nothrow
attribute that allows us to specify an even more detailed (and restrictive) contract:
// No side effects, never throws
// (Actual declaration found in std.math)
pure nothrow double sqrt(double x);
Now we know for sure that the function either returns a double
, terminates the program, or enters an infinite loop. There’s nothing else in the world it can ever do. So we were using contracts with functions by just writing down signatures.
To appreciate the contractual power of function signatures, consider a little piece of historical evidence. The early, pre-standard version of the C language (known as “K&R C” in honor of its creators, Kernighan and Ritchie) had a quirk. If you didn’t declare a function at all, K&R C would consider it a function with this signature:
// If you don't declare sqrt but call it, it's as if
// you declared it as
int sqrt(...);
In other words, if you forgot to #include
the header math.h
(which provides the correct signature for sqrt
), you could have called sqrt("hello")
without the compiler minding it one bit. (The ellipsis introduces varargs, one of the most unsafe features of C.) One more subtle error was that invoking sqrt(2)
compiled with or without including math.h
but did very different things. With the #include
, the compiler converted 2
to 2.0
before calling sqrt
; without it, a terrible misunderstanding between parties occurred: the caller sent the integer 2
and sqrt
picked up its binary representation as if it were a floating-point number, which in 32-bit IEEE is 2.8026e-45
. ANSI C recognized the gravity of this problem and fixed it by requiring prototypes for all functions.
Function attributes and types can be used to specify simple contracts. Attributes are in fixed supply, but types are easy to define whenever needed. How far can types go in describing contracts? The answer is, sadly, that (at least with the current technology) types are not an adequate vehicle for expressing even moderately complex contracts.
A designer could specify a function’s contract in the documentation associated with the function, but I’m sure we all agree that setup is far from satisfactory. Users of a component don’t always peruse its documentation with due care, and even when they do it’s easy to make honest mistakes. Besides, documentation has a way of getting out of sync with design and implementation, particularly when specifications are nontrivial and change frequently (as often happens).
Contract Programming takes a simpler approach of specifying contractual requirements as executable predicates—snippets of code that describe the contract as pass/fail conditions. Let’s take a look at each in turn.
This book has defined (§ 2.3.4.1 on page 46) and already used assert
in many places—an implied acknowledgment of the notion’s usefulness. In addition, most languages include a sort of assertion mechanism, either as a primitive or as a library construct.
To recap, use the assert
expression to ensure that an expression is supposed to be nonzero by design, in all runs of the program regardless of input:
The assert
ed expression is often Boolean but may have any if
-testable type: numeric, array, class
reference, or pointer. If the expression is zero, assert
throws an object of type AssertError
; otherwise, nothing happens. An optional string parameter is made part of the error message carried by the AssertError
object, if thrown. The string is evaluated only if the assertion does fail, which saves some potentially expensive computation:
import std.conv;
void fun() {
int a, b;
...
assert(a == b);
assert(a == b, text(a, " and ", b, " are different"));
}
The std.conv.text
function converts and concatenates all of its arguments into a string
. That entails quite a bit of work—memory allocation, conversions, the works. It would be wasteful to do all that work if the assertion succeeds, so assert
evaluates its second argument only if the first is zero.
What should assert
do in case of a failure? Forcefully terminating the application is an option (and is what C’s homonym macro does), but D’s assert
throws an exception. It’s not an ordinary exception, however; it’s an AssertError
object, which inherits Error
—the über-exception discussed in § 9.2 on page 302.
The AssertError
object thrown by assert
goes through the catch(Exception)
handlers like a hot knife through butter. That’s a good thing because assert
failures represent logic errors in your program, and usually you want logic errors to just terminate the application as soon and as orderly as possible.
To catch an AssertError
exception, use Error
or AssertError
directly in a catch
handler instead of Exception
or its descendants. But then again: you should seldom be in a place in life where catching Error
s would help.
Preconditions are contractual obligations that must be satisfied upon a function’s entry. For example, say we want to write a contract enforcing non-negative inputs for a function fun
. That would be a precondition imposed by fun
on its callers. In D, you write a precondition as follows:
double fun(double x)
in {
assert(x >= 0);
}
body {
// Implementation of fun
...
}
The in
contract is automatically executed before the function’s body
. That’s virtually the same as the simpler version:
but we’ll see that it is important to distinguish the precondition from the function’s body when objects and inheritance enter the picture.
Some languages restrict contracts to Boolean expressions and automatically throw the exception if the Boolean is false, for example:
D is more flexible in that it allows you to check for preconditions that don’t easily lend themselves to single Boolean expressions. Also, you have the freedom of throwing any exception you want, not only an AssertError
exception. For example, fun
might want to throw some exception type that records the faulty input:
import std.conv, std.contracts;
class CustomException : Exception {
private string origin;
private double val;
this(string msg, string origin, double val) {
super(msg);
this.origin = origin;
this.val = val;
}
override string toString() {
return text(origin, ": ", super.toString(), val);
}
}
double fun(double x)
in {
if (x !>= 0) {
throw new CustomException("fun", x);
}
}
body {
double y;
// Implementation of fun
...
return y;
}
But don’t abuse that flexibility. As discussed above, assert
throws an AssertError
object, which is different from regular exceptions. It is best to use AssertError
or other exceptions that inherit Error
but not Exception
when signaling a precondition failure. This is because precondition failure indicates a serious logic error in your program that is not supposed to get caught casually.
The compiler actually takes explicit steps to disallow contract misuse. First, inside the in
clause you cannot execute the return
statement, meaning that you can’t use a contract to entirely skip the function’s body
. Second, D explicitly disallows changing parameters in a contract. For example, the following code is in error:
double fun(double x)
in {
if (x <= 0) x = 0; // Error!
// Cannot modify parameter 'x' inside contract!
}
body {
double y;
...
return y;
}
Yet, although the compiler could enforce that a contract is pure
(which would be a logical decision), it doesn’t. This means you can still alter global variables or generate output from within a contract. This freedom was granted with a purpose: impure uses are useful during debugging sessions, and it would be too restrictive to disallow them. Nevertheless, remember that generally it’s not good style to alter the state of the world from within a contract. Contract code is only supposed to verify observance of the contract and throw an exception if the contract has been violated—nothing else.
With the in
contract in tow, fun
is asymmetric and in a certain way unfair. fun
specifies its requirements to the caller but provides no guarantee. Why should the caller work hard to provide a non-negative number to fun
? To check postconditions, use an out
contract. Let’s assume that fun
guarantees a result between 0 and 1:
double fun(double x)
// As before
in {
assert(x >= 0);
}
// added
out(result) {
assert(result >= 0 && result <= 1);
}
body {
// Implementation of fun
double y;
...
return y;
}
If the in
contract or the function’s body
throws an exception, out
does not execute at all. If the in
contract passes and body return
s normally, the out
contract is executed. The parameter result
passed to out
is whatever the function is about to return. The result
parameter is optional; out { ... }
is also a valid out
contract that doesn’t need the result or applies to a void
-returning function. In the example above, result
will be a copy of y
.
Just like the in
contract, the out
contract should only verify without modifying. The only interaction of out
contracts with the outer world should be either doing nothing at all (pass) or throwing an exception (fail). In particular, out
is not a good place for last-minute result adjustments. Compute the result in body
, and check it with out
. The following code does not compile for two reasons: the out
contract attempts to rebind result
and also attempts to (harmlessly but suspiciously) rebind an argument:
int fun(int x)
out(result) {
x = 42; // Error!
// Cannot modify parameter 'x' in a contract!
if (result < 0) result = 0; // Error!
// Cannot modify the result in a contract!
}
body {
...
}
An invariant is a condition that remains satisfied at certain milestones during a computation. For example, a pure
function ensures that the entire state of the program remains unchanged throughout the execution of the function. Such a guarantee is very strong but often too coarse to be used intensively.
A more granular invariance guarantee may be applied to an individual object, and this is the model D works with. Consider, for example, a simple Date
class that stores the day, month, and year as individual integers:
It is reasonable to posit that at no point in the lifetime of a Date
object should the year
, month
, and day
members take nonsensical values. To express such an assumption, use an invariant
:
class Date {
private:
uint year, month, day;
invariant() {
assert(1 <= month && month <= 12);
switch (day) {
case 29:
assert(month != 2 || leapYear(year));
break;
case 30:
assert(month != 2);
break;
case 31:
assert(longMonth(month));
break;
default:
assert(1 <= day && day <= 28);
break;
}
// No restriction on year
}
// Helper functions
static pure bool leapYear(uint y) {
return (y % 4) == 0 && (y % 100 || (y % 400) == 0);
}
static pure bool longMonth(uint m) {
return !(m & 1) == (m > 7);
}
public:
...
}
The three tests for days 30, 31, and 29 handle the customary verifications for month February and leap year. The test in longMonth
returns true
if a month has 31 days and works by claiming, “A long month is an even number if and only if it is greater than July,” which makes sense (months 1, 3, 5, 7, 8, 10, and 12 are long).
The invariant
must pass for any valid Date
object at all times. In theory the compiler could emit calls to the invariant
whenever it wants. However, things are not that simple. Consider, for example, that the compiler makes the executive decision to insert a call to invariant
at the end of each statement. That would be not only inefficient, but also incorrect. Consider setting a Date
from another Date
:
// Inside class Date
void copy(Date another) {
year = another.year;
__call_invariant(); // Inserted by the compiler
month = another.month;
__call_invariant(); // Inserted by the compiler
day = another.day;
__call_invariant(); // Inserted by the compiler
}
Between these statements it’s quite possible that the Date
is temporarily out of sync, so inserting an invariant
evaluation per statement is not correct. (For example, assigning date 1 August 2015 to a date currently containing 29 February 2012 would temporarily make the date be 29 February 2015, which is an invalid date.)
How about inserting an invariant call at the beginning and end of each method? Negative again. Consider, for example, that you write a function that advances a date by one month. Such a function is useful, for example, for tracking events that happen once a month. The function must pay attention only to adjusting the day around the end of the month such that the date goes, for example, from August 31 to September 30.
// Inside class Date
void nextMonth() {
__call_invariant(); // Inserted by the compiler
scope(exit) __call_invariant(); // Inserted by the compiler
if (month == 12) {
++year;
month = 1;
} else {
++month;
adjustDay();
}
}
// Ancillary function
private void adjustDay() {
__call_invariant(); // Inserted by the compiler
// (PROBLEMATIC)
scope(exit) __call_invariant(); // Inserted by the compiler
// (PROBLEMATIC)
switch (day) {
case 29:
if (month == 2 && !leapYear(year)) day = 28;
break;
case 30:
if (month == 2) day = 28 + leapYear(year);
break;
case 31:
if (month == 2) day = 28 + leapYear(year);
else if (!isLongMonth(month)) day = 30;
break;
default:
// Nothing to do
break;
}
}
Function nextMonth
takes care of year rollover and uses an ancillary private
function adjustDay
to ensure that the day remains inside a valid date. Here’s exactly where the problem is: upon entrance in adjustDay
the invariant may be broken. Of course it might—the sole purpose of adjustDay
was to fix the Date
object!
What makes adjustDay
special? It’s its protection level: it’s a private
function, accessible only to other functions that have the right to modify the Date
object. Upon entrance in and exit from a private
function, in general, it’s acceptable to have a broken invariant. The places where the invariant
must definitely be accepted are at public
method boundaries: an object doesn’t want to allow a client operation to find or leave this
in an invalid state.
How about protected
functions? According to the discussion in § 6.7.6 on page 201, protected
is just one little notch better than public
. However, it was deemed that requiring invariant satisfaction at the boundaries of protected
functions was too restrictive.
If a class defines an invariant, the compiler automatically inserts calls to the invariant in the following places:
public
non-static methodsSay we put on X-ray vision goggles that allow us to see the code inserted by the compiler in the Date
class. We’d then see this:
class Date {
private uint day, month, year;
invariant() { ... }
this(uint day, uint month, uint year) {
scope(exit) __call_invariant();
...
}
~this () {
__call_invariant();
...
}
void somePublicMethod() {
__call_invariant();
scope(exit) __call_invariant();
...
}
}
A detail about the constructor and destructor is worth noting. Recall from the discussion of an object’s lifetime (§ 6.3 on page 181) that once allocated, an object is considered valid. Therefore, even if a constructor throws, it must leave the object in an invariant-abiding state.
Contracts are concerned exclusively with verifying the internal logic of an application. In keeping with that charter, most, if not all, programming systems that support contracts also allow a mode in which all contract checking is ignored. That mode is supposed to be activated only with programs that have been thoroughly reviewed, verified, and tested.
Any D compiler provides a flag (-release
in the reference implementation) that ignores contracts altogether, that is, parses and typechecks all contract code but leaves no trace of it in the executable binary. A release build runs without contract checking (which is riskier) but also at full speed (which is, well, faster). If the application has its ducks in a row, the added risk of skipping contract checks is very low and the increase in speed is well worth that risk. The possibility of running without contracts reinforces the warning that code should not use contracts for routine checks that could reasonably fail. Contracts must be reserved for never-expected errors that reflect a logic bug in your program. Again, you should never use contracts to make sure that user input is correct. Also, remember the repeated warnings against doing any significant work (such as side effects) inside assert
, in
, and out
? Now it’s painfully obvious why: a program that does such unsavory acts would oddly behave differently in non-release and release mode.
One commonly encountered error is assert
ing expressions with side effects, for example, assert(++x < y)
, which is bound to cause much head scratching. It’s the worst of all worlds: the bug manifests itself in release mode, when by definition you have fewer means at your disposal to find the source of the problem.
It’s a pity that assert
disappears from release builds, because using it is very convenient. Instead of writing
if (!expr1) throw new SomeException;
...
if (!expr2) throw new SomeException;
...
if (!expr3) throw new SomeException;
you get to write only
Given that assert
is so concise, many libraries provide an “always assert” feature that checks a condition and throws an exception if the condition is zero, whether you compile in release mode or not. Such checkers go in C++ by names such as VERIFY
, ASSERT_ALWAYS
, or ENFORCE
. D defines such a function in module std.contracts
under the name enforce
. Use enforce
with the same syntax as assert
:
If the passed-in expression is zero, enforce
throws an object of type Exception
regardless of whether you compiled the program in release or non-release mode. If you want to throw a different type, you may specify it as follows:
If something
is zero, the second argument is thrown; enforce
evaluates it lazily such that no object creation occurs if expr1
is nonzero.
Although assert
and enforce
look and feel very much alike, they serve fundamentally different purposes. Don’t forget the differences between the two:
assert
checks your application logic, whereas enforce
checks error conditions that don’t threaten the integrity of your application.assert
throws only the unrecoverable AssertError
exception, whereas enforce
throws by default a recoverable exception (and may throw any exception with an extra argument).assert
may disappear, so don’t take it into consideration when figuring the flow of your function; enforce
never disappears, so after you call enforce(e)
you can assume that e
is nonzero.An assertion against a constant that is known to be zero during compilation, such as assert(false)
, assert(0)
, or assert(null)
, behaves a tad differently from a regular assert
.
In non-release mode, assert(false)
does not do anything special: it just throws an AssertError
exception.
In release mode, however, assert(false)
is not compiled out of existence; it will always cause a program to stop. This time, however, there would be no exception and no chance of continuing to run after an assert(false)
was hit. The program will crash. This is achieved on Intel machines by executing the HLT
(“halt”) instruction, which causes the program to abort immediately.
Many of us tend to think of a crash as a highly dangerous event that indicates a program gone out of control. This disposition is prevalent most likely because many programs that do go out of control terminate, sooner or later, via a crash. But assert(false)
is a very controlled way to terminate a program. In fact, on some operating systems, HLT
automatically loads your debugger and positions it on the very assert
that triggered the crash.
What’s the purpose of this particular behavior of assert(false)
? One obvious use has to do with system-level programs. There had to be a portable way to issue HLT
, and assert(false)
integrates well with the rest of the language. In addition, the compiler is aware of the semantics of assert(false)
, so, for example, it disallows dead code following an assert(false)
expression:
int fun(int x) {
++x;
assert(false);
return x; // Error!
// Statement is not reachable!
}
On the contrary, in other situations you may need to add assert(false)
to suppress a compiler error. Consider, for example, calling the standard library function std.contracts.enforce(false)
discussed just above:
import std.contracts;
string fun() {
...
enforce(false, "can't continue"); // Always throws
assert(false); // Unreachable
}
The call enforce(false)
always throws an exception, but the compiler doesn’t know that. To make the compiler understand that that point cannot possibly be reached, insert an assert(false)
. Finishing fun
with return "";
also works, but in that case, if someone comments out the enforce
call later on, fun
would start returning bogus values. The assert(false)
is a veritable deus ex machina that saves your code from such situations.
This section discusses a controversial matter related to contracts that is the source of continuous debate. The matter essentially boils down to this question: If a function must make some check, where should the check go—in a contract or in the function’s body?
When first getting accustomed to Contract Programming, many of us are tempted to move most checks inside contracts. Consider, for example, a function called readText
that loads a text file in its entirety as a string
. Armed with contracts, we might define it as follows:
import std.file, std.utf;
string readText(in char[] filename)
out(result) {
std.utf.validate(result);
}
body {
return cast(string) read(filename);
}
(readText
is actually a function in the standard library; you may want to look it up in module std.file
.)
readText
relies on two other file functions. First, it uses read
to load an entire file into a memory buffer. The memory buffer has type void[]
, which readText cast
s to string
. But it would be incorrect to leave things at that: what if the file contains malformed UTF characters? To validate the cast
, the out
contract verifies the result by calling std.utf.validate
, which throws a UtfException
object if the buffer contains an invalid UTF character.
That would be fine, were it not for a fundamental issue: contracts must validate the logic of an application, not the validity of its inputs. Anything that’s not considered an endemic problem of the application does not belong inside contracts. Also, contracts are not supposed to change the semantics of the application—hence D’s intentional curbing of what can be modified inside a contract.
Assuming no contracts fail, an application must run with the same behavior and results with or without actually executing contracts. This is a very simple and memorable litmus test for deciding what’s a contract and what isn’t. Contracts are specification checks, and if the checks go away for a correct implementation, that doesn’t stop the implementation from working! That’s how contracts are meant to work. Expecting that a file is always valid may reveal a positive attitude but should not be part of readText
’s specification. A correct definition of readText
makes the check an integral part of the function:
import std.file, std.utf;
string readText(in char[] filename) {
auto result = cast(string) read(filename);
std.utf.validate(result);
return result;
}
In light of the discussion so far, the answer to the question regarding check placement is: If the check concerns the application logic, it should go in a contract; otherwise, the check should go in the body of the function and never get skipped.
That sounds great, but how to define “application logic” in applications built out of separate, generic libraries written by independent entities? Consider a large general-purpose library, such as the Microsoft Windows API or the K Desktop Environment. Many applications use APIs like these, and it is inevitable that library functions receive arguments that do not conform to the spec. (In fact, an operating system API may count on receiving all sorts of malformed arguments.) If an application does not fulfill the precondition of a library function call, where does the blame go? It was clearly the fault of the application, but it’s the library that takes the hit—in terms of instability, undefined behavior, corrupted state inside the library, crashes, all those bad things. As unfair as it may seem, such problems would reflect poorly on the library (“Library Xyz is prone to instability and surprising quirks”) more than on the bug-ridden applications using it.
A general-purpose and large-distribution API should verify all inputs to all of its functions as a matter of course—not in contracts. Failure to verify an argument is unequivocally a library bug. No spokesperson would ever wave a copy of a book or paper and say, “We were using Contract Programming throughout, so we’re not at fault.”
Does that invalidate the argument that functions should use preconditions to specify, for example, argument ranges? Not at all. It’s all a matter of defining and distinguishing “application logic” from “user input.” To a function that’s an integral part of an application, receiving valid arguments is part of the application logic. To a general-purpose function belonging to an independently delivered library, arguments are nothing but user input.
On the other hand, it is perfectly fine for a library to use contracts in its private
functions. Those functions relate to the internal workings of the library and cannot be accessed by user code, so it is sensible to have them use contracts to express adherence to specification.
The often-quoted Liskov Substitution Principle [38] states that inheritance is substitutability: an object of the derived class must be substitutable wherever an object of the base class is expected. This insight essentially determines the interaction of contracts with inheritance.
In the real world, the relationship between contracts and substitutability is as follows: once a contract is established, a substitute contractor must be at least as qualified to perform the job, deliver the job within at least the specified tolerance, and require at most the same compensation that was established in the contract. There is some flexibility, but never in the direction of tightening the preconditions of the contract or loosening the postconditions. If either of these happens, the contract becomes invalid and must be rewritten. The flexibility concerns only variations that don’t negatively affect the understanding in the contract: a substitute is allowed to require less and offer more.
Consider the Date
example again. Let’s say we define a very simple, lightweight BasicDate
class that offers only minimal support and leaves enhancements to derived classes. BasicDate
offers a function format
that takes a string
representing a format specification and returns a string with the date formatted appropriately:
import std.conv;
class BasicDate {
private uint day, month, year;
string format(string spec)
in {
// Require str to be equal to "%Y/%m/%d"
assert(spec == "%Y/%m/%d");
}
body {
// Simplistic implementation
return text(year, '/', month, '/', day);
}
...
}
The contract imposed by Date.format
requires that the format specification be exactly "%Y/%m/%d"
, which we assume means “year in long format followed by a slash followed by month followed by a slash followed by day.” That’s the only format BasicDate
worries about supporting. Derived classes may add localization, internationalization, the works.
A class Date
that inherits BasicDate
wants to offer a better format
primitive—for example, say Date
wants to allow the specifiers %Y
, %m
, and %d
in any positions and mixed with arbitrary characters. Also, %%
should be allowed because it represents the actual character %
. Repeated occurrences of the same specifiers should also be allowed. To enforce all that, Date
writes its own contract:
import std.regex;
class Date : BasicDate {
override string format(string spec)
in {
auto pattern = regex("(%[mdY%]|[^%])*");
assert(!match(spec, pattern).empty);
}
body {
string result;
...
return result;
}
...
}
Date
enforces its constraints on spec
with the help of a regular expression. Regular expressions are an invaluable aid in string manipulation; Friedl’s classic Mastering Regular Expressions [26] is warmly recommended. This is not the place to discuss regular expressions in depth, but suffice it to say that "(%[mdY%]|[^%])*"
means “a %
followed by any of m
, d
, Y
, or %'
; or anything other than a %
—repeated zero or more times.” The equivalent code that would match such a pattern by hand would be considerably more verbose. The assert
makes sure that matching the string against the pattern returns a non-empty
match, that is, it worked. (For more on using regular expressions with D, you may want to peruse the online documentation of the standard module std.regex
.)
What is the aggregate contract of Date.format
? It should mind BasicDate.format
’s contract but also relax it. It’s fine if the base in
contract fails, as long as the derived in
contract passes. Also, Date.format
’s contract should never strengthen BasicDate.format
’s in
contract. The emerging rule is as follows: In an overridden method, first execute the base class contract. If that succeeds, transfer control to the body. Otherwise, execute the derived class contract. If that succeeds, transfer control to the body. Otherwise, report contract failure.
Put another way, the in
contracts are combined by using disjunction with short-circuit: exactly one must pass, and the base class contract is tried first. That way there is no possibility that the derived contract is more difficult to satisfy than the base class contract. On the contrary, the derived class offers a second chance for failed preconditions.
The rule above works very well for Date
and BasicDate
. First, the composite contract checks against the exact pattern "%Y/%m/%d"
. If that succeeds, formatting proceeds. Failing that, conformance to the derived, more permissive, contract is checked. If that passes, again formatting may proceed.
The code generated for the combined contract looks like this:
void __in_contract_Date_format(string spec) {
try {
// Try the base contract
this.BasicDate.__in_contract_format(spec);
} catch (Throwable) {
// Base contract failed, try derived contract
this.Date.__in_contract_format(spec);
}
// Success, can invoke body
}
With out
contracts the situation is exactly the opposite: when substituting a derived object for a base object, the overridden function must offer more than what the contract promised. So right off the bat, the out
guarantee of the base must always be fulfilled by the overriding method (unlike the case for the in
contract).
Conversely, this means that a base class should set the contract as loose as is useful, to avoid the risk of over-constraining derived classes. For example, if BasicDate.format
imposes that the returned string has the format year/month/day, it would effectively prevent any derived class from performing any other formatting. Perhaps BasicDate.format
could impose a weaker contract—for example, if the formatting string is not empty, an empty string is not allowed as output:
import std.range, std.string;
class BasicDate {
private uint day, month, year;
string format(string spec)
out(result) {
assert(!result.empty || spec.empty);
}
body {
return std.string.format("%04s/%02s/%02s", year, month, day);
}
...
}
Date
sets its ambitions a bit higher: it computes the expected result length from the format specification and then compares the length of the actual result to the expected length:
import std.algorithm, std.regex;
class Date : BasicDate {
override string format(string spec)
out(result) {
bool escaping;
size_t expectedLength;
foreach (c; spec) {
switch (c) {
case '%':
if (escaping) {
++expectedLength;
escaping = false;
} else {
escaping = true;
}
break;
case 'Y':
if (escaping) {
expectedLength += 4;
escaping = false;
}
break;
case 'm': case 'd':
if (escaping) {
expectedLength += 2;
escaping = false;
}
break;
default:
assert(!escaping);
++expectedLength;
break;
}
}
assert(walkLength(result) == expectedLength);
}
body {
string result;
...
return result;
}
...
}
(Why walkLength(result)
instead of result.length
? The number of characters in a UTF-encoded string may be smaller than its length in char
s.) Given these two out
contracts, what is the correct combined out
contract? The answer is simple: The contract of the base class must be also verified. Then, if the derived class promises additional contractual obligations, those must be fulfilled as well. It’s a simple conjunction. The code below is what the compiler might generate for composing the base and derived contracts:
void __out_contract_Date_format(string spec) {
this.BasicDate.__out_contract_format(spec);
this.Date.__out_contract_format(spec);
// Success
}
Just as in the case of out
contracts, we’re looking at a conjunction, an “and” relation: a class must fulfill the invariant of all of its base classes in addition to its own invariant. There is no way for a class to weaken the invariant of its base class. The current compiler calls invariant()
clauses from the top of the hierarchy down, but that should not matter at all for the implementor of an invariant
; as discussed, invariant
s should have no side effects.
Possibly the most interesting application of contracts is in conjunction with interfaces. An interface is a complex contract, so it is fitting that each of an interface’s methods should describe an abstract contract—a contract without a body
. The contract is enforced in terms of the not-yet-implemented primitives defined by the interface.
Consider, for example, that we want to enhance the Stack
interface defined in § 6.14 on page 233. Here it is for reference:
interface Stack(T) {
@property bool empty();
@property ref T top();
void push(T value);
void pop();
}
Let’s attach contracts to the interface that reveal the interplay of these primitives. Interface contracts look just like regular contracts without a body
.
interface Stack(T) {
@property bool empty();
@property ref T top()
in {
assert(!empty);
}
void push(T value)
in {
assert(!empty);
}
out {
assert(value == top);
}
void pop()
in {
assert(!empty);
}
}
For an interface method with a contract, the trailing semicolon is not needed anymore. With the new definition of Stack
, implementations are constrained to work within the confines defined by Stack
’s contracts. One nice thing is that the contract-enhanced Stack
is a good specification of a stack that is at the same time easy to read by a programmer and verified dynamically.
As discussed in § 10.7 on page 327, Stack
’s contracts may be compiled out. If you define a container library for large and general use, it may be a good idea to treat method calls as user input. In that case, the NVI idiom (§ 6.9.1 on page 213) may be better suited. A stack interface that uses NVI to always check for valid calls would look like this:
interface NVIStack(T) {
protected:
ref T topImpl();
void pushImpl(T value);
void popImpl();
public:
@property bool empty();
final @property ref T top() {
enforce(!empty);
return topImpl();
}
final void push(T value) {
enforce(!empty);
pushImpl(value);
enforce(value == topImpl());
}
final void pop() {
assert(!empty);
popImpl();
}
}
NVIStack
uses enforce
throughout—a test that’s impossible to compile out of existence and also makes push
, pop
, and top final
and hence impossible to hijack by implementations. One nice effect is that all major error handling has been hoisted out of each implementation in turn into the interface—a good form of reuse and of dividing responsibilities. NVIStack
implementations can assume without fear that pushImpl
, popImpl
, and topImpl
are always called in valid states and optimize them accordingly.
18.191.74.66