CHAPTER 8        Software Fault Avoidance in Specification

Learning objectives of this chapter are to understand:

•  The role and importance of software specification.

•  The principles of formal specification in a declarative language.

•  The mathematics of formal specification.

•  How to read and understand formal specifications.

•  How to write simple formal specifications.

•  Why formal specifications are useful and when.

•  How a formal specification is built.

8.1 The Role of Specification

The specification is the first statement of what a software system has to do. In essence, the specification is a statement about how the software will solve the problem posed by the requirements, and the specification is written in terms of the effect that the software has to have on its inputs and outputs.

The primary role of a software specification is to act as a reference for those who have to develop the associated software system, i.e., the role of a specification is communication. The specification is a repository for all the important information about what a software system is to do. As a repository, the goal is for the specification to convey to those who read it all the information they need. Those who write the specification possess information, and those who read the specification need that information. This communication must work correctly, and a lot of effort has been expended on specification technology precisely because of the importance of this communication.

Many different engineers use a software specification:

•  Obviously, the engineers who actually implement the software will refer to it.

•  The engineers who test the software will refer to the specification, because the specification tells them what the content of functional tests should be.

•  Customers benefit from referring to the specification because the specification tells them what the system they are acquiring will do.

•  For systems that are regulated, engineers working for agencies such as the U.S. Federal Aviation Administration (FAA), Food and Drug Administration (FDA), and Nuclear Regulatory Commission (NRC), and the U.K.’s Civil Aviation Authority (CAA) and Health and Safety Executive will require inspection of the software specification.

Having a specification from which to work is an important part of any software development process. Without an accurate specification, engineers have to proceed with at best an informal idea of what the software is to do. That informal idea is likely to be incomplete, inaccurate, and subject to change. We can achieve a great deal in the area of fault avoidance by paying careful attention to the specification of systems requiring high levels of dependability.

Defects in specification are the most expensive to repair. If a defect is present in the specification of a software system and that defect is not detected until after the system is built, the repair cost is as high as it can be, because all the software artifacts have to be changed. After the specification is corrected, the various designs have to be corrected and checked, the implementation has to be corrected, and the verification has to be repeated. There might also be various documents such as user manuals and maintenance manuals that have to be repaired.

Real specifications are large, complex, and difficult to write correctly and clearly. Developing complete and accurate specifications is a challenge in practice, but, as noted in Section 7.11, there are ways of dealing with the challenge. In this chapter we look at one of the most powerful ways to conquer the challenge, namely, formal specification.

In this chapter, we will look at formal specification with the goal of understanding how formal specifications are built. To do so, we will use one particular formal language, Z1. Z is a powerful language, and there are several books on just that language [134, 108, 71, 154].

8.2 Difficulties with Natural Languages

Natural language is what we use every day when we communicate, but the use of natural language in software engineering is problematic for several reasons:

Ambiguity. Natural language is inherently ambiguous, because the meanings of words and phrases are determined by our experience. What is clear to the author of a statement might not be clear to the reader, and so the author and reader have different meanings, often without realizing that they have this difference.

Imprecision. Natural language is imprecise. In order for an author and a reader to communicate with a reasonable amount of text, many things that are expected to be common are assumed and omitted from the communication.

Informality. Natural language is not formally defined, and so there is no way to check the syntax or any other property mechanically in order to help locate deficiencies.

Dialects. There is no single definition of any natural language. English is not the same in any two countries and frequently is not the same in two regions of the same country.

Many attempts have been made to deal with these issues, including defining language subsets, developing natural-language analysis systems, developing glossaries of terms, relying on dictionaries, using special document formats, and training engineers.

None of these approaches nor any combination deals fully with the difficulties that arise with natural language. The only comprehensive solution is to turn to formal languages and formal specification. Yet practical circumstances often dictate the use of natural language in engineering. Thus, although efforts to deal with the various difficulties with natural language do not solve the problems, they can and should be used if one has to use natural language for specification, in order to comply with a standard, for example.

8.3 Specification Difficulties

8.3.1 Specification Defects

As we noted in Section 7.11, the majority of defects in a software system are introduced in the specification but usually not found until testing [108]. The situation is illustrated in Figure 8.1. What goes wrong in software specification? There are six major categories of defects that tend to arise:

Incorrect functionality. The stated functionality is not what the system is really supposed to do.

Missing functionality. The stated functionality omits something that the software is supposed to do.

Incompleteness. No functionality is stated for certain inputs or certain conditions that might occur.

Ambiguity. The stated functionality is open to more than one interpretation.

Inconsistency. Functionality is stated so that one part is inconsistent with another.

Naïveté. The specification documents a desired system that cannot be built because of technical limitations.

Dealing with these issues is obviously important. But, as long as we restrict our specification technology to natural language, we have no framework within which to attack the problems. Natural language has no precise syntax or semantics from which to derive helpful analysis.

8.3.2 Specification Evolution

Frequently, software engineers are told that no software specification is available, because the desired system is still being designed and so the associated software specification cannot be determined. Despite this, these engineers are asked to begin building the software for that part of the system that is specified even though the specification that is available is probably incomplete, poorly stated, and out of date.

Making things more difficult are the inevitable changes that occur as the desired system becomes more clearly defined. Early system design decisions change as more information becomes available, and this leads to changes in the specification of the software.

FIGURE 8.1 Software lifecycle phases where faults are introduced vs. located.

The ideal situation is to have a complete and accurate software specification before software design begins and for that specification to remain unchanged during development. In practice, this ideal is rarely, if ever, possible. Thus, specification techniques must take these realities into account and manage this development risk carefully.

Many techniques exist for dealing with the issue of incomplete specifications, including:

Throw-away prototypes. A throw-away prototype, as the name suggests, is a prototype that is thrown away after use [61]. The idea is to deal with a question of some sort by quickly building a prototype designed just to answer the question. Since the prototype will not be used as part of the system to be delivered, no attention need be paid to issues such as maintainability or documentation. Specification questions that might be answered using a throw-away prototype include user-interface features, performance measures, and detailed aspects of functionality. A partially complete but high-quality specification can define necessary throw-away prototypes based on incomplete specification details.

Evolutionary prototypes. An evolutionary prototype, again as the name suggests, is a prototype that evolves into part of the system under development. In effect, a system acts as its own prototype [61]. Provided care is exercised to ensure that prototypical elements do not remain incomplete or redundant in the delivered system, the resulting system is not affected negatively. That part of the system which is specified can be constructed carefully and used to help define the remainder of the system. This definition can take the form of a prototype implementation built using the completed system fragment to ascertain what the desired system functionality is. The benefit of having a partial but high-quality specification at this point is to permit the incremental development of the system to have a well-defined set of interfaces and functionality. Note that this process is designed to inform the specification, not replace the need for specification.

User interviews. Interviews with users and engineers in the application domain can be used to target questions about aspects of the specification that are uncertain.

System models. Development of system models such as data-flow models or models based on natural laws for physical systems can provide insight into how software systems need to operate.

Use cases. Use cases are a simple yet powerful way to investigate systematically the interaction between systems and their operating environment.

Specification reviews. Preliminary specifications can be reviewed with the final system’s users to gather information even though the documented specification is known to be incomplete.

The use of formal specification is not precluded either by the problem of incomplete specifications or the use of these various mitigation techniques. No matter how little is established for a given specification and no matter how much of a specification is likely to change, the use of a formal language to document the specification is almost always beneficial. A formal specification facilitates the resolution of many of the issues with specification by providing a means of effective communication and a means of identifying specification weaknesses.

8.4 Formal Languages

8.4.1 Formal Syntax and Semantics

A formal language is any language that has a formal syntax and formal semantics. This idea is actually quite familiar to most people because any programming language is a formal language. In fact, almost all programming languages, and certainly languages like C++ and Java, are actually specification languages. Programmers think they are writing programs in Java, for example, but the compiler is really writing the program, i.e., the sequence of bits that executes on the computer.

Both syntax and semantics are important. A formal syntax means that there can be no doubt about what constitutes a statement in the language. If a language has a formal syntax, then text in the language can be processed mechanically to determine whether the text is a statement in the language. Again, this idea is familiar from the use of compilers that detect syntax errors in programs. With one common definition of what constitutes a statement in the language, there is no need for interpretation by the reader as occurs frequently in natural languages.

Formal semantics permit dependable communication between all parties who have to write or read statements in the language. Whereas natural language relies upon our experience to provide us with meaning, formal language semantics are defined in term of mathematics, and that provides completeness and freedom from ambiguity.

An important caveat has to be noted about formal semantics. The formal languages that we use typically in specification are not connected to any real-world concepts. Technically, formal languages are uninterpreted. Thus, using a term from the real world in a formal-language statement does not mean that the human concept is in any sense part of the semantics of the formal language. As far as the language is concerned, the term is just a symbol.

Formal semantics for a formal language define how the various constructs in the language manipulate the symbols being used. A simple example is the logical operator . Most people are familiar with the semantics of an expression such as a b. A more complex example is the semantics of an if statement in a high-level language program. Again, most people are familiar with the semantics of a statement such as:

FIGURE 8.2 A simple formal specification of sorting written in Z.

For both examples, however, note that the semantics refer just to how the formal language operates. Nothing is said about the real-world semantics in either example. What are a and b in the expression? What is x in the if statement? There is no real-world interpretation of any variable.

Formal language semantics do not include answers to questions such as these. The real-world semantics must be stated separately and have to be stated in natural language. This is why the notions of meaningful identifiers and comprehensive comments are stressed in efforts to make formal statements clear to their readers.

Except for programming languages, formal languages tend to be unfamiliar to software engineers. In fact, the languages are far easier to read and write than they appear. An example of a simple formal specification written in Z2 is shown in Figure 8.2. Z is a convenient language to use, because much of the notation is identical to traditional discrete mathematics. So anybody who has studied things like sets and the predicate calculus in a course on discrete mathematics will find Z familiar and pretty much as easy to read as the original mathematical symbols.

Even without having studied Z, one can infer quite a lot about the meaning of this simple example. The name at the top seems likely to describe what the specification is about. Using that assumption, we can infer that this specification probably describes integer sorting. The two labels, “pre-condition” and “post-condition”, are not part of the specification. These labels identify (a) the condition that must hold before the specification can be applied, the pre-condition, and (b) the condition that must hold after the specification has be applied, the post-condition.

FIGURE 8.3 The benefits of formal specification include enabling improved accuracy and completeness by the author of the specification, improved understanding by readers of the specification, and the opportunity to analyze the specification by checking the syntax, type rules, and various properties.

With these intuitive notions and a little background in discrete mathematics, the meaning of much of the specification is clear. For example, the first term:

states that the cardinality (#) of the set Input? has to be less than 100. In other words, this specification is stating that the associated program need only deal with files containing less than 100 numbers to sort.

You can probably make good guesses about what most of the specification states. This insight is important, because understanding something like this example without much explanation illustrates the idea that using discrete mathematics in a formal specification is not especially difficult. At least, understanding something like this example is not very difficult. We will study this example in more depth in Section 8.7.

8.4.2 Benefits of Formal Languages

The formality of formal specification languages brings many advantages in the areas of understanding and analysis. These advantages, shown in Figure 8.3, are:

Improved understanding. Formal languages are much more precise than natural languages. A formal specification informs the reader much more effectively than a natural-language specification.

Improved accuracy and completeness. By writing specifications in a formal notation and using a rigorous structure, developers tend to create better artifacts. In part, this effect seems to be because of the relatively rigid structure of formal languages. Statements have to be made with a specific syntax, and the semantics available tend to constrain the way that statements can be made.

Syntax checking. The syntax of the language is completely formal, and that means that the syntax can be checked. Any deviation from the defined syntax of the language can be detected (by a program that is much like a compiler) and corrected by the author. Without syntax errors, a formal specification is much more likely to be correct.

Type checking. Just as strongly typed programming languages can help increase dependability so can strongly typed specification languages. Types are one of the basic building blocks of any formal language, and violations of the type rules in a specification are a strong indication of errors. The types can be checked to make sure that we have not violated the type rules, and some simple forms of completeness can be checked, case analysis, for example, to make sure that nothing has been left out.

The first two of these advantages accrue merely from the use of a formal specification language. No tools are needed and so they can be applied immediately. Importantly, these two advantages are also of great value. Understanding, accuracy, and completeness tend to be much higher with formal specification. The second two advantages require tools, but the ones needed are among the most commonly available.

A fifth advantage that arises with the use of formal specification languages is the possibility of analysis with putative or speculative theorems. With a specification written in a formal language, a determination can be made about whether the specification possesses certain expected properties. In general, a putative theorem is expected to be true, taking the specification as a set of axioms. For example, the engineer writing the specification might expect that (a) certain states can never be reached, (b) certain variables maintain a particular relationship, (c) certain events cannot occur, and so on. Such expectations can be stated as theorems, and, if the expectation follows from the specification, then the theorem can be proved from the specification. If the implementation implements the specification properly, then the implementation will have the same property.

The overall effect of using formal languages for specification is a significant reduction in specification defects. Some types of defect, such as those that come from the incorrect use of types, can be eliminated completely. Others, such as failing to state how certain input values are to be handled, can also be eliminated if the type of the input is properly stated. With the type properly defined, checking to see whether each element of the type has been included in the specification of input processing is possible. Finally, the clarity and structure of formal languages permit effective human review of specifications so that larger numbers of defects can be found by human inspections.

Many case studies on the use of formal specification languages have been conducted, and they have generally supported the notion that formal languages provide a variety of advantages. In particular, case studies have shown that formal languages tend to promote specifications of far higher quality than typically are produced when using natural language [27, 31, 39, 40, 72].

8.4.3 Presentation of Formal Languages

The benefit of using any formal language can be lost with sloppy presentation. This idea should be familiar from programming in high-level languages where engineers are admonished to indent carefully, to use meaningful identifiers, to leave white space between statements and subprograms to indicate structure, etc. Programs in high-level languages are often difficult to understand, and careful layout helps readers significantly.

The general rule to follow in specification is to do as much as possible to make the formal text as clear as possible. Simple rules to follow to promote clarity are:

•  Use whatever structuring mechanisms are available in the language to structure the text. In Z, the primary structuring mechanism is the schema.

•  Use vertical white space to delimit and emphasize major parts of the formal text. More space can be used before and after more significant parts.

•  Use spacing within individual lines to separate operators and operands so as to delimit and emphasize the structure within the line.

•  Use meaningful identifiers to help explain the formal statement to the reader. Long identifiers are not wrong and can add immense value. They take a little longer to type, but they can greatly reduce the time needed to read the formal statement.

•  Where more than one adjacent line has the same overall layout, tab the contents so that the major constituent parts are in columns. Declarations, for example, often occur in groups with one per line, and declarations have a unique format in essentially all formal languages. The syntactic structure of a declaration is often something like this:

   By spacing within the lines to have the variable names begin in the same column, the colons to appear in the same column, and the type names begin in the same column, location of all the items is much easier for the reader. Here is an example in Z — understanding the syntax does not matter for the moment; appreciating the readability does:

A good rule of thumb is that if the appearance of the text you are writing in a language like Z is not clear and attractive to you, then the text will probably quickly become impenetrable to others. In that case, rework the appearance. Some of the details of this topic are discussed in the example in Section 8.8.

8.4.4 Types of Formal Languages

High-level programming languages like Java are called procedural languages, because the program defines a procedure that the engineer wants the computer to follow. The program is essentially a sequence of statements saying to the computer:

“Do this,

now do this,

now do this,

now do this,

…”

By stating the desired actions of the computer this way, the engineer is defining the algorithm and the data structures that are needed to achieve the desired computation. The actual desired computation is not stated explicitly, and determining what the desired computation is by looking at a procedural statement is usually quite difficult. Procedural languages say “how” something is to be done, not “what” is to be done.

Two different types of language are used for formal specification:

•  Declarative languages. Declarative languages state the desired computation explicitly by defining the state in which the computer is required to be after the computation. Nothing is said about how the computation is to be achieved.

•  Functional languages. Functional languages state the desired computation by defining a function that will achieve the computation. The computation is frequently stated in a declarative way, and so functional languages are in many ways special cases of declarative languages.

In practice, declarative languages are the most common and most important formal specification languages. Some example languages are Z, VDM, Larch, and RSML. PVS is an example of a functional specification language. In contrast to procedural languages, declarative languages say “what” is to be done, not “how” something is to be done.

8.4.5 Discrete Mathematics and Formal Specification

All general-purpose formal specification languages are based on discrete mathematics. In particular, the languages make heavy use of topics such as the propositional calculus, the predicate calculus including the quantifiers, set theory, relations, functions, and sequences.

An important difference between familiar procedural languages and declarative formal specification languages is the meaning of the equal sign, “=“. In many programming languages, such as C, C++, and Java, the equal sign has been given the meaning of assignment. In C, for example, the statement:

means fetch the values of a and b, add them together, and make the result the new value of c.

Assignment is not the meaning that the equal sign has in traditional mathematics, including discrete mathematics. In most cases, mathematics does not include the notion of assignment.

In formal specification, the meaning of the equal sign is equality in the sense of the predicate calculus. In a mathematical context, the statement above is a predicate that is true if c equal the sum of a and b, and the statement is false otherwise. Thinking of assignment rather than equality in specification (which is both tempting and easy to do) can be very misleading.

8.4.6 The Before and After State

In declarative languages, a mechanism is needed that permits reference to pieces of the state both before an operation has been applied and after the operation has been applied. For example, if an integer variable is part of the state and the integer’s value is updated by an operation, reference to the variable both before the operation and afterwards is necessary in order to be able to describe the effect of the needed operation. If the variable is named i, then referring to i in two different ways is necessary, and so we need a notation that will identify the two circumstances.

In formal specification languages, a prime character is often used to indicate a reference to a variable, either before or after an operation. Which of the two uses the prime has depends on the particular language, although the use of the prime itself is almost universal.

As an example, suppose that an operation has to ensure that a variable named i in the system state has a value after the operation which is one more than its value before the operation. What that means is that the value of i in the state after the operation must be equal to the value of i in the before state plus one. Using the prime to designate the after state and the equal sign to mean equality, such a predicate might be expressed as:

i´ = i + 1

Note carefully that this is a predicate and that the symbol “=” means equality.

8.4.7 A Simple Specification Example

An example that illustrates the contrast between procedural, declarative, and natural languages is the following simple specification of integer square root. For an integer square root of a positive integer input operand, the computation has to determine the greatest integer that, when squared, in less than or equal to the input operand. Admittedly, this is not a very useful computation.

The desired calculation in this example is the integer square root of a quantity x that is assumed to have a value by some means. The output is computed in the variable y:

•  Procedural specification:

   This piece of program states how the computation should be carried out.

•  Declarative specification:

   This logic statement states what properties the input parameter has to have in order for this operation to be applicable and what we want the final state of the computation to be. Note the use of the prime on y to indicate the after state for the variable y

•  Natural language specification:

   The program shall read one positive input and write one output that is the integer square root of the input. The output squared shall be less than or equal to the input and the output+ 1 squared shall be greater than the input.

   The word “shall” is emphasized here because that has become common practice in the development of specifications in English. There is an informal and incomplete relationship between the use of “shall” and individual actions that the software has to perform. With the word “shall” emphasized, human review of specifications can be guided by the occurrence of the word.

8.5 Model-Based Specification

One of the problems with writing specifications in any notation is determining what the structure of the resulting specification should be. When writing a program in a high-level language, the language itself dictates what the overall form of a program has to be. Formal specification languages do the same thing for specifications.

In practice, the primary structure used with formal specification using declarative languages is model-based specification3. The name, model-based specification, derives from the fact that this structure is essentially a model of the desired system. This structure has two major parts:

FIGURE 8.4 The basic structure of a model-based specification.

A definition of the state upon which the desired computation will operate.
This definition uses the various mathematical structures from discrete mathematics, such as sets, relations, and functions.
The state definition is itself in two parts: (a) the definitions of a set of variables and their associated types and (b) a set of invariants on the variables in the state. The invariants are predicates that must always be true for any implementation that meets the specification. A common use of invariants is to limit the range of values that a variable can take where the type of the variable does not permit the limitation.

Definitions of the operations in the desired computation.
The computations are expressed using pre- and post-conditions on the state. The pre-condition for an operation is a predicate that states what has to be true about the state before the operation can be applied, and the post-condition for an operation is a predicate that states what has to be true about the state after the operation is applied.
The structure of an operation is, therefore, quite simple. The structure will consist of the declaration of variables that the operation either reads or modifies together with a conjunction of the pre-condition and the post-condition. The precondition and the post-condition might consist of many terms and so could be quite long and complex.

The general form of a model-based specification is shown in Figure 8.4. There is a lot of similarity between this structure and programs written in high-level languages. A high-level language program declares a collection of variables that will be manipulated and subprograms or methods that will operate on those variables. For many people, developing specifications in formal languages turns out to be fun in the same sense as programming, despite the use of discrete mathematical structures!

The assumption with a model-based specification is that the various operations that have been defined could be applied to the state at any point. Control of operation sequences is specified using the various pre-conditions.

8.5.1 Using a Model-Based Specification

A model-based specification is fundamentally a collection of predicates, one for each invariant and one for each operation. This is quite different from what one might see in an ad hoc natural-language specification. A question that then arises is: “How is the specification used?” The answer is twofold:

•  The specification defines for the developer precisely what the software has to do.

•  The specification can be analyzed to provide all of the stakeholders with increased confidence in the specification.

The basic principle of model-based specification is that the specification is a statement of what the software has to make true during execution. There is an implied statement associated with the specification that says:

“Build an implementation that makes the logic statements in the specification true.”

Thus, the software developer’s task is to write software to achieve this. In the case of the integer square root example (see Section 8.4.7), the output of a program could be anything, but, if the program is to implement the specification, then the output must make the predicate true. So if the input is 10, the output must be 3 because 32 is less than 10 and 42 is not. Similarly, the output could not be 2 because, although 22 is less than 10, 32 is also and the specification precludes that.

In developing a formal specification, we take advantage of the opportunity for analysis throughout the development. Thus, even when just a small part of the specification has been written, we can test what we have written to make sure that the embryonic specification obeys the language’s syntax rules. We can do the same with the types. Developing the specification is best undertaken iteratively, so we can repeat these analyses periodically.

After syntax and type checking, the third form of analysis that we have available is to establish properties of the specification. There are many properties that can be established, but a simple example is to make sure that the invariants cannot be violated. Recall that an invariant is a predicate defined on the variables of the state that must not be violated by the various operations.

As an example, suppose that there is an integer, i, in the state, and that there is also an invariant which requires the value of i to always be less than 10. In other words, the specification states something like this (in no specific language):

  If there were an operation that contained the predicate:

then an implementation of the operation should check that the invariant was maintained properly, but that check would require that the developer examine every operation and every invariant. If the implementation did not check, the invariant might be maintained during operation, or it might not. Maintenance of the invariant would depend on the actual sequence of operations that were executed.

The operation should have been specified to include either a pre-condition on the value of i or a check to make sure that the invariant would not be violated. A modification of the operation might then be:

In complex specification, making sure that such pre-conditions and checks are included in specifications cannot be completed properly and reliably by human inspection. The check can be completed mechanically, however. The approach is to state a theorem that the invariant will be maintained no matter what sequence of operations occurs, and then to try to prove that theorem using the statements in the specification as axioms. In the example above in its original form, attempts to prove such a theorem would fail.

8.6 The Declarative Language Z

Z (pronounced “zed”) is a formal language that was originally proposed by Jean-Raymond Abrial in 1977. Z was developed subsequently at various locations but primarily the University of Oxford in the United Kingdom. Z has been used widely in industrial applications. There are several textbooks available for Z (for example [108, 71, 154, 134]), and a few tools that can check the syntax and types4.

Z includes much of the discrete mathematics with which people are familiar, and the notation that Z uses is, for the most part, the same notation that is used in traditional discrete mathematics. Before discussing the use of Z, we review some of this mathematics using the Z syntax. This review is not a comprehensive introduction to the discrete mathematics of specification; the review is intended to be a refresher.

After the review of discrete mathematics, we look at the use of Z using two examples. This review of Z is also brief and covers only the major parts of the language. Z is a large language, and the details can be found in the texts mentioned above. The purpose of this survey is to look at specification in Z with sufficient depth that (a) you can see how a declarative language is used in specification, (b) become familiar enough with Z that you will be able to read and understand simple Z specifications, and (c) be ready to study Z further if necessary so as to be able to write Z specifications. With this background, you can easily acquire the details of the language using other sources.

8.6.1 Sets

A set is an unordered collection of items without replication. Sets can be defined either (a) by enumeration of all the elements as a list or (b) by comprehension in which a set is defined by selecting elements from an existing set. Here are some examples of each:

The first and second examples are sets defined by enumeration. The third example is a set comprehension using the Z notation with the following parts:

Sometimes we need to talk about a set for which we do not know the contents. In Z these are known as given sets, and their declaration just needs a name:

[files, users, cities]

In this example, three given sets are defined. The basic set operators are shown in Table 8.1:

TABLE 8.1. Basic set operators.

8.6.2 Propositions and Predicates

A proposition is a statement that is either true or false. Here are some examples in English:

Valid user ID entered.
Correct password entered.
The file system is full.
The printer is out of paper.
The backup file system is not operational.

Propositional variables are associated with propositions to allow expressions to be defined conveniently. Natural language text is not part of Z. The first proposition above might be represented by the symbol “ValidID” and the second by the symbol “CorrectPW“.

Separate propositions can be combined with the propositional operators:

The meaning of each of the propositional operators is defined by the associated truth tables shown in Table 8.2:

TABLE 8.2. Truth tables for the propositional operators.

Here are some examples of useful propositional expressions whose meanings can be inferred immediately:

A predicate is an expression that is either true or false when the variables in the expression are given values. Predicates are familiar from high-level-language programming, because predicates are used in Boolean expressions. Here are some examples:

Simple predicates like this can be combined with the propositional operators to form more complex and useful propositions such as this:

Z includes “decoration” of variable names in the form of special characters appended to the name, to indicate something of their purpose. Of particular importance are the “?” that indicates an input variable, and “!” that indicates an output variable. Here are some examples:

8.6.3 Quantifiers

Propositions and predicates are useful tools, and they allow us to define conditions that are important in the specification of software. But propositions that need to be defined over sets of values can become too large to consider writing down. The quantifiers solve the problem.

There are two common quantifiers:

The quantifiers allow us to define predicates that cover a complete set without having to enumerate all the elements of the set. The syntax of the quantifiers is:

with the following meanings:

declaration_of_bound_variables :

The declaration of variables whose scope is just the quantifier in which the declaration appears — the variables are “bound” to the quantifier.

predicate1 :

This predicate restricts the values of the bound variables.

predicate2 :

The predicate defines the truth value of the quantifier.

The symbols in the quantifiers can be read conveniently as English phrases:

Here is an example of an existential quantifier:

In this example, a bound variable file of type system is declared. The quantifier reads:

“There exists a file of type system such that the file is modified for which the size of the file is greater than 1,000 units.”

Here is an example of a universal quantifier:

In this example, the quantifier reads:

“For all files of type system such that the file has been modified it is the case that the file is backed up.”

Once you gain confidence with the quantifiers and have some practice, you will be able to take these literal translations into English and rearrange the spoken or textual form to be less jaded. For example, you might read the first quantifier as:

“There is a modified system file that is bigger than 1,000 units.”

Making this switch is fine as long as this informal version is not used in any engineering context other than you gaining insight into the formalism.

8.6.4 Cross Products

A cross product, denoted by the symbol x, is a set of 2-tuples, i.e., a set of items that are ordered pairs. The two elements of each ordered pair come from sets. The elements of the product take on the value of every element in the associated source sets. Here is an example:

Interesting cross products can be defined even if the two sets are the same. Consider, for example:

This cross product defines all the points in the upper right quadrant of the plane that have integer coordinates.

8.6.5 Relations, Sequences, and Functions

A relation is a cross product (and therefore a set of 2-tuples) in which the 2-tuples are restricted to those that are defined by a binary relation. The binary relation can be anything useful to us in which we need to define ordered pairs.

The definition of the relation can be by enumeration or be algebraic. The set from which the first element of each 2-tuple of a relation is selected is the source set, and the subset of the source set that actually is used in the relation is the domain of the relation. The set from which the second element of each 2-tuple is selected is the target set, and the subset of the target set that actually is used is the range of the relation. Here is an example of a relation defined by enumeration along with its domain and range:

A sequence is a relation whose first element set is the integers in order beginning with one. Z has a special syntax to allow sequences to be defined conveniently. Here is an example of a sequence and what it means as a relation:

Sequences are a convenient structure for modeling information that we might implement with an array.

A function is a relation in which each element of the domain maps to at most one element of the range. The concept of a function leads to a set of special types of function shown in Table 8.3:

TABLE 8.3. The various types of function and their representation symbol in Z.

8.6.6 Schemas

The basic structures of discrete mathematics provide much of the machinery needed for specification. What is needed next is a mechanism for organizing these structures into a form that can be used to describe large specifications. The mechanism used is Z is the schema.

A schema is a named graphic with two mathematical parts: the signature in which variables are declared and the predicate that is just a predicate on the variables. Both the signature and the predicate can refer to both before-state and after-state (primed) variables, thereby providing the facilities needed for model-based specification (see Section 8.5). An example of a simple schema is shown in Figure 8.5. The name of the schema is Counter. The schema declares a single variable called CounterValue that is an integer. From the type of CounterValue, 1, we know that the variable has to be positive. The predicate in the schema tells us that the value of the variable has to be less than or equal to 10.

Schemas whose predicates do not include after-state variables are used to define elements of the state in a model-based specification. Schemas whose predicates do include after-state variables are used to define operations. The predicate in a state schema defines an invariant on the state. Since the invariant might include several conjuncts, each involving a different variable, the predicate in a state schema is commonly thought of as a collection of invariants.

8.6.7 The Schema Calculus

Schemas have a precise syntax and semantics, and they can be manipulated as mathematical structures. The mechanism of this manipulation is the schema calculus. The schema calculus is a sophisticated and complex theory that we can only mention here. The schema calculus includes schema composition, schema expressions, and schema types. The calculus allows elaborate elements of specification to be defined simply and elegantly.

Decoration (such as the ? and ! decoration of variables described in Section 8.6.2) is also used for Z schemas in the form of prefix characters. The Delta symbol, ∆, when prepended to a state schema, defines a new schema consisting of the original together with the same text but with all variables primed. The invariant is included also with primed variables. This meaning is the result of using the ∆ decoration, and the specifier can use the resulting schema freely without defining it. An example of the definition of the schema ∆Counter is shown in Figure 8.6. The Xi symbol, , when prepended to a schema name, creates a schema that consists of the ∆ schema with an additional invariant that the state will not be changed. Including such decorated state schemas in operation schemas provides convenient access to the state and the invariant predicate.

FIGURE 8.5 A simple Z schema.

FIGURE 8.6 The definition of a schema including the ∆ decoration.

8.7 A Simple Example

Figure 8.7 is the same simple example that specifies integer sort which we saw in Figure 8.2 on page 227. The specification consists of a single schema. In the signature part, two variables are declared. The input variable, Input!, is a set of natural numbers. The output variable, Output], is a sequence of natural numbers.

The predicate is a single expression that is eight lines long and is in two major parts. The first two and the seventh lines of the predicate are the pre-condition for the operation, and the third, fourth, fifth, and eighth lines are the post-condition. Line six is just an operator.

The meaning of these lines in the predicate is:

FIGURE 8.7 Simple example of a declarative specification in Z.

•  Line 1 states that the size of the input set must be less than 100.

•  Line 2 states that the value of every element of the input set must be less than 1,000.
These two lines ensure that the program which implements the specification will have to deal with less than 100 numbers, each of which must be less than 1,000. If either line 1 or line 2 were false, then this part of the pre-condition would be false.

•  Line 3 states that every number that is in the input set must also be in the output sequence.

•  Line 4 states that every number in the output sequence must appear in the input set.

•  Line 5 states that all but the last number in the output sequence must be less than or equal to the next number in the sequence, i.e., the numbers in the sequence must be in sorted order.

•  Line 7 is the opposite of lines 1 and 2, and so line 7 will be true if either of lines 1 and 2 were false. This expression ensures that the total pre-condition for the specification is true. In other words, the resulting program should be able to deal with any input. This does not mean that the program can sort any file. A pre-condition that is a tautology means that the program will never fail to do something that was specified, no matter what the input is.

•  Line 8 states that a message is to be printed if the data is not suitable for sorting.

This example is reasonably complete, but makes a simplifying assumption that then makes the specification quite a bit simpler. The simplifying assumption is that there will be no repeated values in the input. This assumption is inherent in the fact that the input is stated to be a set. Uniqueness of the elements is a characteristic of sets.

The way that this simplifying assumption makes the specification simpler lies in the fact that the output of a sort has to be a permutation of the input. If there are no duplicate elements in the input, then the post-condition as stated in the example is sufficient. But if duplicate elements have to be included, the input structure can no longer be a set, and the post-condition stating that the output is a permutation of the input is correspondingly more difficult. Dealing with this is left as an exercise for the reader.

8.8 A Detailed Example

Suppose that a Z specification is required for a system to maintain a catalog of the books kept in a small library. To develop such a specification, just as with programming, things go much more smoothly if the specification is developed iteratively. Iterative development means developing a small part of the specification and checking it over carefully, and then repeating this process until the specification is complete. By proceeding iteratively, maintaining intellectual control of the specification is easier, and defects are much easier to locate.

FIGURE 8.8 The first iteration of the specification for a simple library catalog system.

Checking during each iteration includes visual inspection and analysis such as syntax and type checking. If the evolving specification passes visual inspection, syntax checking, and type checking, then many defects are likely to be absent, and the engineers building the specification will develop substantial confidence in the specification as they proceed.

To illustrate this iterative development approach, four versions of the library example are presented in the rest of this chapter. Each one either adds additional material to the specification or fixes a problem that comes up.

8.8.1 Version 1 of the Example

The first version of the specification is shown in Figure 8.8. At the top of the figure is a schema defining a given set, ISBN, that designates the complete set of unique numbers that are used to designate books, International Standard Book Number.

The second schema is the single state schema for the specification. In large specifications, the state might be defined by several schemas with various dependencies. In this first version of the example, a set of ISBNs will be used to model the catalog. The state schema includes a predicate which states that the cardinality of the set that models the catalog has to be less than or equal to 1,000. This invariant is to make sure that the catalog does not model more books than there is room for.

The third and fourth schemas are the two operation schemas. One describes how a book is added to the catalog; in the model, the book’s ISBN is added to the set of ISBNs. The second operation schema describes how removing a book from the catalog is modeled; the book’s ISBN is removed from the set of ISBNs. In both the third and fourth schemas, the post-condition of the operation is the complete predicate part of the schema.

Syntax and type checking of the specification reveal no errors. However, inspection of the operation schemas reveals three major functional mistakes:

•  Modeling the addition and removal of books is basically satisfactory, but the addition schema does not necessarily maintain the invariant. The implementation could allow more than 1,000 books to be added.

•  Adding a book to the library does not check to make sure that the book is not already in the library. Having multiple copies is something we might want to consider for an advanced version of the library software later, but in this specification, the library is modeled as a set of ISBNs, and so there are no duplicates.

•  Removal of a book does not check to see that the book to be removed is actually in the library. If the book is not in the library, there is a chance that any implementation of this specification will not check and will try to remove the book from the catalog anyway. The result could easily be a damaged data structure.

These issues can be seen clearly in this preliminary version of the specification, and you might think that we have not achieved very much. But we have started down a path of precision and clarity in specification. The benefits of this path include our ability to see issues such as those we have just discussed. A little experience with the concepts of formal specification and a growing familiarity with notations like Z allow engineers to detect defects in specification with considerable speed and accuracy.

8.8.2 Version 2 of the Example

The second version of the specification is shown in Figure 8.9. All three of the problems noted with the first version have been corrected. In order to maintain the invariant correctly, the pre-condition for the add operation has been extended to include a clause which is false if the number of books in the catalog is not less than 1,000. A clause has been added to the pre-condition of the add operation to check to see whether the book is already in the library. In a similar way, the pre-condition for the remove operation has been extended to include a clause that will only be true if the book to be removed is actually in the library.

Unfortunately, visual inspection of this second version reveals that there is a subtle but important mistake. The mistake is in the post-condition for the operation schema RemoveBook. The mistake derives from the fact that the role of the specification is to say:

“Develop an implementation to make the specification logic true.”

A careful examination of the post-condition for the RemoveBook schema reveals that the following implementation would satisfy the specification:

The reason that this implementation satisfies the specification is that this implementation makes the post-condition true even though this behavior is not what is desired. The post-condition is a disjunction, and the specification will be met by any implementation that makes either side of the disjunction true (remember, this is a logical “or”). By printing a message saying that the book is not in the library even when the book actually is, the program meets the specification by making line 5 of RemoveBook true. We need go no further with the implementation!

This might seem like a hypothetical case that would never arise in practice. Recognizing the mistake is easy with a specification of this size, but a mistake like this could be very difficult to find in a large specification. Thus, care with completeness of the pre- and post-conditions for operations is important.

8.8.3 Version 3 of the Example

The third version of the specification is shown in Figure 8.10. In this version, the error message about the remove operation failing will only be generated when indeed the book that is the subject of the removal is absent from the library.

Now that we have been through three iterations of the specification, we have a specification that passes visual inspection, and both syntax and type checking. The next thing to do is to add functionality to the specification so as to get closer to the final specification that we need.

FIGURE 8.9 The second iteration of the specification for a simple library catalog system.

FIGURE 8.10 The third iteration of the specification for a simple library catalog system.

FIGURE 8.11 The fourth iteration of the specification for a simple library catalog system.

8.8.4 Version 4 of the Example

The functionality to be added is a mechanism to lend books from the catalog to friends. To do this, we do not need to modify the operation schemas for adding and removing books. We will need additional state information, because we will have to keep track of loans and the state schema will need to be extended. We also need to add operation schemas for lending and for returning a book.

The first question that needs to be addressed in this iteration is the approach we will use to modeling books on loan. We will only lend books to friends, but a friend could borrow several books. A suitable way to model this is to add a set of friends and a function that maps each friend to a set of books that the friend has borrowed. The function is partial because only friends who have actually borrowed books are included, and that is unlikely to be all of them.

Version 4 of the specification is much longer than version 3, so only part of the specification is shown in Figure 8.11. A new given set, PeopleIKnow, has been added, and a new set, Friends, has been added to the state to model the friends of the system user. Finally, a function, Loans, has been added to the state schema. Loans is a function whose domain is the set Friends, and whose range is a set of sets with one for each element of the domain. Thus, each friend who is in the domain of Loans maps to a set of books that the friend has borrowed.

The only operation schema shown is the Borrow operation. The pre-condition for this operation checks to see whether the request is from a friend and whether the requested book is in the catalog. If either condition is false, then the specification says to print an error message. If the pre-condition is true, then the book requested is added to the set of books in the possession of the person who requested it.

8.9 Overview of Formal Specification Development

Developing a formal specification is a significant challenge, but the challenge is manageable if we proceed carefully. And the benefits are considerable, so facing the challenge is worthwhile.

Using a declarative language that was designed for the task, such as Z, the specification is developed using the model-based specification structure. That structure consists of a set of declarations that define the state and a set of operations that define changes on the state.

The state and the operations are a model of the system being specified. Just like a model car looks like a car, so the model of the system being specified “looks” like the system being specified. Model cars are often used by automobile manufacturers to define what the real car should look like before the car is built. Similarly, architects often use models to describe what a planned building will look like. Sometimes models are graphical, but they still play the role of guiding the creative process and facilitating analysis.

In principle, a model car could guide the builders of the actual car. Always keep in mind that, just as a model car is usually not drivable and not intended to be, a specification is not executable and is not intended to be. There are exceptions to this general rule called executable specifications. Such specifications are executable so that the model can be tested in various ways.

The state in a model-based specification is a set of declarations, each of which is a mathematical structure. Some are simple scalar quantities such as integers, and others are more complex like sets and functions.

Operations in a model-based specification are defined by describing the state as it must be before the operation is applied, the pre-condition, and the state after the operation is applied, the post-condition. The pre-condition limits applicability of the operation, and the post-condition defines the desired state change.

The keys to the successful development of a formal specification are iteration and analysis. The specification should be built in a series of steps where a partial specification is constructed at each step. The first step will result in a simple specification in which only part of the state is defined, and in which only a small number of operations are defined. The operations might even be incomplete. At each step the evolving specification is subject to human inspection, and syntax and type checking.

Finally, we return to the topic of presentation introduced in Section 8.4.3. The use of techniques to make a specification more readable is illustrated in the example discussed in Section 8.8.1 to Section 8.8.4. The primary Boolean operators in each expression are carefully isolated so that they stand out. Boolean operators on adjacent lines with similar roles in the specification are placed in the same column. Finally, white space is used to structure and organize the mathematical expressions for ease of understanding.

Key points in this chapter:

♦   Specification is a critical aspect of software development.

♦   Natural language has been shown to be problematic as a notation for specification.

♦   Several formal languages have been developed to permit the development of formal specifications.

♦   Formal specification languages are typically based on discrete mathematics and are often declarative.

♦   Z is a specification language in fairly common use.

♦   Formal specifications enable clearer communication between engineers.

♦   Z specifications can be analyzed to show various properties.

 

Exercises

1.  Write a predicate that is true for a year that is a leap year and false otherwise.

2.  Define the set of leap years.

3.  Write a pre-condition and a post-condition suitable for documenting a function that computes the maximum element of an array. Use any convenient array notation in the expressions.

4.  Assuming the Z given set [Senators] is the set of all members of the U.S. Senate, defining any other items that you need, and using a mathematical notation:

(a) Define the set consisting of all the Republican senators.

(b) Define a predicate that is true if and only if there are 50 Democratic senators.

5.  A computerized control system is used to control a building’s hot water system. A hot-water tank has digital temperature sensors at the top and at the bottom and two heating elements, one at the top and one at the bottom. The control system has access to a simple clock that provides time of day as a count of the number of minutes since midnight. To ensure an adequate supply of hot water, the system uses the following algorithm (all temperatures are Fahrenheit):

-   From 6:00 a.m. to 9:00 a.m. if the temperature at the top of the tank is below 180, the upper heater is turned on and if the temperature at the bottom of the tank is below 120, the bottom heater is turned on.

-   From 1:00 a.m. to 6:00 a.m. if either temperature is below 160, the top heater is turned on.

-   At all other times, if any temperature is below 180, the top heater is turned on, but if the top temperature is below 140, both heaters are turned on.
Write a formal specification for the water-tank control system in Z.

6.  The specification for the water heater in Exercise 5 is quite simple but realistic. If you had this specification and an implementation in C, how would you go about testing the implementation? Pay attention to the real-time elements of the problem and the other basic testing issues that make testing difficult.

7.  Define a suitable state and the pre- and post-conditions for each operation of a stack, i.e., develop a formal specification for a stack data structure.

8.  Define a formal specification for a simple program that records the MAC addresses of the computers owned by an organization. Assume each computer only has one address, and identify the computers using a serial number assigned by the organization. Include operations to add and delete computers, and to change the MAC address of an existing computer.

9.  Below is a possible Z state schema for the cruise-control application in Chapter 5, Exercise 13. The names of the variables indicate their purpose:

      Braking effect is linearly proportional to the brake pedal pressure, and brake pedal pressure is set as an integer in the range 0..255. If vehicle speed is above the set speed or the speed of the vehicle ahead, the brakes are to be applied with braking set to be proportional to the deviation from desired speed. If the radar detects an object at a distance less than 50% more than the vehicle’s stopping distance, then the brakes are to be applied at the maximum level. The vehicle stopping distance in feet is (current speed in mph * 1.18). Write a Z operation schema that specifies brake behavior.

10.  After testing the cruise-control system, developers learned that maximum braking to avoid a collision might injure the vehicle’s occupants. A linear accelerometer was added to the system, and the specification of emergency braking changed. The accelerometer returns an integer that is 100 times the acceleration in “g”s. Deceleration yields a negative integer from the accelerometer. If the absolute value of acceleration exceeds 1g, then emergency braking has to be reduced. The reduction has to be proportional to the measured deceleration so as to try to make deceleration -1g. Write a Z schema for emergency stopping.

11.  The tracking system in the robotic surgery system described in Chapter 3, Exercise 9 monitors infra-red LEDs attached to the patient and the cutting equipment to determine their locations. Each has several LEDs attached. Sometimes the tracking system cannot “see” some of the LEDs because they are blocked by people or equipment. Accuracy requires the following: (a) no more than one LED is blocked on the patient and no more than one on the cutting head, or (b) no more than three LEDs are blocked on the patient and none on the cutting head, or (c) no more than two LEDs are blocked on the cutting head and none on the patient. Write a Z specification that will signal “off” to the drive motors if any of these conditions arises.

12.  The software controlling the cutting mechanism in the robotic surgery system receives from the system’s guidance software a three dimensional absolute coordinate to which the cutting head is to be moved from its current location. Each coordinate is a vector of three integers measuring millimeters within an established frame of reference. Define suitable variables and write a pre-condition for the move operation which ensures that the requested move is (a) no further than delta millimeters from the current position, (b) no more than gamma millimeters travel distance in any dimension, and (c) that no coordinate value is greater than a maximum value of omega millimeters.

1. Details of Z are introduced in Section 8.6.

2. Details of Z are introduced in Section 8.6.

3. Model-based specification is not related to model-based development discussed in Chapter 7.

4. A useful tool developed by Anthony Hall for working with Z using Word on Microsoft Windows can be downloaded from: http://sourceforge.net/projects/zwordtools/

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.118.253.223