Preface

Programmers are inundated with information about application programming interfaces, or APIs. Yet, while most programmers use APIs and the libraries that implement them in almost every application they write, relatively few create and disseminate new, widely applicable, APIs. Indeed, programmers seem to prefer to “roll their own” instead of searching for a library that might meet their needs, perhaps because it is easier to write application-specific code than to craft well-designed APIs.

I’m as guilty as the next programmer: lcc, a compiler for ANSI/ISO C written by Chris Fraser and myself, was built from the ground up. (lcc is described in A Retargetable C Compiler: Design and Implementation, Addison-Wesley, 1995.) A compiler exemplifies the kind of application for which it is possible to use standard interfaces and to create interfaces that are useful elsewhere. Examples include interfaces for memory management, string and symbol tables, and list manipulation. But lcc uses only a few routines from the standard C library, and almost none of its code can be used directly in other applications.

This book advocates a design methodology based on interfaces and their implementations, and it illustrates this methodology by describing 24 interfaces and their implementations in detail. These interfaces span a large part of the computing spectrum and include data structures, arithmetic, string processing, and concurrent programming. The implementations aren’t toys — they’re designed for use in production code. As described below, the source code is freely available.

There’s little support in the C programming language for the interface-based design methodology. Object-oriented languages, like C++ and Modula-3, have language features that encourage the separation of an interface from its implementation. Interface-based design is independent of any particular language, but it does require more programmer willpower and vigilance in languages like C, because it’s too easy to pollute an interface with implicit knowledge of its implementation and vice versa.

Once mastered, however, interface-based design can speed development time by building upon a foundation of general-purpose interfaces that can serve many applications. The foundation class libraries in some C++ environments are examples of this effect. Increased reuse of existing software — libraries of interface implementations — reduces initial development costs. It also reduces maintenance costs, because more of an application rests on well-tested implementations of general-purpose interfaces.

The 24 interfaces come from several sources, and all have been revised for this book. Some of the interfaces for data structures — abstract data types — originated in lcc code, and in implementations of the Icon programming language done in the late 1970s and early 1980s (see R. E. Griswold and M. T. Griswold, The Icon Programming Language, Prentice Hall, 1990). Others come from the published work of other programmers; the “Further Reading” sections at the end of each chapter give the details.

Some of the interfaces are for data structures, but this is not a data structures book, per se. The emphasis is more on algorithm engineering — packaging data structures for general use in applications — than on data-structure algorithms. Good interface design does rely on appropriate data structures and efficient algorithms, however, so this book complements traditional data structure and algorithms texts like Robert Sedgewick’s Algorithms in C (Addison-Wesley, 1990).

Most chapters describe one interface and its implementation; a few describe related interfaces. The “Interface” section in each chapter gives a concise, detailed description of the interface alone. For programmers interested only in the interfaces, these sections form a reference manual. A few chapters include “Example” sections, which illustrate the use of one or more interfaces in simple applications.

The “Implementation” section in each chapter is a detailed tour of the code that implements the chapter’s interface. In a few cases, more than one implementation for the same interface is described, which illustrates an advantage of interface-based design. These sections are most useful for those modifying or extending an interface or designing related interfaces. Many of the exercises explore design and implementation alternatives. It should not be necessary to read an “Implementation” section in order to understand how to use an interface.

The interfaces, examples, and implementations are presented as literate programs; that is, the source code is interleaved with its explanation in an order that best suits understanding the code. The code is extracted automatically from the text files for this book and assembled into the order dictated by the C programming language. Other book-length examples of literate programming in C include A Retargetable C Compiler and The Stanford GraphBase: A Platform for Combinatorial Computing by D. E. Knuth (Addison-Wesley, 1993).

Organization

The material in this book falls into the following broad categories:

Foundations

1.

Introduction

 

2.

Interfaces and Implementations

 

4.

Exceptions and Assertions

 

5.

Memory Management

 

6.

More Memory Management

Data Structures

7.

Lists

 

8.

Tables

 

9.

Sets

 

10.

Dynamic Arrays

 

11.

Sequences

 

12.

Rings

 

13.

Bit Vectors

Strings

3.

Atoms

 

14.

Formatting

 

15.

Low-Level Strings

 

16.

High-Level Strings

Arithmetic

17.

Extended-Precision Arithmetic

 

18.

Arbitrary-Precision Arithmetic

 

19.

Multiple-Precision Arithmetic

Threads

20.

Threads

Most readers will benefit from reading all of Chapters 1 through 4, because these chapters form the framework for the rest of the book. The remaining chapters can be read in any order, although some of the later chapters refer to their predecessors.

Chapter 1 covers literate programming and issues of programming style and efficiency. Chapter 2 motivates and describes the interface-based design methodology, defines the relevant terminology, and tours two simple interfaces and their implementations. Chapter 3 describes the prototypical Atom interface, which is the simplest production-quality interface in this book. Chapter 4 introduces exceptions and assertions, which are used in every interface. Chapters 5 and 6 describe the memory management interfaces used by almost all the implementations. The rest of the chapters each describe an interface and its implementation.

Instructional Use

I assume that readers understand C at the level covered in undergraduate introductory programming courses, and have a working understanding of fundamental data structures at the level presented in texts like Algorithms in C. At Princeton, the material in this book is used in systems programming courses from the sophomore to first-year graduate levels. Many of the interfaces use advanced C programming techniques, such as opaque pointers and pointers to pointers, and thus serve as nontrivial examples of those techniques, which are useful in systems programming and data structure courses.

This book can be used for courses in several ways, the simplest being in project-oriented courses. In a compiler course, for example, students often build a compiler for a toy language. Substantial projects are common in graphics courses as well. Many of the interfaces can simplify the projects in these kinds of courses by eliminating some of the grunt programming needed to get such projects off the ground. This usage helps students realize the enormous savings that reuse can bring to a project, and it often induces them to try interface-based design for their own parts of the project. This latter effect is particularly valuable in team projects, because that’s a way of life in the “real world.”

Interfaces and implementations are the focus of Princeton’s sophomore-level systems programming course. Assignments require students to be interface clients, implementors, and designers. In one assignment, for example, I distribute Section 8.1’s Table interface, the object code for its implementation, and the specifications for Section 8.2’s word frequency program, wf. The students must implement wf using only my object code for Table. In the next assignment, they get the object code for wf, and they must implement Table. Sometimes, I reverse these assignments, but both orders are eye-openers for most students. They are unaccustomed to having only object code for major parts of their program, and these assignments are usually their first exposure to the semiformal notation used in interfaces and program specification.

Initial assignments also introduce checked runtime errors and assertions as integral parts of interface specifications. Again, it takes a few assignments before students begin to appreciate the value of these concepts. I forbid “unannounced” crashes; that is, crashes that are not announced by an assertion failure diagnostic. Programs that crash get a grade of zero. This penalty may seem unduly harsh, but it gets the students’ attention. They also gain an appreciation of the advantages of safe languages, like ML and Modula-3, in which unannounced crashes are impossible. (This grading policy is less harsh than it sounds, because in multipart assignments, only the offending part is penalized, and different assignments have different weights. I’ve given many zeros, but none has ever caused a course grade to shift by a whole point.)

Once students have a few interfaces under their belts, later assignments ask them to design new interfaces and to live with their design choices. For example, one of Andrew Appel’s favorite assignments is a primality testing program. Students work in groups to design the interfaces for the arbitrary-precision arithmetic that is needed for this assignment. The results are similar to the interfaces described in Chapters 17 through 19. Different groups design interfaces, and a postassignment comparison of these interfaces, in which the groups critique one anothers’ work, is always quite revealing. Kai Li accomplishes similar goals with a semester-long project that builds an X-based editor using the Tcl/Tk system (J. K. Ousterhout, Tcl and the Tk Toolkit, Addison-Wesley, 1994) and editor-specific interfaces designed and implemented by the students. Tk itself provides another good example of interface-based design.

In advanced courses, I usually package assignments as interfaces and give the students free rein to revise and improve on them, and even to change the goals of the assignment. Giving them a starting point reduces the time required for assignment, and allowing substantial changes encourages creative students to explore alternatives. The unsuccessful alternatives are often more educational than the successful ones. Students invariably go down the wrong road, and they pay for it with greatly increased development time. When, in hindsight, they understand their mistakes, they come to appreciate that designing good interfaces is hard, but worth the effort, and they almost always become converts to interface-based design.

How to Get the Software

The software in this book has been tested on the following platforms:

Processor

Operating Systems

Compilers

SPARC

SunOS 4.1

lcc 3.5

gcc 2.7.2

Alpha

OSF/1 3.2A

lcc 4.0

gcc 2.6.3

cc

MIPS R3000

IRIX 5.3

lcc 3.5

gcc 2.6.3

cc

MIPS R3000

Ultrix 4.3

lcc 3.5

gcc 2.5.7

Pentium

Windows 95

Windows NT 3.51

Microsoft Visual C/C++ 4.0

A few of the implementations are machine-specific; they assume that the machine has two’s-complement integer and IEEE floating-point arithmetic, and that unsigned longs can hold object pointers.

The source code for everything in this book is available for anonymous ftp at ftp.cs.princeton.edu in pub/packages/cii. Use an ftp client to connect to ftp.cs.princeton.edu, change to the directory pub/packages/cii, and download the file README, which describes the contents of the directory and how to download the distribution.

The most recent distributions are usually in files with names like ciixy.tar.gz or ciixy.zip, where xy is the version number; for example, 10 is version 1.0. ciixy.tar.gz is a UNIX tar file compressed with gzip, and ciixy.zip is a ZIP file compatible with PKZIP version 2.04g. The files in ciixy.zip are DOS/Windows text files; that is, their lines end with carriage returns and linefeeds. ciixy.zip may also be available on America Online, CompuServe, and other online services.

Information is also available on the World Wide Web at the URL http://www.cs.princeton.edu/software/cii/. This page includes instructions on reporting bugs.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.142.97.219