Chapter 3: Program Inspections, Walkthroughs, and Reviews

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Chapter 3

Program Inspections, Walkthroughs, and Reviews

For many years, most of us in the programming community worked under the assumptions that programs are written solely for machine execution, and are not intended for people to read, and that the only way to test a program is to execute it on a machine. This attitude began to change in the early 1970s through the efforts of program developers who first saw the value in reading code as part of a comprehensive testing and debugging regimen.

Today, not all testers of software applications read code, but the concept of studying program code as part of a testing effort certainly is widely accepted. Several factors may affect the likelihood that a given testing and debugging effort will include people actually reading program code: the size or complexity of the application, the size of the development team, the timeline for application development (whether the schedule is relaxed or intense, for example), and, of course, the background and culture of the programming team.

For these reasons, we will discuss the process of noncomputer-based testing (“human testing”) before we delve into the more traditional computer-based testing techniques. Human testing techniques are quite effective in finding errors—so much so that every programming project should use one or more of these techniques. You should apply these methods between the time the program is coded and when computer-based testing begins. You also can develop and apply analogous methods at earlier stages in the programming process (such as at the end of each design stage), but these are outside the scope of this book.

Before we begin the discussion of human testing techniques, take note of this important point: Because the involvement of humans results in less formal methods than mathematical proofs conducted by a computer, you may feel skeptical that something so simple and informal can be useful. Just the opposite is true. These informal techniques don't get in the way of successful testing; rather, they contribute substantially to productivity and reliability in two major ways.

First, it is generally recognized that the earlier errors are found, the lower the costs of correcting the errors and the higher the probability of correcting them correctly. Second, programmers seem to experience a psychological shift when computer-based testing commences. Internally induced pressures seem to build rapidly and there is a tendency to want to “fix this darn bug as soon as possible.” Because of these pressures, programmers tend to make more mistakes when correcting an error found during computer-based testing than they make when correcting an error found earlier.

Inspections and Walkthroughs

The three primary human testing methods are code inspections, walkthroughs and user (or usability) testing. We cover the first two of these, which are code-oriented methods, in this chapter. These methods can be used at virtually any stage of software development, after an application is deemed to be complete or as each module or unit is complete (see Chapter 5 for more information on module testing). We discuss user testing in detail in Chapter 7.

The two code inspection methods have a lot in common, so we will discuss their similarities together. Their differences are enumerated in subsequent sections.

Inspections and walkthroughs involve a team of people reading or visually inspecting a program. With either method, participants must conduct some preparatory work. The climax is a “meeting of the minds,” at a participant conference. The objective of the meeting is to find errors but not to find solutions to the errors—that is, to test, not debug.

Code inspections and walkthroughs have been widely used for some time. In our opinion, the reason for their success is related to some of the principles identified in Chapter 2.

In a walkthrough, a group of developers—with three or four being an optimal number—performs the review. Only one of the participants is the author of the program. Therefore, the majority of program testing is conducted by people other than the author, which follows testing principle 2, which states that an individual is usually ineffective in testing his or her own program. (Refer to Chapter 2, Table 2.1, and the subsequent discussion for all 10 program testing principles.)

An inspection or walkthrough is an improvement over the older desk-checking process (whereby a programmer reads his or her own program before testing it). Inspections and walkthroughs are more effective, again because people other than the program's author are involved in the process.

Another advantage of walkthroughs, resulting in lower debugging (error-correction) costs, is the fact that when an error is found it usually is located precisely in the code as opposed to black box testing where you only receive an unexpected result. Moreover, this process frequently exposes a batch of errors, allowing the errors to be corrected later en masse. Computer-based testing, on the other hand, normally exposes only a symptom of the error (e.g., the program does not terminate or the program prints a meaningless result), and errors are usually detected and corrected one by one.

These human testing methods generally are effective in finding from 30 to 70 percent of the logic-design and coding errors in typical programs. They are not effective, however, in detecting high-level design errors, such as errors made in the requirements analysis process. Note that a success rate of 30 to 70 percent doesn't mean that up to 70 percent of all errors might be found. Recall from Chapter 2 that we can never know the total number of errors in a program. Thus, what this means is that these methods are effective in finding up to 70 percent of all errors found by the end of the testing process.

Of course, a possible criticism of these statistics is that the human processes find only the “easy” errors (those that would be trivial to find with computer-based testing) and that the difficult, obscure, or tricky errors can be found only by computer-based testing. However, some testers using these techniques have found that the human processes tend to be more effective than the computer-based testing processes in finding certain types of errors, while the opposite is true for other types of errors (e.g., uninitialized variables versus divide by zero errors). The implication is that inspections/walkthroughs and computer-based testing are complementary; error-detection efficiency will suffer if one or the other is not present.

Finally, although these processes are invaluable for testing new programs, they are of equal, or even higher, value in testing modifications to programs. In our experience, modifying an existing program is a process that is more error prone (in terms of errors per statement written) than writing a new program. Therefore, program modifications also should be subjected to these testing processes as well as regression testing techniques.

Code Inspections

A code inspection is a set of procedures and error-detection techniques for group code reading. Most discussions of code inspections focus on the procedures, forms to be filled out, and so on. Here, after a short summary of the general procedure, we will focus on the actual error-detection techniques.

Inspection Team

An inspection team usually consists of four people. The first of the four plays the role of moderator, which in this context is tantamount to that of a quality-control engineer. The moderator is expected to be a competent programmer, but he or she is not the author of the program and need not be acquainted with the details of the program. Moderator duties include:

Distributing materials for, and scheduling, the inspection session.
Leading the session.
Recording all errors found.
Ensuring that the errors are subsequently corrected.

The second team member is the programmer. The remaining team members usually are the program's designer (if different from the programmer) and a test specialist. The specialist should be well versed in software testing and familiar with the most common programming errors, which we discuss later in this chapter.

Inspection Agenda

Several days in advance of the inspection session, the moderator distributes the program's listing and design specification to the other participants. The participants are expected to familiarize themselves with the material prior to the session. During the session, two activities occur:

1. The programmer narrates, statement by statement, the logic of the program. During the discourse, other participants should raise questions, which should be pursued to determine whether errors exist. It is likely that the programmer, rather than the other team members, will find many of the errors identified during this narration. In other words, the simple act of reading aloud a program to an audience seems to be a remarkably effective error-detection technique.

2. The program is analyzed with respect to checklists of historically common programming errors (such a checklist is discussed in the next section).

The moderator is responsible for ensuring that the discussions proceed along productive lines and that the participants focus their attention on finding errors, not correcting them. (The programmer corrects errors after the inspection session.)

Upon the conclusion of the inspection session, the programmer is given a list of the errors uncovered. If more than a few errors were found, or if any of the errors require a substantial correction, the moderator might make arrangements to reinspect the program after those errors have been corrected. This subsequent list of errors is also analyzed, categorized, and used to refine the error checklist to improve the effectiveness of future inspections.

As stated, this inspection process usually concentrates on discovering errors, not correcting them. That said, some teams may find that when a minor problem is discovered, two or three people, including the programmer responsible for the code, may propose design changes to handle this special case. The discussion of this minor problem may, in turn, focus the group's attention on that particular area of the design. During the discussion of the best way to alter the design to handle this minor problem, someone may notice a second problem. Now that the group has seen two problems related to the same aspect of the design, comments likely will come thick and fast, with interruptions every few sentences. In a few minutes, this whole area of the design could be thoroughly explored, and any problems made obvious.

The time and location of the inspection should be planned to prevent all outside interruptions. The optimal amount of time for the inspection session appears to be from 90 to 120 minutes. The session is a mentally taxing experience, thus longer sessions tend to be less productive. Most inspections proceed at a rate of approximately 150 program statements per hour. For that reason, large programs should be examined over multiple inspections, each dealing with one or several modules or subroutines.

Human Agenda

Note that for the inspection process to be effective, the testing group must adopt an appropriate attitude. If, for example, the programmer views the inspection as an attack on his or her character and adopts a defensive posture, the process will be ineffective. Rather, the programmer must a leave his or her ego at the door and place the process in a positive and constructive light, keeping in mind that the objective of the inspection is to find errors in the program and, thus, improve the quality of the work. For this reason, most people recommend that the results of an inspection be a confidential matter, shared only among the participants. In particular, if managers somehow make use of the inspection results (to assume or imply that the programmer is inefficient or incompetent, for example), the purpose of the process may be defeated.

Side Benefits of the Inspection Process

The inspection process has several beneficial side effects, in addition to its main effect of finding errors. For one, the programmer usually receives valuable feedback concerning programming style, choice of algorithms, and programming techniques. The other participants gain in a similar way by being exposed to another programmer's errors and programming style. In general, this type of software testing helps reinforce a team approach to this particular project and to projects that involve these participants in general. Reducing the potential for the evolution of an adversarial relationship, in favor of a cooperative, team approach to projects, can lead to more efficient and reliable program development.

Finally, the inspection process is a way of identifying early the most error-prone sections of the program, helping to focus attention more directly on these sections during the computer-based testing processes (number 9 of the testing principles given in Chapter 2).

An Error Checklist for Inspections

An important part of the inspection process is the use of a checklist to examine the program for common errors. Unfortunately, some checklists concentrate more on issues of style than on errors (e.g., “Are comments accurate and meaningful?” and “Are if-else code blocks, and do-while groups aligned?”), and the error checks are too nebulous to be useful (such as, “Does the code meet the design requirements?”). The checklist in this section, divided into six categories, was compiled after many years of study of software errors. It is largely language-independent, meaning that most of the errors can occur with any programming language. You may wish to supplement this list with errors peculiar to your programming language and with errors detected after completing the inspection process.

Data Reference Errors

Does a referenced variable have a value that is unset or uninitialized? This probably is the most frequent programming error, occurring in a wide variety of circumstances. For each reference to a data item (variable, array element, field in a structure), attempt to “prove” informally that the item has a value at that point.

For all array references, is each subscript value within the defined bounds of the corresponding dimension?

For all array references, does each subscript have an integer value? This is not necessarily an error in all languages, but, in general, working with noninteger array references is a dangerous practice.

For all references through pointer or reference variables, is the referenced memory currently allocated? This is known as the “dangling reference” problem. It occurs in situations where the lifetime of a pointer is greater than the lifetime of the referenced memory. One instance occurs where a pointer references a local variable within a procedure, the pointer value is assigned to an output parameter or a global variable, the procedure returns (freeing the referenced location), and later the program attempts to use the pointer value. In a manner similar to checking for the prior errors, try to prove informally that, in each reference using a pointer variable, the referenced memory exists.

When a memory area has alias names with differing attributes, does the data value in this area have the correct attributes when referenced via one of these names? Situations to look for are the use of the EQUIVALENCE statement in Fortran and the REDEFINES clause in COBOL. As an example, a Fortran program contains a real variable A and an integer variable B; both are made aliases for the same memory area by using an EQUIVALENCE statement. If the program stores a value into A and then references variable B, an error is likely present since the machine would use the floating-point bit representation in the memory area as an integer.

Sidebar 3.1: History of COBOL and Fortran

COBOL and Fortran are older programming languages that have fueled business and scientific software development for generations of computer hardware, operating systems and programmers.

COBOL (an acronym for COmmon Business Oriented Language) first was defined about 1959 or 1960, and was designed to support business application development on mainframe class computers. The original specification included aspects of other existing languages at the time. Big-name computer manufacturers and representatives of the federal government participated in this effort to create a business-oriented programming language that could run on a variety of hardware and operating system platforms.

COBOL language standards have been reviewed and updated over the years. By 2002, COBOL was available for most current operating platforms and object-oriented versions supporting the .NET development environment.

As the time of this writing, the latest version of COBOL is Visual COBOL 2010.

Fortran (originally FORTRAN, but modern references generally follow the uppercase/lowercase syntax) is a little older than COBOL, with early specifications defined in the early to middle 1950s. Like COBOL, Fortran was designed for specific types of mainframe application development, but in the scientific and numerical management arenas. The name derives from an existing IBM system at the time, Mathematical FOR mula TRAN slating System. Although the original Fortran contained only 32 statements, it marked a significant improvement over assembly-level programming that preceded it.

The current version as of the publication date of this book is Fortran 2008, formally approved by the appropriate standard committees in 2010. Like COBOL, the evolution of Fortran added support for a broad range of hardware and operating system platforms. However, Fortran is probably used more in current development—as well as older system maintenance—than COBOL.

Does a variable's value have a type or attribute other than what the compiler expects? This situation might occur where a C or C++ program reads a record into memory and references it by using a structure, but the physical representation of the record differs from the structure definition.

Are there any explicit or implicit addressing problems if, on the computer being used, the units of memory allocation are smaller than the units of addressable memory? For instance, in some environments, fixed-length bit strings do not necessarily begin on byte boundaries, but address only point-to-byte boundaries. If a program computes the address of a bit string and later refers to the string through this address, the wrong memory location may be referenced. This situation also could occur when passing a bit-string argument to a subroutine.

If pointer or reference variables are used, does the referenced memory location have the attributes the compiler expects? An example of such an error is where a C++ pointer upon which a data structure is based is assigned the address of a different data structure.

If a data structure is referenced in multiple procedures or subroutines, is the structure defined identically in each procedure?

When indexing into a string, are the limits of the string off by one in indexing operations or in subscript references to arrays?

For object-oriented languages, are all inheritance requirements met in the implementing class?

Data Declaration Errors

Have all variables been explicitly declared? A failure to do so is not necessarily an error, but is, nevertheless, a common source of trouble. For instance, if a program subroutine receives an array parameter, and fails to define the parameter as an array (as in a DIMENSION statement), a reference to the array (such as C=A(I)) is interpreted as a function call, leading to the machine's attempting to execute the array as a program. Also, if a variable is not explicitly declared in an inner procedure or block, is it understood that the variable is shared with the enclosing block?

If all attributes of a variable are not explicitly stated in the declaration, are the defaults well understood? For instance, the default attributes received in Java are often a source of surprise when not properly declared.

Where a variable is initialized in a declarative statement, is it properly initialized? In many languages, initialization of arrays and strings is somewhat complicated and, hence, error prone.

Is each variable assigned the correct length and data type?

Is the initialization of a variable consistent with its memory type? For instance, if a variable in a Fortran subroutine needs to be reinitialized each time the subroutine is called, it must be initialized with an assignment statement rather than a DATA statement.

Are there any variables with similar names (e.g., VOLT and VOLTS)? This is not necessarily an error, but it should be seen as a warning that the names may have been confused somewhere within the program.

Computation Errors

Are there any computations using variables having inconsistent (such as nonarithmetic) data types?

Are there any mixed-mode computations? An example is when working with floating-point and integer variables. Such occurrences are not necessarily errors, but they should be explored carefully to ensure that the conversion rules of the language are understood. Consider the following Java snippet showing the rounding error that can occur when working with integers:

int x = 1;

int y = 2;

int z = 0;

z = x/y;

System.out.println ("z = " + z);

OUTPUT:

z = 0

Are there any computations using variables having the same data type but of different lengths?

Is the data type of the target variable of an assignment smaller than the data type or a result of the right-hand expression?

Is an overflow or underflow expression possible during the computation of an expression? That is, the end result may appear to have valid value, but an intermediate result might be too big or too small for the programming language's data types.

Is it possible for the divisor in a division operation to be zero?

If the underlying machine represents variables in base-2 form, are there any sequences of the resulting inaccuracy? That is, 10 * 0.1 is rarely equal to 1.0 on a binary machine.

Where applicable, can the value of a variable go outside the meaningful range? For example, statements assigning a value to the variable PROBABILITY might be checked to ensure that the assigned value will always be positive and not greater than 1.0.

For expressions containing more than one operator, are the assumptions about the order of evaluation and precedence of operators correct?

Are there any invalid uses of integer arithmetic, particularly divisions? For instance, if i is an integer variable, whether the expression 2*i/2 == i depends on whether i has an odd or an even value and whether the multiplication or division is performed first.

Comparison Errors

Are there any comparisons between variables having different data types, such as comparing a character string to an address, date, or number?

Are there any mixed-mode comparisons or comparisons between variables of different lengths? If so, ensure that the conversion rules are well understood.

Are the comparison operators correct? Programmers frequently confuse such relations as at most, at least, greater than, not less than, and less than or equal.

Does each Boolean expression state what it is supposed to state? Programmers often make mistakes when writing logical expressions involving and, or, and not.

Are the operands of a Boolean operator Boolean? Have comparison and Boolean operators been erroneously mixed together? This represents another frequent class of mistakes. Examples of a few typical mistakes are illustrated here:

If you want to determine whether i is between 2 and 10, the expression 2<i<10 is incorrect. Instead, it should be (2<i)&&(i<10).
If you want to determine whether i is greater than x or y, i>x||y is incorrect. Instead, it should be (i>x)||(i>y).
If you want to compare three numbers for equality, if(a==b==c) does something quite different.
If you want to test the mathematical relation x>y>z, the correct expression is (x>y)&&(y>z).

Are there any comparisons between fractional or floating-point numbers that are represented in base-2 by the underlying machine? This is an occasional source of errors because of truncation and base-2 approximations of base-10 numbers.

For expressions containing more than one Boolean operator, are the assumptions about the order of evaluation and the precedence of operators correct? That is, if you see an expression such as if((a==2)&&(b==2)||(c==3)), is it well understood whether the and or the or is performed first?

Does the way in which the compiler evaluates Boolean expressions affect the program? For instance, the statement

if(x==0&&(x/y)>z)

may be acceptable for compilers that end the test as soon as one side of an and is false, but may cause a division-by-zero error with other compilers.

Control-Flow Errors

If the program contains a multipath branch such as a computed GOTO, can the index variable ever exceed the number of branch possibilities? For example, in the statement

GOTO(200,300,400),i

will i always have the value of 1, 2, or 3?

Will every loop eventually terminate? Devise an informal proof or argument showing that each loop will terminate.

Will the program, module, or subroutine eventually terminate?

Is it possible that, because of the conditions upon entry, a loop will never execute? If so, does this represent an oversight? For instance, if you had the following for loop and while loop headed by the following statements:

for (i=x;i<=z;i++){

...

or . . .

while (NOTFOUND){

...

what happens if x is greater than z or if NOTFOUND is initially false?

For a loop controlled by both iteration and a Boolean condition (e.g., a searching loop) what are the consequences of loop fall-through? For example, for the psuedo-code loop headed by

DO I=1 to TABLESIZE WHILE (NOTFOUND)

what happens if NOTFOUND never becomes false?

Are there any off-by-one errors, such as one too many or too few iterations? This is a common error in zero-based loops. You will often forget to count 0 as a number. For example, if you want to create Java code for a loop that iterates 10 times, the following would be wrong, as it performs 11 iterations:

for (int i=0;i<=10;i++){

System.out.println(i);

Correct, the loop is iterated 10 times:

for (int i=0; i<10;i++) {

System.out.println(i);

If the language contains a concept of statement groups or code blocks (e.g., do-while or {...}), is there an explicit while for each group, and do the instances of do correspond to their appropriate groups? Is there a closing bracket for each open bracket? Most modern compilers will complain of such mismatches.

Are there any nonexhaustive decisions? For instance, if an input parameter's expected values are 1, 2, or 3, does the logic assume that it must be 3 if it is not 1 or 2? If so, is the assumption valid?

Interface Errors

Does the number of parameters received by this module equal the number of arguments sent by each of the calling modules? Also, is the order correct?

Do the attributes (e.g., data type and size) of each parameter match the attributes of each corresponding argument?

Does the units system of each parameter match the units system of each corresponding argument? For example, is the parameter expressed in degrees but the argument expressed in radians?

Does the number of arguments passed by this module to another module equal the number of parameters expected by that module?

Do the attributes of each argument passed to another module match the attributes of the corresponding parameter in that module?

Does the units system of each argument passed to another module match the units system of the corresponding parameter in that module?

If built-in functions are invoked, are the number, attributes, and order of the arguments correct?

If a module or class has multiple entry points, is a parameter ever referenced that is not associated with the current point of entry? Such an error exists in the second assignment statement in the following PL/1 program:

A: PROCEDURE (W,X);

W=X+1;

RETURN

B: ENTRY (Y,Z);

Y=X+Z;

END;

Does a subroutine alter a parameter that is intended to be only an input value?

If global variables are present, do they have the same definition and attributes in all modules that reference them?

Are constants ever passed as arguments? In some Fortran implementations a statement such as

CALL SUBX(J,3)

is dangerous, because if the subroutine SUBX assigns a value to its second parameter, the value of the constant 3 will be altered.

Input/Output Errors

If files are explicitly declared, are their attributes correct?

Are the attributes on the file's OPEN statement correct?

Does the format specification agree with the information in the I/O statement? For instance, in Fortran, does each FORMAT statement agree (in terms of the number and attributes of the items) with the corresponding READ or WRITE statement?

Is sufficient memory available to hold the file your program will read?

Have all files been opened before use?

Have all files been closed after use?

Are end-of-file conditions detected and handled correctly?

Are I/O error conditions handled correctly?

Are there spelling or grammatical errors in any text that is printed or displayed by the program?

Does the program properly handle “File not Found” errors?

Other Checks

If the compiler produces a cross-reference listing of identifiers, examine it for variables that are never referenced or are referenced only once.

If the compiler produces an attribute listing, check the attributes of each variable to ensure that no unexpected default attributes have been assigned.

If the program compiled successfully, but the computer produced one or more “warning” or “informational” messages, check each one carefully. Warning messages are indications that the compiler suspects you are doing something of questionable validity: Review all of these suspicions. Informational messages may list undeclared variables or language uses that impede code optimization.

Is the program or module sufficiently robust? That is, does it check its input for validity?

Is a function missing from the program?

This checklist is summarized in Tables 3.1 and 3.2.

Table 3.1 Inspection Error Checklist Summary, Part I

Data Reference	Computation
1. Unset variable used?	1. Computations on nonarithmetic variables?
2. Subscripts within bounds?	2. Mixed-mode computations?
3. Noninteger subscripts?	3. Computations on variables of different lengths?
4. Dangling references?	4. Target size less than size of assigned value?
5. Correct attributes when aliasing?	5. Intermediate result overflow or underflow?
6. Record and structure attributes match?	6. Division by zero?
7. Computing addresses of bit strings? Passing bit-string arguments?	7. Base-2 inaccuracies?
8. Based storage attributes correct?	8. Variable's value outside of meaningful range?
9. Structure definitions match across procedures?	9. Operator precedence understood?
10. Off-by-one errors in indexing or subscripting operations?	10. Integer divisions correct?
11. Inheritance requirements met?
Data Declaration	Comparison
1. All variables declared?	1. Comparisons between inconsistent variables?
2. Default attributes understood?	2. Mixed-mode comparisons?
3. Arrays and strings initialized properly?	3. Comparison relationships correct?
4. Correct lengths, types, and storage classes assigned?	4. Boolean expressions correct?
5. Initialization consistent with storage class?	5. Comparison and Boolean expressions mixed?
6. Any variables with similar names?	6. Comparisons of base-2 fractional values?
	7. Operator precedence understood?
	8. Compiler evaluation of Boolean expressions understood?

Table 3.2 Inspection Error Checklist Summary, Part II

Control Flow	Input/Output
1. Multiway branches exceeded?	1. File attributes correct?
2. Will each loop terminate?	2. OPEN statements correct?
3. Will program terminate?	3. Format specification matches I/O statement?
4. Any loop bypasses because of entry conditions?	4. Buffer size matches record size?
5. Possible loop fall-throughs correct?	5. Files opened before use?
6. Off-by-one iteration errors?	6. Files closed after use?
7. DO/END statements match?	7. End-of-file conditions handled?
8. Any nonexhaustive decisions?	8. I/O errors handled?
9. Any textual or grammatical errors in output information?
Interfaces	Other Checks
1. Number of input parameters equal to number of arguments?	1. Any unreferenced variables in cross-reference listing?
2. Parameter and argument attributes match?	2. Attribute list what was expected?
3. Parameter and argument units system match?	3. Any warning or informational messages?
4. Number of arguments transmitted to called modules equal to number of parameters?	4. Input checked for validity?
5. Attributes of arguments transmitted to called modules equal to attributes of parameters?	5. Missing function?
6. Units system of arguments transmitted to called modules equal to units system of parameters?
7. Number, attributes, and order of arguments to built-in functions correct?
8. Any references to parameters not associated with current point of entry?
9. Input-only arguments altered?
10. Global variable definitions consistent across modules?
11. Constants passed as arguments?

Walkthroughs

The code walkthrough, like the inspection, is a set of procedures and error-detection techniques for group code reading. It shares much in common with the inspection process, but the procedures are slightly different, and a different error-detection technique is employed.

Like the inspection, the walkthrough is an uninterrupted meeting of one to two hours in duration. The walkthrough team consists of three to five people. One of these people plays a role similar to that of the moderator in the inspection process; another person plays the role of a secretary (a person who records all errors found); and a third person plays the role of a tester. Suggestions as to who the three to five people should be vary. Of course, the programmer is one of those people. Suggestions for the other participants include:

A highly experienced programmer
A programming-language expert
A new programmer (to give a fresh, unbiased outlook)
The person who will eventually maintain the program
Someone from a different project
Someone from the same programming team as the programmer

The initial procedure is identical to that of the inspection process: The participants are given the materials several days in advance, to allow them time to bone up on the program. However, the procedure in the meeting is different. Rather than simply reading the program or using error checklists, the participants “play computer.” The person designated as the tester comes to the meeting armed with a small set of paper test cases—representative sets of inputs (and expected outputs) for the program or module. During the meeting, each test case is mentally executed; that is, the test data are “walked through” the logic of the program. The state of the program (i.e., the values of the variables) is monitored on paper or a whiteboard.

Of course, the test cases must be simple in nature and few in number, because people execute programs at a rate that is many orders of magnitude slower than a machine. Hence, the test cases themselves do not play a critical role; rather, they serve as a vehicle for getting started and for questioning the programmer about his or her logic and assumptions. In most walkthroughs, more errors are found during the process of questioning the programmer than are found directly by the test cases themselves.

As in the inspection, the attitude of the participants is critical. Comments should be directed toward the program rather than the programmer. In other words, errors are not regarded as weaknesses in the person who committed them. Rather, they are viewed as inherent to the difficulty of the program development.

The walkthrough should have a follow-up process similar to that described for the inspection process. Also, the side effects observed from inspections (identification of error-prone sections and education in errors, style, and techniques) also apply to the walkthrough process.

Desk Checking

A third human error-detection process is the older practice of desk checking. A desk check can be viewed as a one-person inspection or walkthrough: A person reads a program, checks it with respect to an error list, and/or walks test data through it.

For most people, desk checking is relatively unproductive. One reason is that it is a completely undisciplined process. A second, and more important, reason is that it runs counter to testing principle 2 (see Chapter 2), which states that people are generally ineffective in testing their own programs. For this reason, you could deduce that desk checking is best performed by a person other than the author of the program (e.g., two programmers might swap programs rather than desk check their own), but even this is less effective than the walkthrough or inspection process. The reason is the synergistic effect of the walkthrough or inspection team. The team session fosters a healthy environment of competition; people like to show off by finding errors. In a desk-checking process, there is no one to whom you can show off, thereby precluding this apparently valuable effect. In short, desk checking may be more valuable than doing nothing at all, but it is much less effective than the inspection or walkthrough.

Peer Ratings

The last human review process is not associated with program testing (i.e., its objective is not to find errors). Nevertheless, we include this process here because it is related to the idea of code reading.

Peer rating is a technique of evaluating anonymous programs in terms of their overall quality, maintainability, extensibility, usability, and clarity. The purpose of the technique is to provide programmer self-evaluation.

A programmer is selected to serve as an administrator of the process. The administrator, in turn, selects approximately 6 to 20 participants (6 is the minimum to preserve anonymity). The participants are expected to have similar backgrounds (e.g., don't group Java application programmers with assembly language system programmers). Each participant is asked to select two of his or her own programs to be reviewed. One program should be representative of what the participant considers to be his or her finest work; the other should be a program that the programmer considers to be poorer in quality.

Once the programs have been collected, they are randomly distributed to the participants. Each participant is given four programs to review. Two of the programs are the “finest” programs and two are “poorer” programs, but the reviewer is not told which is which. Each participant spends 30 minutes reviewing each program and then completes an evaluation form. After reviewing all four programs, each participant rates the relative quality of the four programs. The evaluation form asks the reviewer to answer, on a scale from 1 to 10 (1 meaning definitely yes and 10 meaning definitely no), such questions as:

Was the program easy to understand?
Was the high-level design visible and reasonable?
Was the low-level design visible and reasonable?
Would it be easy for you to modify this program?
Would you be proud to have written this program?

The reviewer also is asked for general comments and suggested improvements.

After the review, the participants are given the anonymous evaluation forms for their two contributed programs. They also are given a statistical summary showing the overall and detailed ranking of their original programs across the entire set of programs, as well as an analysis of how their ratings of other programs compared with those ratings of other reviewers of the same program. The purpose of the process is to allow programmers to self-assess their programming skills. As such, the process appears to be useful in both industrial and classroom environments.

Summary

This chapter discussed a form of testing that developers do not often consider: human code testing. Most people assume that because programs are written for machine execution, machines should test programs as well. This assumption is invalid. Human testing techniques are very effective at revealing errors. In fact, most programming projects should include the following human testing techniques:

Code inspections using checklists
Group walkthroughs
Desk checking
Peer reviews

Another form of human testing is user or usability testing, a black-box technique that evaluates software from a hands-on, end-user perspective. We cover this topic in detail in Chapter 7.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Chapter 3: Program Inspections, Walkthroughs, and Reviews

Create new playlist

Sign In

Sign Up

Table of Contents for
Chapter 3: Program Inspections, Walkthroughs, and Reviews