7.4  Adequacy Criteria Based on Data Flow

To be able to construct test adequacy criteria based on sets dpu and dcu, we first define some notions related to coverage. In the definitions below we assume that the data flow graph of P contains k nodes where nodes n1 and nk represent the start and end nodes, respectively. Some node s in the data flow graph of program P is considered covered by test case t when the following complete path is traversed upon the execution of P against t,

(ni1 , ni2 , . . . , nim−1 , nim)

and s = nij for some 1 ≤ j m and m k, i.e. node s lies on the path traversed. Similarly, an edge (r, s) in the data flow graph of P is considered covered by t upon the execution of P against t, when the path listed above is traversed and r = nij and s = nij+1 for 1 ≤ j ≤ (m − 1).

 

Test adequacy criteria based on data flows can be defined using a data flow graph.

Let CU and PU denote, respectively, the total number of c-uses and p-uses for all variable definitions in P. Let v = {v1, v2, · · · , vn} be the set of variables in P. di is the number of definitions of variable vi for 1≤i n. CU and PU are computed as follows.

Equation

Recall that |S| denotes the number of elements in set S. For the program in Example 7.31, we get CU=17 and PU=10.

7.4.1  c-use coverage

Let z be a node in dcu(x, q), i.e. node z contains a c-use of variable x defined at node q (see Figure 7.9(a)). Suppose that program Ρ is executed against test case t and the (complete) path traversed is:

p=(n1, ni1 , . . . , nil , nil+1, . . . , nim, nim+1 . . . nk), where 2 ≤ ij < k for 1 ≤ j k

 

This c-use of variable x is considered covered if q = nil and s = nim and (nil , nil+1 , . . . , nim) is a def-clear path from node q to node z. All c-uses of variable x are considered covered if each node in dcu(x, q) is covered during one or more executions of P. All c-uses in the program are considered covered if all c-uses of all variables in P are covered. We now define test adequacy criterion based on c-use coverage.

Figure 7.9

Figure 7.9 Paths for c-use and p-use coverage. Dotted arrows connecting two nodes indicate a path along which there might be additional nodes not shown. (a) Path (Start,… ,q,k,… ,z,… ,End) covers the c-use at node z of defined at node q given that (k,… ,z) is def-clear with respect to x. (b) The p-use of x at node is covered when paths (Start,… ,q,k,… ,z,r,… ,End) and (Start,… ,q,k,… ,z,s,… ,End), shown by dotted lines, have been traversed, and (k,… ,z) are def-clear with respect to x

A c-use of a variable x defined at location 11 and used at location 12 at is considered covered when control flows along a path from 11 to 12 without any redefinition of x.

C-use coverage:

The c-use coverage of Τ with respect to (P,R) is computed as

Equation

where CUc is the number of c-uses covered and CUƒ the number of infeasible c-uses. Τ is considered adequate with respect to the c-use coverage criterion if its c-use coverage is 1.

 

A test set is considered adequate with respect to the c-use criterion when it covers all feasible c-uses in the program under test.

Example 7.36   Suppose that Program P7.16 is under test. The data flow graph of this program is shown in Figure 7.7. We want to devise a test tc which covers node 6 in dcu(z, 5), i.e. the desired test must cover the c-use of the definition of x at node 1. Consider the following test case.

tc :<x = 5, y = −1, count = 1>

 

The complete path traversed when Program P7.16 is executed against tc is: (1, 2, 5, 6, 7). Clearly, this path contains node 5 at which z is defined, node 6 where it has a c-use, and subpath (5, 6) is a def-clear path with respect to this definition of z. Hence tc covers node 6, an element of dcu(z, 6 ).

Note that tc also covers node 7 but not node 4, which also contains a c-use of this definition of z. Considering that there are 12 c-uses in all, and tc covers only 2 of them, the c-use coverage for Τ = {tc} is 2/12 = 0.167. Clearly, Τ is not adequate with respect to the c-use criterion for Program P7.16.

7.4.2  p-use coverage

Let (z, r) and (z, s) be two edges in dpu(x, q), i.e. node z contains a p-use of variable x defined at node q (see Figure 7.9(b)). Suppose that the following complete path is traversed when Ρ is executed against some test case tp.

 

(n1, ni2 , . . . , ni1 , nil+1 , . . . , nim, nim+1 , . . . , nk) where 2≤ij< k for 1≤jk

 

The following condition must hold for edge (z, r) for the p-use at node z of x defined at node q, to be covered.

 

q = nil, z = nim, r = nim+1 and (nil , nil+1 , . . . , nim, nim+1 ) is a def-clear path with respect to x.

 

Similarly, the following condition must hold for edge (z, s).

 

q = nil, z = nim, s = nim+1, and (nil, nil+1, … ,nim, nim+1) is a def-clear path with respect to x.

 

A p-use of a variable x defined at location 11 and used at location 12 in a predicate at location 12 is considered covered when control flows along a path from 11 to 12 and then to each of the destinations of the predicate. Of course, each p-use coverage requires the traversal of at least two paths.

The p-use of x at node z is considered covered when the two conditions mentioned above are satisfied in the same or different executions of P. We can now define test adequacy criterion based on p-use coverage.

P-use coverage:

The p-use coverage of Τ with respect to (P, R) is computed as

Equation

where PUc is the number of p-uses covered and PUf the number of infeasible p-uses. Τ is considered adequate with respect to the p-use coverage criterion if its p-use coverage is 1.

 

A test set is considered adequate with respect to the p-use criterion when it covers all feasible p-uses in the program under test.

Example 7.37   Once again, suppose that Program P7.16 is under test. Consider the definition of y at node 6. Our objective now is to cover the p-use of this definition at node 3. This requires a test that must cause control to reach node 6 where y is defined. Without redefining y, the control must now arrive at node 3, where y is used in the if condition.

 

The edge labeled y ≥ 0 is covered if taken, though the p-use is yet to be covered. Subsequently, control should take the edge labeled y < 0 during the same or in some other execution. Consider the following test case designed to achieve the coverage objective.

 

tp:<x=−2, y=−1, count = 3 >

 

A careful examination of Program P7.16 and its data flow graph reveals that the following path is traversed upon the execution of the program against tp.

 

p=< 1, 2, 3, 61, 2, 31, 41, 62, 2, 32, 63, 7>

 

The first two instances of subscripted node 6 correspond to the definitions of y in the first and the second iterations, respectively, of the loop. The remaining subscripted nodes indicate edges along which the values of y are propagated. An explanation follows.

Variable y is defined at node 6 when subpath <1, 2, 3, 61> is traversed. This value of y is used in the decision at node 3 when <61, 2, 31 , 41> is traversed. Note that edge (3, 4) is now covered as there is a def-clear path <2, 3> from node 6 to edge (3, 4).

Next, path <62, 2, 32, 63, 7> is traversed where y is defined once again at node 6. This definition of y propagates to edge (3, 6) via the def-clear path <2, 32, 62>. The control now moves to node 7 and the program execution terminates. Test tp has successfully covered dpu(y, 6). In fact all the p-uses in Program P7.16 are covered by the test set Τ = {tc, tp}. Thus the p-use coverage of T is 1.0 and hence this test set is adequate with respect to the p-use coverage criterion (see Exercise 7.35).

7.4.3  All-uses coverage

The all-uses coverage criterion is obtained by combining the c-use and p-use criteria. The all-uses criterion is satisfied when all c-uses and all p-uses have been covered. The definition can be stated as follows in terms of a coverage formula.

 

All-uses coverage:

The all-uses coverage of Τ with respect to (P,R) is computed as

Equation

where CU is the total c-uses, CUc is the number of c-uses covered, PUc is the number of p-uses covered, CUf the number of infeasible c-uses, and PUf the number of infeasible p-uses. Τ is considered adequate with respect to the all-uses coverage criterion if its c-use coverage is 1.

 

A test set is considered adequate with respect to the all-uses criterion when it covers all feasible c- and p-uses in the program under test.

Example 7.38   For Program P7.16 and the test cases taken from Examples 7.36 and 7.37, the test set Τ = {tc, tp} is adequate with respect to the all-uses criterion.

7.4.4  k-dr chain coverage

Consider a k-dr chain C=(n1, n2, ... , nk) in program Ρ such that dni(xi) and uni+1(xi) for 1 ≤ i < k. We consider two cases depending on whether or not node nk ends in a predicate. Given that node nk does not end in a predicate, chain C is considered covered by test set Τ only if upon the execution of P against T, the following path is traversed:

 

(Start, . . . , n1, p1, n2, . . ., pk−1nk, . . . , End)

 

where each pi for 1 ≤ i < k is a definition clear subpath from node ni to node ni+1 with respect to variable Xi. If node nk ends in a predicate, then C is considered covered only if upon the execution of Ρ against T, the following paths are traversed:

 

(Start, . . . ,n1, p1, n2, . . . , pk−1nk, r, . . . , End)

(Start, . . . , n1, p′1, n2, . . . , p′k−1nk, s, . . . , End)

 

where pi and p′is denote definition clear subpaths for variable xi and nodes r and s are immediate successors of node nk. The above condition ensures that the decision containing the last use of a variable along the chain is covered, i.e. both branches out of the decision are taken. As shown in Example 7.40, it is possible to traverse both paths in one execution.

While determining k-dr chains, it is assumed that each branch in a data flow graph corresponds to a simple predicate. In the event a decision in the program is a compound predicate, such as (x < y) and z < 0), it is split into two simple predicates. Splitting ensures that each decision node in the corresponding data flow graph will contain an atomic predicate (see Exercise 7.39).

When the first definition or the last reference of a variable is inside a loop, at least two iteration counts must be considered for the loop. We consider these two iteration counts to be 1 and 2, i.e. the loop is iterated once and exited, and in the same or a different execution, iterated twice and exited.

 

Example 7.39   Consider the 3-dr chain C = (1, 5, 7) from Example 7.32 for Program P7.16. Chain C corresponds to the alternating defuse sequence (d1(x), u5(x), d5(z), d7(z)). The following test covers C.

Τ = {t :<x=3, y=2, count=1>}

 

Example 7.40   Consider the 3-dr chain C = (1, 4, 6) from Example 7.32 for Program P7.16. We note that C ends at node 6 that ends in a predicate. Nodes 2 and 7 are immediate successors of node 6. Hence, to cover C, the following subpaths must be traversed:

Equation

where p1, p2, p′1, and p′2 are definition clear subpaths with respect to variable z. In addition, because the last reference of variable z is inside a loop, the loop must be iterated a minimum and a larger number of times. The following test set covers C:

Equation

The following path is traversed when Program P7.16 is executed against t1:

(1, 2, 3, 4, 6, 7)

Clearly, the above path is of the form λ2 where p1 = (2, 3) and p2 is empty. Here the loop containing u6(z) is traversed once, i.e. minimum number of times.

The loop containing the following (complete) path is traversed when Program P7.16 is executed against t2:

(1, 2, 3, 4, 6, 2, 3, 6, 7)

This path can be written as

(1, p1, 4, p2, 6, 2, p3)

and also as

(1, p′1, 4, p′2, 6, 7)

where p1 = (2, 3), p2 = (), p′1 = (2, 3), and p′2 = (6, 2, 3). Note that p2 is an empty subpath. p1, p2, p′1, and p′ 2 are all def-clear with respect to z. Note also the condition that subpaths λ1 and λ2 be traversed is satisfied. Thus, the test case {t} covers the k-dr chain C.

7.4.5  Using the k-dr chain coverage

To determine the adequacy of a test set Τ for program P subject to requirements R, k is set to a suitable value. The larger the value of k, the more difficult it is to cover all the k-dr interactions. Given k, the next step is to determine all l-dr interactions for 1 ≤ lk. We denote this set of all l-dr interactions by k-dr(k). The program is then executed against tests in Τ and the coverage determined. A formal definition of test adequacy with respect to k-dr coverage follows.

 

k-dr coverage:

For a given k ≥ 2, the k-dr (k) coverage of Τ with respect to (P,R) is computed as:

Equation

where Equation is the number of k-dr interactions covered, Ck is the number of elements in k-dr(k), and Cf the number of infeasible interactions in k-dr(k). Τ is considered adequate with respect to the k-dr(k) coverage criterion if its k-dr(k) coverage is 1.

 

A test set is considered adequate with respect to the k-dr coverage criterion, for a given k, if all feasible k-dr chains are covered. Note that higher values of k may require more tests for coverage. However, at some point increasing the value of k will not lead to any new def-use chains unless one increases the number of times a loop body along a chain is executed.

7.4.6  Infeasible c- and p-uses

Coverage of a c- or a p-use requires a path to be traversed through the program. However, if this path is infeasible, then some c- and p-uses that require this path to be traversed might also be infeasible. Infeasible uses are often difficult to determine without some hint from a test tool. The next example illustrates why one c-use in Program P7.16 is infeasible.

 

An infeasible path in a program may lead to infeasible c-or p-uses.

Example 7.41   Consider the c-use at node 4 of z defined at node 5. For this c-use to be covered, control must first arrive at node 5 and then move to node 4 via a path that is def-clear with respect to z.

 

Further analysis reveals that for the control to first arrive at node 5, x > 0. For the control to then move to node 4, it must pass through nodes 6, 2, and 3. However, this is not possible because edge (2, 3) can be taken only if x ≤ 0.

Note that x is not defined anywhere in the program except at node 1. Hence, if x > 0 when control first arrives at node 2, then it cannot be x ≤ 0 in any subsequent iteration of the loop. This implies that the condition required for the coverage of c-use at node 4 of z defined at node 5 can never be true and hence this c-use is infeasible.

 

Example 7.42   A for loop that iterates for a fixed number of times can cause infeasible paths. Consider the following program that defines x prior to entering the loop and uses x inside the loop body and soon after loop exit. Thus there is a c-use of x in lines 3 and 7. A def-use graph for this program is shown in Figure 7.10.

Figure 7.10

Figure 7.10 Data flow graph of P7.18.

Program P7.18

Equation

Path (Start, 1, 2, 4, End) must be traversed to cover the c-use of x at line 7. However, this path, and hence the c-use, is infeasible because the for loop body will be executed exactly five times and not zero times as required for this path to be traversed.

7.4.7  Context coverage

Let Ρ be the program under test, F its data flow graph, t a test case against which P is executed, X(k) the set of input variables for node k in F, and EDC(k) and OEDC(k) be, respectively, the elementary data and ordered elementary contexts for node k. EDC(k) is considered covered by t if the following two conditions are satisfied during the execution of P.

  1. Node k is along the path traversed during the execution of Ρ against t.
  2. When control arrives at k along this path, all definitions in EDC(k) are live.

A data context DC(k) for node k is considered covered when all elementary data contexts in DC(k) are covered. Similarly, an ordered elementary data context OEDC(k) is considered covered by t if the following conditions are satisfied during the execution of P.

  1. Node k is along the path p traversed during the execution of Ρ against t,
  2. All definitions in OEDC(k) are live when control arrives at n along p.
  3. The sequence in which variables in X[k] are defined is the same as that in OEDC(k). 

A test set is considered adequate with respect to the data context criterion if it covers all the feasible data contexts.

An ordered data context ODC(k) for node n is considered covered when all data contexts in ODC(k) are covered.

Given a test set set T for program Ρ subject to requirements R, formal definitions of test adequacy with respect to elementary data context and ordered elementary data context coverage follow.

 

Elementary data context coverage:

Given a program Ρ subject to requirements R, and having a data flow graph containing n nodes, the ordered elementary data context coverage of Τ with respect to (P,R) is computed as:

Equation

where EDC is the number of elementary data contexts in P, EDCc is the number of elementary data contexts covered, and EDCi is the number of infeasible elementary data contexts. Τ is considered adequate with respect to the data context coverage criterion if the data context coverage is 1.

Ordered elementary data context coverage:

Given a program Ρ subject to requirements R, and having a data flow graph containing n nodes, the ordered data context coverage of T with respect to (P,R) is computed as:

Equation

where OEDC is the number of ordered elementary data contexts in P, OEDCc is the number of ordered elementary data conexts covered, and OEDCi is the number of infeasible ordered elementary data contexts. Τ is considered adequate with respect to the ordered elementary data context coverage criterion if the data context coverage is 1.

 

In the two definitions above, we have computed the coverage with respect to the elementary and ordered elementary data contexts, respectively. Another alternative is to compute the coverage with respect to data contexts and ordered data contexts (see Exercise 7.43). Such definitions would offer coarser coverage criteria than the ones above. For example, in such a coarse definition, a data context will be considered uncovered if any one of its constituent elementary data contexts is uncovered. 

Example 7.43   Let us compute the adquacy of test Τ given below against the elementary data context coverage criteria.

Equation

We assume that Program P7.16 is executed against the four test cases listed above. For each test case, the table below shows the path traversed, the elementary data context(s) covered, and the cumulative elementary data context coverage (EDC). An asterisk against a node indicates that the corresponding elementary data context has already been covered and hence not counted in computing the cumulative EDC.

From Figure 7.7 and the table in Example 7.34, we obtain the total number of elementary data contexts as 12. Hence the cumulative EDC for test case tj, 1≤j≤4, is computed as the ratio EDCc/12 where EDCc is the total number of elementary data contexts covered after executing the program against tj. Note that we could also compute data context coverage using the four data contexts for nodes 4 through 7 in Example 7.34. In that case, we will consider a data context as covered when all of its constituent elementary data contexts are covered.

Equation

It is clear from the rightmost column in the table above that Τ is not adequate with respect to the elementary data context coverage criterion. Recall that Τ is adequate with respect to the MC/DC coverage criterion.

7.5  Control Flow Versus Data Flow

Adequacy criteria based on the flow of control aims at testing only a few of the many, sometimes infinitely large, paths through a program. For example, the block coverage criterion is satisfied when the paths traversed have touched each block in the program under test. Data flow based criteria has the same objective, i.e. these criteria assist in the selection of a few of the many paths through a program. For example, the c-use coverage criterion is satisfied when the paths traversed have covered all c-uses. However, sometimes data flow based criteria turn out to be more powerful than criteria based on the flow of control, including the MC/DC based criteria. As evidence in support of the previous statement, let us reconsider Program P7.16. Consider the following test case devised based on the program’s requirements (which we have not listed explicitly).

Equation

Data flow based adequacy criteria are generally stronger and hence more difficult to satisfy than control flow based adequacy criteria, including the MC/DC criterion.

Surprisingly, test set Τ = {t1, t2} is adequate for P7.16 with respect to all the control flow criteria described earlier except for the LCSAJ criterion. In terms of coverage numbers, block, condition, multiple condition, and MC/DC coverages are all 100%. However, the c-use coverage of Τ is only 58.3% and the p-use coverage is 75%. Translated in terms of error detection, the example illustrates that tests adequate with respect to data flow coverage criteria are more likely to expose program errors than those adequate with respect to criteria based exclusively on the flow of control.

 

Example 7.44   Suppose that line 14 in Program P7.16 has an error. The correct code at this line is:

14 y=x+y+z;

The following test set is adequate with respect to the MC/DC criterion.

Equation

Execution of P7.16 against Τ produces the following values of z:

Equation

It is easy to verify that the correct program also generates exactly the same values of z when executed against test cases in Τ. Thus Τ does not reveal the simple error in P7.16. The c- and p-use coverage of Τ is, respectively, 0.75 and 0.5 (these numbers include the infeasible c-use). One of the c-uses that is not covered by Τ is that of y at line 8 corresponding to its definition at line 14. To cover this c-use, we need a test that ensures the execution of the following event sequence.

  1. Control must arrive at line 14; this will happen for any test case.
  2. Control must then get to line 8. For this to happen, count > 1, so that the loop does not terminate and conditions x0 and y0 must be true.

The following test case satisfies the conditions enumerated above.

 

te :< x = 2, y = 2, count = 2 >

 

Execution of Program P7.16 against te causes the c-use at line 8 to be covered. However, the value of z is 1.0 while the correct version of this program outputs 2.0 when executed against te thus revealing the error.

Though coverage of a c-use in the example above revealed the error, a key question one must ask is: “Will every test case that covers the c-use mentioned in Example 7.44 also reveal the error? This question is left as an exercise (see Exercise 7.38). It is also important to note that even though Τ in the above example does not reveal the error, some other MC/DC adequate test might.

7.6  The “Subsumes” Relation

We have discussed a variety of control flow and data flow based criteria to assess the adequacy of tests. These criteria assist in the selection of tests from a large, potentially infinite, set of inputs to the program under test. They also assist in the selection of a finite, and relatively small, number of paths through the program. All control and data flow based adequacy criteria are similar in that their objective is to assist a tester find a suitable subset of paths to be tested from the set of all paths in a given program. Given this similarity of objectives of the various adequacy criteria, the following questions are of interest.

 

A coverage criterion C1 is considered to subsume coverage criterion C2 if every test adequate with respect to C1 is also adequate with respect to C2.

Subsumes:

Given a test set Τ that is adequate with respect to criterion C1, what can we conclude about the adequacy of Τ with respect to another criterion C2?

Effectiveness:

Given a test set Τ that is adequate with respect to criterion C, what can we expect regarding its effectiveness in revealing errors?

 

In this chapter, we deal briefly with the first of the above two questions; an in-depth theoretical discussion of the subsumes relationship amongst various adequacy criteria is found in Volume 2. The effectiveness question is also dealt with in the research literature. The next example illustrates the meaning of the subsumes relationship.

 

Example 7.45   Program P7.19 takes three inputs x, y, and z, and computes p.

Program P7.19

Equation

The following test set is adequate with respect to the p-use criterion for P7.19:

Equation

However, Τ does not cover the c-use (d3(x), u11(x)). This example illustrates that a test set that is adequate with respect to the p-use criterion is not necessarily adequate with respect to the c-use criterion.

Notice that for Program P7.19, a test that covers (d3(x), u11(x)) must cause the conditions at line 4 and 10 to evaluate to true. Such “coupling” of conditions is not required to obtain p-use adequacy (see Exercise 7.46). The next example illustrates that c-use adequacy does not imply p-use adequacy.

 

The subsumes relationship indicates the strength of an adequacy criterion. Generally, a stronger criterion is likely to be more effective in revealing program errors than a weaker criterion.

Example 7.46   Program P7.20 takes two inputs x and y, and computes z.

Figure 7.11

Figure 7.11 The subsumes relationship amongst various control and data flow based test adequacy criteria, XYindicates that Xsubsumes Y. See the bibliography section for citations to research that points to the assumptions that need to be satisfied for the relationship to hold. 

While the all-uses criterion is empirically more powerful than the MC/DC criterion, it does subsume the MC/DC criterion.

Program P7.20

Equation

The following test is adequate with respect to the c-use criterion but not with respect to the p-use criterion.

 

T = { t: < x = 5 y = −2 > }

7.7  Structural and Functional Testing

It is commonly believed that when testers measure code coverage, they are performing “structural testing” also known as “white-box” or “glass-box” testing. In addition, it is also said that while structural testing compares test program behavior against the apparent intention of the programmer as expressed in the source code, functional testing compares test program behavior against a requirements specification. The difference between structural and functional testing is presented below in a different light.

 

Measuring code coverage while testing, and enhancing tests based on such measurements, is often referred to as “white box” or “structural” testing.

As has been explained in Sections 7.1.2 and 7.1.3, measurement of code coverage is a way to assess the “goodness” or “completeness” of a test set Τ derived with the intention of checking whether or not the program Ρ under test behaves in accordance with the requirements. Thus, when Ρ is executed against a test case t that belongs to T, a tester indeed compares the behavior of Ρ against its requirements as t itself is derived from the requirements.

Having successfully executed all tests in T, a tester might choose one of at least two options. One option is to assess the goodness of Τ using criteria not based on code coverage while the other is to use code coverage. If the tester chooses the first option, we say that the tester is performing functional testing without the aid of any code based test adequacy measurement. Certainly, the tester could use non-code based adequacy criteria, for example, requirements coverage.

However, when the tester chooses the second option, then some form of code coverage is measured and perhaps used to enhance Τ. If Τ is found adequate with respect to the code based adequacy criteria, say C, then the tester may decide to either further evaluate Τ using a more powerful adequacy criteria or be satisfied with Τ.

When Τ is not adequate with respect to C, the tester must continue to test Ρ against the newly derived tests in the same way as Ρ was tested against tests in Τ. For example, suppose that the coverage measurement indicates that decision d at, say, line 136 inside function myfunc has not been covered. We now expect the tester to determine, with respect to the requirements, why this decision was not covered. A few possible causes, using arbitrary labels for requirement, are given below.

  • There was no test case in Τ to test for equirement R39.3.
  • Requirement R4 was tested only with default parameters, testing it against any other value of an input parameter will cover d.
  • Decision d can only be covered by a test that tests requirement R2.3 followed by requirement R1.7.

Once the cause for lack of coverage is determined, the tester proceeds to construct a test case t′, not in Τ. Ρ is then executed against t′ to ensure that it behaves correctly in accordance with the requirements.

There are several possible outcomes of executing Ρ against t′. One outcome is that decision d remains uncovered implying improper construction of t′. Another possibility is that Ρ behaves incorrectly on t′. In either case, testing is performed with respect to the requirements while the coverage measurement is used as a means to probe the inadequacy of Τ.

The discussion above leads us to conclude that structural testing is functional testing with the addition of code based adequacy measurement. Thus, rather than treat functional and structural testing as two independent test methods, we treat one as supplementing the other. In this case, structural testing is supplementary to functional testing. One performs structural testing as part of functional testing by measuring code coverage and using this measure to enhance tests generated using functional testing.

7.8  Scalability of Coverage Measurement

One might argue that measurement of code coverage, and test enhancement based on such measurement, is practical only during unit testing. One implication of such an argument is that it is impractical to perform code coverage measurements for integration or system tests. In the remainder of this section we discuss the practicality aspect of code coverage measurement. We offer suggestions on how testers could scale the coverage measurement and test enhancement process to large systems.

 

Measuring code coverage during unit tests is generally easy and can be done using tools such s JUnit. However, doing so for very large an complex applications is difficult for many reasons, for example, due to increase in time for execution, or the increased memory requirements.

Suppose that testers have access to the source code of the application under test. While it is possible to measure code coverage on binaries, lack of such access makes coverage measurement a difficult, if not an impossible, task. Several tools such as JBlanket developed by Joy Augustin, and Quilt available as Apache License, measure certain forms of code coverage by instrumenting the object code (byte code in Java). However, several other tools for coverage measurement often rely on instrumenting the source code and hence would fail to be of any use when the source is not available.

Open systems: First, let us consider open systems, i.e. applications that are not embedded in any specific hardware and can be executed in open desktop environments. Examples of such applications abound and include products such as office tools, web services, and database systems.

 

It is often easier to determine code coverage for non-embedded versus embedded systems.

Access to the code of any such application offers testers an opportunity to evaluate the tests using code based coverage criteria. However, the availability of a coverage measurement tool is almost a necessity when testing large systems. Tools for coverage measurement can be used to instrument the code. The coverage is monitored during execution and displayed after any number of executions. All coverage measurement tools display code coverage values in some form or another and can be used to determine test adequacy.

Incremental and selective coverage measurement: Coverage data for large applications can be overwhelming. For example, an application that contains 10,000 modules with a total of 45,000 conditions spread all over will lead to an enormous amount of coverage data. The display and storage of such data might create problems for the tester and the coverage measurement tool. For example, the test process might exhibit unacceptable slowdown, there might not be sufficient disk space to save all the coverage data, and the coverage measurement tool might breakdown as it might not have been tested to handle such large amounts of coverage data. Such problems can be overcome through incremental measurement and enhancement.

Incremental coverage measurement is performed in several ways. One way is to create a hierarchy of code elements. Elements in the hierarchy are then prioritized using criteria such as to their frequency of use or their criticality to the acceptable operation of the application. Thus, in a large application with say 10,000 modules, one might create three sets of modules M1, M2, and M3, with decreasing priority. The code measurement tool can then be asked to instrument and measure code coverage only in modules that belong to set M1. Tests could then be enhanced based on the coverage data so obtained. The process could then be repeated for modules in M2, and so on.

 

Even for large systems, coverage measurement can be made possible and effective through the use of an incremental approach.

Reducing the number of modules for which coverage measurements are taken will reduce the burden on the coverage measurement tool, as well as the tester who must deal with the amount of coverage data and enhance the tests to improve coverage. However, such a reduction might not be sufficient. In that case, a tester could further subdivide elements of a module set, say M1, into code elements and prioritize them. This further subdivision could be based on individual classes, or even methods. Such a subdivision further reduces the amount of code to be instrumented and the data to be analyzed by the tester for test enhancement.

In addition to constraining the portions of the code subject to coverage measurement, one could also constrain the coverage criteria. For example, one might begin with the measurement of method, or function, coverage in the selected modules. Once this coverage has been measured, tests enhanced, and adequate coverage obtained, one could move to the other portions of the code and repeat the process.

Embedded systems: Embedded systems pose a significant challenge in the measurement of code coverage. The challenge is primarily due to limited memory space and, in some cases, hard timing constraints. Instrumenting code while testing it in its embedded environment will certainly increase code size. The instrumented code might not fit in the available memory. Even when it fits, it might affect the real-time constraints and impact the application behavior.

Memory constraints can be overcome in at least two ways. One way is to use the incremental coverage idea discussed above. Thus instrument only a limited portion of the code at a time and test. This would require the tests to be executed in several passes and hence increase the time to complete the coverage measurement.

Another approach to overcome memory constraints is hardware-based. One could use a tool that profiles program execution to obtain branch points in the machine code. Profiling is done in a non-intrusive manner by tapping the microprocessor-memory bus. These branch points are then mapped to the source program and coverage measures such as block and branch coverage computed and displayed.

Coverage measurement for object oriented programs: Object oriented (OO) programs allow a programmer to structure a program in terms of classes and methods. A class encapsulate one or more methods and serves as template for the creation of objects. Thus, while all code coverage criteria discussed in this chapter applies to OO programs, some additional coverage measures can also be devised. These include, method coverage, context coverage, object coverage, and others.

7.9  Tools

A number of open source and commercially available tools exist for measuring code coverage for a given test suite. Below is a sample of such tools.

 

Tool: PaRTeG

Link: http://parteg.sourceforge.net/

Description

 

This is a tool that fits several chapters in this book. It generate tests from UML state machines and class diagrams. Test sets generated from code satisfy MC/DC, state coverage, transition coverage, decision coverage, multiple condition coverage, and boundary coverage. It is also able to generate mutants for Java classes. PaRTeG is available for download as an Eclipse plugin.

 

Tool: Cobertura

Link: http://cobertura.sourceforge.net/

Description

 

This is a freely available tool for measuring code coverage of Java code. It instruments byte code, not Java source, to measure coverage. Statement and branch coverages are measured. Measurement statistics, e.g., number of times a statement has been exercised, are displayed in a graphical form. Lines and branches not covered are also mapped to source code and highlighted.

 

Tool: EclEmma

Link: http://cobertura.sourceforge.net/

Description

 

This is another tool for measuring coverage of Java programs. It is available as an Eclipse as well as JUnit plugin. Statement coverage is supported. Coverage is displayed graphically for each method.

 

Tool: Bullseye

Link: http://www.bullseye.com/

Description

 

This is a commercial tool for measuring coverage of C++ programs. It instruments the source code. It is able to measure function coverage, condition coverage, and decision coverage.

 

Link: http://www.microsoft.com/visualstudio/en-us

Tool: Visual Studio

Description

 

This is an integrated development environment for .Net and C# programs. It measures statement, block and condition coverage.

 

Tool: LDRA Testbed

 

A summary of this tool appears in Chapter 2.

 

SUMMARY

This chapter covers the foundations of test assessment and enhancement using structural coverage criteria. We have divided these criteria into control flow based and data flow based adequacy criteria. Each criterion is defined and illustrated with examples. Examples also show how faults are detected, or missed, when a criterion is satisfied. The simplest of all control flow based criteria such as statement coverage and decision coverage have found their way in to a large number of commercially available testing tools.

We have introduced some of the several data flow based adequacy criteria, namely the all-uses, p-uses, and the c-uses. There are others not covered here and referred to in the bibliographic notes section. While adequacy with respect to data flow based criteria is stronger than that based on control flow based criteria, it has not found much use in commercial software development. However, the availability of good tools such as χSUDS from Telcordia Technologies, and education, promises to change this state of affairs.

This chapter is certainly not a complete treatment of all the different control and data flow based adequacy criteria. There are many more, e.g. data context coverage, and others. Some of the uncovered material is pointed to in the section on bibliographic notes.

Exercises

7.1 Consider the following often found statement in research publications: “Complete testing is not possible.” What completeness criterion forms the basis of this statement? In light of the completeness criteria discussed in this chapter, is this statement correct?

7.2 Enumerate all paths in P7.5. Assume that each loop is required to be traversed zero times and once.

7.3 Why did we decide to input all tests in Τ during a single execution of P7.5? Will the error be revealed if we were to input each test in a separate execution?

7.4 Draw the control flow graph for P7.5. Enumerate all paths through this program that traverse the loop zero times and once.

7.5 Let A, B, and C denote three Booleans. Let Cond = (A and B) or (A and C) or (B and C) be a condition. What is the minimum number of tests required to (a) cover a decision based on Cond and (b) cover all atomic conditions in Cond?

7.6 Construct a program in your favorite language that inputs three integers x, y, and z and computes output O using the specification in Table 7.12. Assume that f1(x, y, z) = x + y + z and f2(x, y, z) = x * y * z.

 

Table 7.12 Computing Ο for the program in Exercise 7.6.

Table 7.12

Introduce an error in your program. Now construct a test set Τ that is decision-adequate, but is not multiple condition-adequate, and does not reveal the error you introduced. Enhance Τ so that it is multiple condition-adequate and does reveal the error. Will all tests that are multiple condition-adequate for your program reveal the error?

7.7 Suppose that test set Τ for a given program and the corresponding requirements is adequate with respect to the statement coverage criterion. Will Τ be always adequate with respect to the block coverage criterion?

7.8 (a) How many decisions are introduced by an if statement? (b) By a switch statement? (c) Does each decision in a program lead to a new path?

7.9 Consider the following claim: If the number of simple conditions in a decision is n, then the number of tests required to cover all conditions is 2n. If possible, construct a programs to show that in general this claim is not true.

7.10 Let A, B, and C denote three Booleans. Let C′ = (A and B) or (A and C) or (B and C) be a condition. What is the minimum number of tests required to (a) cover a decision based on C′ and (b) cover all conditions in C′?

7.11 Modify Τ in Example 7.15 to render it adequate with respect to the condition/decision criterion for P7.9.

7.12 Given simple conditions A, B, and C, derive minimal MC/DC adequate tests for the following conditions: (a) A and Β and C, (b) A or Β or C, and (c) A xor B. Is the test set unique in each case?

7.13 While removing redundant tests from Table 7.6, we selected tests (3, 4) to cover condition C3. What will be the minimal MC-adequate test set if, instead, tests (5, 6) or (7, 8) are selected?

7.14 Are the tests, given for each of the three conditions in (a) Table 7.9 and (b) Table 7.10, unique?

7.15 In Section 7.2.9 we list a simple procedure to derive MC/DC adequate tests for condition C = C1 and C2 and C3 from MC/DC adequate tests for condition C = C1 and C2. Using this procedure as a guide, construct another procedure to (a) derive MC/DC adequate tests for C = C1 or C2 or C3, given MC/DC adequate tests for C = C1 or C2 and (b) derive MC/DC adequate tests for C = C1 xor C2 xor C3, given MC/DC adequate tests for C = C1 xor C2.

7.16 Extend the procedure in Section 7.2.9 to derive MC/DC adequate tests for C = C1 and C2and Cn−1 and Cn, given MC/DC adequate tests for C = C1 and C2and Cn−1.

7.17 Use the procedures derived in Exercises 7.15 and 7.16 to construct an MC/DC adequate test for (a) (C1 and C2) or C3, (b) (C1 or C2) and C3, and (c) (C1 or C2) xor C3.

7.18 Suppose test Τ is found adequate with respect to the condition coverage for some program P. Is it also adequate with respect to decision coverage? Is the opposite true?

7.19 Consider program Ρ containing exactly one compound condition C and no other conditions of any kind. C comprises n simple conditions. Let Τ be the test set adequate for Ρ with respect to the MC/DC criterion. Show that Τ contains at least n and at most 2n test cases.

7.20 Consider a test set Τ adequate for program Ρ with respect to some coverage criterion C. Now suppose that T1T2 = T and that both T1 and T2 are adequate with respect to C. Let P′ be a modified version of P. To test P′, under what conditions would you execute P′ against all of T? Only T1? Only T2?

7.21 Is T3 of Example 7.23 minimal with respect to MC/DC coverage? If not then remove all redundant test cases from T3 and generate a minimal test set.

7.22 (a) In Example 7.23, suppose that condition C2 at line 12 is incorrectly coded as:

12 if((z * z > y) and (prev==“East”))

Will the execution of P7.14 against t5 reveal the error?

(b) Now suppose that in Example 7.23, condition C3 at line 14 is also incorrectly coded as:

12 else if (z * z ≤ y) or (current==“South”))

Note that there are two erroneous statements in the program. Will the execution of the erroneous P7.14 against t5 reveal the error?

(c) Construct a test case that will cause the erroneous P7.14 to produce an incorrect output thereby revealing the error.

(d)What errors in the requirements will likely be revealed by each of the tests t5, t6, … , t9 in Example 7.23?

7.23 Why does the MC/DC adequate test set in Example 7.24 contain four tests when MC/DC adequacy for a conjunction of two simple conditions can be achieved with only three tests?

7.24 (a) Given C = C1 and C2 and . . . and Cn, suppose that C has been coded as C′ = C1 and C2 and . . . and Ci−1 and Ci+1 . . . and Cn; note one missing simple condition in C′. Show that, in the general case, an MC/DC adequate test set is not likely to detect the error in C′. (b) Can we say that an MC/DC adequate test set will not detect this error? (c) Does the error detection capability of the MC/DC criteria change as the number of missing simple conditions increases? (d) Answer parts (a) and (c) assuming that C = C1 or C2 or . . . or Cn and C′ = C1 or C2 or . . . or Ci−1 or Ci+1 · · · or Cn.

7.25 (a) Suppose that P7.14 contains one error in the coding of condition C2. The error is a missing simple condition. The incorrect condition is:

12 if ((z * z > y) and (prev==“East”))

Develop an MC/DC adequate test set for the incorrect version of P7.14. How many of the test cases derived reveal the error? Does the answer to this question remain unchanged for every minimal MC/DC adequate test for this incorrect program?

(b) Answer (a) for the following error in coding C3:

14 else if ((x < y) or (z * zy) and (current==“South”))

7.26 Let A, Β, and C denote Boolean expressions. Consider a compound condition D defined as follows:

D = (A and B) or (A and C) or (B and C)

Assuming that D is used in a decision, construct a minimal set of tests for D that are adequate with respect to (a) decision coverage, (b) condition coverage, (c) multiple condition coverage, and (d) MC/DC coverage.

7.27 Show that a test set adequate with respect to the LCSAJ(MC/DC) criterion may not be adequate with respect to the LCSAJ (MC/DC) criterion.

7.28 There are at least two options to choose from when considering data flow coverage of arrays. One option is to treat the entire array as a single variable. Thus whenever this variable name appears in the context of an assignment, the entire array is considered defined; when it appears in the context of a use, the entire array is considered used. The other option is to treat each element of the array as a distinct variable.

Discuss the impact of the two options on (a) the cost of deriving data flow adequate tests and (b) the cost of keeping track of the coverage values in an automated data flow coverage tool.

7.29 (a) Prove that basis path coverage implies decision coverage. (b) Does decision coverage imply basis path coverage?

7.30 Consider the following statement: “If a graph contains a loop, it has an infinite number of paths.” Is this statement correct?

7.31 Paul Amman and Jeff Offutt have defined simple and prime paths as follows. A simple path from node p to node q in a CFG is a subpath in which no node appears more than once, except that the first and the last nodes might possibly be p. A prime path is a simple path that does not appear as a subpath of any other simple path. A round trip path is a prime path that starts and ends at the same node.

  1. Show that coverage of all prime paths implies coverage of all decisions.
  2. Show that prime path coverage does not imply MC/DC coverage unless each condition in the program is simple.
  3. Show that prime path coverage implies def-use coverage when all path conditions are simple.
  4. Construct all simple, prime, and round trip paths for the CFG in Figure 2.5 (b) and (c).

7.32 (a) In Example 7.30, (1, 2, 5, 6) is considered a def-clear path with respect to count defined at node 1 and used at node 5. Considering that count is defined at node 6, why is this redefinition of count not masking the definition of count at node 1? (b) Construct two complete paths through the data flow graph in Figure7.7 that are def-clear with respect to the definition of count at node 6 and its use in the same node.

7.33 In Example 7.37, suppose that a test set Τ contains only tp. Is Τ adequate with respect to c-use coverage? p-use coverage? Generate additional tests to obtain complete c- and p-use coverages if Τ is inadequate with respect to these criteria.

7.34 In Section 7.3.8 a minimal set of c- and p- uses is listed that when covered by a test set Τ ensure that all feasible c- and p-uses are also covered by T. Show that the given set is minimal.

7.35 (a) Show that indeed set Τ = {tc, tp}, consisting of tests derived in Examples 7.36 and 7.37, is adequate with respect to the p-use criterion, (b) Enhance Τ to obtain T′ adequate with respect to the c-use criterion, (c) Is T′ minimal, i.e. can you reduce T′ and obtain complete c- and p-use coverage from the reduced set?

7.36 Consider a path s through some program P. Variable v is defined along this path at some node nd and used subsequently at some node nu in an assignment, i.e. there is a c-use of v along s. Suppose now that path s is infeasible. Does this imply that the c-use of v at node nu is also infeasible? (b) Suppose now that there is a p-use of v at node np along path s. Given that s is infeasible, is this p-use also infeasible? Explain your answer.

7.37 Consider the following program, due to Boyer, Elapas, and Levitt, and analyzed by Rapps and Weyuker in the context of data flow testing in their seminal publication. The program finds the square root of input p, 0 ≤ p < 1 to accuracy e, 0 < e ≤ 1.

 

Program P7.21

Equation

For P7.21, (a) draw a data flow graph, (b) develop an MC/DC adequate test set Tmcdc, and (c) enhance Tmcdc to generate Τ that is adequate with respect to the all-uses criterion, (d) There is an error in P7.37. Lines 12 and 13 must be interchanged to remove this error. Does Τ reveal the error? (e) Does Tmcdc reveal the error? (f) Will every T, adequate with respect to the all-uses criterion, reveal the error?

7.38 (a) In Example 7.44, what condition must be satisfied by a test case for Program P7.16 to produce an incorrect output? Let us refer to the condition you derive as C1. (b) Is C1 the same as the condition, say C2, required to cover the c-use of yat line 8 corresponding to its definition at line 14? (c) For Program P7.21, what would you conclude regarding the detection if C1 does not imply C2, or vice versa?

7.39 (a) While doing k-dr analysis, it is assumed that each node in the data flow graph corresponds to a simple predicate. Modify the data flow graph shown in Figure 7.6 so that each node in the modified graph corresponds to a simple predicate. (b) Does k-dr coverage for k = 2 imply condition coverage? Multiple condition coverage? MC/DC coverage?

7.40 (a) Draw a data flow graph F for Program P7.22. (b) Construct data context sets for all nodes in F. (c) Construct a test set adequate with respect to data context coverage. Assume that functions foo1(), foo2(), foo3(), foo4(), Pl(), and P2() are defined elsewhere.

 

Program P7.22

Equation

7.41 (a) The table in Example 7.34 lists only the feasible elementary data contexts for various nodes for Program P7.16. Enumerate all elementary data contexts for nodes 4, 5, 6, and 7. Amongst these, identify the infeasible elementary data contexts. (b) How will the answer to (a) change if the initialization z = 0 is removed and the statement at line 4 is replaced by:

4 input (x, y, z, count):

7.42 Construct ODC(n) for all nodes in the data flow graph for Program P7.22. Is the test set constructed in Exercise 7.40 adequate with respect to the ordered data context coverage criterion? If not then enhance it to a set that is adequate with respect to the ordered data context coverage criterion.

7.43 (a) Define data context coverage and ordered data context coverage in a manner analogous to the definitions of elementary and ordered elementary data contexts, respectively. (b) Which of the two sets of definitions of context coverage would you recommend, and why, for the measurement of test adequacy and enhancement?

7.44 Find all infeasible elementary data contexts for node 6 in Figure 7.7.

7.45 For Program P7.23, construct a test set adequate with respect to the p-use criterion but not with respect to the c-use criterion.

 

Program P7.23

Equation

7.46 Show that for programs containing only one conditional statement, and no loops, the p-use criterion subsumes the c-use criterion.

7.47 The statement “Structural testing compares test program behavior against the apparent intention of the source code.” can also be restated as “Functional testing compares test program behavior against the apparent intention of the source code.” Do you agree with these statements? Explain your answer.

7.48 Solve parts (a) and (b) of Exercise 7.28 in Chapter 8.

7.49 Boundary-interior testing proposed by William Howden requires a loop to be iterated zero times and at least once. Construct a program with an error in the loop that can be revealed only if the loop is iterated at least twice. You may use nested loops. Note that it is easy to construct such an example by placing the construction of a function f inside the loop and executing f only in the second or subsequent iteration. Try constructing a more sophisticated program.

7.50 Suppose that a program contains two loops, one nested inside the other. What is the minimum number of boundary-interior tests required to satisfy Howden’s boundary-interior adequacy criterion?

7.51 Critique the following statement obtained from a web site: “White Box tests suffer from the following shortcomings: (a) Creating a false sense of stability and quality. An application can often pass all unit tests and still appear unavailable to users due to a non-functioning component (a database, connection pool, etc.). This is a critical problem, since it makes the IT application team look incompetent, (b) Inability to validate use-case completeness because they use a different interface to the application.”

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.14.252.56