336 Simple Statistical Methods for Software Engineering
Both B and C control the inflection point. At the inflection point, growth
attains 0.36788 (this number =
1
e
, where e is the Euler constant) of the plateau
value A. is means 36.788% of the total defects at the inflection point. is prop-
erty characterizes the Gompertzian way of testing and finding defects. e inflec-
tion point, as we have already seen, also represents peak value in growth rate (here
defect discovery rate).
e time when the inflection happens has been defined by Kececioglu et al. [4].
For the different values of B and C used in Figure 21.1, the following inflection
times have been computed:
B 0.1 0.01 0.001 0.1 0.01 0.001
C 0.3 0.3 0.3 0.6 0.6 0.6
Inflection time 0.693 1.268 1.605 1.633 2.990 3.783
From these results, we may see the influence of B and C on inflection time. Large
inflection time indicates delayed defect discovery.
Dimitri Shift
Kececioglu et al. [4] worked out a modified Gompertz curve by adding a constant
D to displace the curve vertically by a distance D. is shift results in a four-
parameter Gompertz model defined in the following equation:
G t D A B
C
t
( )
( )
= +
( )
(21.2)
is shift of the Gompertz curve in the y axis is analogous to the familiar
location shift in the x axis we have seen in Chapter 20. e shift factor D is some
kind of a location parameter. A plot of the shifted Gompertz curve is shown in
Figure 21.3.
e shifted curve suggests a significant amount of defects discovery immedi-
ately after the start. is is not agreeable to intuitive reasoning. However, Dimitri
et al. [4] claimed that the modified model fits better with data. ey observe from
several data sets that
Reliability growth data could not be adequately portrayed by the con-
ventional Gompertz model. ey point-out that the reason is due to the
models fixed value of reliability at its inflection point. As a result, only
a small fraction of reliability growth datasets following an S-shaped
pattern could be fitted.
Gompertz Software Reliability Growth Model 337
In any case, the four-parameter model offers more options during curve fitting, and
this could be an advantage.
Predicting with the Gompertz Model
Once test data arrives, we may wish to begin asking the following questions:
How many more defects remain in the application?
How long would it take to detect those defects?
ese are the prediction questions the Gompertz model strives to answer. ere
are two prediction scenarios. e rst is when we use an auxiliary model to esti-
mate defects in the application. For example, we might predict defects based on the
size and complexity of the application using regression equations based on histori-
cal data. In this case, we know the constant A. All we have to do is to derive the
remaining parameters B, C, and D from data and predict time.
In the second scenario, all four parameters are unknown. We do not have any
estimate of the defects in the application. In this case, one short cut is to wait
for the defect discovery rate to go through a peak and start symptoms of steady
decline. e peak is the Gompertzian inflection point. If the defects found until
the inflection point is known, then the overall defects A can be obtained by using
the following relationship:
Defects found until inflection = total defects in the application × 0.36788
is calculation completes the prediction of total defects in the application.
0
0
10
20
30
40
50
60
70
80
90
100
2 4 6
Time
Cumulative
defects fixed
G(t)
A = 100
8 10
Vertical
shift, D
Original
Shifted
Figure 21.3 Modified Gompertz reliability growth curve with shift.
338 Simple Statistical Methods for Software Engineering
We can now estimate constants B, C, and D by iterative analysis to arrive at the
least square error or any other curve-fitting technique. Once these three constants
are known, we can predict the time taken to achieve a given percentage of reli-
ability. Curve fitting techniques do not assume Gompertzian behavior but force
fit the Gompertz curve to data. e coefficient of determination R
2
or any other
assessment of error in the fitted curve should be used to determine the quality of
predictions.
Dimitri has proposed a way to extract parameters. Data are divided into three
groups with an equal number of values. He proposed formulas to determine con-
stants [4].
Box 21.2 Gompertz Curve for Growth of SharkS
Sharks are the top predators and play important roles in marine ecosystems.
Annual yields of small sharks in Taiwan declined dramatically from 5699
tons in 1993 to 510 tons in 2008, which implies that these stocks, mainly
caught by trawlers and long-liners in coastal waters off Taiwan, have experi-
enced heavy exploitation in recent years.
e blacktip sawtail catshark is a small species that inhabits tropical and
subtropical coastal waters of the western Pacific region. In Taiwan, this spe-
cies is found in coastal waters of western and northern Taiwan and is one of
the most important small shark species. e growth pattern of this species
has been studied by Liu et al. [5].
e growth data of 275 female sharks have been fitted to growth equations
such as Gompertz, as shown in Figure 21.4.
L
t
= A
e
Be
Ct
Where the parameters have been
estimated as
L
t
= Length in cm at time t years
A = Asymptotic length 52.8 cms
B = –2.28
C = –0.232
e above Gompertz equation is an
adaptation by Kwang-Ming Liu et al.
e constants are differently limited.
0
0
10
20
30
40
50
60
2 4 6 8 10 12 14
Age (year)
Length (cm)
16 18 20 22 24 26 28
Figure 21.4 Gompertz curve for shark growth.
Gompertz Software Reliability Growth Model 339
More Attempts on Gompertzian Software
Reliability Growth Model (SRGM)
e Gompertz curve is easily one of the widely used models. ere are several
examples of application of Gompertz to predict software reliability. ere have
been different adaptations, and different interpretations, each adding to the insight
into the Gompertz curve.
e following interpretations of the Gompertz equation by its different users
and the values of model constants are noteworthy.
Stringfellow and Andrews
Stringfellow and Andrews [3] built a Gompertzian SRGM shown in Figure 21.5. ey
tted a model with B = 0.001 and C = 0.74. e model was built from failure data
collected from a large medical record system, consisting of 188 software components.
e low value of B suggests delayed discovery and initially slow progress in testing.
e criteria used by Stringfellow and Andrews in model evaluation are simple
and effective:
Curve fit: How well a curve fits is given by a Goel–Okumoto (GO) F test: the
R
2
value.
Prediction stability: A prediction is stable if the prediction in a given week is
within 10% of the prediction in the previous week.
Predictive ability: Error of predictive ability is measured in terms of error (estimate−
actual) and relative error (error/actual).
Stringfellow and Andrews noted that “Gompertz performed better for Release 1
but not for Release 2 and Release 3.
A = 433
B = 0.001
C = 0.74
0
0
50
100
150
200
250
300
350
400
450
500
5 10 15
Test week
Cumulative defects found
and fixed
20 25 30
y = A(B
C
t
)
Figure 21.5 Stringfellow’s version of Gompertz reliability growth model.
340 Simple Statistical Methods for Software Engineering
Zeide
Bores Zeide [6] observed,
Another characteristic feature of the Gompertz equation is that
the position of the inflection point is controlled by only one
parameter, asymptotic size, A.
Swamydoss and Kadhar Nawaz
Swamydoss and Kadhar Nawaz [7] provided a physical interpretation to the param-
eters, as follows:
A: initially, A is taken as total defect detected until date
B: the rate at which defect rate decreases, or test case efficiency (0.27, in his study)
C: a constant, a shape parameter
Swamydoss and Kadhar Nawaz report that they have collected cumulative fail-
ures found every week and constructed the Gompertz model by curve fitting and
used the model to predict reliability. If the predicted reliability values were less than
the threshold, they would continue system testing.
Arif et al.
In the adaptation of Arif et al. [8], the coefficients B and C are negative numbers,
not fractions.
Anjum et al.
Anjum et al. [9] published a Gompertz model with A = 191.787, B = 0.242, and
C = 0.05972.
umer and Seidler
Bäumer and Seidler [10] reported poor performance of Gompertz.
Ohishi et al.
Ohishi et al. [11] used the Gompertz distribution with a different mathematical struc-
ture. e equation, fitted to data, and a plot of the equation are shown in Figure 21.6.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.15.220.201