196 Simple Statistical Methods for Software Engineering
irregular; defects seem to be triggered in different rates in different periods. ere
seem to be short test cycles within the main testing process. e curve also has sev-
eral linear climbs. ese details mark clear departure from the simple exponential,
making a case for building NHPP.
e equation to an NHPP is given as follows:
P x m t
e m t
x
m t x
( , ( ))
( )
!
( )
=
(12.13)
where m(t) is the mean value function that takes the place of the traditional failure
rate constant λ in the HPP model presented in Equation 12.5.
It may be noted that Equation 12.13 is exactly Equation 12.5, except for a redefi-
nition of the rate constant λ. An NHPP is completely defined by its mean value func-
tion m(t). Building an NHPP model decreases the identification of the right function
for m(t) and the derivation of the parameters of the function from failure data.
ere are many options available to choose a function for m(t). Researchers have
used different functions to suit different situations. e list includes exponential,
logarithms, Gaussians, Weibulls, and logistic functions. Even mixtures of func-
tions have been used to deal with complex events.
It is now a custom to think of NHPP with two equations. e bigger Poisson
equation in Equation 12.13 defines the structure, and the mean value function in
Equations 12.1012.13 defines a central component. In fitting NHPP to data, we
derive the coefficients of the mean value function m(t) from data. ere is no need
to consult the Poisson equation for this purpose. e Poisson is in the background,
as an abstraction of the model.
ink of NHPP, think of mean value function.
An early application of the NHPP power law is by Duane [3], who in 1964 observed,
When he plotted cumulative MTBF estimates versus the times
of failure on log-log paper, the points tended to line up follow-
ing a straight line. This was true for many different sets of reli-
ability improvement data and many other engineers have seen
similar results over the last three decades. This type of plot is
called a Duane Plot and the slope beta of the best line through
the points is called the reliability growth slope or Duane plot
slope. A straight line on a Duane plot is equivalent to the NHPP
Power Law Model.
e NHPP power law has been used as a model for “reliability improvement.
Law of Rare Events 197
GoelOkumoto (GO-NHPP) Model
Many models have been proposed by the studies about software reliability based on
NHPP. e main reason for selecting the NHPP technique is that it facilitates the
testers an analytical framework that helps in identifying the faults derived from the
software during the testing process, as noted by Vamsidhar et al. [4].
One of the most widely models used is the GoelOkumoto NHPP model.
On the basis of their study of actual failure data from many systems, Goel
Okumoto proposed the following exponential mean value function for their
NHPP model [5]:
m(t) = a(1 − e
bt
) (12.14)
where m(t) is the expected cumulative number of defects function, a is the expected
total number of defects in the system, and b is the defect detection rate per defect.
e model assumes exponential behavior of failure and perfect debugging so
that failure intensity reduces with time. It is exactly a replica of the cumulative
exponential distribution given in Equation 12.4. e familiar λ, the failure rate
constant, is now called b, the detection rate constant, supporting the paradigm
that failure events in software are detection events. b could also stand for test case
efficiency. e constant a is a scaling factor, introduced to represent the number
of defects in the product. e GoelOkumato model treats a as a parameter to be
estimated from data.
It may be noted that the GoelOkumuto model fits data to the exponential
distribution. is strengthens the application of the exponential law in software
reliability engineering. What is interesting is even in NHPP, the exponential law
prevails as a fundamental principle. is upholds a universal view: “the exponential
function is used to generate several other functions.
By substituting the GoelOkumoto mean value function in the NHPP equa-
tion, we get the following detailed expression of NHPP:
P x m t
e a e
x
a e bt x
bt
( , ( ))
( ( ))
!
( )
=
1
1
(12.15)
where x takes discrete integer values 0, 1, 2, and so on.
e detailed expression still has only two parameters, a and b. Applying Equation
12.9 for any given time t, we can create the Poisson probabilities of finding x num-
ber of defects. Plotting an NHPP is a complex thing to do. We have plotted the
mean value function in Figure 12.13. Alongside, we have also plotted cumulative
NHPP probabilities for the following discrete x values, for example, x = 0, x = 1,
x = 2, and x = 3. is creates a family of curves for Equation 12.9.
Fitting the mean value function to data is the real job in building an NHPP—a
curve-fitting job, an empirical task.
198 Simple Statistical Methods for Software Engineering
God does not care about our mathematical difficulties.
He integrates empirically.
Albert Einstein
ere are several references to the use of the GoelOkumoto model, often
called the GO model. A few are mentioned in the following section.
Different Applications of Goel–Okumoto (GO) Model
e law of rare events is fully realised in structure of GO-NHPP model, as we
have seen. e GO NHPP Model has been extensively researched and used as a
Software Reliability Growth Model (SRGM). A few attempts are listed below.
e wide varieties of applications of the GO model explore the model features
and identify the limits.
0
0.2
0.4
0.6
0.8
1.0
1.2
0 20 40 60 80 100
Cumulative defects (%)
Time (days)
Mean value function MVF
a = 1, b = 0.07
MVF
x = 1
x = 1
x = 2
x = 3
NHPP
m(t) =
a(1
e
bt
)
P(x, m(t))
=
e
m(t)
m(t)
x
x!
Figure 12.13 Goel–Okumoto model.
Law of Rare Events 199
1. Nagar and ankachan [6] have used the GO model to “decide the amount
of more testing required and for the correct estimation of the remaining
errors.ey call b as a roundness factor, similar to a shape factor that tends
to zero when irregularity increases.
2. Wood [7] considers nine SRGMs and has included the GO model in his
study. He draws special attention to the collection of time data, the argument
in SRGMs. He also proposes a two-stage NHPP if a significant amount of
new code is added during the test period.
3. In a survey of software reliability models, Pai [8] considers the GO model.
He makes a salient observation regarding the GO model usage, “e model
requires failure counts in the testing intervals and completion time for each
test period for parameter estimation.
In our opinion, this practice gives extra credibility to the mean value func-
tion and invests it with more decision making power. An exponential model
that uses sums of the defects found in test intervals and the completion time
of the test interval as the argument does not depend so much on Poisson
abstraction.
4. Liu et al. [9] propose a generalized NHPP that uses a bell curve for fault detec-
tion rate. e bell curve handles variations due to fluctuations in debugging,
learning, and fault removal efficiency. e results show that the proposed
model fits failure data better than some selected NHPP models, including
the GO model.
5. Anjum et al. [10] have evaluated 16 SRGMs proposed during the past 30 years
using a set of 12 comparison criteria. ey find the GO model in position 6
from the top. Surprisingly, they find the generalized Goel model in the 14th
position.
e generalized Goel model does not use the simple exponential law
for its mean value function but adds a third parameter to generate desired
shape changes. A graph of the generalized Goel model is available in
Chapter 21.
6. Mohd and Nazir [11] have studied different reliability models and find an
interesting characteristic in the GO model.
It should be noted that here the number of faults to be detected is treated
as a random variable whose observed value depends on the test and other
environmental factors. is is a fundamental departure from the other mod-
els which treat the number of faults to be a fixed unknown constant.
200 Simple Statistical Methods for Software Engineering
7. Kim et al. [12] nd that the GO model can be applied to safety critical soft-
ware, although
it is generally known that software reliability growth models such as the
Jelinski-Moranda model and the GoelOkumotos non-homogeneous
Poisson process (NHPP) model cannot be applied to safety-critical
software due to a lack of software failure data.
eir analysis confirms the fear: the estimated total number of inherent
software faults varies from 27.32 to 34.83 for the GO model as the software
failure numbers change from 24 to 34. Results are sensitive to the number of
failure data points.
8. Lin and Huang [13] finds the Weibull model better than the GO model in a
special application but chooses to refer to the GO model as a benchmark.
9. Gokhale and Trivedi [14] propose an enhanced NHPP, called the mean value
function, as a coverage function and use the log logistic function instead of
exponential to get better results than the GO model.
Box 12.5 rAre BASeBAll eventS
Huber and Glen [15] have studied three sets of rare baseball events—pitching
a no-hit game, hitting for the cycle, and turning a triple play—which offer
excellent examples of events whose occurrence may be modeled as Poisson
processes. From 1901 to 2004, there have been 206 no-hitters, 225 cycles,
and 511 triple plays. e associated mean values per year have been calculated
as follows:
No-hitter = 1.98.
Cycle = 2.16.
Triple plays = 4.91.
e above mean values characterize the respective Poisson distributions.
e researchers have also calculated mean interarrival times as follows:
No-hitter = 772 games.
Cycle = 720 games.
Triple play = 316 games.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.145.162.114