181
Chapter 12
Law of Rare Events
Science pursues the study of rare events with fervor. e probability of rare events
is a skewed one. In this chapter, we discuss one continuous distribution and one
discrete distribution to represent rare events.
Box 12.1 Age DeterminAtion—CArBon DAting
In the mid-1940s, Willard Libby, then at the University of Chicago, real-
ized that the decay of carbon 14 might lead to a method of dating organic
matter. Wood samples taken from the tombs of two Egyptian kings, Zoser
and Sneferu, were dated by radiocarbon measurement to an average of 2800 BC
plus or minus 250 years. ese measurements, published in Science in 1949,
launched the radiocarbon revolutionin archaeology and soon led to dra-
matic changes in scholarly chronologies. In 1960, Libby was awarded the
Nobel Prize in chemistry for this work. e equation governing the decay of
a radioactive isotope is
N = N
0
e
λt
where N
0
is the number of atoms of the isotope at time t = 0, N is the number
of atoms left after time t, and λ is a constant that depends on the particular
isotope. It is an exponential decay. Using this equation, the age of the sample
can be determined. (http://en.wikipedia.org/wiki/Radiocarbon_dating)
182 Simple Statistical Methods for Software Engineering
Exponential Distribution
It is common knowledge in science that decay is dened as an exponential form,
a simple and beautiful mathematical structure, as given in the following equation:
f(t) = e
λt
(12.1)
where λ is a constant describing the rate of decay and t is the time variable.
Nobel laureate Ernest Rutherford used this equation to describe the radioactive
decay of thorium in 1907 [1].
It is the basis for the Nobel Prize in Chemistry he was awarded in 1908
for his investigations into the disintegration of the elements, and the
chemistry of radioactive substances.
If we use the Geiger counter and counted the radiated particles, the data will
fit a discrete Poisson distribution. If we measure loss of weight of the parent or the
interarrival time of particles, the data will fit a continuous Exponential distribution.
When we fit a curve to Rutherford data, we will obtain the following equation:
y
R
x
=
=
144 03
0 9991
0 172
2
.
.
.
e
(12.2)
Figure 12.1 shows the Rutherford data and the exponential plot. e exponen-
tial form fits like a glove to the decay data. Rutherford also defined a parameter
called half-life, the time taken for the parent matter to lose half its weight. On the
exponential graph, half-life represents the median. e half-life point is marked in
Figure 12.2 (4.03 days), where thorium activity becomes half of the start value. e
start value is 144, and the half value is 72. e time required for this loss of activity
is 4.03 days.
e exponential distribution is memoryless.
is can be demonstrated using Figure 12.2. For thorium activity to drop from
72 to a half of 72, that is, 36, it will take another 4.03 days. is is exactly the time
taken for thorium activity to drop from 144 to 72. e second drop takes the same
time as the first drop because the exponential curve has no memory of the first
drop. Each time, decay starts afresh with a new account and a fresh experience of
the same half-life. Half-life is the property of the decaying matter represented in the
exponential form. For thorium activity, it is 4.03 days.
e exponential nature of radioactive decay is exploited in carbon dating (see
Box 12.1).
Law of Rare Events 183
y = 144.03e
–0.172x
R
2
= 0.9991
0.0
20.0
40.0
60.0
80.0
100.0
120.0
0 5 10 15 20
Activity of X
Days
Figure 12.1 Exponential distribution of Radioactive decay.
Half-life
160
140
120
100
80
60
40
20
0
0 1 2 3 4 5 6 7 8 9 10 11
Days
Decay constant
Mean life time
Half-life time
λ
Mean
Median
0.172
5.814
4.030
y = 144e
–0.172x
R
2
= 1
Activity of X
12 13 14 15 16 17 18 19 20 21
Figure 12.2 Half-life analysis using exponential distribution.
184 Simple Statistical Methods for Software Engineering
It can be seen that Equation 12.1 is just a variant of the proper exponential
probability density function (PDF) shown as follows:
f(t) = λe
λt
(12.3)
e cumulative distribution function (CDF) is as follows:
F(t) = 1 − e
λt
(12.4)
where t is time and λ is the rate constant. e mean of this distribution is 1/λ. e
standard deviation is also equal to 1/λ.
In engineering, exponential distribution is primarily used in reliability applica-
tions. In the context of reliability, λ is known as the failure rate or hazard rate. In
a chemical engineering example, corrosion rate is represented in exponential form.
In an electrical engineering example, electrical charge stored in a capacitor decays
exponentially. In a geophysics example, atmospheric pressure decreases exponen-
tially with height.
Equation 12.3 shows that a single parameter completely specifies the PDF, a
unique aspect responsible for the simplicity of the equation.
e other model statistics are as follows:
e median is
ln2
λ
.
e mode is 0.
e skewness is 2.
e kurtosis is 9.
e metric% software defects discovered during system testing decreases expo-
nentially with time, as shown in Figure 12.3. Initial test effort discovers more
defects, and subsequent tests begin to show lesser results, a common experience
in software testing. We assume that risky modules are tested rst, as per a well-
designed test strategy. Representing defect metrics is a classic application of the
exponential model.
Defects found in a testing day are counted and summed up to obtain Figure
12.3. e x-axis of the plot could be test day or even calendar day. We can plot total
defects found every week and establish the exponential nature.
In reliability analysis, the median value
ln2
λ
is called half-life. e mean 1/λ
is known as mean time to fail (MTTF). Also, f(t) = e
λt
is known as survival func-
tion or reliability function. If the MTTF of a bulb is 400 hours, the corresponding
f(t) would dene the reliability of the bulb. As time goes on, the reliability would
decrease, notably after 400 hours, and the reliability of the bulb can be calculated
directly from the following expression:
Bulb reliability = e
−(t/400)
Law of Rare Events 185
where t is elapsed time in hours (see Box 12.2 for a note on a Super Bulb burning
for 113 years).
In software applications, there is a reversal of thinking: the CDF F(t) = 1 − e
λt
is known as the reliability function.
As time progresses, more defects are removed, and the product becomes more
reliable, contrasting with the bulb. It is rather easy to calculate the fraction found
Box 12.2 Super BulB
e Centennial Light is the worlds longest-lasting light bulb. It is at 4550 East
Avenue, Livermore, California, and maintained by the Livermore-Pleasanton
Fire Department. e fire department says that the bulb is at least 113 years
old and has been turned off only a handful of times. It is a 4-watt, hand-blown,
carbon lament, common light bulb manufactured by the Shelby Electric
Company in Shelby, Ohio, in the late 1890s. e Livermore-Pleasanton Fire
Department plans to house and maintain the bulb for the rest of its life.
(http://en.wikipedia.org/wiki/Livermore-Pleasanton_Fire_Department)
0
0.000
0.020
0.040
0.060
0.080
0.100
0.120
0.140
0.160
0.180
5 10
System test day
Relative defects discovered
15 20
Figure 12.3 Exponential distribution of defect discovery.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.223.195.164