299
Chapter 19
Weibull Distribution:
A Tool for Engineers
e genesis of the Weibull distribution goes back to the Rayleigh distribution. e
Rayleigh distribution was invented in 1905 (see Box 19.1). e Weibull distribu-
tion was invented much later by Waloddi Weibull in Sweden in 1939 (see a brief
biographical note in Box 19.2). Since the discovery of the Weibull distribution, it
has cast great influence in engineering, especially in reliability analysis. e scope
of its application has widened to several areas of physics and management (see Box
19.3 for a few instances).
Box 19.1 Rayleigh Flight
It so happens that the Rayleigh distribution, invented in 1905, is a special
case of the Weibull distribution invented in 1939. e Weibull distribution
type II with location constant 0 and shape factor 2 is the Rayleigh curve.
Both are skewed to the right.
e discovery follows research on the historic random walkproblem
posed by Pearson.
Pearson posed his problem in Nature (July 27, 1905).
A man starts from a point 0 and walks l yards in a straight line; he then
turns through any angle whatever and walks another l yards in a second
300 Simple Statistical Methods for Software Engineering
IBM’s Peter Norden (see Box 19.4) favored the Weibull equation over the logis-
tic curve to model software development project cost. Lawrence Putnam promoted
the Weibull curve with shape factor = 2 as a Rayleigh distribution. Putnam used
this in his estimation model SLIM to estimate cost and defect. He went further to
call this distribution the software equation.
The Weibull distribution is by far the world’s most popular
statistical model for life data. It is also used in many other
straight line. He repeats this process n times. I require the probability that
after n of these stretches he is at a distance between r and r + δr from his
starting point.
Rayleigh pointed out that, for large values of n, the answer given by
Rayleigh was
2
2
2
2
nl
e r r
r
nl
δ
(19.1)
is actually has the shape of a normal distribution, centered at the origin.
In this equation Rayleigh assumed that the drunkard walks in one dimen-
sion. e model suggests that the drunkard will return to the origin after a
random walk. If we allow two additional dimensions and solve the problem,
a new phenomenon called Rayleigh Flight occurs. e distribution now is
Rayleigh. e drunkard will not return to the origin.
Rayleigh missed Smoluchowskis 1906 paper on the motion of colloidal
particles, in which he introduces the random flight idea.
A one-dimensional walk is Gaussian. A multidimensional walk is Rayleigh.
Brownian motion in one dimension is Gaussian. e vector sum of Brownian
movements in several dimensions is Rayleigh. Simple processes follow
Gaussian. A combination of several simple processes is Rayleigh.
Software development is due to the combined work of several people and
several processes. Even if the individual processes are Gaussian, the com-
bined result can be the skewed Rayleigh. is being the essential case, can we
expect team results to be normally distributed? Relentless pursuit of normal-
ity in software engineering data is futile.
Weibull Distribution 301
applications, such as weather forecasting and fitting data of all
kinds. Among all statistical techniques it may be employed for
engineering analysis with smaller sample sizes than any other
method.
Robert B. Abernethy
Weibull Curves
We will start with Weibull plots with three standard values for the shape param-
eter: 1, 2, and 3. ese three plots are shown in Figure 19.1. We have chosen a scale
factor of 20 and have kept it the same for the three plots. e shapes of these three
curves resemble the shapes of the exponential distribution, the Rayleigh distribu-
tion, and the normal distribution, respectively. e three figures represent a family
of curves known as the Weibull family.
e curves shown are possible models for service time in a software mainte-
nance project with a median value of 20 days and hypothetically equating the scale
parameter to the median value.
e Weibull equation, which has been used in creating the curves, is given as
follows:
W x
x
e
x
( ) ,=
>
α
β β
α β
α
β
α
1
0
(19.2)
where α is the shape parameter and β is the scale parameter.
e value of shape factor is not limited to the three integers shown in Figure
19.1. It can be any positive number and can be used to generate an innite number
of the Weibull curves.
Parameter Extraction
Rules of Thumb
e shape factor can be judged by looking at data histograms. To start the iteration
of curve fitting, it is good to begin with the nearest of the three shapes 1, 2, and 3
and then converge using the least square error method. Scale is substantially influ-
enced by median, as shown in the following rule of thumb:
Scale = C
1
median
C
1
= 0.09 shape factor + 0.636
302 Simple Statistical Methods for Software Engineering
0
0.000
0.005
0.010
0.015
0.020
0.025
0.030
0.035
0.040
0.045
0.050
10 20 30 40
Service time (h)
Probability
50 60
α = 1
β = 20
70
0
0.000
0.005
0.010
0.015
0.020
0.025
0.030
0.035
0.040
0.045
0.050
10 20 30 40
Service time (h)
Probability
50 60
α = 2
β = 20
70
is Weibull curve
resembles exponential
distribution.
is Weibull curve
resembles Rayleigh
distribution.
0
0.00
0.01
0.02
0.03
0.04
0.05
0.06
0.07
10 20 30 40
Service time (h)
is Weibull curve
resembles normal
distribution.
Probability
50 60
α = 3
β = 20
70
Figure 19.1 Family of Weibull curves.
Weibull Distribution 303
It is a good idea to fit Weibull by equating the median of data to the median
value of the distribution. Weibull, being a skewed curve, is best represented by
median. is formula is simple.
Moments Method
e scale factor can be estimated by the method of moments. e mean value of the
distribution is equated to the mean value of data. e variance of the distribution
is equated to the variance of data.
Equation to the mean of the distribution is given as follows:
µ β
α
= +
Γ 1
1
(19.3)
Equation to the variance of the distribution is given as follows:
σ β
α α
2 2 2
1
2
1
1
= +
+
Γ Γ (19.4)
MLE
A commonly accepted approach to the general problem of parameter estimation is
based on the principle of maximum likelihood estimation (MLE). Moments-based
estimators have been popular because of their ease of calculation, but MLEs enjoy
more properties desirable for estimators. For a rst-order judgment, the moments-
based approach is good enough.
Parameters for Machine Availability Modeling
Nurmi and Brevik [1] studied the problem of machine availability in the enterprise area
and wide area distributed computing settings using Weibull. In one of the models, they
t data to a Weibull with a shape factor of 0.49 and a scale factor of 2403. It may be
noted that the shape factor is less than 1, making it sharper than the exponential func-
tion.e scale value is large on par with the median machine availability value.
Box 19.2 WeiBull, the ScientiSt
Ernst Hjalmar Waloddi Weibull (1887–1979) was a Swedish engineer, scien-
tist, and mathematician. In 1914, while on expeditions to the Mediterranean,
the Caribbean, and the Pacific Ocean on the research ship Albatross, Weibull
wrote his first paper on the propagation of explosive waves. He developed the
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.144.109.223