244 Simple Statistical Methods for Software Engineering
=
+b
b a b c
c a b
( )( )
( )
2
2
when /
(15.5)
e calculation is illustrated below by solving a problem.
QuesTion
Given the following inputs,
Minimum % SLA compliance = 50
Maximum % SLA compliance = 100
e median is 80%
Build a triangular PDF for service-level agreement (SLA) compliance and find
where the peak occurs.
answer
Let us rearrange the input information to suit our formulas.
a = 50
b = 100
erefore, (a + b)/2 = 75
Median = 80
Substituting all these in Equation 15.5, we obtain
c = 84
us, all the three corners of the triangle are known. e model is plotted in
Figure 15.4.
x = % SLA compliance
10050
Median = 80 (given)
Solution
peak = 84
Model for SLA compliance
Figure 15.4 Triangular model of SLA compliance.
Law for Estimation 245
e triangular SLA compliance model is far superior to a plausible Gaussian
model. Typically, the Gaussian tail would exceed 100%, distort calculations, and
force us to take countermeasures such as a messy truncation. e triangular PDF is
compact and does not outstep empirical experience.
Other Statistics
e mode is obviously the peak, and hence, the following relationship is true:
Mode = c
Dispersion is strongly indicated by the base width, b a. However, a proper
calculation of variance is according to the following equation:
Variance = (a
2
+ b
2
+ c
2
abacbc)/18 (15.6)
An example is as follows:
Given
a = 0,
b = 10,
c = 5 (for a symmetrical triangular model),
We get
Variance = 4.17,
Standard deviation = 2.0412.
It may be noted that as c changes, the variance slightly changes.
Skew
Although the process boundaries constitute a rm base, the apex c can be moved
from the left extreme to the right extreme, as shown in the three examples in
Figure 15.5.
e first example has its peak at the lower limit and gives a triangle skewed to
the right. e second example has its peak in the middle position between the lim-
its, providing symmetry. e peak in the third example coincides with the upper
limit, giving a negative skew. ese three peaks demonstrate how the triangular
PDF can be made to be symmetrical or skewed. e peak can take an infinite
number of positions within these extremes.
246 Simple Statistical Methods for Software Engineering
e formula for skew is as follows:
Skew =
+ +
+ +
2 2 2 2
5
2 2 2
( )( )( )
(
a b c a b c a b c
a b c ab ac
bbc)
3
2
(15.7)
Equation 15.7 is used to construct a relationship between skew and mode c for a
given a = 0 and b = 10. A graph of the relationship is plotted in Figure 15.6.
We can generate a wide range of skews using the relationship. In software develop-
ment projects, the challenge arises in the form of skew. In a Gaussian-dominated statis-
tical thinking, skew does not even exist.e triangular model provides a simple model
to represent skew. Hence, the inherent advantages of the triangular model are threefold:
a b
c
a b a b
c
c
Variable
Probability
Variable Variable
Probability
Probability
c Process mode
a Lower boundary of process
b Upper boundary of process
is is a model of process
aligned to some convenient
lower bound.
is is a model of a centered
process.
is is a model of a process
stretching to the maximum
tolerance.
Figure 15.5 Triangular distributionthree examples.
0 2 4 6
c
a = 0, b = 10
Skew
8 10 12
0.8
0.6
0.4
0.2
–0.2
–0.4
–0.6
–0.8
0
Figure 15.6 Relationship between Skew and mode c.
Law for Estimation 247
It shows a prominent central tendency.
It is capable of representing symmetry.
It is capable of showing skew, both left and right.
All these representations can be achieved with great agility.
Three-Point Schedule Estimation
Let us consider an example of schedule estimation by expert judgment for a soft-
ware component development:
Optimistic value = 25 days
Pessimistic value = 50 days
Most likely value = 30 days
Applying the triangular PDF, the expected value of schedule is as follows:
Mean /
days
= + +
=
( )25 50 30 3
35
is may be compared with the conventional estimation technique using the
program evaluation review technique (PERT) formula. e PERT formula will
place the estimate as follows:
PERT /
/
days
o m p
= + +
= + × +
=
( )
( )
.
t t t4 6
25 4 30 50 6
32 5
It is seen that the triangular PDF gives a safer and more conservative estimate.
Beta Option
ere have been interests in generalized triangles with curvature added. Wahed
published “e Family of Curvi-Triangular Distributions[2]. Brizz [3] has con-
structed two-faced triangles with one face a straight line and the second face expo-
nential. However, the classic beta distribution provides smoothly curved bounded
functions, and in the opinion of the authors, the good old beta distribution must
be exploited rst before experimenting with curvilinear versions of the triangle. A
typical beta distribution model is shown in Figure 15.7. e problem of productiv-
ity is revisited with beta.
248 Simple Statistical Methods for Software Engineering
For fixed upper and lower bounds 30 and 15, three curves have been drawn for
three sets of shape parameters. e beta distribution bounded between 0 and 1 is
defined as follows:
Beta( , , )
( )
( ) ( )
( )
x
x x
α β
α β
α β
α β
=
+
1 1
1
Γ Γ
Γ
(15.8)
where Γ represents the gamma function, and α and β are the shape parameters.
Figure 15.7 looks more appealing than triangles, but the equation intimidates
users. Beta distribution can offer an impressive array of bounded shapes. However,
while implementing bounded functions in real-life projects, beta distribution was
less acceptable despite its inherent power, and the “intuitive” triangular model was
considered.
Triangular Risk Estimation
Like with any PDF, the triangle can be used for risk analysis. Figure 15.8 shows risk
measurement in the triangular way.
Risk computation based on triangular function should be taken with far more
seriousness and treated more urgently than risk measured based on tailed distribu-
tions. Tails in bell curves and other tailed distributions are mere extrapolations into
extremes, whereas the triangle means business. Risk measured by typical triangular
models should be specially treated because we do not generally anticipate risks in
the triangular side of the world. Measuring risk with the body of a probabilistic
distribution is very different from measuring risk with tails.
0.12
0.10
0.08
0.06
0.04
0.02
0
14 16 18 20 22
FP/person-month
Lower bound A = 15
Upper bound B = 30
Alpha = 2 Beta = 3
Alpha = 3 Beta = 3
Alpha = 4 Beta = 3
24 26 28 30
Beta (x, α, β) =
x
α–1
(1 – x)
β–1
Г(α + β)
Г(α)Г(β)
Figure 15.7 Beta distribution of productivity (FP/person-month).
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.119.110.139