190 ◾ Simple Statistical Methods for Software Engineering
Using the above Poisson distribution, Russian mathematician von Bortkiewicz
predicted that “over the 200 years observed 109 years would be with zero deaths.”
It turned out that 109 is exactly the number of years in which the Prussian data
recorded no deaths from horse kicks. e match between expected and actual val-
ues is not merely good, it is perfect.
Analysis of Module Defects Based on Poisson Distribution
Before release, software defects are triggered by tests according to the Poisson dis-
tribution. Defect count in modules in User Acceptance Tests will be an example
of rare events. If the average defects per module are 0.3 and if there are 100 mod-
ules in a release, the defects are distributed across the modules according to the
Poisson distribution. All the modules are not likely to have equal defects. A few
may have more and the count tapers off among the remaining. e distribution
follows Equation 12.5. e plot of Poisson distribution is shown in Figure 12.8.
e mean of the distribution is now known as the rate parameter. e only
parameter to the equation is the mean. Variance of the distribution is equal to
mean. Hence, the statistical limits are known by simple formulas:
UCL = +λ λ3 (12.7)
Box 12.4 AnAlogy—BAD AppleS
A truck delivering apples unloads at a warehouse. Most cartons have apples in
good condition, but some apples are damaged. Typically, “damaged apples”
is a rare event; only cartons in some part of the truck might be damaged. e
occurrence of damaged apples is a Poisson process, the distribution of defects
happens in spatial domain. e number of bad apples in unit volume is a
Poisson parameter.
Likewise, a software product is shipped to the customer. When usage
begins, some part of the product is found to have defects. Such defects are rare
events. Across the code structure, defects are spatially distributed. However,
software usage and defect discovery is a rare event in temporal domain.
Hence, people use the word defect arrival rate. e number of defects arriv-
ing in unit time (e.g., a week) can be measured from defects counts in time.
e defect arrival rate follows Poisson distribution.
Tests prior to release also discover defects in a similar manner. Defects “arrive”
according to the Poisson distribution, in a broad sense. Change requests follow
suit. Each development project has unique styles of managing defect discovery;
accordingly, the Poisson distribution varies in structure and departs from the
simple classic Poisson equation. ere are several variants of the Poisson distribu-
tion to accommodate the different styles in defect management.