314 Simple Statistical Methods for Software Engineering
is change reflects changes in life cycle models and practices. e modern
projects follow new paradigms, self-designed and custom-tailored.
Weibull Model for Defect PredictionSuccess Factors
Weibull model, be it for cost prediction or for defect prediction, is holistic in nature
and can be applied early in the project. Weibull holistic models should be used by
leaders to predict the future and to manage projects. With a few data and indica-
tors, one can foresee what lies ahead.
However, there is some reluctance in applying holistic models when a lot more
details have accumulated. People refuse to look at the larger picture. Weibull dis-
tribution presents a larger picture. e predictive use of Weibull cost models that
resemble earned value graphs or Weibull defect curves that resemble reliability
growth graphs is easily forgotten.
Alagappan [13] presented three patterns of details that overstep the holistic
Rayleigh curve:
1. Fluctuating defect trend
2. Lower actual defect density
3. Effective defect management (early defect removal)
ese details come after project closure. Weibull can be constructed reliably
enough halfway through the project. is advantage is untapped by the industry.
Stoddard and Goldenson [14] mentioned the Rayleigh model as a process per-
formance model in an SEI Technical Report (2010). e report presents a success-
ful case study on a Rayleigh model from Lockheed Martin in which the authors of
the case study described,
e use of Rayleigh curve fitting to predict defect discovery (depicted
as defect densities by phase) across the life cycle and to predict latent or
escaping defects. Lockheed Martins goal was to develop a model that
could help predict whether a program could achieve the targets set for
later phases using the results to date.
Research indicates that defects across phases tend to have a Rayleigh
curve pattern. Lockheed Martin verified the same phenomenon even
for incremental development approaches, and therefore decided to
model historical data against a Rayleigh curve.
Two parameters were chosen: the life-cycle defect density (LDD)
total across phases and the location of the peak of the curve (PL).
Outputs included planned defect densities by phase with a performance
range based on the uncertainty interval for LDD and PL and estimated
latent defect density.
Weibull Distribution 315
Inputs to the model during program execution included the actual
defect densities by phase. Outputs during program execution included
fitted values for phases to date, predicted values for future phases and
estimated LDD, PL, and latent defect density.
e presentation suggested that those who want to be successful
using defect data by phase for managing a program need to be believers
in the Rayleigh curve phenomenon for defect detection.
e belief in the Rayleigh curve, mentioned as an ingredient for success, is the
point we wish to highlight.
Box 19.4 vieWS oF noRden Who FiRSt applied
WeiBull to SoFtWaRe engineeRing
Peter V. Norden has been a consultant with IBM’s Management Technologies
practice, specializing in the application of quantitative techniques to man-
agement problems and project management systems. He was a member of
the team that developed IBM’s worldwide PROFS communication system,
which eventually became the Internet.
Norden [15] created history by applying the Weibull distribution to soft-
ware development. He was building quantitative models when he noticed,
It turned out, however, that time series and other models
built on these data had relatively poor predictive value. It
was only when we noticed that the manpower build-up
and phase-out patterns related to why the work was being
done (i.e., the purpose of the effort, such as requirements
planning, early design, detail design, prototyping, release
to production) that useful patterns began to emerge. The
shapes were related to problem-solving practices of engi-
neering groups and explained by Weibull distributions.
Subsequent researchers (notably Colonel L. H. Putnam, originally of
the U.S. Army Computer Systems Command) referred to them as Rayleigh
curves but were dealing with the same phenomenon.
e life cycle equation computes the level of effort (labor-hours, labor-
months, etc.; the scale is arbitrary) required in the next work period (day,
week, month, etc.) as a function of the time elapsed from the start of this
particular cycle, the total effort forecast for the cycle, and a scaleless “trashi-
ness” parameter that could represent the urgency of the job.
316 Simple Statistical Methods for Software Engineering
Review Questions
1. What settings will make a Weibull curve behave like a Rayleigh distribution?
2. What is the role played by location parameter?
3. What is the formula connecting the scale shape and scale factor of Weibull curve?
4. Who invented the Weibull distribution?
5. Who applied the Weibull distribution to software projects for the first time?
Exercises
1. e median value of a certain data set is 4.5. e data are suspected to have
standard Weibull distribution. Calculate the scale shape.
2. Plot a Weibull curve with a shape factor of 3 and a scale factor of 15, mak-
ing use of the Excel function WEIBULL.DIST. (Clue: set the cumulative
value = 0.)
3. In software review, defect discovery follows the Weibull model with a shape
of 2 and a scale of 15 days. Find the remaining defects in the code if review is
terminated on day 20.
4. Software productivity data (lines of code [LOC] per person day) is fitted to
Weibull with the following parameters: location = 30, shape = 3, and scale =
50. Find the probability that productivity will go above 70 LOC/person-day.
5. Fit Putnam’s software reliability model to BSPIN data (graphs in Figure 19.8)
and predict the percentage of postrelease defect.
References
1. D. Nurmi and J. Brevik, Modeling Machine Availability in Enterprise and Wide-Area
Distributed Computing Environments, UCSB Computer Science Technical Report
Number CS2003-28.
2. F. Shi and J. Hu, e effect of test length on strength of cotton yarns, Research Journal
of Textile and Apparel, 5(1), 18–25, 2001.
3. F. Liu, C. Kashyap and C. J. Alper, A Delay Metric for RC Circuits Based on the
Weibull Distribution, DAC ‘98 Proceedings of the 35th Annual Design Automation
Conference, pp. 463–468, ACM New York, 1998.
4. H. D. Grissino-Mayer, Modeling fire interval data from the American southwest with
the Weibull distribution, International Journal of Wildland Fire, 9(1), 37–50, 1999.
5. R. Sakin and I. Ay, Statistical analysis of bending fatigue life data using Weibull distri-
bution in glass-fiber reinforced polyester composites, Materials and Design, 29, 1170–
1181, 2008.
6. L. Robert, T. Bailey and R. Dell, Quantifying diameter distributions with the Weibull
function, Forest Science, 19(2), 97–104, 1973.
7. W. Weibull, A statistical distribution function of wide applicability, Journal of Applied
Mechanics, 18, 293–297, 1951.
Weibull Distribution 317
8. M. A. Vouk, Using Reliability Models During Testing with Non-Operational Proles,
North Carolina State University, 1992.
9. H. C. Joh, J. Kim and Y. K. Malaiya, Vulnerability discovery modeling using Weibull
distribution, 19th International Symposium on Software Reliability Engineering, IEEE,
2008.
10. A. T. Tai, L. Alkalai and S. N. Chau, On-Board Preventive Maintenance: A Design-
Oriented Analytic Study for Long-Life Applications, Jet Propulsion Laboratory, California
Institute of Technology.
11. L. H. Putnam and W. Myers, Familiar Metric Management—Reliability. Available at
http://www .qsm.com/fmm_03.pdf.
12. Larry Putnams Interest in Software Estimating, Version 3, Copyright Quantitative
Software Management, Inc., 1996.
13. V. Alagappan, Leveraging defect prediction metrics in software program management,
International Journal of Computer Applications, 50(20), 23–26, 2012.
14. R. W. Stoddard II and D. R. Goldenson, Approaches to Process Performance Modeling:
ASummary from the SEI Series of Workshops on CMMI High Maturity Measurement and
Analysis, SEI Technical Report, CMU/SEI-2009-TR-021 ESC-TR-2009-021, January
2010.
15. P. V. Norden, Quantitative techniques in strategic alignment, IBM Systems Journal,
32(1), 180–197, 1993.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.222.106.30