Truth and Central Tendency 31
2. Geometric mean: central tendency for skewed positive data
3. Harmonic mean: central tendency for skewed positive data
In the second category, we obtain participation from only strategically selected
data points. We have seen seven such measures.
Category 2
1. Mode: a better indicator of central tendency in human performance
2. Median: a better indicator of central tendency in nonnormal data
3. Trimmed mean: straight removal of extreme values
4. Winsorized mean: robust calculation
5. Trimean: a weighted average of quartiles
6. Midhinge: an average of the first and the third quartiles
7. Midrange: average of lowest and largest values
We can estimate a pertinent set of means before judging the central tendency. e
choice would depend on the skew, the presence of outliers, and the degree of protec-
tion we need from outliers. Such a choice would make analysis robust and safe.
Truth
Truth, expressed as central tendency, has many variants. We can narrow down our
options depending on the type of data and depending on what we wish to do with
the finding. e message is not in the mean, nor in the median. e message is to
be seen in the many expressions. To the mathematically inclined, geometric mean
and harmonic mean are alternatives to the arithmetic mean. e differences must
be reconciled with practical reasoning.
Statistical judgment is never the ultimate end.
Further reasoning alone can discover truth.
Statistical calculations need not be the ultimate truth. At best, they can guide
us toward truth. e moment of truth occurs only with reasoning.
To the empirical researchers, there is a series of trimmed means to bestow alter-
natives to the median.
e impact of multiple definitions of central tendency is rather heavy while
evaluating shifts in process means. It is safer to work out all the definitions, obtain
multiple numbers for central tendency, and treat them as a small universe of values.
We will have to compare one universe of central tendency values with
another. We can no longer pitch one mean against another or engage in
such misleading exercises.
32 Simple Statistical Methods for Software Engineering
Application Notes
Managing Software Projects Using Central Tendency Values
After collecting all the data, software projects are more commonly managed with
values of central tendency. Managers prefer to take decision with summary truths.
Goal tracking is done using mean values while risk management is done
using variances.
Weekly and monthly reports make liberal use of mean values of data collected.
Performance dashboards make wide use of mean values. Means are compared to
evaluate performance changes.
Making Predictions
Basic forecasts address mean values. Most prediction models present mean values.
In forecasting business volumes and resource requirements, central tendencies are
predicted and used as a rule. e prediction of variance is performed as a special
case to estimate certainty and risk.
Box 2.3 a golden Rule to oBtain tRutH
Estimating a software project using the expert judgment” method is knowledge
driven. However, selective memory could taint human judgment because knowl-
edge is embedded in the human mind. A golden rule to extract truth from expert
judgment is to make the expert recall extreme values as well as the central value
from previous experience and use a weighted average using the 1:4:1 ratio. is
estimate is respected as a golden estimate, and the rule is hailed as the golden rule.
Box 2.4 aligning tHe mean
Aligning the mean of results with target is a great capability. Process alignment
with target is measured by the distance between mean and target. e lesser the
distance, the greater is the alignment. Aligned processes synchronize with goals,
harmonize work flow, and multiply benefits. e mean of results is particularly
important in this context. e quality guru Taguchi mentions that the loss to soci-
ety is proportional to the square of the drift of process mean from target, that is,
Loss = (target − mean)
2
Drift favorable to the consumer creates loss to the supplier; drift in the
opposite direction creates loss to the consumer. Either way, drift causes loss
to someone in society.
Truth and Central Tendency 33
Case Study: Shifting the Mean
Performance is often measured by the mean. is is true even in the case of engi-
neering performance. Code complexity is an engineering challenge. Left to them-
selves, programmers tend to write complex codes, a phenomenon known as software
entropy. Code complexity is measured by the McCabe number. Shifting the mean
value of the McCabe number in code requires drive from leaders and motivation
from programmers. To make a shift in the mean complexity is a breakthrough in
software engineering. Lower complexity results in modules that are testable and
reliable. Nevertheless, achieving lower complexity requires innovation in software
structure design and programming approaches.
is case study is about a software development project that faced this challenge
and overcame it by setting a visible goal on mean complexity. e current state is
defined by the mean McCabe number, and the goal state is defined by the desired
McCabe number. e testing team suggested an upper limit of 70, beyond which
code becomes difficult to comprehend in terms of test paths; test coverage also suffers.
Data for the current state show huge variation in complexity from 60 to 123 and even
150 occasionally. e project manager has two thoughts: fix an upper limit on com-
plexity of individual objects or x an upper limit for the mean complexity number for
a set of objects that define a delivery package. Although these two options look simi-
lar, the practical implications are hugely different. On the first option of setting an
upper limit on individual events, the limit contemplated by testers is 70. e second
option is about setting limit on the central tendency; this really is setting an optimum
target for software development. e number chosen for this target is 40. is is a
stretch goal, making the intention of the project manager very clear. Figure 2.3 shows
the chart used by the project team to deploy this stretch goal for shifting the mean.
In reality, the team is gradually moving toward the targeted mean value. e
direction of shift in the mean is very satisfying.
Current mean 78
Target mean 40
McCabe number
Current
state
Goal
state
Desired
breakthrough
Figure 2.3 Reducing the code complexity.
34 Simple Statistical Methods for Software Engineering
Review Questions
1. What are the strengths and weaknesses of arithmetic mean?
2. What is the most significant purpose of using trimmed means?
3. Why is median considered more robust?
4. How will you find central tendency in ordinal data?
5. What are the uses of weighted average?
Exercises
1. Customer Satisfaction Data in a software development project is obtained in
a Likert scale ranging from 1 to 5. e 1-year customer satisfaction scores are
given below. Find the central tendency in the data (data 2, 4, 3, 5, 1, 3, 3, 4,
5, 3, 2, 1).
2. A test project duration is estimated by an expert who has made the following
judgments:
Optimistic duration: 45 days
Pessimistic duration: 65 days
Most likely duration: 50 days
What do you think is the final and fair estimate of the test project duration?
References
1. J. Sauro and J. Lewis, Average task times in usability tests: What to report, CHI 2010,
Atlanta, April 2010.
2. P. S. Bullen, Handbook of Means and eir Inequalities, Kluwer Academics Publisher,
e Netherlands, 2003.
3. M. Wu, Trimmed and Winsorized Estimators, Michigan State University. Probability
and Statistics Department, p. 194, 2006.
4. L. P. Rivest, Statistical properties of Winsorized means for skewed distributions,
Biometrika, vol. 81 no. 2. pages no. 373–383, 1994.
5. S. Umberger, Some Mean Trapezoids, Department of Mathematics Education, University
of Georgia, 2001.
Suggested Reading
Aczel, A. D. and J. Sounderpandian, Complete Business Statistics, McGraw-Hill, London,
2008.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.188.254.131