Data Dispersion ◾ 37
e full range, 97.13, is uncomplicated. e other two ranges have been trimmed.
e 3rd percentile range (obtained by cutting o 3% of data on either side) is 87.139.
e IQR is 15.001. Choosing the trimming rules has an inherent trouble—it can
tend to be arbitrary, unless we exercise caution. Trimming the range is a practical
requirement because we do not want extreme values to misrepresent the process. e
untrimmed range is so large that it is impractical. ere could be data outliers; one
would suspect wrong entry, or wrong computation of the percentage of design eort.
It is obvious that the very high values, such as 100% design eort, are impractical.
Because the data have not been validated by the team, we can cautiously trim and
“clean” the data. e 3rd percentile range seems to be clean, but it still has recognized
a high value of 91.616%, again an impractical value. Perhaps we can tighten the trim-
ming rule, say a cuto of 20% on either end of data. If we want such tight trimmings,
we might as well use the IQR, which has trimmed o 25% of data on either end.
Example 3.2: Analyzing Eort Variance Data from Two Processes
Let us take a look at eort variance data from two types of projects, following two
dierent types of estimation processes. e data are shown in Data 3.2.
A summary of range analysis is shown as follows: Estimation A and Estimation B
Estimation A Estimation B
Full range 86.482 63.640
Percentile range 69.466 45.181
IQR 19.995 8.075
e three ranges individually conrm that the second set of data from projects
using a dierent estimation model has less dispersion.
In both of the previously mentioned examples, we have studied dispersion from
three dierent angles. We have not looked into the messages derived from the extreme
values. Extreme values are used elsewhere in risk management and hazard analysis.
Range calculations approach dispersion from the extreme values—the ends—
of data. ese calculations do not use the central tendencies. In fact, these are
independent of central tendencies.
Next we are going to see expressions of dispersion that consider central ten-
dency of data.
BOX 3.1 CROSSING A RIVER
ere was a man who believed in averages. He had to cross a river but did
not know how to swim. He obtained statistics about the depth and gured
out that the average depth was just 4 feet. is information comforted him
because he was 6 feet tall and thought he could cross the river. Midway in the
river, he encountered a 9-foot-deep pit and never came out. is story is often
cited to caution about averages. is story also reminds us that we should
register the extreme values in data for survival.