Sometimes the data you are using
is measured in hard-to-interpret ways. For example, you may ask twenty
questions in a survey all designed to assess stress, and each question
may be measured on a 1-5 scale. You may then combine all these stress
answers for each person into an overall stress index. What does this
index really mean though? You can tell if a person is more or less
stressed by their position on the scale, but if I tell you that average
stress is a score of 67, what does this really mean? If someone improves
from 67 to 62 how good is this really? Such a scale, measured as it
is in arbitrary units, is hard to interpret directly.
In other cases, statistical
analysis gives us some statistical estimate that, by its own nature,
is difficult to interpret directly.
In these cases, statistical
packages often give us standardized values for the size of a statistic.
Standardized (sometimes called normalized) statistics are very useful
in these cases, because they translate something that may be hard
to interpret into a standard scale with well-understood properties
and cut-offs.
For instance,
Figure 12.2 Example of a standardized (normalized) statistic shows an example
of a statistical output called the “Mardia measures of multivariate
kurtosis” (don’t worry about the terminology for now).
These measures assess the extent to which a set of variables are normally
distributed, not only individually but as a whole set.
As seen in
Figure 12.2 Example of a standardized (normalized) statistic,
the raw statistic assessing the normality of the variables is called
the “Mardia Multivariate Kurtosis”. Technically, this
Mardia score needs to be zero for perfect normality. But it is not
– in the example the score is 1.65. The question is whether
this is too far from zero? What should we conclude: does our statistic
tell us that the variables are not normally distributed because 1.64
is not zero? Or is it close enough to zero to conclude that the variables
are normally distributed? Unfortunately we cannot really tell, because
the raw Mardia score has no recognizable cut-offs.
Instead, SAS provides
you with a ”Normalized Multivariate Kurtosis” score,
which is standardized onto a scale that does have a known cut-off.
On this particular standardized scale, anything bigger than 3 is usually
considered potentially big enough that it may be far from zero. Here
we see that the Normalized Multivariate Kurtosis score is only .68,
which is smaller than 3, so we would probably conclude that our variables
are normally distributed. It is the provision of an extra standardized
statistic that enabled us to make this conclusion, because the original
raw Mardia statistic was hard to interpret.
Note that this is only
one of many examples. Many other examples exist of standardizing statistics
so that the relative size can be ascertained. As you will see in the
next two chapters, certain more advanced measures of association between
variables come with standardized values so that they can be read as
though they are correlations, because we understand correlations well.
You should always try
to use raw, unstandardized values for statistics if you are able to
assess them in their units, because it is the raw values that reflect
the actual impact of the statistic, as discussed earlier under the
magnitude discussion. Standardized values are very useful as extra
analyses and especially useful if you cannot use the raw statistics.
Each time you use a standardized statistic, you need to look up the
particular rules for what a ”big”’ or ”small”’
standardized value is in the context of that statistic.