Once you have data, there are various things you may want to analyze about it. One
vital reason for wanting to use numerical data is that it is easier to analyze, and
there is more you can do with it.
-
Analyze how many of each category of scores there are (e.g. how many 6s).
-
Ask what the score of the average employee is.
-
Ask how much spread there is in performance between the different employees.
-
Compare relationships between different bits of data.
2, 3, 5, 5, 6, 6, 6, 7, 8, 10
We see two major things about the data already:
-
It is data that seems to run from low to high with a fair number of options.
-
It has a center; that is where the middle of the data is situated. With the naked eye you would probably
place the center at about 6.
-
It has a spread, in other words it is spread out on either side of the center.
There is more we could ask: what kind of data is it (compared, say, to an appraisal
system where employees are ranked only on a system of “Below Average,” “Acceptable”
and “Above Average”)?
There are therefore multiple aspects or characteristics of numerical data that we
need to understand – these characteristics and our understanding of them have a fundamental
impact on our ability to analyze them as well as our understanding of the analysis.
As we have seen, each variable is measured across multiple observations, such as a
survey question measured across many employees. Because the variable is measured across
many observations, which can differ in their response to the variable, there are a
range of measured responses to each variable. This brings up three very important
characteristics of variables:
-
Type of variable. This depends on what was being measured, and fundamentally affects the type of data
analysis you can use the variable in.
-
Centrality: This is the most representative response on the variable across the whole sample.
-
Spread: This represents the range of responses across the sample.
Let us explore these three characteristics of data more thoroughly.