There are a great number of other distributions. This book teaches only a few. The
important thing for the starting learner is to understand the basic concept of distributions,
and to know that they affect the types of statistics that can be done. Some common
examples include:
-
The lognormal distribution (see
Figure 7.10 Histogram of enquiries from the univariate module) is similar to the normal distribution except that most of the data clusters around
a relatively low median, and there are a few very large observations that make the
right tail of the distribution stretch out. This is important in business because
many finance and other business variables have this distribution. Examples of typically
lognormal variables include firm sizes, employee tenure, many financial variables
(e.g. stock returns), and countless more. If your data are lognormal in shape like
that in
Figure 7.10 Histogram of enquiries from the univariate module then you must take this into account in your approach to the statistics. Of course,
this distribution also tells you that you have big outliers and cannot trust the average
as the central measure necessarily.
-
Bimodal distributions. Some distributions have two peaks of frequently found data. These are called bimodal
distributions. Take the example of
Figure 7.9 Bimodal data with peaks at low and high scores, where a 5-point Likert-type scale was asked. This only allows for 5 data points
(corresponding to Strongly Agree, Agree, Neutral, Disagree, Strongly Disagree). If
most people either strongly agree (a score of 1) or strongly disagree (a score of
5), and far fewer answer in the middle, then we get a bimodal distribution like that
in
Figure 7.9 Bimodal data with peaks at low and high scores.
-
Many others. There are many other distributions with well-known properties that analysts
look for and find in data.
For the non-statistician, knowing all the distributions that statisticians look for
and study is unnecessary. However, if you are commissioning statistics from a research
firm or statistician you may want to see the distributions of your key variables first.
Make sure that if your important variables are strangely shaped, you address this
with the analyst and find out whether he or she is aware of it and is prepared, if
necessary, to act on it. If you are reading statistical reports, you may also find
references to distributions.
Computer programs can typically generate graphs that help you examine the shape of
data. Histograms such as that of the variable enquiries in
Figure 7.10 Histogram of enquiries from the univariate module below are good options. This gives the number of customers in various ”levels of
average enquiries” bands and, because you asked for it, a line showing the normal
distribution for comparison (it seems clear this variable is relatively normally distributed).