Why Sample?

Early in Chapter 2 we noted that we analyze data because we want to understand something about variation within a population or a process, and that more often than not we work with just a subset of the target population or process. Such a subset is known as a sample, and in this chapter we focus on the variability across possible samples from a single population.

In most cases it is impossible or impractical to gather data from every single individual within a population. In Chapter 2 we considered several sampling methods. At that time, we also presented the following sequence that describes the situation in many statistical studies:

  • We're interested in the variation within one or more attributes of a population.

  • We cannot gather data from every single individual within the population.

  • Hence, we choose some individuals from the population and gather data from them in order to generalize about the entire population.

We gather and analyze data from the sample of individuals (that's what we call the group of individuals we chose) in place of doing so for the whole population. The particular individuals within the sample are not the group we're ultimately interested in knowing about. We really want to learn about the variability within the population and to do so we learn about the variability within the sample. Before drawing conclusions, though, we also need to know something about the variability among samples we might have taken from this population. While many potential samples probably do a very good job of representing the population, some might do a poor job. Variation among the many possible samples is apt to distort our impression of what's happening in the population itself.

Ultimately, the trustworthiness of statistical inference—the subject of all chapters following this one—depends on the degree to which any one sample might misrepresent the parent population. When we sample, we run the risk of being misled by a misrepresentative sample, and therefore it's important to understand as much as we can about the size and nature of the risk. To a large degree, our ability to analyze the nature of the variability across potential samples depends on which method we use to select a sample.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.144.187.55