CHAPTER 2

Probability Sampling

Did you ever go into a coffee shop and discover they were giving away samples of a new blend they were serving? Or how about going to a grocery store and finding a person in one of the aisles distributing hot samples of a new pizza bread snack in the frozen-food section? Beyond the obvious purpose of enticing you to buy the drink or food being offered, you probably never gave much thought to the small portion of the drink or food you were offered. Whether you would consider giving the merchant your hard-earned money for a full cup or a large package of pizza bread depends not only on your food preferences and whether you liked the taste but also on an underlying assumption that you make regarding the sample you were given—namely that it is representative of the gallons of that particular coffee blend that is brewed or that the large boxes of pizza bread snacks found in the freezer case are the same as the bite-size piece you were offered. Imagine the disaster for both merchants and customers if it were not. This little example illustrates the important elements that pertain to the kind of sampling done in conjunction with surveys.

If done properly, a sample can adequately replace the need to examine each item or individual in a population in order to determine whether a particular characteristic is present. For example, suppose you want to find out whether students who attend colleges that are affiliated with particular religious faiths have different attitudes toward the legalization of recreational marijuana than students who attend colleges that have no religious affiliation.1 It’s obvious that if you planned to answer your question by interviewing or providing a written survey to all the students in the country who attended both religiously affiliated and nonaffiliated colleges, it would be impossible owing to the cost and time required (along with a host of other practical considerations). Instead, by carefully selecting a manageable sample that actually represents students from both types of educational institutions, it is possible to answer the question. The key is in getting a sample of individuals that accurately represents all the individuals in the larger population that you’re interested in studying. Now here’s a little secret: the only way we can be absolutely certain that we have determined the true picture of the attribute that we are interested in studying, in this case attitudes toward marijuana legalization, is to ask each individual in the study population about her or his attitudes toward legalization, then count up, and report the responses that are given; this process is called enumeration. If we could actually pull this off, we might find, for example, that 86 percent of students at nonreligiously affiliated colleges favored legalization of recreational use, but only 63 percent of students at religiously affiliated schools were in favor of this legalization. We could be very certain that our findings were actually correct, and it really wouldn’t take any more complicated math than addition and division to get our percentages. But that creates a dilemma—we’ve already said that it is not possible to get information from each person for a variety of reasons. As a result, we are forced to draw a sample and be content knowing that there is some probability that our sample will miss the mark in terms of accurately representing the population. This fact has been the force driving two major fields of study regarding research, including survey research. One of these has concerned itself with methodology of sampling and problems owing to the way the sample was selected or drawn—which we term sampling error. The second area, also part of the larger field of statistics, has focused on determining mathematical probabilities of error and developing mathematical formulas to estimate true values with a variety of data.

In the next chapter we discuss sampling error and some ways to deal with it, but for now, we’ll simply look at different kinds of sampling approaches and see that how a sample is selected can make a big difference in how much confidence we can have that it actually represents the larger population we want to study.

A few years ago, a large state mental health agency approached one of us about improving the quality of a survey that it administered annually to individuals who were receiving community mental health services throughout the state. The survey was designed to capture the kinds of services used at community mental health facilities and clients’ satisfaction with those services. After reviewing the data and the process used to administer the survey, a number of problems came to light. One of the most serious of these was the manner of selecting clients to participate in the survey. The selection of participants and administration of the survey involved staff members handing the survey to every individual who came into a community mental health facility during one specified 2-week period during the year. Can you see any problems with selecting a sample in this way? If your answer included the idea that samples selected this way might not be truly representative of the population using community mental health services, you are right on target. This sample is essentially what is known as a convenience sample, and it is a nonprobability sample because it relies on individuals who happen to be available to take the survey. Essentially, individuals who did not pay a visit during the 2-week sample period had no chance of being included in the sample. Further, while the agency knew the total number of individuals receiving services over the course of a year, they had no way of determining how many would be in the sample beforehand. This is important because, as we’ll discuss later in the chapter, the size of the sample affects its ability to accurately represent the population from which it is drawn.

What Is a Probability Sample?

The basic idea of survey sampling emerged in Europe at the beginning of the last century,2 but a major theoretical underpinning for probability sampling seen today is commonly attributed to a Polish-born mathematician and statistician who presented a paper entitled “On the Two Different Aspects of the Representative Method: The Method of Stratified Sampling and the Method of Purposive Selection,” at the Royal Statistical Society in June 1934.3 Neyman’s influential work would help create an acceptance of probability sampling and shape the basic construction of probability samples that has occurred since. During the ensuing eight decades since his paper, the basic structure of probability sampling has been further developed and refined, but at its core are three assumptions: (1) that a body or frame of all units in the population can be created (this is called the sampling frame), (2) that a person or thing in the frame has a (positive) likelihood of being selected into the sample, and (3) that the likelihood or probability of being selected can be computed for each sampled unit. The advantage of probability sampling over nonprobability methods is not that it eliminates the possibility of picking a sample of individuals or units that doesn’t accurately reflect the population, but that it does allow us to estimate the amount of error that may be present in our findings. In other words, we can determine the probability or odds that we have accurately represented the population about which we’re trying to draw inferences. In more technical language, we can gauge the precision of our sample estimates. Thus, when you read that a survey found 46 percent of the public approve of the job the president of the United States is doing and the survey had a ±3-point margin of error, you can assume (all else being equal) that the actual percentage of public approval has a known probability of being within 43 and 49 percent. Traditionally, survey researchers have maintained that we cannot make this type of statement with nonprobability sampling methods because we lack the fundamental ability to calculate the probability that each population element will be chosen, and in fact, we cannot be sure that each individual or thing has a possibility of being selected. Stated in more technical language, with probability sampling we say that each element of the population element has a nonzero probability of being chosen. Thus, while a nonprobability sample may accurately mirror the population from which it is drawn, we don’t know whether it does or not.

Today, however, there are challenges to this standard as the six-billion-dollar, 19-year-old enterprise of online survey research with nonprobability samples continues to develop,4 with exploration in developing models for reducing the bias in nonprobability sampling.5 However, in the sphere of public opinion surveying, organizations such as the American Association for Public Opinion Research (AAPOR) have continued to discourage their members and others from attempting to draw inferences by means other than probability sampling.6

Returning to the example of the mental health clients’ survey, clients who had no visits during the 2-week sample period had no chance (a zero probability) of being in the sample. On the other hand, because the survey’s selection criterion was a time interval, individuals who visited community mental health centers many times during the year would have a greater likelihood or greater probability of being selected to take the survey. Making this selection approach even more troublesome was the fact that clients were asked to complete a survey each time they came to the community mental health center; thus a single client could have multiple surveys included in the sample. In fact, one individual had completed the survey 16 times during the 2-week sample period! The possibility that more frequent visits to the mental health centers might be associated with more chronic or severe mental health issues highlights the possible problems of this sampling approach: the greater likelihood that these individuals would be completing surveys might present a different picture of services than the larger population utilizing community mental health services.

Thus, in traditional probability sampling approaches, those samples that have the greatest likelihood of being representative are those in which the units (in this case people) have an equal probability of being selected. We term these equal probability of selection methods (EPSEM).

Simple Random Samples

There are many ways to obtain samples in which each person in a population has an equal chance of being selected for a sample; the most straightforward of these is called the simple random sample. Simple random samples work when every individual or thing in the population that is to be sampled can be identified. For example, while possible, can you imagine the work you would have to do to compile a complete list with contact information of everyone who had purchased a new Toyota in the past 5 years? However, if you had such a listing, you would be able to use simple random sampling in the same way that you would use it to select a sample of students from three introductory business classes. Once you have a listing of every element (people or things) in the population, you simply need a way to select the number of them you need for your sample in a truly random fashion. The key here is ensuring the randomness of the selection, which gives each element an equal chance or probability of being selected.

A simple random sample method with which most people in this country are familiar (although it is doubtful they think about it in that way) is the ubiquitous state or multistate lottery where five or six ping-pong balls are (hypothetically) randomly drawn from a universe of around 50 numbers that are represented by numbered ping-pong balls, which are randomly mixed in some type of rotating cage or swirling air container. Each numbered ball should have an equal chance of being selected from the hopper on a given draw except, of course, those already selected, which would entail a different kind of sampling.i The objective is to correctly choose the five or six numbers that are represented on the ping-pong balls before they are drawn, which results in winning an enormous jackpot, usually worth millions of dollars. Individuals paying to play the lotto hope to preselect the numbers drawn and frequently invent mysterious and unfathomable methods to choose the right numbers. In reality, because the drawing represents a simple random sample of all the ping-pong balls in the lotto universe, they could simply pick any sequence of numbers, say one through five, and they would have just as good a chance of winning as using some supposed system of getting their lucky numbers.

Unfortunately, not all probability sampling methods are as easily done as drawing a simple random sample of ping-pong balls in a lottery or pulling random numbers on strips of paper from a hat. After the introduction and widespread acceptance of probability sampling, but before researchers had the power of today’s computers and statistical software at their fingertips, statisticians and researchers devoted considerable energy to finding consistently reliable ways to randomly select samples. One of the centerpieces of most of the methods was the random number table, which dates back to the early 1900s.7 When used in conjunction with beginning and ending points, a table of random numbers allows researchers to select enough random numbers to draw a desired sample. In a book chapter published in the early 1980s, Seymour Sudman reviewed the development of random numbers in sampling. In his discussion, he talks about the lengths undertaken to develop massive random number tables.

The most convenient and most accurate procedure for obtaining a random process is through the use of tables of random numbers. The largest printed table is A million Random Digits by the Rand Corporation (1955) . . . . The Rand random digits were generated by an electronic roulette wheel. A random-frequency pulse source passed through a five-binary counter and was then converted to a decimal number. The process continued for 2 months and even with careful tuning of the equipment the numbers produced at the end of the period began to show signs of nonrandomness indicating that the machine was running down.8

With the introduction of random number tables, the idea of randomly selecting numbers became much easier; however, particularly when needing larger samples, the process could become cumbersome and tedious. A variant of the simple random sample is the systematic sample, which generally maintains the randomness of the selection process but is easier to use when a very large sample is needed and is often easier to execute without mistakes when manually completed.

Systematic Sampling

Like simple random sampling, systematic sampling starts with a listing that is an identification of the units making up the population. The systematic approach usually starts by first determining the size of the sample needed. For illustration, let’s say a researcher is conducting a survey and will require 200 individuals in the sample selected from a population of 2,000 individuals. A calculation is then done by dividing the sample size into the population to determine the interval between individuals to be drawn from the population to complete the sample. Every nth case is then selected, which, in the present example, would be every 10th individual. To inject randomness into the process, the key to the systematic selection is to randomly select a startingpoint number. Again, random number tables become a handy tool to find this starting point. So, if we selected 7 as our starting point using the random number tables, we would draw the 7th individual on the list, the 17th, 27th, 37th, and so forth. In some cases, the systematic selection even allows us to select the cases without numbering them. For example, suppose a medical facility wanted a survey of patients who had been diagnosed with cancer in the past 2 years.9 The medical facility is interested in this because it had installed a new imaging scanner a year earlier and it wanted to determine patients’ perceptions of the diagnostic process before and after the installation of the new equipment. If the medical researchers used a systematic sample with the same parameters previously mentioned, that is, a total of 2,000 cases and a sample of 200 patients, they could simply identify the starting point, say the 7th patient in the case records, then select every 10th patient for the survey. Using the case records, the researcher could pull up the case file of every 10th patient after the 7th and contact that individual about participating in the survey. However, a note of caution should be voiced here about the systematic sample selection process. If the data set for the population is already sorted by some characteristics (sometimes unknown to the researcher), it can seriously bias the sample and wreak havoc on the study. Suppose in our example that the medical case records had been arranged not alphabetically by the patients’ last names but by the date of their diagnosis (this would be highly correlated with whether they were diagnosed with the old equipment or the new imaging scanner). Can you see a selection problem here? The influence of the present filing system on the type of cancer would bias the sample and have serious implications for a survey dealing with diagnostic procedures.

Before moving on to more complex ways of selecting samples, a couple of points are worth noting. First, while some statistics and survey books still contain random number tables, today’s students and researchers have a much easier time using random numbers to select samples because many of today’s statistical analysis programs incorporate random number generators (subroutines) capable of generating random numbers for simple random and systematic sampling and even more sophisticated sampling designs. However, random number tables are still useful tools, and a few statistics and methodology texts continue to include them and provide instructions on their use.10

Second, despite the power of the computer to process samples, several problems may still make it impractical or impossible to carry out the process needed for the simple random or systematic selection of individuals from a population. Fortunately, statisticians and survey methodologists have developed a number of sampling designs that allow us to overcome many of these problems and still maintain the integrity of the probability sample. In the following section, we briefly review some of these approaches.

Stratified Sampling

Let’s suppose that a large software company based in California with an extensive internship program is interested in how interns of various racial and ethnic backgrounds perceive the value of their internship with the company, such as whether interns believe they have been treated well and whether they would be interested in seeking full-time employment with the company based on their internship experience. There are 3,060 interns currently with the company, and the company’s human resources department estimates that it will have time and resources to conduct interviews with about 306 of them. The breakdown of the race and ethnicity of the interns is as follows:

  • 1,200 Caucasians (39.2 percent)
  • 660 Chinese/Southeast Asian (21.6 percent)
  • 540 East Indian (17.6 percent)
  • 240 Latino/Latina or other Hispanic (7.8 percent)
  • 180 African/African American (5.9 percent)
  • 120 Middle Eastern (3.9 percent)
  • 120 Other ethnic groups (3.9 percent)

If we assume that the individuals of various racial and ethnic backgrounds would appear in a random sample in the same proportions as they appear in the population, then we would expect to see the following racial and ethnic distribution in our sample of 306 interns:

  • 120 Caucasians (39.2 percent)
  • 66 Chinese (21.6 percent)
  • 54 East Indian (17.6 percent)
  • 24 Latino/Latina or other Hispanic (7.8 percent)
  • 18 African/African American (5.9 percent)
  • 12 Middle Eastern (3.9 percent)
  • 12 Other ethnic groups (3.9 percent)

However, because of the small proportion of certain racial and ethnic backgrounds, it is quite possible that some of the groups might have very few or no interns selected in a simple random sample owing to sampling error. This is particularly problematic in our illustration because the researchers are primarily concerned with perceptions by the different racial and ethnic groups. To overcome this problem, we can use a technique termed stratified sampling, which works particularly well when we have subgroups within the population that are of very different sizes or small proportions of the population. By dividing the population into homogenous groups or layers called strata and then sampling within those strata, we reduce sampling error. In this example we would have seven strata or groups. Once we have the stratum identified, we can then use simple random sampling to select individuals within each stratum.

There are two types of stratified samples: proportional and disproportional. In proportional stratified random sampling, the size of each division or stratum is proportionate to the population size of the strata. This means that each stratum has the same sampling fraction. In our illustration, there are 180 African American interns and 120 Middle Eastern interns, which are 6 and 4 percent of the total number of interns, respectively, so if our stratified sample is proportional, we would randomly select 18 interns from the 180 African American intern group and 12 interns from the Middle Eastern intern group. On the other hand, if we use a disproportionate stratified sampling method, the number of individuals from each stratum is not proportional to their representation in the total population. Population elements are not given an equal chance to be included in the sample (recall the previous EPSEM discussion). Therefore, while this allows us to build up or oversample the individual numbers in each stratum, which otherwise would have low numbers of individuals, it creates a problem if we’re trying to generalize back to a population. Suppose in our example we sample disproportionately so that we have approximately 44 interns in each sample. In that case, the responses given by Latino/Latina/Hispanic, African American, Middle Eastern, and our Other category of interns would be overrepresented in the responses, while the responses of Caucasians and Chinese/Southeast Asian and East Indian interns would be underrepresented. To compensate for this, we would need to weight our stratum responses back to the proportions of the strata seen in the populations.

In our example here, we would likely be interested in comparing our strata or conducting what is termed a between-stratum analysis.ii This would permit us to compare responses on the survey interviews from each of our seven strata against one another.

Cluster Sampling

Another form of sampling that also uses grouping of individuals in the process is called cluster sampling. Because both stratified sampling and cluster sampling use groups in their process, they are frequently confused. Recall that a stratified sample begins by placing individuals into groups based on some characteristic, such as race and ethnicity, marital status, religious preference, and so forth. In cluster sampling, we begin by first randomly selecting a sample of some naturally occurring or known grouping. For example, we might create a cluster sample by randomly selecting a group of outlet malls. We then take all units from each of our randomly selected clusters for our sample. Thus, we might select all the stores from our randomly selected group of outlet malls. This approach is particularly useful when there is no satisfactory list of individuals or things in a larger population that we want to study and no way to get at the population directly, making it impossible to draw a simple random sample. To illustrate the process, let’s consider using this approach to solve the problem of an inability to get a listing of individuals in a population. Suppose the National Collegiate Athletic Associate (NCAA), responding to growing concern with the rising problem of its athletes getting concussions while playing, decides to survey NCAA school athletes. The NCAA thinks that a survey of players would be good to determine how aware players were of the problem of concussions, if they had ever suffered a concussion, and if they had suffered any longer-term effect from a competition-related head injury. Because of the large number of college sports and players, the NCAA decides to start by first conducting surveys of athletes in two sports with higher probabilities of concussions: football and soccer. It contracts with a university to design and conduct two surveys, one for each sport. The league tells the university that it is very difficult to get a single listing of all current players across NCAA football and soccer players from which to pull a simple random sample. This is not an uncommon problem even with well-defined populations such as college sports teams; so, can you imagine the struggle to identify all the members of a less well-defined group such as aerospace workers or the residents of a particular country! Because of this problem, the researchers decide to use cluster sampling. Just as with stratified sampling, every member of the population can be a member of one, and only one, group or cluster—in this case one NCAA college or university. The first step is to identify known or accessible clusters, so in our example, the researchers will start by listing NCAA schools (because they are identifiable) across the country; then using a random selection method (described earlier), they will choose a certain number of schools that are the clusters from which individual athletes will be drawn. Basically, the researchers would ask for the team rosters from each of the selected schools for each of the two sports in order to produce its final two samples of athletes who will receive a survey.

We can extend the basic ideas of clustering, stratification, and random selection to create much more complex designs to deal with specific issues that might present themselves in sampling. Such designs are commonly referred to as multistage sampling. With multistage sampling, we select a sample by using combinations of different sampling methods. For example, because of the large number of student athletes on NCAA college football and soccer teams, the researchers may decide that it’s too costly to send a survey to every athlete at schools in the cluster samples. They might then propose a two-stage sampling process. In Stage 1, for example, they might use cluster sampling to choose clusters from the NCAA college and university population. Then, in Stage 2, they might use simple random sampling to select a subset of students from the rosters of each of the chosen cluster schools for the final sample. As long as every individual (in this case players) can be attached to one of the groups, this sampling approach works very well.11 As you can see, it is quite easy to include additional clustering criteria in the design.

How Do We Select the Right Sample Size?

As Arlene Fink points out, “The ideal sample is a miniature version of the population.”12 For those pursuing this ideal, there is always a trade-off in considering sample sizes—what is optimal and what is practical? In sampling methodology terms, the researcher must decide how much sampling error he or she is willing to tolerate, balanced against budget, time, and effort constraints.

When talking about sampling needs and sample sizes, it is important to keep in mind that sample size is based on the number of individuals who respond to the survey, not the number of individuals who initially receive the survey. Thus, the response rate needs to be taken into consideration when the sample is being selected. For example, if you have a 1 in 15 response rate, it means that for a sample of 300 individuals you need to start with an initial sample group of 4,500. If your response rate is 1 in 20, your initial sample would need to be 6,000 individuals, so you can see how changing the response rate by only a small amount can have a fairly large impact on the number of individuals you need in your initial sample. This problem has become readily apparent in the realm of telephone surveys, which up to the early 2000s was one of the most predominant ways of collecting survey data. For a variety of reasons, telephone survey response rates are now in the single digits. To illustrate, a study, conducted by the Pew Research Center and presented at an AAPOR conference in 2012, noted, “The response rate of a typical telephone survey was 36% in 1997 and is just 9% today.”13

Before continuing to explore sample size determination, let’s revisit some fundamental ideas about drawing inferences about a population from a sample. Recall that the only way we can be absolutely certain we have accurately captured a population attribute is to determine the value of the characteristic from every individual or thing in the population. So if we want to determine the average age of an alumni group and be 100 percent certain we are correct, we would need to ask each person’s age, then sum up the responses, and divide by the number of individuals, which would give us an arithmetic average (mean). For the sake of illustration, let’s say that we have 15,000 alumni who range in age from 19 to 73 years and the average age (mean) of the population is 34 years. But, as is the case with most surveys in the real world, we have neither time nor resources to get the information from 15,000 alumni, so we will draw a sample and use the average age of the sample to estimate the average age of the population. For a little drama, suppose we only ask one person’s age, and that person’s age is 26 years. This would be a point estimate of our population average age. Simply put, if we use that one-person sample to estimate the average population of the entire alumni group, we will be wrong by quite a bit! In fact, if we repeated our rather absurd sampling of one individual many times over, the only time our sample estimate of the population parameter would be correct is when our sample picked an individual who was exactly 34 years old. This would be a fairly rare event given there is likely to be a very small percentage of the alumni population exactly 34 years old. So our sampling error will be very high. Now, suppose that we selected five alumni and determined their average age. You would still have a high likelihood of being off (i.e., have a large sampling error) from the population’s true mean of 34, but the chances of getting an accurate estimate would slightly improve over the approach where you asked only one person. This is due to the number of different combinations of ages among five individuals. In other words, there is less sampling error with large samples. Stated another way, our sample of five individuals more closely resembles the distribution of age in the population than a sample of one. By contrast, if any of the following three different samples of five individuals were pulled from the population, each would produce an accurate estimate of the true average age of the alumni population:

Group 1: 20, 30, 30, 40, and 50

Group 2: 25, 28, 28, 32, and 57

Group 3: 30, 32, 34, 36, and 38

You may have also noticed that if you had pulled the same 15 individuals as repeated single person samples, you would have seen an accurate estimation of the population’s average age only once, while 14 of your samples would have yielded an incorrect estimate. You can see how as you increase your sample size, you increase the probability of accurately mirroring or representing the larger population.

Confidence Intervals and Confidence Levels

If you went to a statistician to ask for help in determining the best sample size for a study you had planned, chances are the statistician would ask you several questions relative to sampling error and the precision of the sampling estimates you would like to have. The two primary questions in this regard would be “What confidence interval would you like to use?” and “What confidence level are you comfortable with?” A confidence interval is the range of values around the true population value within which the estimates from our samples are expected to fall a specific percentage of the time. It is also commonly called the margin of error. That is why when survey results are reported, they are usually accompanied by a disclaimer about the margin of error (recall the example of the president’s approval earlier in the chapter). The statistician would also ask about the confidence level to describe uncertainty associated with the interval estimate. For example, the confidence level might be the 95 percent confidence level. This means that if we used the same sampling method, we could potentially create an infinite number of different samples and compute an interval estimate for each sample. We would expect the true population parameter to fall within the interval estimates 95 percent of the time. In survey research, confidence levels are commonly set at the 90th, 95th, or 99th percentile. In the first case, this would mean that you can expect your sample to contain the true mean value, such as average age, within your confidence interval 90 percent of the time. Conversely, 10 percent of the time your sample interval estimates would not contain the true mean. With the 95th percentile, you would reduce the probability of being wrong to 5 percent, and at the 99th you would reduce it even further to 1 percent.iii The effect of increasing the confidence level is that the confidence interval becomes wider if the sample size stays constant. One way to counteract that is to increase the size of your sample. While the actual calculations for establishing confidence intervals are outside the scope of the discussion here, many good introductory statistics books will take you through the process. There are also a number of sampling calculators available online, which, once you provide the size of the population, confidence level, and confidence interval parameters, will give the sample size you will need for your survey.iv These sample size calculators also will let you see how adjusting the confidence interval and confidence level affects your sample size.

One common mistake that people who are not familiar with survey sampling make is to assume that the size of samples must be proportional to the size of the populations. That is, while small samples work well for small populations, very large samples must be obtained for very large populations. However, the following illustrates how as the population grows, we can still obtain reliable estimates with samples that are modestly bigger. Let’s say we are conducting a survey to determine whether voters approve of the president’s performance. For the sake of simplicity, let’s say we start with the assumption that half will approve and half will not (which would result in the largest sample size because this is the maximum variation possible). If we use a standard 5 percent margin of error and a confidence level of 95 percent (both of which are commonly used), our sample sizes would vary across different population sizes as follows:

 

Size of population

Size of sample needed

               200

132

           2,000

323

         20,000

377

       200,000

383

    2,000,000

385

200,000,000

385

With a confidence interval of ±5 percent at the 95 percent confidence level.

As you can see, initially our sample size needs to increase significantly as the population gets bigger; however, after a point, the increase in the sample size essentially flattens out, as the population increases by a factor of 10 and finally by a factor of 100.

This is the beauty of probability sampling; if done correctly, we can use rather small representative groups or samples to accurately estimate characteristics of large populations!

Before turning our attention to survey error in the next chapter, a final note on probability sampling is in order. As you may have noted, we have not explored nonprobability sampling methods in this chapter. We chose to omit this because of space limitations; the unevenness of some of the emerging approaches;v the problems of demonstrated bias in nonprobability samples, which has continued to cause problems for these non probability models; and the fact that probability sampling has been a tested framework since its inception in the 1930s, chiefly because of its power in generalizing back to a larger population. Recently, however, there has been increased concern about some of the problems facing probability-based sampling designs as rapidly evolving technology alters the landscape of communication essential for conducting surveys. Landline phones are disappearing as cell phone-only households become commonplace and electronic communication such as e-mail and text messaging is substituted for hard copy. New messaging platforms that integrate rapid transmission of written material, video and still images, and audio are emerging on an almost daily basis. Recent development of address-based sampling frames has allowed researchers the ability to use probability sampling of addresses from a database with near universal coverage of residential homes. The web has become an easily accessible and inexpensive tool for survey delivery, even though many web applications use nonprobability sampling methods, such as survey panels, and are therefore suspect in terms of generalizing back to a larger population of interest. With these new technologies come sampling problems that affect the reliability and validity of probability sampling methods when they are simply layered over designs created around different data collection methods. The creation of new platforms for survey delivery requires an examination of alternative approaches.14 Recently, the AAPOR, one of the most respected professional organizations in this area, commissioned a task force to “examine the conditions under which various survey designs that do not use probability samples might still be useful for making inferences to a larger population.”15

The massive amount of data collected through Internet enterprise has even offered what some see as the elimination of the need to do surveys at all. The ability to collect, store, and analyze so called Big Data clearly offers opportunities to look at the relationships between variables (topics of interest) on a scale of populations rather than samples. In so doing, many of the concerns about sampling and sampling error presumably fall away. The use of Big Data also shifts much of the focus in analytic process away from concentrated statistical efforts after data collection is complete to approaches centered around collecting, organizing, and mining of information, “the fundamental challenge in every Big Data analysis project: collecting the data and setting it up for analysis. The analysis step itself is easy; pre-analysis is the tricky part.”16 However, critics such as Tim Hartford point out that in the rush to use, and sometimes sell, Big Data approaches, proponents sometimes present “optimistic oversimplifications.”17 Hartford is particularly critical of the notion that theory-free data correlations obtained from Big Data can tell us what we need to know without the need to examine causality further and that those large data sets somehow remove all the analytic pitfalls that are seen in smaller data sets.

Clearly, the changing landscape of electronic data collection will impact sampling methodologies used in surveys and create new approaches to get information about a population. Online surveys, opt-in panels, and the use of Big Data techniques serve as examples of the drive to capitalize on electronic data collection and storage capabilities offered through the Internet. However, the technology alone does not automatically improve our ability to make valid and reliable inferences about a larger population. In Chapter 4 (Volume II) we discuss the impacts that technology is having on the survey research enterprise as well as the strengths and weaknesses of the new approaches mentioned above.

Summary

Note: In this summary we use the term individuals broadly to refer to the units or elements, which can be people or other individual entities, making up the population from which we draw samples.

  • To collect information from a target population, well-done sampling can replace collecting information from each individual in that population (a process called enumeration). Sampling is more practical because it reduces time, effort, and costs needed to gather information.
  • There are two basic types of sampling—probability and nonprobability.
    • Nonprobability may be used in some cases, but it has a major limitation: it doesn’t allow us to judge how well our sample reflects the larger population about which we are trying to draw inferences. Essentially, we cannot determine sampling error with nonprobability samples.
    • While probability sampling doesn’t eliminate the possibility of picking a sample of individuals that doesn’t accurately reflect the larger population, it does allow us to calculate how much error might be present.
  • Probability samples have three fundamental elements:
    • A group or frame of all individuals in the population can be created. This is termed the sampling frame.
    • Each individual in the population has a positive chance of being selected into the sample.
    • The probability of an individual being selected can be computed for each individual in the population.
  • Simple random sampling is a basic type of probability sampling that uses techniques to randomly choose individuals from the population for the sample. Each individual in the larger population has an equal chance of being selected for the sample. We call the methods that have this characteristic as equal probability of selection methods (EPSEM).
  • Systematic samples are also EPSEM samples and are very similar to simple random samples but differ in their approach of selecting the sample. Systematic samples use a process where the sample is formed by dividing the number of individuals needed for the sample into the number of individuals in the population to determine an interval between individuals for selection purposes. A random number located within the first interval is selected as a starting point, and then every subsequent nth individual is selected until the number of individuals needed for the sample is reached.
  • Stratified sampling is a more complex sampling strategy that works well when there are subgroups within the population that are of very different sizes or are very small proportions of the larger population. Strata are formed by dividing the population into homogeneous groups or layers, and then sampling is done within those strata. This reduces sampling error. We might, for example, create strata based on the racial and ethnic backgrounds of interns in a large company in order to ensure that certain racial/ethnic subgroups are not missed because they make up such a very small part of the intern population.
  • Cluster sampling is another form of more complex sampling, which also uses a grouping technique like stratified sampling. With cluster sampling, however, the groups are formed by using some population (known or naturally occurring) characteristic like high schools or organizations such as businesses.
  • Multistage sampling extends the basic ideas of stratification, clustering, and random selection to create much more complex designs to deal with specific issues that might present themselves in sampling. When engaging in multistage sampling, we simply break our sampling design down into separate stages, sequencing one method after another.
  • Selecting the right size for a sample is a bit of an art and a bit of a science. There is always a trade-off in considering sample sizes—what is optimal and what is practical.
    • It is important to keep in mind that the size of the sample refers to the number of individuals who respond to the survey, not the number who initially receive the survey.
    • Response rate refers to the proportion of individuals who respond out of the number of individuals who are initially selected and asked to participate. For example, a 1-in-15 response rate means that 1 out of every 15 individuals asked to participate actually completed a survey.
    • Confidence intervals, also called the margin of error, refer to the range of values around the true population value within which our samples are expected to fall a specific percentage of the time.
      • Confidence levels reflect the amount of time that we can expect the values (estimates) derived from our samples to be in error. In social science research, confidence levels are typically set at the 90th (10 percent error), 95th (5 percent error), or 99th (1 percent error) percentiles.
      • Increasing the confidence level without increasing the sample size widens the confidence interval.
  • One of the common mistakes that people not familiar with surveys make is assuming that the size of samples must be proportional to the size of the population. In reality, after population sizes reach a certain level, the sample size needs to increase only a small amount even if the population increases by magnitudes of 10 or 100 or more.
  • This is the beauty of probability sampling; if done correctly, we can use rather small representative groups to accurately estimate characteristics of large populations.

Annotated Bibliography

  • There are a number of good books and articles on research sampling, from very technical presentations to general introductory discussions.
  • A good general introduction to sampling can be found in Earl Babbie’s classic The Practice of Social Research.18
  • The Handbook of Survey Research edited by Peter H. Rossi, James D. Wright, and Andy B. Anderson provides a comprehensive look at survey methodology and data. Chapter 2, “Sampling Theory,” by Martin Frankel,19 provides the underlying foundation for survey sampling, and Chapter 5, by Seymour Sudman,20 covers the theory of sampling and different sampling approaches.
  • If you haven’t had statistics or are new to the idea of sampling, some of the self-help websites such as Stat Trek provide overviews to survey methodology on topics such as sampling.21
  • For insight into more advanced sampling techniques such as proportionate stratified sampling and more complex disproportionate stratification methods such as disproportionate optimum allocation, see Daniel’s Essentials of Sampling.22
  • Chapter 2 focuses on probability sampling. Other nonprobability types of sampling are discussed in most statistics or survey research methodology texts. Advances in technology are resulting in rapid change in survey sampling and data collection, such as the use of online survey panels. These methods have come under increasing scrutiny because of questions surrounding their ability to produce representative samples. Information on a recent examination of these by the American Association for Public Opinion Research is available.23 A nice review of recent advances and how methodology can be improved is provided by Engel.24

iThe typical sampling done in lotteries is termed sampling without replacement because once a numbered ball is selected, it is taken from the pool, so it cannot be selected again. If it were returned to the pool of balls, the sampling approach would be termed with replacement. The odds of correctly choosing the entire sequence of five or six balls needed to win the lottery changes dramatically under the two methods.

iiTo do so, we would use a balanced allocation (also termed as factorial allocation), so we would select strata with an equal number of interns in each. Since we have limited our study to 306 individuals and we have seven strata, we would disproportionately sample so we had 44 interns in each stratum.

iiiYou may have heard the term significance level when we are reporting results of statistical tests for differences between individuals or groups. The significance level is set by the confidence level, so when a researcher reports that there were significant differences between the average ages of the two groups at the p ≤ 0.05 level, what the person is really saying is that the apparent differences in average age between the groups could be due to chance 5 percent of the time or less.

ivSee for example, online sample size calculators provided by Creative Research Systems at http://www.surveysystem.com/sscalc.htm, or the one offered by Raosoft Inc. at http://www.raosoft.com/samplesize.html

vSome of the new approaches driven by technology are discussed in Chapter 4 (Volume II).

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.188.131.255