10

The Basics of Experimental Research in Media Studies

Glenn Leshner

ABSTRACT

This chapter discusses the basics of experimental design used in media studies research. Many scientific researchers believe, with good reason, that the controlled experiment is perhaps the best research method for determining a cause and effect relationship between variables. The chapter considers the types of research questions experiments can address, how the experiment can demonstrate causal relationships between variables, common experimental designs, data collection and interpretation, and ethical considerations. A major part of the chapter deals with internal validity pitfalls, such as confounds, randomization, and how to determine acceptable sample sizes. It also considers what may be regarded as controversial issues in experimental research, such as generalizability (or lack thereof), how to create treatment variance, the issues involved with creating message variance, and handling the issues of design complexity and manipulation checks.

This chapter discusses the basics of experimental design used in media studies research. Many scientific researchers believe, with good reason, that the controlled experiment is perhaps the best research method for determining a cause and effect relationship between variables. And yet, experimental research is relatively rare in mass communication studies, especially compared to other quantitative methods such as surveys and content analyses. Thorson, Wicks, and Leshner (2012) report that only 12% of social scientific manuscripts submitted to Journalism & Mass Communication Quarterly in 2008–2009 were based on experiments. Grabe and Westley (2003) reported only 9% to 13% of studies in mass communication use experimental methods. Given that most of the theories in scientific mass communication research involve some sort of causal relationships between variables, why is the controlled experiment so rarely used as a research methodology? There are likely many reasons why this is the case, including the perception that controlled experiments are difficult to do well. Another may rest on misconceptions some in our field have about several key issues that must be considered and resolved when conducting experiments. This chapter is designed to address both of these concerns.

The chapter considers the types of research questions experiments can address, how experiments can demonstrate causal relationships between variables, and common experimental designs. A major component of this examination relates to internal validity pitfalls – such as confounds, randomization, and how to determine acceptable sample sizes – which are discussed in detail. The chapter also reflects on what some would consider more controversial issues in experimental research, such as generalizability (or lack thereof), how to create treatment variance, the issues involved with creating message variance, and manipulation checks. The ultimate goal here is to help students conduct controlled experiments that contribute to our understanding of mass communication processes and effects.

Before discussing the particulars of controlled experiments, it is important to note that a student attempting to conduct an experiment in the absence of compelling theory is likely to become frustrated, if not confused, during the process. The activity of designing and executing a controlled experiment involves many decisions, each with its own trade-offs. The benefits of having a carefully developed theory to help guide the process cannot be overstated. Many, if not most, decisions that need to be made – including those having to do with stimuli design, controls, samples, and so on – can be shaped, either directly or indirectly, by the theory the researcher seeks to inform.

Logic of the Controlled Experiment

The philosophy of the experimental method is to determine cause and effect relationships between two or more variables and to eliminate all possible alternative explanations for the relationship. How is this accomplished? Controlled experiments require three elements that generally are not a part of other research methods: (1) manipulation of an independent variable, (2) random assignment to condition (sometimes called “treatment”), and (3) strong control of sources of variation in the dependent variable that are not due to the manipulation.

A manipulation is an independent variable in which the researcher creates the variance (or levels) of the variable. In media studies, a manipulated independent variable is often operationalized as media that vary on particular characteristics. For example, Grabe and Samson (2011) were interested in how participants would evaluate and remember what they called “anchor grooming.” For their experiment, they created two versions of a TV newscast with a female anchor – one that was “sexualized” and one that was “unsexualized.” The sexualized condition showed the anchor in a tight-fitting jacket and skirt, wearing bright red lipstick and a necklace. The unsexualized condition showed the anchor in a loose-fitting jacket and skirt, wearing no lipstick and no necklace.

Random assignment to a condition (also called randomization) is the process of randomly assigning participants to treatment groups (or randomly assigning treatments to participants). Random assignment to condition eliminates the influences of extraneous variables, such as individual differences between participants. In the above study, Grabe and Samson randomly assigned their participants to one of the two “anchor grooming” conditions.

Controlling other sources of variation helps to insure that the cause of differences observed in the dependent variables is due to the manipulation and not to anything else. In their study, Grabe and Samson created two versions of the newscast in which everything was the same across the two conditions except for the manipulation. Also, participants followed the same procedures. Some scholars require that controlled experiments include a theory that convincingly explains the relationships between variables found in the research. Although this last point is not unique to controlled experiments, the lack of a suitable theory that explains the relationships between variables greatly limits the usefulness of the research findings, and thus devalues the research itself.

The manipulation of the independent variable, randomization, and the controlling of other sources of variance are requirements for establishing causal relationships. In order to argue that an independent variable caused change in dependent variables, the independent and dependent variables must covary. That is, changes in the value of an independent variable must vary along with changes in a dependent variable. In addition, temporal precedence must be established. That is, the independent variable must precede the dependent variable in time. The manipulation helps the researcher set up a time order where changes in the independent variable occur prior to measurement of the dependent variable. This helps to eliminate the alternative explanation that changes in the dependent variable caused changes in the independent variable. Randomization is crucial to reduce the likelihood that individual differences between participants caused changes in the dependent variable.

Strong controls help to eliminate alternative explanations in that they reduce confounds in your design. The ability to eliminate alternative explanations in cause and effect relationships is called internal validity. Well-executed controlled experiments are high in internal validity. However, because experiments often occur in artificial contexts, because the contexts are under the control of the researcher, and because convenience samples of participants are often used, controlled experiments are said to lack external validity. That is, the results of experiments cannot be easily generalized to larger populations because their samples are rarely, if ever, drawn randomly.

Establishing Causality

At the heart of an experiment's effectiveness for discovering causal relationships is the control of events in order to eliminate alternative explanations for any observed covariation between the independent variable and the dependent variable. As noted above, this is done by detecting covariation between the independent variable and the dependent variable, by manipulating an independent variable, by randomly assigning participants to condition, and by including strong controls. All of these must be present in order to infer a casual relationship between an independent and dependent variable in an experiment.

We detect covariation between independent and dependent variables by examining statistical relationships between the two. If an independent variable and a dependent variable do not covary, then the independent variable cannot be the cause of changes in values of the dependent variable.

Much of the time, effort, and money researchers spend on experimental research are directed toward the creation of a suitable manipulation. In media studies, this often means systematically varying properties of media messages, media contexts, or media technologies. Depending on the independent variable of interest, creating a treatment that manipulates only the media feature of interest without concurrently manipulating other media features can be very difficult. However, care in manipulations is crucial because it is possible to unwittingly incorporate a confound into the study. Let's suppose that in the study mentioned above, Grabe and Samson had the female anchor read five news stories about the economy in the unsexualized condition and five news stories about entertainment celebrities in the sexualized condition (of course, they did not do this). In this case, a confound would be that the topics of the stories covaried along with the anchor grooming. Any difference found in a dependent variable (e.g., perceived professionalism) could be due to the story content and not the image of the anchor. The reason why we create variance in the independent variable is twofold. First, variation in the independent variable is a necessary (but not sufficient) condition to detect covariation with the dependent variable. Second, the manipulation helps to set up the time order of events where the independent variable occurred prior to the measurement of the dependent variable.

Research Questions That Can Be Addressed by Controlled Experiments

A good research question that can be addressed by a controlled experiment must ask about the causal relationships between at least two variables. These variables must be observable by either manipulation (independent variable) or measurement (either independent variable or dependent variable). A good way to begin to frame a simple research question is to ask, “What is the relationship between A and B?” where A and B are your two variables of interest (for argument's sake, let's say that A is your independent variable and B is your dependent variable). The benefit of beginning with this type of question is that it makes you think of your study in terms of your variables rather than in terms of a topic. Another benefit of this question format is that it requires you to think about the kind of relationship you think might exist between your independent and dependent variables. Does a change in A cause a change in B? If so, does variation in A increase B or decrease B? It is difficult for a researcher (whether novice or expert) to begin designing an experiment without first asking the research question in this or in a similar form. In a classic study, Janis and Feshbach (1953) asked if increases in levels of fear in a persuasive message (the independent variable; theirs had three levels) affect message effectiveness. Message effectiveness was represented by a variety of dependent variables that measured beliefs, attitudes, and behaviors.

Designing an Experiment

Let's suppose that you have a research question about the causal relationship between two variables. As you think about the steps needed to insure that any change in the dependent variable is due solely to changes in the independent variable, you soon realize that there are dozens of important decisions to make. While these decisions are addressable when handled with sufficient knowledge, each involves making choices that often do not have easy answers. The key is to make those choices having understood the benefits and costs of each decision. Keep in mind that any “cost” that adversely affects the internal validity of an experiment is likely to be a major obstacle to learning much about the causal relationships you are seeking to test.

The literature refers to several different kinds of experiments. The seminal resource on experimental design is Campbell and Stanley's (1963) Experimental and Quasi-Experimental Designs for Research. Campbell and Stanley identify six designs. Three are described as “pre-experimental designs” and the other three as “true experimental designs.”

The pre-experimental designs include one-shot case study, one-group pretest–post-test, and a static-group comparison. The main problem with the first two designs, and why Campbell and Stanley refer to them as “pre-experimental” designs, is that they include no comparison groups. Recall (above) that one of the goals of the experimenter is to identify covariation between independent and dependent variables. In both the one-shot case study and the one-group pretest–post-test designs, what might be considered the independent variable is in reality a constant. That is, the independent variable is not a variable at all because it doesn't vary.

The static-group comparison does have a comparison group, but the main problem with that design is that there are no formal means to insure that the groups are equivalent beyond the manipulation. Thus, this design has possible confounds built into it and the causal influences of the independent variable on the dependent variable cannot be established with confidence. Because none of these designs can establish a cause and effect relationship, Campbell and Stanley call them “pre-experimental” designs. Others have referred to them as “quasi-experimental” designs.

True Experimental Designs

Campbell and Stanley describe three “true” experimental designs: pretest–post-test control group design, post-test-only control group design, and the Solomon four-group design. The three designs are adequately explained elsewhere (e.g., Bradley, 2011; Campbell & Stanley, 1963; Grabe & Westley, 2003). The key features of these designs are the inclusion of (1) experimental conditions, (2) comparison groups, and (3) random assignment to conditions. Experimental conditions represent the levels of the independent variables and help to determine if there are differences in the dependent variables that are due to your conditions. The comparison group, sometimes called a control group, is important because it controls many threats to validity (e.g., history, testing, maturation) that could yield alternative explanations. Random assignment to condition also controls for alternative explanations due to individual differences of your participants.

Factorial Studies

Perhaps the most common type of experimental design in media studies research includes at least two independent variables. When these variables are crossed (meaning that each level of each independent variable also occurs at another level of another independent variable) they are called factorial designs. In factorial designs, independent variables are called factors. Factors generally refer to manipulated independent variables, such as when one group of participants sees one kind of message and another group sees another kind of message. Factorial designs are expressed in the number of factors (i.e., independent variables) they have, and the number of levels of each factor. For example, a 2 × 2 factorial study has two independent variables, each with two levels. A 2 × 3 factorial study also has two independent variables, where the first independent variable has two levels and the second independent variable has three levels. By definition, each factor must have at least two levels since independent variables must vary (that's what makes them variables) even if they are nominal variables (e.g., present/not present; high/low; one medium versus another medium).

Factorial studies have several advantages compared to single independent variable designs. They permit the simultaneous analysis of two or more independent variables. This design can be more efficient than a study that has only one independent variable because it can cost less in terms of the number of participants needed and the amount of time it takes to run the study. But just as importantly, factorial designs permit the researcher to examine interactions – when the influence of an independent variable on a dependent variable may depend in part on the level of another independent variable in the design.

The simplest factorial design is a 2 × 2 design. An example of a 2 × 2 factorial design would be an experiment that asks the question “What is the impact of fear appeals and channel on message compliance?” In this case, there are two independent variables. The first is fear appeal, which can be operationalized as a message having a fear (threat) component and one that does not. The second independent variable could be operationalized as the medium format (e.g., radio/television). A graphical representation of the 2 × 2 design is shown in Figure 10.1.

Notice that this design shows four boxes. Each box represents something about the condition in which it appears: Box A represents the condition of a fear message on radio; Box B represents the condition of a fear message on television; Box C represents the no fear message on radio; and Box D represents the condition of a no fear message on television. You can put different types of information in each of these boxes, such as the stimuli for each condition, the order of the stimuli, the means and standard deviations on one of the dependent variables, and so on. A graphic like this helps to organize your study.

Suppose you wanted to add a third condition to the medium variable that represented a message presented on the Internet. You could envision a 2 × 3 factorial design (see Figure 10.2), where the first independent variable is still the fear appeal (fear/no fear), but the second independent variable (medium) now has three levels (radio/television/Internet). A fully crossed design would have six conditions. As a general rule, experimental designs that have more than three factors can be problematic, even if each factor has only two levels (Smith, Levine, Lachlan, & Fediuk, 2002) as they become harder to interpret.

images

Figure 10.1 A 2 × 2 factorial design

images

Figure 10.2 A 2 × 3 factorial design

Confounds

When designing an experiment, there are additional important issues to consider. Several have to do with eliminating confounds. A confound is a source of variation that is inextricably tied to one of the independent variables. A confound makes it impossible to determine what is having the observed effect on any of the dependent variables. Eliminating confounds can be accomplished, in part, by paying close attention to treatment variance among conditions, that is, to the process of operationalizing independent variables.

Often, manipulated independent variables in media studies experiments have to do with the form and/or content properties of messages. The fundamental consideration in thinking about message manipulations is to design them in such a way as to eliminate confounds. Sometimes that requires that the message be controlled to such a point that it may not represent actual media messages very well. Although external validity of messages used in experiments is important, the internal validity of message manipulations is even more crucial because, without the confidence that confounds have been avoided, there is little reason to conduct an experiment. We have already discussed how having the female anchor read five economic stories in the “sexualized” condition and five celebrity entertainment stories in the “unsexualized” condition would have introduced a confound in Grabe and Samson's 2011) study. Instead, the authors kept other message properties constant. They knew that simultaneously altering stories would have resulted in what some academics refer to as a “fatal flaw.”

Creating Treatment and Message Variance

The most common way of creating treatment variance in media studies controlled experiments is through message alteration, where the researcher either alters existing messages to create different versions (each of which represents a level of an independent variable), or first creates the messages, and then different versions of each. The point of message alteration is to change only the attribute of a message that represents a level of an independent variable while keeping the rest of the message constant. In their study on the effect of compelling negative images in TV news stories on memory for story content, Newhagen and Reeves (1992) created two versions of news stories to represent their news story independent variable – compelling negative images in news stories (absent/present). First, they collected stories about disasters (e.g., hurricane, riots, accidents, etc.). Then they created two versions. In one condition, they inserted compelling negative video footage in the middle of the story. In the other, the stories had no compelling negative video. Based on this manipulation, the researchers were able to see how the presence of the negative video impacted memory for the story.

Another issue is deciding how many messages should represent a level of a treatment. Is one enough? This is not an easy question to answer, especially since accomplished researchers vary on their positions. Detailed discussions of the benefits of including message variance in design are discussed elsewhere (Jackson, O'Keefe, Jacobs, & Brashers, 1989; Reeves & Geiger, 1994; Slater, 1991; Thorson et al., 2012). Thorson et al. explain that two kinds of variance are important in experiments – treatment variance and message variance – and they argue that both are important in media studies experiments. Treatment variance refers to how the levels of each independent variable vary within the message. Often, experimenters focus primarily on treatment variance and not enough on message variance – employing multiple messages per treatment level.

Suppose you were interested in how news sources affect the perceived credibility of news stories. You might choose or write a news story and create two versions, one in which the primary source is a government official and another in which the primary source is a private citizen. In this case, you have created variance between your two treatment levels (government source versus private citizen) and held all other message features the same. This is an example of creating treatment variance through message alteration, which is common in media studies. However, this design, sometimes called a “single message design” is problematic because it ignores message variance (Thorson et al., 2012). That is, any effect you might obtain with this message alteration may be dependent in part on other features of the message that were held constant. Suppose your news story was about the effort of the local government to levy an additional sales tax. It is possible that the source effect you found may only occur in stories that focus on taxes, on economic issues, on financial concerns, or on the gender of the particular sources you used, and so on. It is impossible to determine if the observed effect is unique to that story. One way to reduce this problem is to create message variance. Employing multiple messages per treatment level strengthens the ability to generalize to message categories.

In addition to message alteration, one can create treatment variance by employing a sample of messages so each level of an independent variable is represented by several messages. When selecting a sample of messages to represent each treatment level, the researcher would ideally sample from a knowable message population that contains all of the messages that represent a particular treatment. Rarely, however, do we have access to such a population of messages. Often, the best we can do is find multiple messages that represent one treatment, where the variance within treatment would ideally be less than the variance between treatments. Given that messages vary on a host of features other than the ones of interest, many of which we cannot know, Thorson et al. (2012) and others (e.g., Reeves & Geiger, 1994) suggest that the best strategy is to include as many messages per treatment level that participants can reasonably be expected to attend to without fatigue or boredom. The more messages naturally vary within a treatment, the larger the number of messages per treatment is needed. This strategy creates both treatment and message variance.

Lee and Lang (2009) provide such an example. They tested the effects of five discrete emotional public service announcements (PSAs) – fear, sadness, joy, anger, neutral – on several cognitive and emotional dependent variables, including self-report and psychophysiological measures. Based on pretest ratings, Lee and Lang selected 30 PSAs to represent the five-level discrete emotion independent variable. Six different messages represented one of the five conditions. Note that this message repetition procedure created both treatment and message variance. Clearly, messages within a single treatment varied widely, but within-treatment message variance was treated as random error in tests across treatments. The significant findings of their study occurred despite this within-treatment message variance. This approach is becoming more common in media studies, especially as continuous response and psychophysiological measures spread across research laboratories in journalism, communication, and media studies programs.

Another option for creating both treatment and message variance is to employ both message alteration and message repetition strategies. An example of such a study is a 3 × 2 experiment in which Miller (2006) manipulated story type (three levels: live, breaking, traditional) and emotional message content (two levels: threat, negative graphic images). She created treatment variance for story type with message alteration by having an anchor verbally refer to a story as “live” or as “breaking,” or by including no verbal designation for the “traditional” story. To create treatment variance for emotional content, three stories were chosen to represent the threat condition and three different stories were chosen to represent the negative graphic image condition. Miller used a fractional design, such that each participant saw six stories that represented the six levels of the two independent variables (traditional–threat, traditional–graphic image, live–threat, live–graphic image, breaking–threat, breaking–graphic image, presented in random orders). No participant saw two versions of the same story. By including a message repetition factor for the emotional content treatment, Miller was better able to generalize her results to other similar emotional news stories than if she had relied on only one story per condition. Although message repetitions may require more work on the part of the researcher, their inclusion in an experimental research design greatly strengthens what can be learned from a study.

Operationalizing Independent Variables

There have been discussions in the literature on how to characterize the structure of media stimuli, as summarized in O'Keefe (2003), Tao and Bucy (2007), and Thorson et al. (2012). Many scholars define media stimuli in “industry units” (e.g., commercials, news stories). There are good reasons for such a choice. Much of our research interests involve how people respond to news, ads, programs, stories, and other such journalism and advertising products. Other scholars suggest it is often more useful to describe media messages in terms of variables more closely related to how messages are psychologically processed (Reeves, 1989). This approach emphasizes form and content message attributes such as visual complexity, brightness, contrast, movement of objects on the screen, story length, tone, frame, arousing content, and so on. This distinction becomes increasingly important if research questions focus on the impact of specific formal (e.g., pacing, type size, color, movement) or content features (e.g., emotional content, bias, sources, attacks, acclaims).

One common mistake media studies students new to experimental research make is confounding message attributes with their intended effects. The classic example is research on “fear appeals.” Understanding how fear appeals work to change individuals' knowledge, attitudes, and behaviors about health threats has been examined with experimental research since the mid-1950s. It is tempting to define a fear appeal message in terms of the effect it has on a participant (i.e., a message that scares someone). Although frightening an audience member may be the goal of a fear appeal, that definition does nothing to tell us how to create one. A better approach to operationalizing a fear appeal is to decide which message features constitute such an appeal. Perhaps the most common definition of a fear appeal is as message content that contains a threat to the well-being of target individuals (Dillard, 1994). Note that in practice, “fear appeals are executed by writing copy in a way that directly associates the targeted behavior (e.g., tobacco use) with a threat (e.g., disease, death)” (Leshner, Bolls, & Wise, 2011, p. 79). Note that the content described here is independent of how an individual feels upon exposure.

Sundar (2004) provides another example with the concept of interactivity. He points out that if interactivity is conceptualized as a feature of the physical stimulus, then it is a mistake to operationalize it in terms of people's perception of interactivity. Instead, it is necessary to develop a theory of what it is about a stimulus that makes it interactive (e.g., features like functionality, user accommodations, organization, control, choice, and contingencies). These features are physical, not psychologically perceived independent variables.

Operationalizing Dependent Variables

Dependent variables in media studies experiments often share properties with those measured in survey research. Most often, dependent variables tend to be continuous measures, meaning that they are measured on ordinal interval or ratio levels. These include Likert scales, semantic differential scales, and magnitude response scales (e.g., seven-point scales anchored by “not at all” and “a lot,” or “never” and “always”). Such scales have been used to measure attitudes, opinions, and behaviors. Researchers can compute the means and standard deviations across conditions to see if being in a condition affects those responses. Knowledge and/or memory can be measured by factual tests (e.g., series of multiple choice items summed to create a “knowledge” score for each participant). Other types of self-report measures include cued and free recall, thought listing, thinking aloud, and so on. Self-report dependent variables can have problems associated with them, such as social desirability (Crowne & Marlowe, 1960), demand characteristics (Orne, 1969), and satisficing (e.g., Krosnick, Narayan, & Smith, 1996). As more media studies research investigates how individuals process media messages, researchers are adopting dependent variables used in psychology and other fields. Many of these are implicit measures, meaning that they are less susceptible than self-report variables to the threats to validity mentioned above. These can include secondary task reaction time (STRT), which can include response latency (how long it takes the participant to respond) (Geiger & Newhagen, 1993), speeded recognition tasks, and psychophysiological measures (e.g., heart rate, skin conductance, facial electromyography, startle) (Potter & Bolls, 2011). These measures can require sophisticated equipment and software as well as considerable training and theoretical understanding. Even so, laboratories that can accommodate these types of measures are becoming more common in journalism, communication, and mass communication programs. There are currently upward of two dozens psychophysiological research laboratories in the United States supervised by mass communication researchers, and several more outside the United States.

Between or Within Subjects?

A major decision that needs to be made in any experiment in media studies is deciding what participants will see, hear, or read. Specifically, would each participant see stimuli that represent only one level of an independent variable (between subjects), all levels of all independent variables (within subjects), or some levels of some independent variables (mixed)? The most common strategy in media studies is the between-subjects design, where participants are randomly assigned to only one level of an independent variable. The within-subjects design is a repeated-measures design where participants see all levels of all independent variables. The third option is having some independent variables run between subjects and some independent variables run within subjects.

Between-Subjects Designs

In between-subjects designs each participant is randomly assigned to a condition. Comparisons can be made across conditions on the values of each dependent variable. The advantages of between-subjects designs are that procedures are generally easier and you don't have to worry about participants figuring out your manipulations. Also, treatment orders are not considerations in between-subjects designs. Since each group is exposed to different treatment levels, the mean scores (systematic variance) on each dependent variable are compared relative to their within-group variation (error variance).

An example of a between-subjects design is the previously mentioned fear appeal study by Janis and Feshbach (1953). The researchers developed three versions of a 15-minute illustrated talk about the causes of tooth decay and oral hygiene recommendations. Included in the talk were 20 slides illustrating points being made by the speaker. All three versions presented essentially the same content, but they varied in the amount of fear-arousing material presented: strong fear, moderate fear, or minimal fear. Participants were high school students who were assigned to either one of the three fear conditions or to a fourth group, which acted as a control group, who saw a presentation on a different topic. The design permits the comparison of scores on dependent variables across the four groups (or the three fear groups on measures that could only be asked of participants who saw one of the fear messages). Since participants were not exposed to other conditions, Janis and Feshbach did not have to worry about threats to validity that may occur when participants are exposed to multiple levels of an independent variable, such as sensitization and carry-over effects.

Within-Subjects Designs

Within-subjects designs are less common in media studies, although the frequency with which they appear in media-related journals is increasing, in part due to the growth of media psychology research. In within-subjects designs, each participant receives all of the manipulations, regardless of how many independent variables and levels of each there are. The main advantage of within-subjects designs is that each participant acts as his or her own control, meaning that the amount of error variance per condition is greatly reduced. The consequence of reduced error variance is that the same size effect can be detected with many fewer participants than in a between-subjects design. This means that, all things being equal (the same number of conditions), within-subjects designs can be considerably less expensive to conduct in terms of both time (experimenter and participant) and money. However, within-subjects designs can have a major drawback. Since every participant experiences all levels of all independent variables, it is possible (even likely in many situations) that they could be sensitized to the manipulations, which could in turn impact their responses. Further, within-subjects designs are susceptible to carry-over effects, which means that the induction of one treatment impacts the induction of the next. It is very difficult to eliminate carry-over effects, if they occur, in within-subjects designs. The most common strategy to deal with possible carry-over effects is to randomize the presentation order of stimuli to participants. Although carry-over effects are not eliminated, this approach treats the impact of presentation order as error.

Sample Sizes

An important decision to be made is the size of the sample needed to conduct an experiment. This consideration is primarily informed by the size of the effect to be detected, the number of factors and levels of each, whether factors are run within subjects or between subjects, the type of statistical analyses to be conducted (e.g., ANOVA (analysis of variance), regression), and the statistical criterion for rejecting the null hypothesis.

In order to determine sample size, you will need to have an a priori idea of how large an effect you wish to detect. You can estimate effect size by reviewing literature and seeing what effect sizes others have found in studies that used variables similar to yours. If insufficient information from prior literature exists, then you may need to estimate effect size either by reasoning from theory or by using rules of thumb. Cohen (1988, 1992) categorizes effects as “small,” “medium,” and “large.” Clearly, the choice of such a crude classification will depend on the specifics of the research. For example, we might expect to find a “medium” sized effect when doing a study on priming, but a “small” sized effect if measuring heart rate. Absent any convincing evidence for a priori effect size estimates, most researchers opt for expecting a “medium” sized effect. Once you have chosen the effect size you wish to detect, you will need to determine which independent variables you wish to run between subjects and which you wish to run within subjects. For within-subjects independent variables, you may need to estimate the correlation among the repeated measures. The statistical tests you wish to run to test your hypotheses will be determined by your design. Finally, your criterion for rejecting the null, also called type I error, is set by convention at α =.05 (admittedly this level is arbitrary).

There are several power analysis software packages you can use to compute a priori effect sizes, some of which are free (e.g., Faul, Erdfelder, Land, & Buchner, 2007). There are also rules of thumb that can be used (e.g., VanVoorhis & Morgan, 2007), although this approach can be less precise than a power analysis program, tables, or computation. Cohen (1988) for example, provides power tables for multiple statistical tests and research designs.

Sample Composition

There has been considerable discussion in journals of whether it is necessary to randomly sample experimental participants from a targeted population (Abelman, 1996; Basil, 1996; Basil, Brown, & Bocarnea, 2002; Courtright, 1996; Lang, 1996; Potter, Cooper, & Dupagne, 1993, 1995; Sparks, 1995a, 1995b). This issue feeds into a larger theoretical one, which is, to what do we wish to generalize (Shapiro, 2002)?

When researchers randomly sample from a population, they can infer from findings in the sample to the particular population from which the sample was drawn. For example, if a researcher obtains a random sample of people in a county to study support for a local ballot initiative, the percentage of people who respond positively can be used to estimate the actual support in the county's population, within the sample's sampling error. Random sampling enables statistical generalization from sample attributes to population attributes.

Samples for experiments in media studies, however, rarely involve random samples. In fact, one does not have to read many reports of experiments in mass communication to recognize the fact that nearly none employ samples randomly selected from populations. Instead, experimental researchers typically acquire convenience samples (e.g., college students enrolled in large communication classes, or adults who agree to participate in an experiment for a chance to win a retail store gift card). These individuals are then randomly assigned to the conditions in the experiment. Because there is no random sampling of participants, inferences cannot be applied to the likelihood that values found in the experimental sample are representative of values that would be found in the population as a whole. Instead, logical inferences are made about the multivariate relationships among the variables in the experiment (Basil, 1996; Basil et al., 2002). Lang (1996) explains that the “population” in experiments is all the possible samples one could randomly draw from the group of individuals in the experiment.

The strength of those logical inferences depends on how well the experimenter can make the argument for them, which should be based on a combination of theoretical and conceptual considerations. For example, suppose a study examined the relationship between exposure to emotional media and psychophysiological processes that are unlikely to be affected by demographics or other individual difference variables. Then generalizing to “human processing” could be argued since those processes are expected to be similar across samples with different characteristics. On the other hand, if an experiment involves variables clearly related to demographics or individual differences, then generalization beyond people who do not share those characteristics would be unconvincing. Often the generalizability of experimental findings is also supported by replication of the experiment across different samples of both people and messages and across different contexts. A good example of a finding that seems to be robust regardless of when or where it is tested is the third-person effect.

In his cogent discussion about the differences between social and surface realism, Shapiro (2002) argues that generalizability is inextricably linked to the theoretical principles that underlie our controlled observations (in the case of experiments). Thus, the focus on what we can learn about the processes involved in much media studies research should be more on the theoretical relationships among variables and less on surface similarities, such as random sampling or the operationalizations of variables. Basil et al. (2002) showed that multivariate relationships can be stable across samples even when the samples differ greatly on demographic variables and the data collection method. Basil and his colleagues conducted a study of celebrity effects around the time of the death of Diana, princess of Wales, with three different samples: a random telephone sample drawn from seven states, a nonrandom sample of college students in three states, and a nonrandom web-based survey sample. The samples varied widely in terms of age, gender, and media use, yet Basil and his colleagues found a remarkably consistent relationship across the different samples between respondents' identification with Diana and their media use. Thorson et al. (2012) recommend that sample characteristics and selection methods should be reported so the reader can evaluate claims of generalizability.

Manipulation Checks

Manipulation checks are traditionally included in media studies research, especially in studies in which a message manipulation represents levels of an independent variable. For example, some fear appeal researchers will manipulate the level of threat in a message, then measure participants' perceived threat or fear arousal in order to be certain that the message manipulation performed the way in which it was intended. However, this approach has come under considerable criticism (e.g., O'Keefe, 2003; Tao & Bucy, 2007; Thorson et al., 2012). The criticism is based on three arguments: (1) message manipulations are independently verifiable; (2) message manipulations are conflated with effects; and (3) mistreating measures as manipulation checks when they could be treated as mediating psychological states.

The first argument can be illustrated with a study reported by Lang, Schwartz, Chung, and Lee (2004). They defined pacing of videos, which they hypothesized would impact message processing, as the number of scene changes per unit of time with public service announcements. That is, PSAs with few scene changes represented “slow” pacing and PSAs with many scene changes represented “fast” pacing (they included a third condition, where a moderate number of scene changes represented “medium” pacing). The average number of scene changes per condition is verifiable – either pacing (or rate of scene changes) varied across the three conditions or they didn't. Asking participants their perception of pacing as a check on actual pacing is unnecessary.

The second argument can be demonstrated by the fear appeal example, in which perceived threat (important in a number of fear appeal processing models) and/or fear arousal are measured and treated as checks that the message manipulation has worked as expected. In research where the theoretical interest is on the message attributes that increase fear, such a manipulation check is not measuring the message feature; rather it is measuring the effect the manipulation has. In other words, this approach defines the media content by its presumed effects and not by the particular message features of theoretical interest. O'Keefe (2003) specifically argues that conceptualizing media content according to concrete message properties that can be manipulated during message production provides more theoretical and practical value for media processes and effects research than defining it according to how it impacts participants.

The third argument involves not so much measurement but how those measures are handled during data analyses. O'Keefe (2003) argues, for example, that if messages are manipulated based on the intensity of the health threat (e.g., the threat can reduce the ability to climb stairs versus the threat can kill you) and perceived threat is measured, then perceived threat could be thought of as a mediating psychological state, and therefore should be included as a mediator in subsequent analyses. Too often, researchers will present perceived threat as a manipulation check rather than include it analytically in a mediation model. Such treatment would add considerable value to our theoretical models.

To sum up, O'Keefe argues that manipulation checks, as such, are unnecessary:

From my vantage point, a researcher who reports a message manipulation check has probably made one of two mistakes – either the mistake of defining a message variable in terms of effects rather than in terms of intrinsic properties or the mistake of confusing the assessment of a potential mediating state with the description of message properties. (2003, p. 268)

Summary

The controlled experiment is a valuable, if underutilized, research method in media studies. It is the best way to show a causal relationship between variables because it can set up a temporal order of events (independent variables occur prior to the measurement of dependent variables) and eliminate alternative explanations by eliminating confounds. The key aspects of a controlled experiment include manipulations of independent variables, random assignment to conditions, and controlling all other variables. When variations in message properties represent different levels of independent variables, manipulations can be accomplished through message alteration or message sampling. Message sampling can also increase generalizability of the findings to other similar messages. Many experiments in media studies are factorial designs, where more than one independent variable is systematically varied. These designs allow the examination of interactions in addition to main effects. Samples based on college students should not be discounted out of hand. There are cogent arguments against manipulation checks, although measures of perceived message effects can be theoretically important as mediating variables.

REFERENCES

Abelman, R. (1996). Can we generalize from Generation X? Not! Journal of Broadcasting & Electronic Media, 40, 441–446.

Basil, M. D. (1996). The use of student samples in communication research. Journal of Broadcasting & Electronic Media, 40, 431–440.

Basil, M. D., Brown, W. J., & Bocarnea, M. C. (2002). Differences in univariate values versus multivariate relationships: Findings from a study of Diana, Princess of Wales. Human Communication Research, 28(4), 501–514.

Bradley, S. D. (2011). Experiment. In S. Zhou & W. D. Sloan (Eds.), Research methods in communication (pp. 161–180). Northport, AL: Vision.

Campbell, D. T., & Stanley, J. C. (1963). Experimental and quasi-experimental designs for research. Boston, MA: Houghton Mifflin.

Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). Hillsdale, NJ: Lawrence Erlbaum.

Cohen, J. (1992). A power primer. Psychological Bulletin, 112(1), 155–159.

Courtright, J. A. (1996). Rationally thinking about nonprobability. Journal of Broadcasting & Electronic Media, 40, 414–421.

Crowne, D. P., & Marlowe, D. (1960). A new scale of social desirability independent of psychopathology. Journal of Consulting Psychology, 24, 349–354.

Dillard, J. (1994). Rethinking the study of fear appeals: An emotional perspective. Communication Theory, 4, 295–323.

Faul, F., Erdfelder, E., Lang, A. G., & Buchner, A. (2007). G*Power 3: A flexible statistical power analysis program for the social, behavioral, and biomedical sciences. Behavioral Research Methods, 39, 175–191.

Geiger, S., & Newhagen, J. (1993). Revealing the black box: Information processing and media effects. Journal of Communication, 43(4), 42–50.

Grabe, M. E., & Samson, L. (2011). Sexual cues emanating from the anchorette chair: Implications for perceived professionalism, fitness for beat, and memory for news. Communication Research, 38, 471–496.

Grabe, M. E., & Westley, B. (2003). The controlled experiment. In G. H. Stempel, III, D. H. Weaver, & G. C. Wilhoit (Eds.), Mass communication research and theory (pp. 267–298). Boston, MA: Allyn & Bacon.

Jackson, S., O'Keefe, D. J., Jacobs, S., & Brashers, D. E. (1989). Messages as replications: Toward a message-centered design strategy. Communication Monographs, 56, 364–384.

Janis, I. L., & Feshbach, S. (1953). Effects of fear-arousing communications. Journal of Abnormal and Social Psychology, 48(1), 78–92.

Krosnick, J. A., Narayan, S. S., & Smith, W. R. (1996). Satisficing in surveys: Initial evidence. In M. T. Braverman & J. K. Slater (Eds.), Advances in survey research (pp. 29–44). San Francisco, CA: Jossey-Bass.

Lang, A. (1996). The logic of using inferential statistics with experimental data from nonprobability samples: Inspired by Cooper, Dupagne, Potter, and Sparks. Journal of Broadcasting & Electronic Media, 40, 422–430.

Lang, A., Schwartz, N., Chung, Y., & Lee, S. (2004). Processing substance abuse messages: Production pacing, arousing content, and age. Journal of Broadcasting & Electronic Media 48, 61–88.

Lee, S., & Lang, A. (2009). Discrete emotion and motivation: Relative activation in the appetitive and aversive motivational systems as a function of anger, sadness, fear, and joy during televised information campaigns. Media Psychology, 12, 148–170.

Leshner, G., Bolls, P. D., & Wise, K. (2011). Motivated processing of fear appeal and disgust images in televised anti-tobacco ads. Journal of Media Psychology, 23(2), 77–89.

Miller, A. (2006). Watching viewers watch TV: Processing live, breaking, and emotional news in a naturalistic setting. Journalism & Mass Communication Quarterly, 83, 511–529.

Newhagen, J. E., & Reeves, B. (1992). The evening's bad news: Effects of compelling negative television news images on memory. Journal of Communication, 42(2), 25–41.

O'Keefe, D. J. (2003). Message properties, mediating states, and manipulation checks: Claims, evidence, and data analysis in experimental persuasive message effects research. Communication Theory, 13(3), 251–274.

Orne, M. T. (1969). Demand characteristics and the concept of quasi-controls. In R. Rosenthal & R. L. Rosnow (Eds.), Artifact in behavioral research (pp. 143–179). New York, NY: Academic Press.

Potter, R. F., & Bolls, P. D. (2011). Psychophysiological measurement and meaning: Cognitive and emotional processing of media. New York, NY: Routledge.

Potter, W. J., Cooper, R., & Dupagne, M. (1993). The three paradigms of mass media research in mainstream communication journals. Communication Theory, 3(4), 317–335.

Potter, W. J., Cooper, R., & Dupagne, M. (1995). Is media research prescientific? Reply to Sparks's critique. Communication Theory, 5(3), 280–286.

Reeves, B. (1989). Theories about news and theories about cognition: Arguments for a more radical separation. American Behavioral Scientist, 33(2), 191–198.

Reeves, B., & Geiger, S. (1994). Designing experiments that assess psychological responses. In A. Lang (Ed.), Measuring psychological responses to media (pp. 165–180). Hillsdale, NJ: Lawrence Erlbaum.

Shapiro, M. (2002). Generalizability in communication research. Human Communication Research, 28(4), 491–500.

Slater, M. D. (1991). Use of message stimuli in mass communication experiments: A methodological assessment and discussion. Journalism Quarterly, 68(3), 412–421.

Smith, R. A., Levine, T. R., Lachlan, K. A., & Fediuk, T. A. (2002). The high cost of complexity in experimental design and data analysis. Human Communication Research, 28(4), 515–530.

Sparks, G. G. (1995a). Is media research prescientific? Comments concerning the claim that mass media research is “prescientific”: A response to Potter, Cooper, and Dupagne. Communication Theory, 5(3), 273–280.

Sparks, G. G. (1995b). Is media research prescientific? A final reply to Potter, Cooper, and Dupagne. Communication Theory, 5(3), 286–289.

Sundar, S. S. (2004). Theorizing interactivity's effects. Information Society, 20, 385–389.

Tao, C., & Bucy, E. P. (2007). Conceptualizing media stimuli in experimental research: Psychological versus attributed-based definitions. Human Communication Research, 33, 397–426.

Thorson, E., Wicks, R., & Leshner, G. (2012). Experimental methodology in journalism and mass communication research. Journalism & Mass Communication Quarterly, 89(1), 112–124.

VanVoorhis, C. R. W., & Morgan, B. L. (2007). Understanding power and rules of thumb for determining sample sizes. Tutorials in Quantitative Methods for Psychology, 3(2), 43–50.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.142.249.229