Gathering and Entering Data

The Questionnaire

Assume that you conduct a study in which the participants are 50 college students, each of whom is currently involved in a romantic relationship. Each participant completes a 24-item questionnaire designed to assess the six investment model constructs: commitment; satisfaction; rewards; costs; investment size; and alternative value. (One construct, satisfaction, is not included in the version of the investment model to be tested in this study and is therefore not referred to again in this chapter.)

Each investment model construct is assessed with four questions. For example, the four items that assess the “commitment” construct are reproduced here:

21.How committed are you to your current relationship?
  
Not at all   1 2 3 4 5 6 7 8 9   Extremely
 committed                       committed

22.How long do you intend to remain in your current relationship?
  
    A very   1 2 3 4 5 6 7 8 9   A very
short time                       long time

23.How often do you think about breaking up with your current partner?
  
Frequently   1 2 3 4 5 6 7 8 9   Never

24.How likely is it that you will maintain your current relationship for a long time?
  
Extremely   1 2 3 4 5 6 7 8 9   Extremely
 unlikely                       likely


Notice that, with each item, selecting a higher response number (such as 8 or 9) indicates a higher level of commitment whereas selecting a lower response number (such as 1 or 2) indicates a lower level of commitment. Participants’ responses to these four items are summed to arrive at a single score that reflects their overall level of commitment. This cumulative score will serve as the commitment variable in your analyses. Scores on this variable can range from 4 to 36 with higher values indicating higher levels of commitment.

Each remaining investment model construct is assessed in the same way with responses to four survey items being summed to create a single overall measure of the construct. With each variable, scores can range from 4 to 36 and higher scores indicate higher levels of the construct being assessed.

Entering the Data

Analyzing Raw Data

In practice, the easiest way to create these summed scores would involve entering the raw data and then using data manipulation statements in the SAS program to create the summed scores. Raw data in this sense refers to participants’ responses to the 24 items on the questionnaire. You could prepare a single SAS program that would input these raw data, create new variables (overall scale scores) from existing variables, and perform the regression analyses on the new variables.

For example, imagine that you entered the raw data so that there is one line of data for each participant, and this line contains the participant’s responses to questions 1 through 24. Assume that items 1 through 4 assessed the rewards construct, items 5 through 8 assessed the costs construct, and so forth. The following program would input these data and create the necessary composite scores:

 1      DATA D1;
 2         INPUT   #1   @1   (Q1-Q24)   (1.)   ;
 3
 4            REWARD        = Q01 + Q02 + Q03 + Q04;
 5            COST          = Q05 + Q06 + Q07 + Q08;
 6            INVESTMENT    = Q09 + Q10 + Q11 + Q12;
 7            ALTERNATIVES  = Q13 + Q14 + Q15 + Q16;
 8            SATISFACTION  = Q17 + Q18 + Q19 + Q20;
 9            COMMITMENT    = Q21 + Q22 + Q23 + Q24;
10
11            IF REWARD NE . AND COST NE . AND INVESTMENT NE . AND
12               ALTERNATIVES NE . AND COMMITMENT NE . ;
13
14      DATALINES;
15      343434346465364735748234
16      867565768654544354865767
17      434325524243536435366355
18      .
19      .
20      .
21      874374848763747667677467
22      534232343433232343524253
23      979876768968789796868688
24      ;
25      RUN;
26
27      Place PROC statements here

Lines 1 and 2 of the preceding program tell the system that there is one line of data for each participant, variables Q1 through Q24 begin at column 1, and each variable is one column wide.

Line 4 of the program tells the system to create a new variable called REWARD, and participants’ scores on REWARD should be equal to the sum of their scores on variables Q1, Q2, Q3, and Q4 (questionnaire items 1 to 4). Lines 5 though 9 of the program create the study’s remaining variables in the same way.

Lines 11 and 12 contain a subsetting IF statement that eliminates from the dataset any participant with missing data on any of the six variables to be analyzed. This ensures that each analysis is performed on exactly the same participants. You do not have to worry that some analyses may be based on 50 participants while others are performed on only 45 due to missing data. (Chapter 4 discusses the use of subsetting IF statements.)

Notice that SATISFACTION (the satisfaction variable) is not listed in this subsetting IF statement. This is because SATISFACTION is not included in any of the analyses (i.e., SATISFACTION will not be listed as a predictor variable or a criterion variable in any analysis). Including SATISFACTION in the subsetting IF statement might have unnecessarily eliminated some participants from your dataset. For example, if a participant named Luc Dion had missing data on SATISFACTION, and SATISFACTION were in fact included in this subsetting IF statement, then Luc Dion would have been dropped from the dataset and each analysis would have been based on 49 observations rather than 50. But dropping Luc Dion in this case would have been pointless, because SATISFACTION is not going to be used as a predictor or criterion variable in any analysis. To summarize, only variables that are to be used in your analyses should be listed in the subsetting IF statement that eliminates participants with missing data.

The DATALINES statement and some fictitious data are presented on lines 14 to 24. Line 27 shows where the PROC statements (discussed later) would be placed in the program.

How This Program Handles Missing Data When Creating Scale Scores

With the preceding program, if a participant left blank any one of the four questions that constitute a scale, that participant will receive score of “missing” (.) on that scale. For example, if Luc Dion completes items 1, 2, and 3 of the REWARDS scale but leaves item 4 blank, he is assigned a missing value on REWARDS. If the dataset were printed using PROC PRINT, his value on REWARDS would appear as “.”.

Consider the problems that would arise if the system did not assign missing values in this way. What if the system simply summed each participant’s responses to the items that s/he did complete, even when some had been left blank? Remember that Luc Dion completed only 3 items on the 4-item response scale. In his case, the highest score he could possibly receive on REWARDS would be 27. (This is because the 3 items use a 9-point response scale, and 3 × 9 = 27.) For those participants who completed all 4 items in the REWARDS scale, the highest score possible is 36 (because 4 × 9 = 36). Obviously, under these circumstances the scores from participants with incomplete responses would not be directly comparable to scores from those with complete data. It is therefore best that participants with incomplete responses be assigned missing values when scale scores are computed.

Analyzing Pre-Calculated Scale Scores

A second option is to compute each participant’s score on each of the six constructs by hand and then prepare a SAS program to analyze these pre-calculated scores. In other words, you could review the questionnaire completed by participant 1, sum the responses to items 1 through 4 (perhaps by using a hand calculator), record this sum on a summary sheet, repeat this process for the remaining scales, and repeat the entire process for all remaining participants.

Later, the summed scores could be entered as data in a SAS program. This dataset would consist of 50 lines of data (one line for each participant). Instead of raw data, a given participant’s summed score on each of the six questions would appear on his or her data line. Scores on the commitment variable might be keyed in columns 1 to 2, scores on the satisfaction variable might be keyed in columns 4 to 5, and so forth. The guide used in entering these data is presented here:

LineColumnVariable NameExplanation
11-2COMMITMENTScores on commitment variable.
3blank
 4-5SATISFACTIONScores on the satisfaction variable.
6blank
 7-8REWARDScores on rewards variable.
9blank
 10-11COSTScores on costs variable.
12blank
 13-14INVESTMENTScores on investment size variable.
15blank
 16-17ALTERNATIVESScores on the alternative value variable.

Fictitious data from the 50 participants in your study have been entered according to these guidelines and are presented in Appendix B. The analyses discussed in this chapter are performed on this dataset.

Following is the SAS program that will input these data. Lines 9 and 10 again include a subsetting IF statement that would eliminate from the dataset any participant with missing data on any of the variables to be analyzed (as was explained in the preceding section). A few lines of the actual data are presented as lines 13 through 22. Line 26 indicates the position where the PROC statements (to be discussed later) would be placed in this program.

 1       DATA D1;
 2          INPUT   #1    @1   COMMITMENT      2.
 3                        @4   SATISFACTION    2.
 4                        @7   REWARD          2.
 5                        @10  COST            2.
 6                        @13  INVESTMENT      2.
 7                        @16  ALTERNATIVES    2.   ;
 8
 9           IF REWARD NE . AND COST   NE . AND INVESTMENT NE .
10           AND ALTERNATIVES NE . AND COMMITMENT NE . ;
11
12        DATALINES;
13        34 30 25 13 25 12
14        32 27 27 14 32 13
15        34 23 24 21 30 14
16        .
17        .
18        .
19        .
20        36 32 28 13 15 5
21        32 29 30 21 32 22
22        30 32 33 16 34 9
23        ;
24        RUN;
25
26        Place PROC statements here

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.221.206.73