Chapter 7

More about Defining Your Data

IN THIS CHAPTER

check Understanding the special properties of dates and times

check Creating multiple response sets

check Copying variable definitions from another file

Without a definition, a number serves no purpose. For example, the number 3 could be the number of miles, or an answer to a multiple-choice question, or the number of jelly beans in your pocket. As you can see, a variable’s definition is important.

Data type is part of a variable’s definition. The data type is more than just a tag — it determines how a value can be manipulated. For example, calculating date arithmetic (the distance in time between two dates) would be a nightmare without the proper date format. As soon as you specify the proper date format, you can take advantage of special menus for manipulating dates.

Multiple-response variables — those “check all that apply” questions on surveys — are another kind of variable type that needs extra attention. Again, when you create multiple-response variables properly, you can use a special menu dedicated to this type of variable. Finally, all this data definition stuff can be time consuming, so SPSS includes a special shortcut menu for copying your data and variable definitions from one dataset to another.

Working with Dates and Times

Calendar and clock arithmetic can be tricky, but SPSS can handle it all for you. Just enter date and time variables in whatever format you specify, and SPSS will convert the values internally to do the calculations. Also, SPSS displays the newly created date and time variables in your specified format, so the variable is easy to read.

SPSS understands the meaning of slashes, commas, colons, blanks, and names in the dates and times you enter, so you can write the date and time almost any way you’d like. If SPSS can’t figure out what you’ve typed, it clears away what you typed and waits for you to type something again.

Technical Stuff Internally, SPSS keeps all dates as a positive or negative count of the number of seconds from a zero date. Here’s a bit of trivia for you. The zero date in SPSS is the birth of the Gregorian calendar in 1582. No kidding! You can choose a display format that includes or excludes the time, but the information is always there. You can even change the display format without loss of data. If the time isn’t included in the data you enter, SPSS assumes zero hours and minutes (midnight).

On the Data View tab of the Data Editor window you can determine the data type for each variable. The type is chosen from the list of types shown in Figure 7-1. On the right, you select a format. SPSS uses this format to interpret your input and to format the dates for display.

Screenshot of the Variable Type window to select the data type and the format for each variable from the list of types displayed on the left.

FIGURE 7-1: Select the data type and the format.

Remember SPSS uses the format you select for both reading your input and formatting the output of dates and times.

Warning The Columns setting of the Date variable on the Variable View tab of the Data Editor window is important. The column width determines the maximum number of characters that can be displayed. If you choose a format that is too narrow to fit, the date will appear as a row of asterisks.

The available formats are defined as a group and change according to the variable type. For example, the Dollar type has a different list of choices than those offered for the Date type.

The list of format definitions you have to choose from are constructed by combining the specifiers listed in Table 7-1. Format definitions look like mm/dd/yy and ddd:hh:mm.

TABLE 7-1 Specifiers in Date and Time Formats

Specifier

Means

dd

A two-digit day of the month in the range 01, 02, …, 30, 31.

ddd

A three-digit day of the year in the range 001, 002, …, 364, 365.

hh

A two-digit hour of the day in the range 00, 01, …, 22, 23.

Jan, Feb, …

The abbreviated name of the month of the year, as in JAN, FEB, …, NOV, DEC.

January, February, …

The name of the month of the year, as in JANUARY, FEBRUARY, …, NOVEMBER, DECEMBER.

mm

When adjacent to a dd specifier in a format, a two-digit month of the year in the range 01, 02, …, 11, 12. When adjacent to an hh specifier in a format, a two-digit specifier of the minute in the range 00, 01, …, 58, 59.

mmm

A three-character name of a month, as in JAN, FEB, …, NOV, DEC.

Mon, Tue, …

The abbreviated name of the day of the week, as in MON, TUE, …, SAT, SUN.

Monday, Tuesday, …

The name of the day of the week, as in MONDAY, TUESDAY, …, SATURDAY, SUNDAY.

q Q

The quarter of the year, as in 1 Q, 2 Q, 3 Q, or 4 Q.

Ss

Following a colon, the number of seconds in the range 00, 01, …, 58, 59. Following a period, the number of hundredths of a second.

ww WK

The one- or two-digit number of the week of the year in the range 1 WK, 2 WK, …, 51 WK, 52 WK. Note: Although week numbers can be one or two digits, the numbers always align when printed in columns because SPSS inserts a blank in front of single-digit numbers.

yy

A two-digit year in the range 00, 01, …, 98, 99. The assumed first two digits of the four-digit year are determined by the configuration found at Edit  ⇒    Options  ⇒    Data.

yyyy

A four-digit year in the range 0001, 0002, …, 9998, 9999.

You can change the format of a date variable at any time without fear of losing information. For example, you could enter the data under a format that accepted only the year, month, and day, and then change the format to something that contains only the hours and minutes. The format may not display all the information you entered (in this case, it won’t), but when you change the format back to something more inclusive, all your data is still there.

Warning To enter data, choose a format — any format — that contains all the data you have. Later, you can change to a more limited format that displays only the information you want. But you can’t go the other way.

Using the Date and Time Wizard

If you have dates that have been properly declared, you can easily do numerous types of calculations. Just follow these steps:

  1. Open the nenana2.sav dataset.

    The file is not in the SPSS installation directory. You have to download it from the book’s companion website at www.dummies.com/go/spss4e.

  2. Choose Transform ⇒   Date and Time Wizard.

    The window shown in Figure 7-2 appears.

    Screenshot of the welcome page of the Date and Time Wizard window to select and learn how dates and times are represented in SPSS Statistics.

    FIGURE 7-2: The Date and Time Wizard.

  3. Select the Extract a Part of a Date or Time Variable radio button, and click Continue.
  4. In the Date or Time list, choose DateTime. In the Unit to Extract list, choose Day of Week (see Figure 7-3) and click Continue.
    Screenshot of the Date and Time Wizard window to choose DateTime in the Date or Time list, and choose Day of Week in the Unit to Extract list.

    FIGURE 7-3: Extracting Day of Week.

  5. Call the Result variable Day_of_Week2, and click Done.

    Let's do a second calculation next.

  6. Choose Transform ⇒   Date and Time Wizard.
  7. Select the Calculate with Dates and Times radio button this time, and click Continue.
  8. Select the Calculate Number of Time Units between Two Dates radio button, and click Continue.
  9. Select Current Date and Time [$Time] and move it to the Date1 field. Then select DateTime and move it to the minus Date2 field.
  10. Select the Retain Fractional Part radio button, as shown in Figure 7-4, and click Continue.
  11. Name the Result variable Years_Since2, and click Done.

    The screen shown in Figure 7-5 appears with the two new variables you created. Note that your Years_Since2 variable will be slightly different than what is shown in the image because the current data and time will be different.

Screenshot of the Date and Time Wizard window to select the Retain Fractional Part radio button and click Continue.

FIGURE 7-4: Date subtraction.

Illustration of a screen displaying the dataset of two new variables with calculation added.

FIGURE 7-5: The dataset with calculations added.

Creating and Using a Multiple-Response Set

Most survey questions allow respondents to provide only one answer. For example, the question “What is your favorite baseball team?” allows for only one response. Multiple-response questions, on the other hand, allow respondents to provide more than one answer. For instance, the question, “What TV shows have you seen in the last six months?” allows for more than one response.

A multiple-response set, or multiple-response variable, is much like a new variable made of other variables you already have. A multiple-response set acts like a variable in some ways, but in other ways it doesn't. You define it based on the variables you’ve already defined, but it doesn’t appear on the Variable View tab. It doesn’t even show up when you list your data on the Data View tab. But it does appear among the items you can choose when defining graphs and tables.

The following steps explain how you can define a multiple-response set, but not how you can use one — that comes later when you generate a table or a graph. Also, SPSS has three Multiple Response menus: The one in the Data menu is for graphs; the one under the Analyze ⇒   Tables ⇒   Multiple Response Sets is for tables; and the one in the Analyze menu is for using special menus that you see in this example.

Warning Having three menus for declaring this variable type can be a little confusing.

A multiple-response set can contain a number of variables of various types, but it must be based on two or more dichotomy variables (variables with just two values — for example, yes or no or 0 or 1) or two or more category variables (variables with several values — for example, country names or modes of transportation). For example, suppose you have two dichotomy variables with the value 1 defined as no and the value 2 defined as yes. You can create a multiple-response set consisting of all the cases where the answer to both is yes, where the answer to both is no, or whatever combination you want.

Do the following to create a simple multiple-response set:

  1. Open the Apples and Oranges.sav dataset.

    The file is not in the SPSS installation directory. You have to download it from the book’s companion website at www.dummies.com/go/spss4e. Four dichotomous variables have 1 for Yes and 0 for No as their possible answers. Note in Figure 7-6 that the dataset has only ten respondents.

  2. Choose Analyze ⇒   Multiple Response ⇒   Define Variable Sets.

    Your variables appear in the Set Definition area. If you previously defined any multiple datasets, they appear in the list on the right.

  3. Select all four variables in the Set Definition box, and then click the arrow to move the selections to the Variables in Set box.
  4. In the Variable Coding area, select the Dichotomies option and specify a Counted Value of 1.
  5. Provide a multiple response set name.

    To follow along with the example, type Fruit.

    Illustration of a screen displaying four fruits variables set in four columns that have been labeled Yes and No, respectively.

    FIGURE 7-6: The variables are nominals with possible values of 1 and 0, which have been labeled Yes and No, respectively.

  6. Click Add.

    The new multiple-response set is created and a dollar sign ($) is placed before the name, as shown in Figure 7-7. The dollar sign in the file name identifies the variable as a multiple-response set. The new name will appear in two special menus in the Analyze menu.

    Screenshot of the Define Multiple Response Sets dialog box where a new multiple-response set is created and the dollar sign in the filename identifies the variable.

    FIGURE 7-7: The window showing the complete definition.

  7. Click Close.

    Now that you've created the multiple-response variable, you'll be able to use it to obtain frequencies.

  8. Choose Analyze ⇒   Multiple Response ⇒   Frequencies.

    The new special variable appears. To obtain frequencies, you'll need to place the multiple-response variable in the Table(s) For box.

  9. Move the $Fruit variable into the Table(s) For area, as shown in Figure 7-8.

    The new special Frequencies report appears in the output window, as shown in Figure 7-9.

Screenshot of the Multiple Response Frequencies dialog box to move the $Fruit variable into the Table(s) For area.

FIGURE 7-8: The Multiple Response Frequencies dialog.

The Multiple Response Frequencies table in which the new special Frequencies report appears in the output window.

FIGURE 7-9: The Multiple Response Frequencies table.

This table might look confusing at first. As mentioned, the dataset has only ten people, and you can see that they bought 24 pieces of fruit. Nine pieces of fruit were apples — 37.5 percent of the fruit. Nine out of ten people bought apples — 90 percent of the people. So the difference is the denominator: 9/24 or 9/10. What makes this table special is that what you usually care about are people with multiple responses. In other words, how many people shopping at the store will buy apples along with other purchases?

Tip If multiple-response sets are a common variable type for you, consider getting the Custom Tables module because it offers lots of options for this kind of variable. You can read more about modules in Chapter 28.

Copying Data Properties

Suppose you have some data definitions in another SPSS file, and you want to copy one or more of those definitions but you don’t want the data. (All you want is the metadata.) SPSS enables you to choose from several files and copy only the variable definitions you want into your current dataset.

Tip If you have a variable of the same name defined in your dataset before you run the Copy Data Properties procedure, you can choose to change the existing variable definition by loading new information from another file. The copied definition simply overwrites the previous information. Otherwise, the copying procedure creates a new variable.

The following steps show you how to copy data properties. You start with a new data file and copy the properties from the Apples and Oranges.sav data file, which was used in the previous exercise:

  1. Choose File ⇒   New ⇒   Data.

    You now have a new dataset with no data and no variable information.

  2. Choose Data ⇒   Copy Data Properties.

    The Copy Data Properties procedure on the Data menu provides the facility to copy variable properties from one dataset to another. Dictionary information can be copied to the active data file from an open dataset or from an external SPSS data file.

  3. In the An Open Dataset box, select Apples and Oranges.sav, as shown in Figure 7-10.

    If the Apples and Oranges.sav data file does not appear in the An Open Dataset box, select the An External SPSS Statistics Data File radio button, click the Browse button, locate the file, and click Open.

  4. Click Continue.

    A screen appears with the following options:

    • Apply Properties from Selected Source Dataset Variables to Matching Active Dataset Variables: This option is the default.

      To update the source list to display all variables in the source data file, also select Create Matching Variables in the Active Dataset If They Do Not Already Exist. If selecting source variables that do not exist in the active dataset (based on variable name), new variables will be created in the active dataset with the variable names and properties from the source data file. If the active dataset does not contain variables (a blank, new dataset), all variables in the source data file are displayed and new variables based on the selected source variables are automatically created in the active dataset.

      Screenshot of the welcome page of the Copy Data Properties Wizard page to select the file you want to use as the source of variable definitions.

      FIGURE 7-10: Select the file you want to use as the source of variable definitions.

    • Apply Properties from a Single Source Variable to Selected Active Dataset Variables of the Same Type: Applies the selected variable properties from one source variable to the selected variables in the active dataset. This option is useful when you're copying value labels or missing value designations to a series of variables.
    • Apply Dataset Properties Only — No Variable Selection: Applies only file properties to the active dataset. (For example, a dataset can have a file label and this file label will be copied). No variable properties will be applied (this option is not available if the active dataset is also the source data file).
  5. Use the default option.

    Now you need to specify from which variables you want to copy properties.

  6. Select the variables from which you want copy information.

    To follow along with the example, select all the variables, as shown in Figure 7-11. (To deselect a variable, hold down the Ctrl key, and click the variable.)

  7. To use the variables you’ve selected, click Continue.

    If instead you wanted to copy the complete definitions of all the variables you selected and overwrite what you have, you would click Done. Clicking the Continue button, as in this example, allows you to be more specific about which parts of the variable definitions you want to copy.

    Screenshot of the Copy Data Properties page to apply properties from selected source dataset variables to matching active dataset variables.

    FIGURE 7-11: Select the source variable names you want to use for definitions.

  8. Choose the properties of the existing variable definitions you want to copy to the variables you’re modifying.

    To follow along with the example, copy all the properties. In Figure 7-12, everything is selected by default, but you can deselect any properties you don't want. These selections apply to all variables you’ve chosen. If you want to handle each variable separately, you’ll have to run through this procedure again for each one, selecting different variables each time.

    Screenshot of the Copy Data Properties page to select the variable properties to be copied to the active dataset.

    FIGURE 7-12: Select which attributes you want to copy.

  9. Click Continue.

    If you were satisfied with your choices, you could click Finish to complete the process. By clicking Continue, however, you can select from a list of available properties to be copied.

  10. Choose any properties made available in the dialog shown in Figure 7-13.

    Depending on the variable type, different properties are available to be copied. Unavailable properties appear dimmed. By default, none of them are selected. The example dataset does not have any of these properties.

    Screenshot of the Copy Data Properties page where different properties are available to be copied depending on the variable type.

    FIGURE 7-13: Attributes other than variable definitions can be copied from the source.

  11. Click Continue to move to the final dialog.

    The screen displays the number of existing variable definitions to be changed, the number of new variables to be created, and the number of other properties that will be copied.

  12. To execute the copy procedure immediately, click Done.

    You could instead save the set of instructions as a Syntax script so you can execute them later. (Part 8 describes using the Syntax language.)

Using the basic variable types and adding property descriptions as needed, you should be able to concoct any type of variable you want.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
13.58.82.79