Chapter 4. Entering Data from the Keyboard

In This Chapter

  • Considering your choices when defining a variable

  • Defining variables

  • Entering numbers

  • Making sure that you're using the right measurement type

To process your data, you have to get it into the computer. Entering data is always difficult; it has been a problem with computers since the beginning. No matter how you decide to get your numbers into SPSS, at some point someone has to type them (unless they come from some form of automatic monitoring). SPSS can read data from other places. You can also type directly into SPSS — and, if you want, copy the data to places other than SPSS later.

Entering data into SPSS is a two-step process: First you define what sort of data you will be entering, and then you enter the actual numbers. This may sound difficult, but it isn't so bad. When you see how data entry works in SPSS, you'll discover you have some pretty nifty software to help you.

You organize your data into cases. Each case is made up of a collection of variables. First, you define the characteristics of the variables that make up a case, and then you enter the data into the variables to make up the contents of the cases. This chapter shows you how to work with this technique of getting data into your system.

The Variable View Is for Entering Variable Definitions

You use the Variable View, shown in Figure 4-1, to define the names and characteristics of variables. This is where you always start if you plan on entering data into SPSS. You get to this window by clicking the Variable View tab at the bottom of the Data Editor window of SPSS. As you can see in Figure 4-1, every characteristic you can define about your variables is named at the top of the window. All you have to do is enter something in each column for each variable.

Note

The predefined set of 12 characteristics are the only ones needed to completely specify all the attributes of any variable. The characteristics are all known to the internal SPSS processing. When you add a new variable, you will find that reasonable defaults appear for most characteristics.

You use Variable View to define the characteristics of variables.

Figure 4.1. You use Variable View to define the characteristics of variables.

Note

The Variable View window is just for defining the variables. The entry of the actual numbers comes later. (See the section "The Data View Is for Entering and Viewing Data Items" later in this chapter.)

Each variable characteristic has a default, so if you don't specify a characteristic, SPSS fills one in for you. However, what it selects may not be what you want, so let's look at all the possibilities.

Name

The cell on the far left is where you enter the name of the variable. Just click in the cell and type a short descriptor such as age, income, sex, or odor. A longer descriptor, called a label, comes later. You could type longer names here, but you should keep them short because they'll be used in named lists and as identifier tags on the data graphs and such — where the format can be a bit crowded. Names that are too long can cause the output from SPSS to be garbled or truncated.

If the name you assigned turns out to be too long or is misspelled, you can always pop back into Variable View and change it. One of the nice things about SPSS is that you can correct mistakes quickly. I like that. (I had to hide a lot of them for the screen shots in this book.)

Tip

Here are some handy hints about names:

  • You can use some bizarre characters in a name, such as @, #, and $, as well as the underscore character (_) and numbers.

    If you decide you want to use some screwy characters in a name, go ahead and try it. It can't hurt. SPSS never does anything about it other than make you type something else.

  • Be sure to start every name with an uppercase or lowercase letter.

  • You can't include blanks anywhere in a name.

Note

If you want to export data to another application, make sure the names you use are in a form acceptable to that application. Watch out for special characters.

Type

Most data you enter will be just regular numbers. Some, however, will be a special type, such as currency, and some will be displayed in a special format. Other data, such as dates, will require special procedures for calculation. You simply specify what type you have and SPSS takes care of those other details for you.

Click the cell in the Type column you want to fill in, and a button with three dots appears on its right. Click that button and the dialog box shown in Figure 4-2 appears.

The Variable Type dialog box allows you to specify the type of variable you are defining.

Figure 4.2. The Variable Type dialog box allows you to specify the type of variable you are defining.

You can choose from the following predefined types of variables:

  • Numeric: Standard numbers in any recognizable form. The values are entered and displayed in the standard form, with or without decimal points. Values can be formatted in standard scientific notation, with an embedded E to represent the start of the exponent. The Width value is the total number of all characters in a number — including any positive or negative signs and the exponent indicator. The Decimal Places value specifies the number of digits displayed to the right of the decimal point, not including the exponent.

  • Comma: This type specifies numeric values with commas inserted between three-digit groups. The format includes a period as a decimal point. The Width value is the total width of the number, including all commas and the decimal point. The Decimal Places value specifies the number of digits to the right of the decimal point. You may enter data without the commas, but SPSS will insert them when it displays the value. Commas are never placed to the right of the decimal point.

  • Dot: Same as Comma, except a period character (.) is used to group the digits into threes, and a comma is used for the decimal point.

  • Scientific Notation: A numeric variable that always includes the E to designate the power-of-ten exponent. The base, the part of the number to the left of the E, may or may not contain a decimal point. The exponent, the part of the number to the right of the E — which also may or may not contain a decimal — indicates how many times 10 multiplies itself, after which it's multiplied by the base to produce the actual number. You may enter D or E to mark the exponent, but SPSS always displays the number using E. For example, the number 5,286 can be written as 5.286E3. To represent a small number, the exponent can be negative. For example, the number 0.0005 can be written as 5E-4. This format is useful for very large or very small numbers.

  • Date: A variable that can include the year, month, day, hour, minute, and second. When you select Date, the available format choices appear in a list on the right side of the dialog box, as shown in Figure 4-3. Choose the format that best fits your data. Your selection determines how SPSS will format the contents of the variable for display. This format also determines, to some extent, the form in which you enter the data. You can enter the data using slashes, colons, spaces, or other characters. The rules are loose — if SPSS doesn't understand what you enter, it tells you, and you can re-enter it another way. For example, if you select a format with a two-digit year, SPSS accepts and displays the year that way, but it will use four digits to perform calculations. The first two digits (the number of the century) will be selected according to the configuration you set by choosing Edit

    The Variable Type dialog box allows you to specify the type of variable you are defining.
    Selecting a date format also selects which items are included.

    Figure 4.3. Selecting a date format also selects which items are included.

  • Dollar: When you select Dollar, the available format choices appear in a list on the right side of the dialog box. Dollar values are always displayed with a leading dollar sign and a period for a decimal point, and, for large values, will include commas to collect the digits in groups of threes. You select the format and its Width and Decimal Places values, as shown in Figure 4-4. The format choices are similar, but it's important that you choose one that's compatible with your other dollar-variable definitions so they line up when you print and display monetary values in output tables. The Width and Decimal Places settings help with vertical alignment in the output, no matter how many digits you include in the format itself. No matter what format you choose, you can enter the values without the dollar sign and the commas; SPSS inserts those for you.

    The different dollar formats mostly specify the number of digits to be included.

    Figure 4.4. The different dollar formats mostly specify the number of digits to be included.

  • Custom Currency: The five custom formats for currency are named CCA, CCB, CCC, CCD, and CCE, as shown in Figure 4-5. You can view and modify the details of these formats by choosing Edit

    The different dollar formats mostly specify the number of digits to be included.
    Five custom currency formats are available.

    Figure 4.5. Five custom currency formats are available.

  • String: A freeform non-numeric item. Because it is non-numeric, the contents of a variable of this type can never be used for calculations. You can specify any number of any characters up to the maximum length you specify, as shown in Figure 4-6. You can also use a variable of this type as a descriptor or an identifier of a particular case.

A freeform type never used in calculations.

Figure 4.6. A freeform type never used in calculations.

Width

The width setting in the definition of a variable determines the number of characters used to display the value. If the value to be displayed is not large enough to fill the space, the output will be padded with blanks. If it is larger than you specify, it will either be reformatted to fit or asterisks will be displayed.

Tip

Certain type definitions allow you to set a width value. The width value you enter as the Width definition is the same as the one you enter when you define the type. If you make a change to the value in one place, SPSS changes the value in the other place automatically. The two values are the same.

At this point, you can do one of three things:

  • Skip this cell and accept the default (or the number you entered previously under Type).

  • Enter a number and move on.

  • Use the up and down arrows that appear in the cell to select a numeric value.

Decimals

The number of decimals is the number of digits that appear to the right of the decimal point when the value appears on-screen. This is the same number that you may have specified as the Decimal Places value when you defined the variable type. If you entered a number there, it appears here as the default. If you enter a number here, it changes the one you entered for the type. They are the same.

Now you can do one of three things:

  • Skip this cell and accept the default (or the number you entered earlier under Type).

  • Enter a number and move on.

  • Use the up and down arrows that appear in the cell to select a numeric value.

Label

The name and the label serve the same basic purpose: They are descriptors that identify the variable. The difference is that the name is the short identifier and the label is the long one. You need one of each because some output formats work fine with a long identifier and other formats need the short form.

You can use just about anything for the label. What you choose has to do with how you expect to use your data and what you want your output to look like. For example, the name may be sex and the longer label may be Boys and Girls, Men and Women, or simply Gender.

Tip

The length of the label is not determined by some sort of software requirement. However, output looks better if you use short names and somewhat longer labels. Each one should make sense standing alone. After you produce some output, you may find that your label is lousy for your purposes. That's okay; it's easy to change. Just pop back to the Variable View and make the change. The next time you produce output, the new label will be used.

You can also just skip defining a label. If you don't have a label defined for a variable, SPSS will use the name you defined for everything.

Value

The Values column is where you assign labels to all the possible values of a variable. If you select a cell in the Values column, a button with three dots appears. Clicking that button displays the dialog box shown in Figure 4-7.

You can assign a name to each possible value of a variable.

Figure 4.7. You can assign a name to each possible value of a variable.

Normally, you make one entry for each possible value that a variable can assume. For example, for a variable named Sex you could have the value 1 assigned the label Male and 2 assigned Female. Or, for a variable named Committed you could have 0 for No, 1 for Yes, and 2 for Undecided. If you have labels defined, when SPSS displays output, it will show the labels instead of the values.

To define a label for a value:

  1. In the Value box, enter the value.

  2. In the Label box, enter a label.

  3. Click the Add button.

    The value and label appear in the large text block. To change or remove a definition, simply select it in the text box and make your changes.

  4. Repeat Steps 1–3 as needed.

  5. Click the OK button to save the value labels and close the window.

You can always come back and change the definitions, using the same process you used to enter them. The window will reappear, filled in with all the definitions; then you can update the list.

Missing

You can specify what is to be entered for value that is missing for a variable in a case. That is, when you have values for all variables in a case except one, you can specify a placeholder for the missing value. Click a cell in the Missing column, and the dialog box shown in Figure 4-8 appears.

You can specify exactly what is entered for a missing value.

Figure 4.8. You can specify exactly what is entered for a missing value.

For example, say you are entering responses to questions, and one of the questions is, "How many dirigibles do you own?" The normal answer to this question is a number, so you define the variable type as a number. If someone chooses to ignore this question, this variable won't have a value. However, you can specify a placeholder value. Perhaps 0 seems like a good choice for a placeholder here, but it's not really: A common answer will be 0. Instead, a less likely value — like, say, −1 — makes a better choice.

You can even specify unique values to represent different reasons for a value being missing. In the previous example, you could define −1 as the value entered when the answer is, "I don't remember," and −2 could be used when the answer is, "None of your business." If you specify that a value is representing a missing value, that value is not included in general calculations. During your analysis, however, you can determine how many values are missing for each of the different reasons. You can specify up to three specific values (called discrete values) to represent missing data, or you can specify a range of numbers along with one discrete value, all to be considered missing. The only reason you would need to specify a range of values is if you have lots of reasons why data is missing and want to track them all.

Columns

Columns is where you specify the width of the column you will use to enter the data. The folks at SPSS could have used the word Width to describe it, but they already used that term for the width of the data itself. A better name might have been the two words Column Width, but that would have been too long to display nicely in this window, so they just called it Columns. To specify the number of columns, select a cell and enter the number.

Align

The Align column determines the position of the data in its allocated space, whenever the data is displayed for input or output. The data can be left-aligned, right-aligned, or centered. You've defined the width of the data and the size of the column in which the data will be displayed; the alignment determines what is done with any space left over.

When you select a cell in the Align column, a list appears and you can choose one of the three alignment possibilities, as shown in Figure 4-9. Aligning to the left means inserting all blanks on the right; aligning right inserts all the extra spaces on the left; centering the data splits the spaces evenly on each side — but I don't know what it does if an odd space is left over. (I also worry about things like the number of seeds in a tomato and where the clouds go at night.)

Values can be justified right or left, or positioned in the center.

Figure 4.9. Values can be justified right or left, or positioned in the center.

Measure

Your value here specifies the measure of something in one of three ways. When you click a cell in the Measure column, you can select one of these choices (see Figure 4-10):

  • Scale: A number that specifies a magnitude. It can be distance, weight, age, or a count of something. Most numbers fall into this category. The technical name for this type of number is cardinal, but SPSS uses Scale to keep life simple.

  • Ordinal: These numbers specify the position (order) of something in a list. For example, first, second, and third are ordinal numbers.

  • Nominal: Numbers that specify categories or types of things. You can have 0 represent Disapprove and 1 represent Approve. Or you can use 1 to mean Fast and 2 to mean Slow.

The type of measurement being made by the values in this variable.

Figure 4.10. The type of measurement being made by the values in this variable.

Role

Some of the SPSS dialog boxes select variables according to their role and include them as defaults. You don't need to worry about this characteristic. It can be handy after you have some experience with SPSS and understand how defaults are chosen. When you click a cell in the Role column, you can select one of six choices (see Figure 4-11):

The role assumed by this variable in certain SPSS dialog boxes.

Figure 4.11. The role assumed by this variable in certain SPSS dialog boxes.

  • Input: This variable is used for input. This is the default role. Definition of Roles is new to version 18 of SPSS, and all data imported from earlier versions will be assigned this role.

  • Target: This variable is used as output by SPSS procedures.

  • Both: This variable is used as both input and output.

  • None: This variable has no role assignment.

  • Partition: This variable is used to partition the data into separate samples for training, testing, and validation.

  • Split: This option is included for round-trip compatibility with the SPSS modeler. This capability, however, should not be confused with file splitting (described in Chapter 16).

The Data View Is for Entering and Viewing Data Items

After you've defined all the variables for each case, switch the display to the Data View so you can begin typing the data. You make the switch by clicking the Data View tab at the bottom of the window. When you do, the Data Editor window appears.

At the top of the columns in Figure 4-12, you can see some names I chose for variables. Switching to Data View makes the window ready to receive entered data — and to verify that what's entered matches the specified format and type of the data.

The Data Editor window, ready to accept new data.

Figure 4.12. The Data Editor window, ready to accept new data.

Entering data into one of these cells is straightforward: You simply click the cell and start typing.

If something is already in a cell and you want to change it instead of just typing over it, look up toward the top of the window, just underneath the toolbar: You'll see the name of the variable and the currently selected value. Click the value in the field at the top, and you can edit it right there. You can do all the normal mouse and keyboard stuff there, too — you can use the Backspace key to erase characters, or select the entire value and type right over it.

Tip

If you feel like a lousy (or inexperienced) mouse driver, take some time to experiment and figure out how to edit data. Lots of software use these same editing techniques, so becoming proficient now will pay you dividends later.

If your data is already in a file, you might be able to avoid typing it in again by reading that file directly into SPSS. For more information, see Chapter 5.

Warning

Don't take chances. As soon as you type a few values, save your data to a file by choosing File

The Data Editor window, ready to accept new data.

We all have to go back and refine our variable definitions from time to time. That's normal. When you come across something that doesn't do what you want it to, just switch back to Variable View and correct it. Nobody but you and SPSS will ever know about it, and SPSS never talks.

Filling In Missed Categorical Values

Now that you have defined your variables and entered your data, you might want to check that you have names defined for all your actual ordinal and nominal values, and that you have defined the correct measures for them. SPSS can help by scanning your data, finding values for which you don't have definitions, and pointing them out in a friendly way.

The following steps use an existing file to walk through a demonstration:

  1. Choose File

    Filling In Missed Categorical Values
    Open
    Filling In Missed Categorical Values
    Data to load the file named
    Cars.sav.

    This file came with your installation of SPSS and is found, along with a number of other files, in the same directory in which you installed SPSS. You can load any of these data files, but Cars.sav is the one used in this demonstration. If you load this file while you already have some other data showing in the window, SPSS will open a new Data Editor window to display the new information; your existing data will not be lost.

    When you open this data file — or any data file, for that matter — SPSS opens a SPSS Viewer window to tell you that it has opened a file (or the information could be displayed in a SPSS Viewer window that is already open). You won't need this information for what you are doing here, so you can just close the window.

  2. Choose Data

    Filling In Missed Categorical Values
    Define Variable Properties.

    The Define Variable Properties dialog box appears.

  3. On the left, select all the names of the variables you want to check, and then click the arrow in the center of the window to move them to the right, as shown in Figure 4-13.

    Selecting variables to check their properties.

    Figure 4.13. Selecting variables to check their properties.

  4. Click the Continue button.

  5. Select one of the variable names in the list on the left.

    Its different values appear in the center of the window, as shown in Figure 4-14. (In this example, every value has a name assigned to it.)

  6. Ask SPSS to suggest a new type for this variable by clicking the Suggest button in the top center of the window

    The window in Figure 4-15 appears, telling you what SPSS concludes about this variable and its values. This same window, with different text, appears for each variable you test. Sometimes the text suggests changes in the variable definition, and sometimes it does not.

    The values of the selected variable.

    Figure 4.14. The values of the selected variable.

    From the pattern of values, SPSS concludes whether you may have chosen the wrong measurement.

    Figure 4.15. From the pattern of values, SPSS concludes whether you may have chosen the wrong measurement.

  7. To apply any changes, click Continue.

    You return to the window shown in Figure 4-14, where you can select another variable.

You won't want to make changes to all your variables, but SPSS helps you find the ones that you do need to change. Values defined as Missing are not included in the computations. The text in the window always explains the criteria used to reach a conclusion, and SPSS allows you to make the final decision.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.147.84.157