Chapter 21

Adding Syntax to Your Toolkit

In This Chapter

arrowUnderstanding the fundamental form of Syntax commands

arrowControlling the flow of execution through a program

arrowFinding some useful commands and keywords

In the last chapter, you see that Syntax can save you time (avoiding repetitive steps and making it easier for others to know exactly what you’ve done) and that ultimately it’s really pretty easy to get started. When you incorporate Syntax into your routine, you’ll want to start expanding your use of it — as long as it continues to save you time and effort.

Some Syntax commands can’t be pasted from the menus using the Paste button. Some are just easier to type. If you’re going to start typing some Syntax, you’ll have to learn a little bit more about the grammar and the rules of Syntax.

remember You can always choose to do most things in SPSS with the menus. It’s okay to use the menus for a while before you master Syntax. There is an old saying, “When you’re ready, your teacher will appear.” So, when you’re ready, Syntax — and this chapter — will be waiting for you.

Your Wish Is My Command

A single Syntax language instruction can be very simple, or it can be complex enough to serve as an entire program. A single instruction consists of a command followed by arguments to modify or expand the actions of the command. For example, the following Syntax command generates a report:

REPORT /FORMAT=LIST /VARIABLES=MPG.

The first thing you probably noticed is that the command is written in all uppercase. That’s tradition — not a requirement. You can write in lowercase (or even mixed case) if you want. The old-school way of writing it, dating back to when everything was typed, is to write commands in uppercase, and variable names in lowercase. The Syntax window’s autocomplete function will write commands in uppercase. Notice, too, that the end of the list of arguments is terminated by a single period; the terminator must be there or else SPSS will complain.

Now, about those forward slashes and equal signs: Sometimes you need them, and sometimes they’re optional. Always use them and you won’t have any trouble. The presence of slashes and equal signs reduces ambiguity for both you and SPSS. Also, commands can be abbreviated as long as you have at least three letters that uniquely identify the command. Abbreviating commands was a popular strategy when everything was typed, but we can’t think of a single reason to abbreviate anything, especially with the autocomplete feature. Figuring out how to abbreviate a command is more work than just typing it, and abbreviation makes the program harder to figure out later.

The command in this example is REPORT, which causes text to be written to SPSS Statistics Viewer. In fact, all output produced by running Syntax programs goes to SPSS Statistics Viewer. The FORMAT specification tells REPORT to make a list of the values. The VARIABLES specification tells REPORT which variables are to be included in the list.

Commands can begin anywhere on a line and continue for as many lines as necessary. That’s why SPSS is so persnickety about that terminator (the period) — it’s the only way it has of detecting the end of a command. If you forget it, SPSS may think that the additional lines belong to an earlier line. If the syntax turns bright red, it’s a bad sign. Very bad. Try deleting a period, and the colors will change. Trouble. Big trouble. What you’re hoping for is for the command to turn blue. Subcommands will be green, and keywords will be maroon. All the stuff that is unique to your program, like your dataset’s variables, will be a plain old black text. Table 21-1 lists all the command types and the colors in the Syntax Editor.

Table 21-1 Color Coding in the Syntax Editor

Syntax Command Type

Color in the Syntax Editor

Command

Blue

Subcommand

Green

Keyword

Maroon

Other (including variable names)

Black

Error

Red

Using Keywords

All the commands in Syntax are keywords in the language. A keyword is a word already known to the language; it has a predefined action. The variable names you define are not keywords, but SPSS can tell which is which by the way you use them. That is, you can name one of your variables the same name as one of the keywords, and SPSS can tell what you mean by how you use the word. Usually.

The names of commands, subcommands, and functions are all keywords — and there are lots of them — but they aren’t reserved and you can use them freely. For example, you could have variables named format and report, and you could use the following Syntax command to display a list of their values:

REPORT /FORMAT=LIST /VARIABLES=REPORT FORMAT.

remember Don’t try to name variables AND, OR, or NOT. These logical operators are keywords in the Syntax language and are also reserved words. If you try to use a reserved word as a variable name, SPSS will catch it and tell you that you can’t do it. Relational operators are used in the Syntax language to compare values and are also reserved words. The relational operators are EQ, NE, LT, GT, LE, and GE. Some other reserved words are ALL, BY, TO, and WITH. (These operators are discussed in more detail later in this chapter.)

Working with Variables and Constants

Most of the values used in Syntax come from the variables in the dataset you currently have loaded and displayed in SPSS. You simply use your variable names in your program, and SPSS knows where to go and get the values. Some other variables are already defined, and you can use them anywhere in a program. Predefined variables, which are called system variables, all begin with a dollar sign ($) and already contain values. The system variables are listed in Table 21-2.

Table 21-2 System Variables

Variable Name

Description

$CASENUM

The current case number. It’s the count of cases from the beginning case to the current case.

$DATE

The current date in international date format with a two-digit year.

$DATE11

The current date in international date format with a four-digit year.

$JDATE

The count of the number of days since October 14, 1582 (the first day of the Gregorian calendar).

$LENGTH

The current page length.

$SYSMIS

The system missing value. This prints as a period or whatever is defined as the decimal point.

$TIME

The number of seconds since midnight October 14, 1582 (the first day of the Gregorian calendar).

$WIDTH

The current page width.

remember When a Syntax program executes, it’s associated with the currently loaded dataset and uses its variable names and values. This can get confusing when you have more than one dataset open. If SPSS claims that there is no variable with that name, make sure that the correct dataset is active.

Declaring Data

You can define variables and their values right in the Syntax window. You might wonder why you would do this. Why not just have a data file? This is a great way to ask for advice and to prototype calculations. You can send the DATA LIST command with just a few rows of data when you ask a colleague for help. Just copy and paste it right into an email along with your question and the code that you’re trying to fix.

To do so, you create a DATA LIST, which defines the variable names, and follow it with the list of values between BEGIN DATA and END DATA commands. The following example creates three variables (ID, SEX, and AGE) and fills them with four instances of data:

DATA LIST / ID 1-3 SEX 5 (A) AGE 7-8.
BEGIN DATA.
001 m 28
002 f 29
003 f 41
004 m 32
END DATA.
PRINT / ID SEX AGE.
EXECUTE.

The DATA LIST command defines the variables. The first variable is ID. Its values are found in the input stream in columns 1 through 3; therefore it’s defined as being three digits long. It has no type definition, so it defaults to numeric. The second variable is named SEX. It is one character long, and its values are in column 5 of the input. Its type is declared as alpha (A), so it’s declared as a one-character string. The third variable, AGE, is two digits long, is a numeric value, and has its values in columns 7 and 8 of the input.

The BEGIN DATA command comes immediately after the DATA LIST command and marks the beginning of the lines of data — each line is a case. If you’ve ever wondered what it was like to place data on punched cards, this is it. The fundamental design of SPSS is that old. This form of data entry still works, but this is the old way of getting data into a program. When this list of commands is executed, a normal SPSS window appears, showing a dataset with the variable names and values.

You can do all your processing this way, if you prefer. But you don’t have to do it by column numbers. You can enter the data in a comma-separated list, as follows:

DATA LIST LIST (“,”)/ ID SEX AGE.
BEGIN DATA.
1,1,28
2,2,29
3,2,41
4,1,32
END DATA.
PRINT / ID SEX AGE.
EXECUTE.

warning END DATA must begin in the first column of a command line. It’s the only command in Syntax that has this requirement.

Commenting Your Way to Clarity

You can insert descriptive text, called a comment, into your program. This text doesn’t do anything except help clarify how the program works when you read (or somebody else reads) your code. You start a comment the same way you start any other command: on its own line, using the keyword COMMENT or an asterisk. The comment is terminated by a period. Here’s an example:

COMMENT This is a comment and will not be executed.

An asterisk can be used with the same result, which is the way that everyone really does it:

* This is a comment placed here for the purpose of
describing what is going on, and it continues until
it is terminated.

You can also put comments on the same line as a command by surrounding them with /* and */. A comment like this can be inserted anywhere inside the command where you’d normally put a blank. For example, you could put a comment at the end of a command line like this:

REPORT /FORMAT=LIST /VARIABLES=AGE /* The comment */.

remember It is important to note that the command is terminated with a period, but the period comes after the comment because the comment is part of the statement. If you forget, the next line will get swallowed up into the comment and ignored. The following line will not be color-coded correctly either, which may help you catch your mistake. Watch out.

Executing Commands

Commands are executed one at a time, starting from the top of the program. The order is important. In particular, if a variable has not been created yet, you can’t use it. For the most part, the order is intuitive; you don’t have to think much about what exists and what doesn’t.

Some statements don’t execute right away. Instead, they’re stored for later execution. This makes SPSS run faster, so it’s all for a good reason, but most folks don’t know how it works. This is normally of no consequence because the statements will be executed when their result is needed. But you should be aware this is going on because it can cause surprises in some circumstances. For example, the PRINT command has a delayed execution:

PRINT / ALL.

This is a command to print the complete list of values for every case in your dataset. It will print all the values, or by naming variables it can be instructed to print values of only the ones you choose. However, the PRINT command doesn’t do it right away. It stores the instruction for later. Commands like this are called transformations. As you might guess, all the commands in the Transform menu are of this type. Commands like COMPUTE, COUNT, and RECODE are transformations.

When your program comes to a command that executes immediately, the stored commands are executed first. That works fine as long as there’s another statement to be executed, but if the PRINT statement is the last one in your program, nothing happens. That is, nothing happens until you run another program, and then the stored statement becomes the first one executed.

But there is an easy fix that you may see in some Syntax programs. All you need to do is end your program this way:

PRINT / ALL.
EXECUTE.

All the EXECUTE command does is execute any statements that have been stored for future execution. You’ll see programs written by others who do it this way, but you generally don’t want to solve the problem this way. Procedure commands (commands that generate output) will accomplish the same thing. So, just put any old procedure that you had to do anyway after your transformation, like FREQUENCIES, as shown below:

PRINT / ALL.
FREQUENCIES ALL.

For the PRINT command, there is another option. The LIST command does the same thing the PRINT command does, but it executes immediately instead of waiting until the next command:

LIST / ALL.

A number of commands have a Transform version and a Procedure version. For instance, SAVE is a Procedure, and XSAVE is a Transformation. (The Command Syntax Reference, which is located under the Help menu, has lists of which commands are which.) This execution delay may seem odd at first, but there’s a really good reason for it: If SPSS executed every line one at a time, it would have to reread the data for every line and would be very slow. A little tricky, but very important information.

tip Remember that the transformations are delayed, but the procedures happen right away. Your Syntax programs will run best (and fastest) when you try to put all your transformations at the top, and all your procedures at the bottom. Transformations Pending is an error that you’ll get when you’ve managed to end your program with a transformation.

Controlling Flow and Executing Conditionals

Unless you specify otherwise, a program starts at the top and executes one statement at a time through your program until it reaches the bottom, where it stops. But you can change that. Situations come up where you need to execute a few statements repeatedly, or maybe you want to conditionally skip one or more statements. In either case, you want program execution to jump from one place to another under your control. What you’re really trying to do is say that certain cases will be treated one way (by certain lines) and other cases will be treated another way (by other lines).

IF

You use the IF command when you have a single statement you want to execute only if conditions are right. For example:

IF (AGE > 20) GROUP=2.

This statement asks the simple question of whether AGE is greater than 20. If so, the value of GROUP is set to 2. We could’ve used the GT keyword in place of the > symbol. Table 21-3 lists the relational operators you can use to compare numbers.

Table 21-3 Relational Operators

Symbol

Alpha

Definition

=

EQ

Is equal to

<

LT

Is less than

>

GT

Is greater than

<>

NE

Is not equal to

<=

LE

Is less than or equal to

>=

GE

Is greater than or equal to

You can also combine the relational expressions with logical operators to ask longer and more complex questions. Here’s an example:

IF (AGE > 20 AND SEX = 1) GROUP=2.

This statement asks whether AGE is greater than 20 and SEX is equal to 1. If so, GROUP is set to 2. The logical operators are listed in Table 21-4.

Table 21-4 Logical Operators

Symbol

Alpha

Definition

&

AND

Both relational operators must be true.

|

OR

Either relational operator can be true.

~

NOT

Reverses the result of a relational operator.

tip You should use parentheses to organize expressions so there is no ambiguity about what is being compared. When you construct a complicated conditional expression, it’s easy to lose track of your original line of scrimmage.

You have to write your expressions so the computer knows what you’re talking about. Spell them out. For example, IF (A LT B OR GT 5) is not valid. It can be written IF ((A LT B) OR (A GT 5)), which is a longer form but has a clearer meaning.

You can compare strings to strings and numbers to numbers, but you can’t compare strings to numbers.

DO IF

The DO IF statement works the same way as the IF statement, but with DO IF you can execute several statements instead of just one. Because you can enter several statements before the terminating END IF, the END IF is required to tell SPSS when the DO IF is over. The following is an example with two statements:

DO IF (AGE &#x003C; 5).
COMPUTE YOUNG = 1.
COMPUTE SCHOOL = 0.
END IF.

In addition to having the option of including a number of statements at once, you can use DO IF to test several conditions in a series — and execute only the statements of the first true condition(s) by using ELSE IF:

DO IF (AGE &#x003C; 5).
COMPUTE YOUNG = 1.
ELSE IF (AGE &#x003C; 9).
COMPUTE YOUNG = 2.
ELSE IF (AGE &#x003C; 12).
COMPUTE YOUNG = 3.
END IF.

SELECT IF

The SELECT IF statement is not really flow control, but it works that way. You can use it to remove specific cases, and, as a result, include only the cases you want in your analysis. For example, the following sequence of commands prints only the age values greater than 40:

SELECT IF (AGE > 40).
PRINT / AGE.
EXECUTE.

Watch out, though. If you save your dataset right after this command, you’ll lose data! Any of the logical operators and relational operators that can be used in other IF statements can be used in SELECT IF statements. A really powerful and popular way to modify this is the TEMPORARY command.

TEMPORARY.
SELECT IF (AGE > 40).
PRINT / AGE.
EXECUTE.

The SELECT IF will work only until it hits the EXECUTE (or any other procedure). Then SPSS immediately goes back to using all the data. Much better, because it’s less risky. Less risky is nice.

remember If you have any procedures that you have to do anyway, you can (and should) delete the EXECUTE command. Just make sure that some procedure — any procedure — comes after the transformation. Pretty much any command that generates output (table or graph) is a procedure.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.144.107.191