Chapter 2: Shifting Gears: Macro Language

As the focus shifts from the SAS language to the macro language, similar principles apply. It pays to expand your knowledge, to experiment, and to keep up with the latest software features, all the while attempting to keep programs as simple as possible. For example, just as the SCAN function can read a list from right to left, the %SCAN function can as well. Revisit this macro parameter from the beginning of Chapter 1:

by_varlist = state county town,

Based on that parameter, a macro must generate this code (with the bold sections based on &BY_VARLIST):

proc sort data=municipalities;

     by state county town;

run;

data last_one;

     set municipalities;

     by state county town;

     if last.town;

run;

The text substitution is easy (in two places):

   by &by_varlist;

But how does macro language locate the last word in the list to create:

   if last.town;

In years past the %SCAN function had to read from left to right, using %str( ) to indicate that blanks are the only delimiters. The answer looked something like this:

%local i;

%let i = 0;

%do %until (%scan(&by_varlist, &i+1, %str( ))=);

    %let i = %eval(&i + 1);

%end; 

if last.%scan(&by_varlist, &i, %str( ));

The %DO loop continues to increment &I, until the next word in the list is blank. The result: the final value of &I matches the number of words in the list. The last statement retrieves that word.

Starting in SAS 8, the %SCAN function can read from right to left, enabling a simpler approach. The first five lines of code vanish, leaving only:

if last.%scan(&by_varlist, -1, %str( ));

Expanding your arsenal of tools is important. But equally important is experimenting with existing tools. Explore their limits without worrying about failure. To illustrate, consider some simple assignment statements:

%let ten=10;

%let twenty=20;

What would happen if we tried to get the same result this way?

%let ten=%let twenty=20; 10;

It turns out this generates an error message. The software won’t begin a second %LET statement in the middle of an executing %LET statement. But what if we attempt to hide the structure:

data _null_;

   call symput('test', '%let twenty=20; 10'),

run;

%let ten=&test;

There is still an error (although the error message changes). What about hiding the interior %LET statement in a macro, like this:

%macro test;

   %let twenty=20; 10

%mend test;

%let ten=%test;

Well, this version works! Executing %TEST assigns 20 as the value for &TWENTY and then generates the text 10 used as the value for &TEN. Why should this version work while the others fail? Experiments shed some light on the reasons and begin by assigning values to these macro variables:

%global ten twenty thirty;

data _null_;

   call symput('test1', '%let twenty=20; 10'), 

   call symput('test2', '%let thirty=30; %let twenty=20; 10'), 

   call symput('test3', '%let thirty=%let twenty=20; 30; 10'),

run;

Outside of a macro definition, all of these statements would generate an error:

%let ten=&test1;   %let ten=%let twenty=20; 10;

%let ten=&test2;   %let ten=%let thirty=30; %let twenty=20; 10;

%let ten=&test3;   %let ten=%let thirty=%let twenty=20; 30; 10;

In every case, macro language interprets the resulting statements as beginning a %LET statement inside a partially completed %LET statement. However, defining macros can erase some of the error conditions. Remember that &TEN, &TWENTY, and &THIRTY are already defined as global variables:

%macro retrieve_test1;

   %let ten=;

   %let twenty=;

   %let thirty=;

   %let ten=&test1;  %let ten=%let twenty=20; 10;

%mend retrieve_test1;

%retrieve_test1

%put &ten;                10

%put &twenty;          20

%put &thirty;          

In this case, all the %LET statements execute without error.

Here’s the next experiment:

%macro retrieve_test2

   %let ten=;

   %let twenty=;

   %let thirty=;

   %let ten=&test2;  %let ten=%let thirty=30; %let twenty=20; 10;

%mend retrieve_test2;

%retrieve_test2

%put &ten;               10

%put &twenty;         20

%put &thirty;           30

Once again, the error messages vanish. The macro has embedded multiple %LET statements inside a %LET statement.

Try one final test:

%macro retrieve_test3;

   %let ten=;

   %let twenty=;

   %let thirty=;

   %let ten=&test3;    %let ten=%let thirty=%let twenty=20; 30; 10;

%mend retrieve_test3;

%retrieve_test3           Error!

%put &ten;

%put &twenty;

%put &thirty;

Finally, an error message surfaces: Open code recursion detected.

Do these results make sense? Why should the final test generate an error when the first two do not? Any conclusion we arrive at is really a theory, an explanation that we concoct that is consistent with the results of our experiments. Here is one such explanation. Evidently, the process of compiling a macro has an impact: the semicolons that appear within the macro definition get interpreted as ending the statements that appear within the macro. For example, %retrieve_test1 contains this statement:

%let ten=&test1;

By compiling the macro, the software determines that the semicolon it can "see" is the one that ends the %LET statement. Later, executing the macro generates this statement:

%let ten=%let twenty=20; 10;

Still, the software "remembers" that the final semicolon on the line is the one that ends the first %LET statement. It "figures out" that the interior semicolon ends the interior %LET statement.

Executing the macro does not change that interpretation. So executing the macro can add entire %LET statements, each one having its own semicolon, without introducing an error condition. However, executing the macro still cannot add two %LET statements that are embedded within one another. So %RETRIEVE_TEST3 can generate this statement:

%let ten=&test3;

During the macro compilation process, the software still determines that the semicolon it can "see" ends the %LET statement. But a problem arises when macro execution generates this statement:

%let ten=%let thirty=%let twenty=20; 30; 10;

The software still "remembers" that the final semicolon on the line ends the first %LET statement. But the section of the statement in bold is all new, generated code:

%let thirty=%let twenty=20; 30;

As newly generated code, what should this statement mean? The possibilities are:

•    Assign &TWENTY a value of "20" and &thirty a value of "30"

•    Assign &THIRTY a value of "%let twenty=20; 30"

Rather than try to figure out what you meant (or perhaps for other technical reasons that are hidden inside the black box of macro language), the software generates an error. Overall, however, our experimentation has discovered cases where it is possible to embed multiple statements within a %LET statement. Chapter 6 contains an example that utilizes this discovery.

In essence, we now have a theory. As part of the macro compilation process, the software interprets the semicolons it can "see" as ending the statements it can "see". Can we isolate those conditions, and simplify our experiments, to support that theory? Here is just one supporting test:

data _null_;

   call symput ('semicolon', ';'),

run;

%put &semicolon;            null

%macro supporting_test;

   %put &semicolon;

%mend supporting_test;

%supporting_test              ;

Twice, this program generates:

%put ;;

The first time, when the generated text appears in open code, the software interprets this as being two statements. The first semicolon ends the %PUT statement. So the %PUT statement writes nothing, and then a null statement appears. The second time, when the macro call generates the %PUT statement, the software interprets the entire generated code as a single statement. The macro compilation process has already determined that the semicolon it can “see” ends the %PUT statement. Therefore, executing the macro causes %PUT to write a semicolon. Chapter 7 contains another example utilizing this discovery. For now, though, consider that our theory might not be limited to semicolons. It might apply to parentheses as well, as in this statement:

%let b = %length (&a);

Would it be possible that this statement could produce different results, depending on whether it appears inside or outside a macro? Here is the macro that will test that theory:

%macro length_test; 

     %let b1 = %length(&balanced);

     %let b2 = %length(&right_only);

%mend length_test;

The tests will run on two incoming strings:

%let balanced = (text)ing;

%let right_only = text)ing;

%let b1 = %length(&balanced);

%let b2 = %length(&right_only); 

%put &b1;            9

%put &b2;            4)ing

%length_test

%put &b1;            9

%put &b2;            4)ing

Evidently, the rules that apply to semicolons don’t apply to right-hand parentheses. In compiling the macro, the software doesn’t determine that the right parenthesis it can “see” is the one that defines the characters that %LENGTH should measure.

Let’s consider one more experiment. An interactive session can appear unresponsive for a number of reasons. Besides forgetting a final RUN statement, the cause might be:

•    An embedded comment began with /* but was never ended.

•    A macro definition is missing a %MEND statement.

•    There are unmatched quotes (either single or double).

Under any of these circumstances, additional code can be submitted but will not run. The color-coded program editor can help identify some of these conditions. And it would be nicer still if the interactive Run menu were to include an option to clear out all already submitted statements, allowing you to start over. Submitting this magical string might refresh a hung session:

*);*/;/*'*/ /*"*/; %MEND;run;quit;;;;;

But this is a lot to remember. Would it be possible to capture this code as a macro variable? A DATA step would certainly allow it:

call symput('fixit', '*);*/;/*''*/ /*"*/; %MEND;run;quit;;;;;'),

However, this is not a viable approach to refreshing a hung session. Issuing this statement won’t be sufficient:

&fixit

If the cause of the hang-up is unbalanced single quotes or an unfinished macro definition, the macro variable won’t resolve. Remember, if all your experiments succeed, it's a sign that you are not experimenting enough!

Besides experimenting and expanding your list of tools, be creative with the ones that you have. Compare these two statements, for example:

%if &total < 0 %then %put WARNING;

%if &total < 0 %then %put WAR%str()NING;

Clearly, both statements generate the same result. However, the second statement is an improvement. Why? Because if you search the log for the word WARNING, you will always find it with the first statement. With the second, you only find a WARNING when the logic (&TOTAL < 0) has generated the word WARNING.

Even a simple comment statement can take on different formats. Compare these two approaches:

%* This is a comment that        %** This is a comment that ;

   spans two lines.;                     %** spans two lines.             ;

The second approach has a distinct advantage. It allows text-searching tools to easily locate and/or extract all the comment lines within a program.

Even when using the simplest of tools, tinker. Ponder how they could be used. Understanding what they do is just the starting point.

As we move on to the next set of chapters, we will explore a variety of tools and concepts, both old and new. Experimentation and pushing the tools to the limit will abound throughout.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.137.220.92