Chapter 5: CALL SYMPUT

5.1 Leading and Trailing Blanks

5.2 A Similar Lesson, Using Recursion

5.3 Test Your Skill

5.3.1 Programming Challenge #4a

5.3.2 Solution

5.3.3 Programming Challenge #4b

5.3.4 Solution

5.3.5 Programming Challenge #4c

5.3.6 Solution

5.4 Function Shifts in the Real World

5.5 A Key Issue: Extra Blanks

A macro variable that is only two characters long can take on 65,536 different values. When the length of a macro variable reaches 65,534 characters, the number of values reaches a staggering level: 256 ** 65,534. With so many possible values, some of them would certainly cause trouble if they found their way into a program. And CALL SYMPUT is the best tool for causing such trouble. It can assign any conceivable value to a macro variable. Consider this trouble-making combination:

data _null_;

     call symput('a', 'NOW; THERE WILL be a #@%&PROBLEM'),

run;

%let b = &a;

This simple %LET statement resolves into:

%let b = NOW; THERE WILL be a #@%&PROBLEM;

Although %LET successfully assigns NOW to &B, problems arise:

•    The second SAS statement generates an error message because it begins with THERE.

•    The macro processor attempts to resolve %&PROBLEM.

It is easy for CALL SYMPUT to assign a troublesome value. Single quotes around the second parameter turn off macro resolution, giving the DATA step great flexibility to assign a multitude of problematic values. Some problems occur naturally, with no attempt to create a troublesome value. That is the focus of the next section.

5.1 Leading and Trailing Blanks

Consider two simple-looking statements:

%let a =          250;

%let a = Bob         ;

In both cases, the value of &A is three characters long. %LET ignores leading and trailing blanks. Because of that feature this statement is potentially useful:

%let a = &a;

It would remove any leading or trailing blanks from the value of &A. How do leading or trailing blanks become part of &A to begin with? CALL SYMPUT could easily generate either scenario above:

call symput('a', '         250'),

call symput('a', 'Bob         '),

These statements explicitly include extra blanks within the single quotes. In that sense, they represent an unusual usage of CALL SYMPUT. More commonly, careless usage introduces those extra blanks. The statements below, for example, inadvertently add leading or trailing blanks to &A:

call symput('a', n_recipes);

call symput('a', name);

The first statement automatically performs a numeric-to-character conversion, transforming N_RECIPES into a twelve-character text string with leading blanks. The second statement transfers the full value of NAME, including any trailing blanks. This issue of extra blanks forms a recurring theme throughout this chapter.

Other tools would have a lot more trouble assigning leading or trailing blanks. Certainly this would be a legal possibility:

%let a = %str(         250);

However, quoted blanks remain as part of the value of &A. Those quoted blanks would not be removed by coding:

%let a = &a;

5.2 A Similar Lesson, Using Recursion

Take one more troublesome example. Macro language would normally reject this statement as being "recursive":

%let a = %let a = #@%?&;

But CALL SYMPUT has no trouble assigning the intended value:

call symput ('a', '%let a = #@%?&'),

Remember, this was our original useful statement:

%let a = &a;

But that statement now generates the "recursive" result:

%let a = %let a = #@%?&;

Note that this combination of statements might function differently when embedded in a macro definition. Chapter 2 illustrated a similar case.

The basic lesson, though, is that the single quotes around the second parameter in CALL SYMPUT can assign a percent sign, a single quote, a semicolon, or any other character to a macro variable. Single quotes allow CALL SYMPUT to transfer a virtually limitless set of text values to a macro variable. Some values from that limitless assortment produce troublesome results when the macro variable resolves.

5.3 Test Your Skill

Did the message sink in? Here is a test to find out. In the set of problems below, assume that the statements are used properly within a macro definition.

5.3.1 Programming Challenge #4a

These statements generate Match #2, but they do not generate Match #1. How could this happen?

%if "&flower" = "rose" %then %put Match #1;

%let flower     = &flower;

%if "&flower" = "rose" %then %put Match #2;

5.3.2 Solution

First, consider these DATA step statements:

if flower = "rose"    then put 'Match #1';

if flower = "rose"    then put 'Match #2';

When either comparison is true, the other must be true as well. When the DATA step compares strings of different lengths, it forces the lengths to match by padding the shorter string with blanks. However, in the macro language, these comparisons are different:

%if "&flower" = "rose"    %then %put Match #1;

%if "&flower" = "rose"    %then %put Match #2;

When one statement produces a match, the other cannot also match. Macro language looks for an exact match on all characters, including both the quotes and the trailing blanks that are within quotes.

With regard to challenge 4a, suppose CALL SYMPUT creates a variable with trailing blanks:

call symput('flower', 'rose   '),

In that case, the first statement below resolves into the second statement:

%if "&flower" = "rose" %then %put Match #1;

%if "rose      " = "rose" %then %put Match #1;

This comparison does not produce a match. However, the next statement changes &FLOWER by removing the trailing blanks:

%let flower = &flower;

Subsequently, the final statement produces a match:

%if "&flower" = "rose" %then %put Match #2; 

Too easy? Try the second problem.

5.3.3 Programming Challenge #4b

This problem is similar, but the %LET statement in the middle changes. Once again, these statements generate Match #2, but they do not generate Match #1. How could this happen?

%if "&flower" = "rose" %then %put Match #1; 

%let leaf = &leaf;

%if "&flower" = "rose" %then %put Match #2;

5.3.4 Solution

Once you have solved challenge 4a, challenge 4b becomes easier. Somehow, changing the value of &LEAF also changes the value of &FLOWER. The simplest variation sets them equal to one another:

data _null_;

     call symput('leaf', 'rose   '),

     call symput('flower', '&leaf'),

run;

Within the DATA step, single quotes prevent any attempt to resolve &LEAF. But later references to &FLOWER allow the resolution to take place. So the “before” comparison resolves in two steps:

%if "&flower" = "rose" %then %put Match #1;

Substituting for &FLOWER generates:

%if "&leaf" = "rose" %then %put Match #1;

Next, substituting for &LEAF generates:

%if "rose   " = "rose" %then %put Match #1;

As before, the result is a non-match. Then comes the change:

%let leaf = &leaf;

This statement removes any leading and trailing blanks from &LEAF:

%let leaf = rose   ;

On the final statement, substituting for &FLOWER generates just four characters:

%if "rose" = "rose" %then %put Match #2;

One of the keys to solving the problem is recognizing the power of single quotes within CALL SYMPUT. Single quotes prevent any attempts to resolve macro references, allowing live macro triggers to become part of a macro variable.

Still too easy? The theme continues in the next challenge.

5.3.5 Programming Challenge #4c

Now there is no %LET statement between the two comparisons. So how could these statements generate Match #2 but not Match #1?

%if "&flower" = "rose" %then %put Match #1;

%if "&flower" = "rose" %then %put Match #2;

5.3.6 Solution

In the previous problems, single quotes permitted a live & to become part of a macro variable. In this problem, they permit a live % to become part of a macro variable. Consider this variation:

data _null_;

     call symput('flower', '%leaf'),

run;

Could a macro call generate one result the first time and a different result the second time? When the question takes on that form, the problem becomes feasible. This macro fits the bill:

%macro leaf;

     %global ever_done_before;

     %if &ever_done_before=Yes %then rose;

     %else petunia;

     %let ever_done_before=Yes;

%mend leaf;

Unless external code somehow alters &EVER_DONE_BEFORE, this macro generates "petunia" the first time it gets called and "rose" every subsequent time. Under these conditions, both of these %IF conditions would be true:

%if "&flower" = "rose"       %then %put Flower is petunia.;

%if "&flower" = "petunia" %then %put Flower is rose.;

Moreover, it is possible to generate a series of ever-changing values by altering the macro's %LET statement:

%let ever_done_before=%eval(&ever_done_before+1);

The variations are limited only by your imagination. For example, alter the %LEAF definition along these lines:

%if            &ever_done_before=1 %then proc;

%else %if &ever_done_before=2 %then print;

%else %if &ever_done_before=3 %then data;

. . .

%else %if &ever_done_before=10 %then run;

With the complete %IF %THEN logic in place, the macro calls on the left could generate the program on the right:

%leaf %leaf %leaf %leaf %leaf;    proc print data=cities;

   %leaf %leaf %leaf %leaf;           var state city pop;

%leaf;                                            run;

But coming back to reality for a moment, why would anyone ever do this? Are there any practical applications for a macro that operates differently the first time it gets called? The simplest example might be a single program that:

•    Generates a series of reports

•    Includes data-generated footnotes

•    Numbers the footnotes sequentially

•    Uses a single page at the end of all reports to explain the details of the footnotes

Hypothetically, a macro named %FNOTE could generate all the footnotes. Each call to %FNOTE would have to increment a global counter and then add text to the current report to print the counter in a raised position.

5.4 Function Shifts in the Real World

Even simple processes might shift how they function. For example, consider a simple INPUT statement:

input amount 8.;

As the software searches through eight characters, it must determine whether the characters it finds form a valid numeric. It follows one set of rules until it finds a nonblank. Then it switches gears and follows a different set of rules (because a plus sign, a minus sign, and an embedded blank are now invalid characters). Once a decimal point is found, the rules change again because another decimal point would not be valid. While this description is not all-inclusive (remember that scientific notation is a possibility), it illustrates how a standard process can change each time it encounters another character.

What about macro language? Does it make sense that a macro might change how it operates from one execution to the next?

Recall this situation from Section 4.2. A program generates 100 pages of output. Most of the time, the analyst wants to see only two of those pages to obtain a few key numbers. But occasionally, the other 98 pages are required for further investigation. In that situation, a good programmer would consider creating two output files. Use the .lst file to hold the two key pages, and create a second output file to hold the other 98 pages of backup information. As the program begins creating backup information, add:

proc printto print="some_other_file" NEW;

run;

Then just before the section creating the two key pages, add:

proc printto;

run;

Finally, once the key two pages have been created, redirect any remaining output:

proc printto print="some_other_file";

run;

Leave out the word "NEW" so that additional output gets appended to the existing file.

Before writing the macro, let’s embellish the objective. Define a second output file that travels with the program. As we saw in Section 4.2, batch jobs can retrieve the full path to the submitted program:

%let path = %sysfunc(getoption(sysin)); 

&PATH is the complete path to the program. Now it’s a simple matter to adjust that path:

%let path = %substr(&path, 1, %length(&path)-3);

By removing the last three characters, &PATH now contains the complete path to the program, minus the letters "sas" at the end. It ends with the dot that used to precede "sas". One final change adds an alternate extension:

%let path = &path.backup;

Now the program is ready to create a second output file that automatically follows the program path:

proc printto print="&path" NEW;

run;

Let’s wrap that code in a macro and then return to that concept where the macro should operate differently the first time vs. later times it gets called. First, the macro:

%macro reroute;

     %global path;

     %if %length(&path)=0 %then %do;

          %let path = %sysfunc(getoption(sysin)); 

          %let path = %substr(&path, 1, %length(&path)-3);

          %let path = &path.backup;

     %end;

     proc printto print="&path" new;

     run;

%mend reroute;

A simple command now reroutes subsequent output to the .backup file:

%reroute

Once the program reaches the point where it is ready to create the key pages, redirect subsequent output to the .lst file:

proc printto print=;

run;

The problem becomes more complex if the program shifts several times, alternating between generating important output vs. backup information. Only the first PROC PRINTTO should add the word NEW. But for all subsequent PROC PRINTTO statements, NEW should be eliminated. That enhancement might look like this:

%macro reroute;

   %global path;

   %if %length(&path)=0 %then %do;

       %let path = %sysfunc(getoption(sysin)); 

       %let path = %substr(&path, %length(&path)-3);

       %let path = &path.backup;

       proc printto print="&path" NEW;

   %end;

   %else %do;

       proc printto print="&path";

   %end; 

   run;

%mend reroute;

NEW is necessary the first time the macro runs. Just in case the same program were to run several times, each run must overwrite the output file from earlier runs. The final version of this macro represents a practical application.

5.5 A Key Issue: Extra Blanks

Many of the CALL SYMPUT issues revolve around the presence of leading or trailing blanks in the second argument. To prevent such issues, programmers often remove the blanks with code along these lines:

call symput('leaf', trim(flower));

call symput('tot_amount', trim(left(put(amount, 12.2))));

Technically, switching from TRIM to STRIP would remove both leading and trailing blanks, and it would eliminate the need for the LEFT function. In the second statement, the PUT function controls the numeric-to-character conversion, eliminating conversion messages that would otherwise appear on the log. To help with these issues, macro language now contains an expanded version of CALL SYMPUT, named CALL SYMPUTX. Some of its key features center around the second argument:

•    If the second argument is character, CALL SYMPUTX automatically removes any leading or trailing blanks.

•    If the second argument is numeric, CALL SYMPUTX makes the numeric-to-character conversion automatically but without generating a conversion message in the log. Again, leading or trailing blanks get removed automatically.

The syntax becomes simpler:

call symputx('leaf', flower);

call symputx('tot_amount', amount);

By allowing CALL SYMPUTX to make the numeric-to-character conversion, the programmer gives up control of the numeric format used to convert to character But it is easy to take back that control:

call symputx('tot_amount', put(amount, 12.2));

Because the second argument is now character, CALL SYMPUTX still removes leading and trailing blanks. Finally, CALL SYMPUTX contains additional extended features that are unrelated to the second argument. Section 8.4 covers some of those additional features.

CALL SYMPUT plays a role in most macro applications. Always pay attention to leading and trailing blanks, and you’ll have much less debugging to perform.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.147.28.9