Using a Variety of Advanced Functions

The LAG Function

A Brief Overview

Suppose you have the Certadv.Stock6Mon data set that contains opening and closing stock prices for the past six months for two different companies. You are trying to determine which company has the bigger difference in the daily opening price between consecutive days.
To start, consider what you want the LAG function to return, as shown below.
Figure 14.1 Desired LAG Function Results for ABC Company, by Day
Result to include three new variables FirstPrevDay, SecondPrevDay, and ThirdPrevDay.
The LAG function enables you to compare the daily opening prices between consecutive days by retrieving the previous values of a column from the last time that the LAG function executed.

LAG Function Syntax

The LAG function retrieves a value from a previous observation. It is able to do so because the function maintains a queue of the previous values. If you use LAG or LAG1, you are looking for the previous value one row back. LAG2 gives you the previous value two rows back. LAG3 gives you the previous value three rows back, and so on. The LAG function is useful for computing differences between rows and computing moving averages.
Syntax, LAG function:
LAG<n>(column);
n
specifies the number of lagged values.
column
specifies a numeric or character constant, variable, or expression.

Example: Retrieving Previous Values

The following example uses the LAG function to retrieve previous values using assignment statements. The LAG function also creates new variables based on the previous values of Open. Using March 6 as an example, the first previous value is 59.45, the second previous value is 59.53, and the third previous value is 54.37. These values are highlighted in the table below.
Note: For the first observation, there are no previous values to look up, so the assignment statement returns a missing value. For the second observation, there is a first previous value but no second and third previous values, and so on.
data work.stockprev;
   set certadv.Stock6Mon(drop=Close);
   FirstPrevDay=lag1(Open);
   SecondPrevDay=lag2(Open);
   ThirdPrevDay=lag3(Open);
run;
proc print data=work.stockprev;
run; 
Output 14.1 PROC PRINT Output of Work.StockPrev (partial output)
Partial PROC PRINT Output of Work.StockPrev

Example: Calculating a Moving Average

In addition to computing differences between rows, you can calculate a moving average using the LAG function.
Suppose you have stock prices for the Random Company. The data set contains the opening stock price for the first work day of each month. You need to calculate a moving three-month average. Again, consider what you want the LAG function to return, as shown below.
Figure 14.2 Desired LAG Function Results for Random Company, by Month
Result to include three new variables Open1Month, Open2Month, and Open3MonthAvg.
You can use the LAG function to get the stock price for the past two months. The third row is the first row that calculates an average based on three values.
data work.stockavg;
   set certadv.stocks(drop=Close);
   Open1Month=lag1(Open);
   Open2Month=lag2(Open);
   Open3MonthAvg=mean(Open,Open1Month,Open2Month);
   format Open3MonthAvg 8.2;
run;
proc print data=work.stockavg;
run;
Output 14.2 PROC PRINT Output of Work.StockAvg
PROC PRINT Output of Work.StockAvg
Note: The best practice is to create a lagged value in an assignment statement before using it in a conditional statement.

The COUNT/COUNTC/COUNTW Function

A Brief Overview

Suppose you have the Certadv.Slogans data set, which contains numerous slogans that a company can use for its business. You are asked to identify the number of times a specific word, 24/7, was used in a slogan and how many words were in a slogan. The slogans are separated by commas in a row.
You can use the COUNT function to count the number of times a specific word such as 24/7 appears in the slogan, or you can use the COUNTW function to count the number of words in a slogan. You could even use the COUNTC function to count the number of characters in a slogan.
Note: Word is defined as a character constant, variable, or expression.

COUNT/COUNTC/COUNTW Syntax

There are three variations of the COUNT function. Note the slight difference in syntax for the three functions.
Table 14.1 COUNT/COUNTC/COUNTW Syntax
Function Name
Syntax
Function Definition
COUNT
COUNT(string, substring <,modifiers>)
Counts the number of times that a specified substring appears within a character string.
COUNTC
COUNTC(string, character-list <,modifiers>)
Counts the number of characters in a string that appear or do not appear in a list of characters.
COUNTW
COUNTW(string <,delimiters><,modifiers> )
Counts the number of words in a character string.
character-list
specifies a character constant, variable, or expression that initializes a list of characters. COUNTC counts characters in this list, provided that you do not specify the V modifier in the modifier argument. If you specify the V modifier, all characters that are not in this list are counted. You can add more characters to the list by using other modifiers.
delimiters
can be any of several characters that are used to separate words. You can specify the delimiters by using the chars argument, the modifier argument, or both.
modifiers
is a character constant, variable, or expression that specifies one or more modifiers. modifiers is an optional argument.
i or I
ignores the case of the characters. If this modifier is not specified, COUNT counts character substrings only with the same case as the characters in substring.
t or T
trims trailing blanks from string, substring, and chars arguments.
string
specifies a character constant, variable, or expression in which substrings are to be counted.
Tip
Enclose a literal string of characters in quotation marks.
substring
is a character constant, variable, or expression that specifies the substring of characters to search for in string.
Tip
Enclose a literal substring of characters in quotation marks.

Example: Counting the Number of Words

The following example uses the COUNT function to count the number of times 24/7 appears in the Slogans column. The COUNTW function counts the number of words in the Slogans column. The COUNTW function does not specify any delimiter. Therefore, a default list of blank ! $ % & () * + , - . / ; < ^ | is used.
data work.sloganact;
   set certadv.slogans;
   Num24=count(Slogans,'24/7');
   NumWord=countw(Slogans);
run;
proc print data=work.sloganact;
run;
The COUNT function returns the number of times 24/7 appeared in the Slogan column and assigns the value to Num24. Notice that observation 5 contains 24/365. However, this was not counted as a part of the 24/7. If you change the string to search for to 24/, then 24/365 would appear in the Num24 column. The COUNTW function counts the number of words in each slogan and assigns the value to NumWord.
Output 14.3 PROC PRINT Output of Work.SloganAct
PROC PRINT Output of Work.SloganAct

The FIND/FINDC/FINDW Function

A Brief Overview

Suppose you were asked to identify the starting position of the first occurrence of 24/7 in a string. The FIND function finds the starting position of the first occurrence of a substring in a string. Alternatives to the FIND function are the FINDC and FINDW functions, which are also based on finding the first occurrence. The FINDC function returns the starting position where a character from a list of characters is found in a string, and the FINDW function returns the starting position of a word in a string or the number of the word in a string.

FIND/FINDC/FINDW Function Syntax

There are three variations of the FIND function. Note the slight difference in syntax for the three functions.
Table 14.2 FIND/FINDC/FINDW Function Syntax
Function Name
Syntax
Function Definition
FIND
FIND (string, substring <, modifiers><, start-position>);
Searches for a specific substring of characters within a character string.
Returns the starting position where a substring is found in a string.
FINDC
FINDC (string, character-list <, modifiers> <, start-position>);
Searches a string for any character in a list of characters.
Returns the starting position where a character from a list of characters is found in a string.
FINDW
FINDW (string, word<, delimiters><, modifiers> <, start-position>);
Returns the character position of a word in a string, or returns the number of the word in a string.
character-list
is a constant, variable, or character expression that initializes a list of characters. FINDC searches for the characters in this list, provided that you do not specify the K modifier in the modifiers argument. If you specify the K modifier, FINDC searches for all characters that are not in this list of characters. You can add more characters to the list by using other modifiers.
delimiters
can be any of several characters that are used to separate words. You can specify the delimiters by using the chars argument, the modifiers argument, or both.
modifiers
is a character constant, variable, or expression that specifies one or more modifiers.
i or I
ignores the case of the characters. If this modifier is not specified, FIND searches only for character substrings with the same case as the characters in substring.
t or T
trims trailing blanks from the string, word, and chars arguments.
start-position
is a numeric constant, variable, or expression with an integer value that specifies the position at which the search should start and the direction of the search.
string
specifies a character constant, variable, or expression that will be searched for substrings.
Tip
Enclose a literal string of characters in quotation marks.
substring
is a character constant, variable, or expression that specifies the substring of characters to search for in string.
Tip
Enclose a literal substring of characters in quotation marks.
word
is a character constant, variable, or expression that specifies the word to be searched for.

Example: Finding the Word Number

You can use the FINDW function to return the number of the word 24/7 in the Slogans string. The third argument uses a blank to specify the delimiter separating the words in the string. The E modifier tells SAS to count the number of words instead of returning the starting position. The modifiers argument must be positioned after the delimiters argument. The E modifier is just one of the modifiers that can be used.
data work.sloganact;
   set certadv.slogans;
   Num24=count(Slogans,'24/7');
   NumWord=countw(Slogans);
   FindWord24=findw(Slogans,'24/7',' ','e');
run;
proc print data=work.sloganact;
run;
Output 14.4 PROC PRINT Output of Work.SloganAct (partial output)
Partial Output: PROC PRINT Result of Work.SloganAct
Last updated: October 16, 2019
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.217.110.0