Example 9.6 Sorting Variable Values within an Observation

Goal

Sort a series of character variable values and a series of numeric variable values within an observation.

Example Features

Featured StepDATA step
Featured Step Options and StatementsCALL SORTC and CALL SORTN CALL routines LARGEST, ORDINAL, and SMALLEST functions
Related TechniqueDATA step and ORDINAL function: ORDINAL sort in descending order

Input Data Set

Data set ADSURVEY contains responses from five partipants in a survey for a store. The three character variables NOTICED1, NOTICED2, and NOTICED3 list where the respondent saw an ad. The three numeric variables AMOUNT1, AMOUNT2, and AMOUNT3 are the amounts spent (if positive) or refunded (if negative) as recalled by the respondent on recent visits to the store. The NOTICED and the AMOUNT series are stored in no specific order.

                       ADSURVEY

Obs surveyid noticed1 noticed2  noticed3 amount1 amount2 amount3
 1     106   website  flyer     MAIL        300      0       0
 2     153   website  newspaper email       -55      .       .
 3     192   email    ?         email         0      .       .
 4     145   Online   n/a       website      75     95     250
 5     162   insert   on-line               -45      .      10

Resulting Data Set

Output 9.6a SURVEYSORTED Data Set

               Example 9.6 SURVEYSORTED Data Set

 Obs   surveyid    noticed1    noticed2     noticed3   amount1
  1       106       MAIL       flyer        website        0
  2       153       email      newspaper    website        .
  3       192       ?          email        email          .
  4       145       Online     n/a          website       75
  5       162                  insert       on- line       .

 Obs   amount2    amount3    ord_amt1    ord_amt2      ord_amt3
  1        0         300         0            0           300
  2        .         -55         .            .           -55
  3        .           0         .            .             0
  4       95         250        75           95           250
  5      -45          10         .          -45            10

        small_    small_    small_    large_    large_    large_
 Obs     amt1      amt2      amt3      amt1      amt2      amt3
  1         0        0        300       300         0        0
  2       -55        .          .       -55         .        .
  3         0        .          .         0         .        .
  4        75       95        250       250        95       75
  5       -45       10          .        10       -45        .


Example Overview

This example demonstrates how you can use CALL routines and SAS language functions in a DATA step to sort character values and numeric values within an observation. When you use PROC SORT, you sort the values of variables across observations, not within.

The following DATA step sorts a series of character variable values with CALL SORTC. It sorts a series of numeric variable values with CALL SORTN and the functions LARGEST, ORDINAL, and SMALLEST.

Data set ADSURVEY has responses from five participants in a store survey. A series of character variables NOTICED1, NOTICED2, and NOTICED3 lists the places that the respondent had seen a recent advertisement. The responses are not in any specific order nor are the text values formatted uniformly. A series of numeric variables AMOUNT1, AMOUNT2, and AMOUNT3 shows the respondents' recalled amounts spent (if positive) or amount refunded (if negative) on three visits to the store. The responses are not in any specific order.

The DATA step sorts the values of the NOTICED series of character variables in ascending order by using CALL SORTC. The three variables are not elements of an array.

The DATA step sorts the AMOUNT series of numeric variables four times by using the four tools: CALL SORTN CALL routine and the LARGEST, ORDINAL, and SMALLEST functions. The three AMOUNT variables are defined as elements in the AMOUNT array.

The DATA step places calls to the three functions within an iterative DO loop. The upper bound of the DO loop is the total number of elements in the numeric array AMOUNT whose elements are the variables that are being sorted. The first argument that is supplied to each of the functions is the DO loop index variable I.

On each iteration of the DO loop, the ORDINAL and SMALLEST functions find the ith smallest value in the AMOUNT array. The ORDINAL function includes any missing values in the selection and places missing values lowest in the ordering while SMALLEST ignores any missing values. The LARGEST function finds the ith largest value in the AMOUNT array ignoring missing values.

The three functions do not overwrite the original values of the variables with the same set of values in sorted order. Instead the DATA step creates three new arrays of variables, one for each of the functions.

  • The results of applying the ORDINAL function are stored in the ORD_AMT array.

  • The results of applying the SMALLEST function are stored in the SMALL_AMT array.

  • The results of applying the LARGEST function in the LARGE_AMT array.

The two CALL routines sort their arguments within the series of variables that are specified as their arguments. They overwrite the original values of the variables with the same set of values in sorted order. The CALL routines sort only in ascending order, placing missing values lowest in the sort order.

Note that the values of the NOTICED variables are mixed-case alphabetic and nonalphabetic characters and that NOTICED3 is missing for SURVEYID=162. Uppercase letters sort before lowercase letters. For example, for SURVEYID=106, the value `MAIL' is lower in order than the value `flyer'.

Note that several of the AMOUNT variable values are missing and a few are negative.

Program

Create data set SURVEYSORTED. Read the observations in ADSURVEY. Define the array of numeric values that the three functions will sort. Save the results from applying the three functions, ORDINAL, SMALLEST, and LARGEST, in these arrays.

Sort AMOUNT in ascending order. Find the ith element in ascending order on each iteration of the DO loop. Include missing values in the sorting process. Sort AMOUNT in ascending order. Find the ith element in ascending order on each iteration of the DO loop, ignoring any missing values that are present in the list. Sort AMOUNT in descending order. Find the ith element in descending order on each iteration of the DO loop, ignoring any missing values that are present in the list.

Sort character variables NOTICED1, NOTICED2, and NOTICED3. Overwrite the original values with the sorted values. Sort the AMOUNT array. Overwrite the original values with the sorted values.

data surveysorted;
  set adsurvey;
  array amount{3} amount1-amount3;

  array ord_amt{3} ord_amt1-ord_amt3;
  array small_amt{3} small_amt1-small_amt3;
  array large_amt{3} large_amt1-large_amt3;
  drop i;
  do i=1 to dim(amount);
    ord_amt{i}=ordinal(i, of amount{*});



    small_amt{i}=smallest(i, of amount{*});



    large_amt{i}=largest(i, of amount{*});



  end;
  call sortc(of noticed1-noticed3);


  call sortn(of amount{*});

run;

Related Technique

You can achieve a descending sort with the ORDINAL function by modifying the index specification on the array element that is receiving the result. The following program applies the ORDINAL function to the AMOUNT array. The results are saved in array REV_AMT. The index value specification for REV_AMT is modified so the elements are arranged in descending order. Output 9.6b shows the elements of the REV_AMT array.

Specify the index value for REV_AMT so that its elements are arranged in descending order.

data revsorted;
  set adsurvey;
     array amount{3} amount1-amount3;
     array rev_amt{3} rev_amt1-rev_amt3;

     drop i;

     do i=1 to dim(amount);
          rev_amt{dim(amount)-i+1}=
                   ordinal(i, of amount{*});

  end;
run;

Output 9.6b REVSORTED Data Set

                               Example 9.6 REVSORTED Data Set

 Obs surveyid noticed1 noticed2  noticed3 amount1 amount2 amount3 rev_amt1 rev_amt2 rev_amt3

  1     106   website  flyer     MAIL        300      0       0      300        0       0
  2     153   website  newspaper email      -55       .       .      -55        .       .
  3     192   email    ?         email        0       .       .        0        .       .
  4     145   Online   n/a       website     75      95     250      250       95      75
  5     162   insert   on-line              -45       .      10       10      -45       .


..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.223.206.69