Sort the rows in a table in a customized, complex order. Define a new column temporarily to hold the sort order and do not save it in the output table.
Featured Step | PROC SQL |
Featured Step Options and Statements | Subquery CASE and ORDER expressions |
Table EMP_ROSTER contains recent on-call hours for an employee group. Variable HOURS_WEEK is the number of hours the employee has been on-call in the past week. Variable HOURS_4WEEKS is the number of hours the employee has been on-call in the past four weeks.
EMP_ROSTER hours_ hours_ Obs id week 4weeks level 1 4GJBU 0 130 Staff 2 7SFWY 5 43 Staff 3 AT4S3 28 102 Staff 4 BGPP9 18 25 Staff 5 BXPQM 38 152 Staff 6 EYFYX 29 123 Staff 7 GSSMJ 14 56 Staff 8 IRFVM 8 62 Staff 9 L15GM 41 142 Staff 10 XCU73 32 115 Staff 11 Z86ZR 20 40 Staff 12 62EFL 5 105 Lead 13 6EI4N 21 87 Lead 14 7TZ7T 12 31 Lead 15 E2C61 27 54 Lead 16 MVZ8P 14 36 Lead 17 XLFLN 2 28 Lead 18 YELJB 29 44 Lead 19 2G8JY 3 21 Senior 20 63337 14 55 Senior 21 IHLJF 8 25 Senior 22 J4V0J 29 71 Senior 23 URAOV 18 41 Senior 24 IM76W 3 10 Manager 25 KULNC 12 20 Manager
Output 8.6 ONCALL_ROSTER TableExample 8.6 ONCALL_ROSTER Table Created with PROC SQL hours_ hours_ Obs id week 4weeks level 1 7SFWY 5 43 Staff 2 IRFVM 8 62 Staff 3 GSSMJ 14 56 Staff 4 BGPP9 18 25 Staff 5 Z86ZR 20 40 Staff 6 XLFLN 2 28 Lead 7 2G8JY 3 21 Senior 8 62EFL 5 105 Lead 9 IHLJF 8 25 Senior 10 7TZ7T 12 31 Lead 11 63337 14 55 Senior 12 MVZ8P 14 36 Lead 13 AT4S3 28 102 Staff 14 URAOV 18 41 Senior 15 6EI4N 21 87 Lead 16 E2C61 27 54 Lead 17 J4V0J 29 71 Senior 18 YELJB 29 44 Lead 19 XCU73 32 115 Staff 20 4GJBU 0 130 Staff 21 IM76W 3 10 Manager 22 KULNC 12 20 Manager 23 EYFYX 29 123 Staff 24 BXPQM 38 152 Staff 25 L15GM 41 142 Staff |
This example shows you how to rearrange the rows in a table in a customized, complex order with PROC SQL.
The goal of the program is to determine the on-call order for the group of employees in EMP_ROSTER. Complex rules based on the employee's level, hours worked in the past week, and hours worked in the past four weeks determine where to place the employee on the call list. The rules are expressed as the WHEN conditions in a CASE clause that is assigned the alias SORTCOLUMN. The lower the value assigned to SORTCOLUMN, the higher the employee is placed on the call list. In general, employees with level="STAFF" are higher on the list, while employees with level="MANAGER" are lower on the list.
The SORTCOLUMN column is used only to sort the rows in the input table EMP_ROSTER and is not saved in the output table ONCALL_ROSTER.
The step performs a subquery of EMP_ROSTER that defines SORTCOLUMN and returns the sorted rows, including the new column SORTCOLUMN to the outer query. The outer query selects only the columns found in EMP_ROSTER, and it does not select SORTCOLUMN.
This process can be easily accomplished in the DATA step by writing a series of IF-THEN statements to define new variable SORTCOLUMN, and then sorting the observations and not saving SORTCOLUMN in the output data set from PROC SORT. A reason to use PROC SQL is that the creation of the new column, ordering the rows, and saving specific columns can be done in one step.
Create table ONCALL_ROSTER. Specify the outer query. Select only the columns found in EMP_ROSTER. Begin the subquery. Select the columns found in EMP_ROSTER and a new column defined by the following CASE expression. Specify the complex rules to determine the order of the call list. When the combination of LEVEL, HOURS_WEEK, and HOURS_4WEEKS is not present in the WHEN conditions, assign a value of 9 to SORTCOLUMN to put these observations at the bottom of the list.
Select rows from EMP_ROSTER. Assign an alias to EMP_ROSTER. Terminate the subquery. Order the rows in the output table.
proc sql; create table oncall_roster as select emp.* from (select emp.*, case when hours_week gt 35 or hours_4weeks gt 120 then 5 when level='Manager' then 5 when (level='Lead' or level='Senior') and hours_week le 16 then 2 when (level='Lead' or level='Senior') and hours_week between 17 and 24 then 3 when (level='Lead' or level='Senior') and hours_week gt 24 then 4 when level='Staff' and hours_week gt 30 and hours_4weeks gt 40 then 4 when level='Staff' and hours_week gt 30 and hours_4weeks le 40 then 3 when level='Staff' and hours_week le 20 then 1 when level='Staff' and (hours_week between 21 and 30) and hours_4weeks le 40 then 1 when level='Staff' and (hours_week between 21 and 30) and hours_4weeks gt 40 then 2 else 9 end as sortcolumn from emp_roster as emp) order by sortcolumn, hours_week; quit;
18.226.98.166