AVG, MEAN
|
mean or average of values
|
COUNT, FREQ, N
|
number of nonmissing
values
|
CSS
|
corrected sum of squares
|
CV
|
coefficient of variation
(percent)
|
MAX
|
largest value
|
MIN
|
smallest value
|
NMISS
|
number of missing values
|
PRT
|
probability of a greater
absolute value of student's t
|
RANGE
|
range of values
|
STD
|
standard deviation
|
STDERR
|
standard error of the
mean
|
SUM
|
sum of values
|
T
|
student's t value
for testing the hypothesis that the population mean is zero
|
USS
|
uncorrected sum of squares
|
VAR
|
variance
|
Syntax, GROUP BY clause:
PROC SQL <options>;
SELECT column-1 <,...column-n>
FROM input-tables
WHERE expression
GROUP
BY column-name <,column-name>;
QUIT;
|
proc sql; select membertype, sum(milestraveled) as TotalMiles /*1*/ from certadv.frequentflyers group by membertype; /*2*/ quit;
1 | The SUM function totals the values of the MilesTraveled column to create the TotalMiles column. |
2 | The GROUP BY clause groups the data by the values of MemberType. |
Summary Function Behavior
|
Calculation
|
Sample Code and Result
|
|
---|---|---|---|
specifies one column
as argument
|
performed down the column
|
proc sql;
select sum(boarded), sum(transferred),sum(nonrevenue)
as Total
from certadv.marchflights;
quit;
|
|
specifies multiple columns
as arguments
|
performed across columns
for each row
|
proc sql;
select sum(boarded,transferred,nonrevenue)
as Total
from certadv.marchflights;
quit;
|
GROUP BY Clause Presence
|
PROC SQL Behavior
|
Example
|
---|---|---|
is not present in the
query
|
applies the function
to the entire table
|
proc sql;
select jobcode, avg(salary)
as AvgSalary
from certadv.payrollmaster;
quit;
|
is present in the query
|
applies the function
to each group specified in the GROUP BY clause
|
proc sql; select jobcode, avg(salary) as AvgSalary from certadv.payrollmaster group by jobcode; quit; If a query contains
a GROUP BY clause, all columns in the SELECT clause that do not contain
a summary function should typically be listed in the GROUP BY clause.
Otherwise, unexpected results might be returned.
|
SELECT Clause Contents
|
PROC SQL Behavior
|
Example
|
---|---|---|
contains summary functions
and no columns outside summary functions
|
calculates a single
value by using the summary function for the entire table. However,
if groups are specified in the GROUP BY clause, for each group it
combines the information into a single row of output for the entire
table.
|
proc sql;
select avg(salary)
as AvgSalary
from certadv.payrollmaster;
quit;
|
contains summary functions
and additional columns outside summary functions
|
calculates a single
value for the entire table. However, if groups are specified, for
each group it displays all rows of output within the single grouped
value. If the data in the table is grouped by more than one value,
then grouped values are repeated.
|
proc sql; select EmpId, jobcode, dateofhire, avg(salary) as AvgSalary from certadv.payrollmaster group by jobcode; quit; |
title 'Average Salary for All Employees';
proc sql;
select avg(salary) as AvgSalary
from certadv.payrollmaster;
quit;
proc sql;
select sum(boarded, transferred, nonrevenue) as Total
from certadv.marchflights;
quit;
proc sql;
select jobcode, avg(salary) as AvgSalary
from certadv.payrollmaster;
quit;
54079.62
).NOTE: The query requires remerging summary statistics back with the original data.
proc sql;
select jobcode, avg(salary) as AvgSalary format=dollar11.2
from certadv.payrollmaster
group by jobcode;
quit;
Form for COUNT
|
Result
|
Example
|
---|---|---|
COUNT(*)
|
the total number of
rows in a group or in a table
|
select count(*) as Count |
COUNT(column)
|
the total number of
rows in a group or in a table for which there is a nonmissing value
in the selected column
|
select count(jobcode) as Count |
COUNT(DISTINCT column)
|
the total number of
unique values in a column
|
select count(distinct jobcode) as Count |
proc sql; select count(*) as Count from certadv.payrollmaster; quit;
proc sql; select substr(jobcode,1,2) /*1*/ label='Job Category', count(*) as Count /*2*/ from certadv.payrollmaster group by 1; /*3*/ quit;
1 | The first column, which is labeled JobCategory, is created by using the SAS function SUBSTR. The SUBSTR function extracts the two-character job category from the existing JobCode field. |
2 | The second column, Count, is created by using the COUNT function. |
3 | The GROUP BY clause specifies that the results are to be grouped by the first defined column, which is referenced by 1 because the column was not assigned a name. |
proc sql;
select count(distinct jobcode) as Count
from certadv.payrollmaster;
quit;
proc sql;
select distinct jobcode
from certadv.payrollmaster;
quit;
3.135.199.27