The PROC SQL SELECT Statement

A Brief Overview

The SELECT statement retrieves and displays data. It consists of a SELECT clause and several optional clauses that can be used within the SELECT statement. Each clause begins with a keyword and is followed by one or more components. The optional clauses name the input data set, subset, group, or sort the data.
A PROC SQL step that contains one or more SELECT statements is referred to as a PROC SQL query. The SELECT statement is only one of several statements that can be used with PROC SQL.

SELECT Statement Syntax

The SELECT statement is the primary tool of PROC SQL. Using the SELECT statement, you can identify, manipulate, and retrieve columns of data from one or more tables and views. The SELECT statement must contain a SELECT clause and a FROM clause, both of which are required in a PROC SQL query.
Syntax, SELECT statement:
PROC SQL <options>;
SELECT column-1 <,...column-n>
FROM input-table
<WHERE clause>
<GROUP BY clause>
<HAVING clause>
<ORDER BY clause>
;
QUIT;
When you construct a SELECT statement, you must specify the clauses in the following order:
  • The SELECT clause selects columns.
  • The FROM clause selects one or more source tables or views.
  • The WHERE clause enables you to filter your data.
  • The GROUP BY clause enables you to process data in groups.
  • The HAVING clause works with the GROUP BY clause to filter grouped results.
  • The ORDER BY clause specifies the order of the rows.

Example: Selecting Columns

To specify which columns to display in a query, write a SELECT clause. After the keyword SELECT, list one or more column names and separate the column names with commas. The SELECT clause specifies existing columns and can create columns. The existing columns are already stored in a table.
The following SELECT clause specifies the columns EmpID, JobCode, Salary, and bonus. The columns EmpID, JobCode, and Salary are existing columns. The column named Bonus is a new column. The column alias appears as a column heading in the output and matches the case that you used in the SELECT clause.
proc sql;
   select empid, jobcode, salary, salary*.06 as bonus
      from certadv.payrollmaster
      where salary<32000
      order by jobcode;
quit;
Output 1.1 PROC SQL Query Result
PROC SQL Query Result

Example: Displaying All Columns Using SELECT *

Use an asterisk (*) in the SELECT clause to display all columns in the order in which they are stored in a table. All rows are displayed, by default, unless you limit or subset them.
The following SELECT statement displays all columns and rows in the table Certadv.Staffchanges. Certadv.Staffchanges lists all employees in a company who have had changes in their employment status.
proc sql;
   select *
      from certadv.staffchanges;
quit;
Output 1.2 PROC SQL Query Result of SELECT *
PROC SQL Query Result of SELECT *

Example: Using the FEEDBACK Option

When you specify SELECT * you can use the FEEDBACK option in the PROC SQL statement to write the expanded list of columns to the SAS log. For example, the PROC SQL query shown below contains the FEEDBACK option:
proc sql feedback;
   select *
      from certadv.staffchanges;
quit;
The following is written to the SAS log.
Log 1.1 SAS Log
NOTE: Statement transforms to:

        select STAFFCHANGES.EmpID, STAFFCHANGES.LastName, STAFFCHANGES.FirstName,
STAFFCHANGES.City, STAFFCHANGES.State, STAFFCHANGES.PhoneNumber
          from CERTADV.STAFFCHANGES;
The FEEDBACK option is a debugging tool that lets you see exactly what is being submitted to the SQL processor. The resulting message in the SAS log expands asterisks (*) into column lists. It also resolves macro variables and places parentheses around expressions to show their order of evaluation.

Example: Creating a New Column

You can create new columns that contain either text or a calculation. New columns appear in output, along with any existing columns that are selected. The new columns exist only for the duration of the query, unless a table or a view is created.
To create a new column, include any valid SAS expression in the SELECT clause list of columns. You can assign a column alias, a name, to a new column by using the keyword AS followed by the name that you would like to use.
Note: A column alias must follow the rules for SAS names.
In the following example, an expression is used to calculate the new column, Bonus. The value of the new column is Salary multiplied by 0.06. The keyword AS is used to assign the column alias Bonus to the new column.
proc sql;
   select empid, jobcode, salary, salary*.06 as bonus
      from certadv.payrollmaster
      where salary<32000
      order by jobcode;
quit;
Output 1.3 PROC SQL Query Result: New Column – Bonus
PROC SQL Query Result: New Column – Bonus
A column alias is useful because it enables you to reference the column elsewhere in the query. The column alias appears as a column heading in the output and matches the case that you used in the SELECT clause. You can specify a label for an existing or a new column in the SELECT clause. If both a label and a column alias are specified for a new column, the label is displayed as the column heading in the output[1]. If only a column alias is specified, specify the column alias exactly as you want it to appear in the output.

Example: Eliminating Duplicate Rows from Output

You can use the DISTINCT keyword to eliminate duplicate rows. The DISTINCT keyword applies to all columns, and only those columns, that are listed in the SELECT clause.
proc sql;
   select distinct flightnumber, destination
      from certadv.internationalflights;
quit;
Note: The DISTINCT keyword is identical to UNIQUE. Although the UNIQUE keyword is identical to DISTINCT, it is not an ANSI standard.
Output 1.4 PROC SQL Query Result: Unique Values
PROC SQL Query Result: Unique Values
Last updated: October 16, 2019
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.188.110.106