Querying Data

One of the main purposes of a database is to have a repository or a data storage system where information can be extracted quickly. The SQL statement used to extract information from tables in a database is the SELECT statement.

The SELECT Statement

The SELECT statement is used to extract information from the database or, in other words, to ask questions (or queries) to the database.

The clauses or elements of the SELECT statement are FROM, WHERE, ORDER BY, GROUP BY, HAVING, TOP, and INTO. The only element that is always required for queries is the FROM clause, which is used to specify the table or view from which to extract information.

The basic syntax of SELECT is

SELECT column_list
FROM Table_name

When you issue a SELECT statement using this syntax, all rows are returned to the client because there are no restrictions (the query doesn't have a WHERE clause).

The output of the SELECT statement is a result set that is composed of rows that come from one or more tables or views (working with multiple tables at the same time using JOIN is covered in Chapter 5, "Querying Multiple Tables: JOINs" ). If you want to get all columns of a table in a SELECT statement, use the * wildcard instead of specifying the whole column list. However, if you want only certain columns to appear in the output, these specific columns must be specified in the column list.

Listing 4.1 shows how to query a table using the * wildcard and using a column list. Notice that in both cases the query returns all rows on the table without restrictions, but the second one shows only certain columns.

Code Listing 4.1. Using a Basic SELECT Statement
						
USE Northwind

SELECT *
FROM Shippers

SELECT ShipperID,CompanyName
FROM Shippers
GO

ShipperID   CompanyName                              Phone
----------- ---------------------------------------- ------------------------
1           Speedy Express                        (503) 555-9831
2           United Package                        (503) 555-3199
3           Federal Shipping                      (503) 555-9931

(3 row(s) affected)

ShipperID   CompanyName
----------- ----------------------------------------
1          Speedy Express
2          United Package
3          Federal Shipping

(3 row(s) affected)

Notice that the SELECT statement can be used by itself when printing constants or values of variables. Also, SELECT is used to assign values to variables; similar to SET, which is used for the same purpose, but you can assign only one variable for each SET statement. On the other hand, you can assign values to more than one variable using one SELECT statement. In these cases (variable assignment and output), SELECT doesn't need a FROM clause. Listing 4.2 demonstrates how to use SELECT and SET to assign values to variables, and then it uses SELECT to show these values.

Code Listing 4.2. Using SELECT and SET to Assign Values to Variables and to Show These Values
						
DECLARE @firstname VARCHAR(10), @middlename VARCHAR(10), @lastname VARCHAR(10)
SET @firstname = 'Maria'
SELECT @middlename = 'Angelica', @lastname = 'Rojas'

SELECT @firstname, @middlename, @lastname
GO
---------- ---------- ----------

Maria      Angelica    Rojas

In the column list of a SELECT statement, you also can include constants (or literals), which appear as new columns in the result set. Furthermore, columns can be concatenated (using the + string concatenation operator) to form a new column. These two techniques can be useful when populating tables using SELECT .. INTO, to calculate values, and to build scripts dynamically.

Listing 4.3 contains two queries. The first one has a constant ('The name of the table is: ') and a column (the name of the table that is extracted from the INFORMATION_SCHEMA.TABLES view). Notice that in the output of the first query, the constant appears as the first column. The second query uses + to concatenate two strings (a constant and a column) and generates one string (or a column resulting from the concatenation). This query generates a script as output that can be used later.

Code Listing 4.3. Using Constants and Concatenating Strings in the Column List of SELECT
						
USE Northwind

SELECT 'The name of the table is: ', table_name
FROM INFORMATION_SCHEMA.TABLES
WHERE table_type = 'base table'

SELECT 'DROP TABLE '+ table_name
FROM INFORMATION_SCHEMA.TABLES
WHERE table_type = 'base table'
GO

-- partial results are shown

                           table_name
-------------------------- ----------------------------
The name of the table is:  Cars
The name of the table is:  Categories
The name of the table is:  CategoriesBudget

(20 row(s) affected)


-----------------------------------------

DROP TABLE Cars
DROP TABLE Categories
DROP TABLE CategoriesBudget

(20 row(s) affected)

When concatenating columns, make sure that the data type of the column is character. Otherwise, use CONVERT or CAST to change it to character data to be able to use the concatenation operator. Listing 4.4 illustrates how to use CAST with a column whose data type is MONEY to change it to VARCHAR to be able to concatenate it with other columns and constants.

Code Listing 4.4. Using String Concatenation and CAST in SELECT Statements
						
USE Northwind

SELECT 'The cost per unit of '+ productname + 'is '+
       CAST(unitprice as VARCHAR(10))
FROM Products
GO

-- partial results are shown

-----------------------------------------------------------
The cost per unit of Chai is 18.00
The cost per unit of Chang is 19.00
The cost per unit of Aniseed Syrup is 10.00
The cost per unit of Chef Anton's Cajun Seasoning is 22.00
The cost per unit of Chef Anton's Gumbo Mix is 21.35

The DISTINCT clause is used to eliminate duplicates in a result set. For example, the Employees table has more than one person with the same title. If you want to display all possible values of this column, you will get repeated data, but if DISTINCT is used in the SELECT clause, only unique values will be listed. Listing 4.5 shows the difference between a query without DISTINCT and another one with it.

Code Listing 4.5. Using DISTINCT to Remove Duplicate Rows from a Result Set
						
USE Northwind

SELECT title
FROM Employees

SELECT DISTINCT title
FROM Employees
GO

title
------------------------------
Sales Representative
Vice President, Sales
Sales Representative
Sales Representative
Sales Manager
Sales Representative
Sales Representative
Inside Sales Coordinator
Sales Representative

(9 row(s) affected)

title
------------------------------
Inside Sales Coordinator
Sales Manager
Sales Representative
Vice President, Sales

(4 row(s) affected) 

In SELECT statements, the IDENTITYCOL keyword can be used instead of the name of an IDENTITY column. For example, the Shippers table has a column with the IDENTITY property, shipperid. Therefore, when referencing this column in a SELECT statement, you can use either its name or the IDENTITYCOL keyword, which appears in Listing 4.6.

Code Listing 4.6. Using IDENTITYCOL Instead of the IDENTITY Column Name
						
USE Northwind

SELECT shipperid
FROM Shippers

SELECT IDENTITYCOL
FROM Shippers
GO

shipperid
-----------
1
2
3

(3 row(s) affected)

ShipperID
-----------
1
2
3
.

(3 row(s) affected)

Column Aliases

You can use aliases to change default column names. Sometimes, assigning labels or aliases to the columns in a SELECT statement can be beneficial because

  • There is more than one column with the same name—This usually happens when you're working with more than one table (using JOINs) and they have a column with the same name. In this case, it is beneficial to use column aliases to differentiate between these columns.

  • The column is the result of a calculation, becoming an expression—In these cases, SQL Server doesn't assign a column name to these kinds of columns.

A column alias is specified using the following syntax:

column_name AS alias_name

The AS keyword is optional; therefore, the column name can be followed by the alias name. Also, the alias name can contain up to 128 characters.

Caution

The alias name must be enclosed in single quotation marks or brackets if it contains spaces.


Listing 4.7 shows how to use column aliases.

Code Listing 4.7. Using Column Aliases
							
USE Northwind


SELECT productname + '('+ quantityperunit + ')'as product_quantities,
       unitsinstock + unitsonorder units
FROM Products
GO

partial results shown

product_quantities                                              units 
--------------------------------------------------------------- ------
Chai (10 boxes x 20 bags)                                       39
Chang (24 - 12 oz bottles)                                      57
Aniseed Syrup (12 - 550 ml bottles)                             83
Chef Anton's Cajun Seasoning (48 - 6 oz jars)                   53
Chef Anton's Gumbo Mix (36 boxes)                               0
Grandma's Boysenberry Spread (12 - 8 oz jars)                   120
Uncle Bob's Organic Dried Pears (12 - 1 lb pkgs.)               15
Northwoods Cranberry Sauce (12 - 12 oz jars)                    6

The FROM Clause

You use the FROM clause to specify the tables or views involved in a query. In the case of multiple tables, the type of JOIN and the JOIN condition are also specified in the FROM clause. Listing 4.8 shows a SELECT statement that retrieves information from two tables, Territories and Region (Chapter 5 goes over queries that involve multiple tables using JOINs).

Code Listing 4.8. Using the FROM Clause to Specify the Tables from Which Data Will Be Retrieved
						
USE Northwind

SELECT Territories.territorydescription, Region.regiondescription
FROM Territories JOIN Region
ON Territories.regionid = Region.regionid
GO

partial results shown

territorydescription     regiondescription
------------------------ ------------------
Westboro                 Eastern
Bedford                  Eastern
Georgetown                Eastern
Boston                   Eastern
Cambridge                Eastern
Braintree                Eastern

Tables in other databases can be referenced in the FROM clause if you qualify them with the database name and the owner (the last is optional). Further more, if you're working with linked servers (which are covered in Chapter 15, "Working with Heterogeneous Environments: Setting Up Linked Servers" ), you can access tables in those servers, but in this case you must qualify the table with the name of the server, and then the database name (or catalog) and owner (or schema). Listing 4.9 illustrates this situation, retrieving data stored in the Authors table in Pubs, from the Northwind database.

Code Listing 4.9. Using the FROM Clause to Specify Tables in Other Databases
						
USE Northwind

SELECT au_fname + ''+ au_lname AS name
FROM Pubs..Authors
GO

name
----------------------------------
Abraham Bennet
Reginald Blotchet-Halls
Cheryl Carson
Michel DeFrance
Innes del Castillo
Ann Dull
Marjorie Green
Morningstar Greene
Burt Gringlesby
Sheryl Hunter
Livia Karsen
Charlene Locksley
Stearns MacFeather
Heather McBadden
Michael O'Leary
Sylvia Panteley
Albert Ringer
Anne Ringer
Meander Smith
Dean Straight
Dirk Stringer
Johnson White
Akiko Yokomoto

(23 row(s) affected) 

A maximum of 256 tables can be referenced in a SELECT statement. If you have a query that requires extracting information from more than 256 tables, use temporary tables or derived tables to store partial results.

Table Aliases

You can use table aliases to make queries more readable, adding a label to a table (usually an identifier that is shorter than the name of the table), and using this label to reference the table in the rest of the query. Generally, table aliases are useful when you are writing queries that involve multiple tables (joining tables).

A table alias is specified using the following syntax (similar to column aliases):

Table_name AS alias_name

Notice that the AS keyword can be omitted. Therefore, specifying the table name followed by the alias name is also valid. Listing 4.10 shows a query similar to the one shown in Listing 4.8, but this one uses table aliases, which are used to qualify the columns in the column list and in the JOIN condition.

Code Listing 4.10. Using Table Aliases
							
USE Northwind

SELECT T.territorydescription, R.regiondescription
FROM Territories T JOIN Region R
ON T.regionid = R.regionid
GO
partial results shown

territorydescription     regiondescription
------------------------ ------------------
Westboro                 Eastern
Bedford                  Eastern
Georgetown               Eastern
Boston                   Eastern
Cambridge                Eastern
Braintree                Eastern

Caution

If an alias is specified for a table, it must be used in the rest of the query—the name of the table cannot be used.


The WHERE Clause

You have learned how to query a table (retrieving all rows) using the SELECT statement and the FROM clause. Generally, you must restrict the number of rows that a query returns; therefore, only rows that meet certain criteria or conditions will be part of the result set of the query. The WHERE clause restricts the result set of a query based on a search condition. As a result, just the rows that meet the search condition will be returned by the query. The syntax of a query that contains a WHERE clause is

SELECT column_list
FROM Table_name
WHERE conditions

Listing 4.11 shows a SELECT statement that retrieves the lastname, firstname, and hiredate of the employees who live in Seattle.

Code Listing 4.11. Using the WHERE Clause to Restrict the Output of a Query
						
USE Northwind

SELECT lastname, firstname, hiredate
FROM Employees
WHERE city = 'seattle'
GO

lastname             firstname  hiredate
-------------------- ---------- ---------------------------
Davolio              Nancy      1992-05-01 00:00:00.000
Callahan             Laura      1994-03-05 00:00:00.000

(2 row(s) affected)

In Transact-SQL, operators are used to work with expressions. Because a WHERE clause contains one or more expressions to restrict the output of a query, all operators covered in Chapter 2 can be used in WHERE clauses. These are LIKE, NOT LIKE, IN, NOT IN, BETWEEN, NOT BETWEEN, and comparison operators (=, <>, <, >, <=, and >=). Listing 4.12 demonstrates how these operators are used in queries.

Code Listing 4.12. Using Operators in WHERE Clauses
						
USE Northwind

-- Returns all employees whose last name begins with 'b'
SELECT lastname, firstname
FROM Employees
WHERE lastname LIKE 'b%'

-- Returns all employees who don't live in Seattle, Redmond or Tacoma
SELECT lastname, firstname, city
FROM Employees
WHERE city NOT IN ('seattle','redmond','tacoma')

-- Returns all employees that were hired between 1/1/1993 and 12/31/1993
SELECT lastname, firstname, hiredate
FROM Employees
WHERE hiredate BETWEEN '1993.1.1'AND '1993.12.31'

-- Returns all employees that live in any other city than London
SELECT lastname, firstname, city
FROM Employees
WHERE city <> 'london'.
GO

lastname             firstname
-------------------- ----------
Buchanan             Steven

(1 row(s) affected)

lastname             firstname  city
-------------------- ---------- ---------------
Leverling            Janet      Kirkland
Buchanan             Steven     London
NewFamily            Michael    London
King                 Robert     London
Dodsworth            Anne       London

(5 row(s) affected)

lastname             firstname  hiredate
-------------------- ---------- -----------------------------
Peacock              Margaret   1993-05-03 00:00:00.000
Buchanan             Steven     1993-10-17 00:00:00.000
NewFamily            Michael    1993-10-17 00:00:00.000

(3 row(s) affected)

lastname             firstname  city
-------------------- ---------- ---------------
Davolio              Nancy      Seattle
Fuller               Andrew     Tacoma
Leverling            Janet      Kirkland
Peacock              Margaret   Redmond
Callahan             Laura      Seattle.

(5 row(s) affected)

In a WHERE clause, many expressions can be combined using the AND and OR logical operators. Therefore:

  • If AND is used, the rows returned by the query will be the ones that meet all the search conditions.

  • On the other hand, if OR is used, the result set will contain the rows that meet any of the search conditions.

An example of how these logical operators (AND and OR) are used in the WHERE clause that appears in Listing 4.13.

Code Listing 4.13. Combining Expressions in the WHERE Clause Using AND and OR
						
USE Northwind

-- Returns all employees whose last name begins with 'b'
-- and don't live in Seattle, Redmond or Tacoma
SELECT lastname, firstname, city
FROM Employees
WHERE lastname LIKE 'b%'
AND city NOT IN ('seattle','redmond','tacoma')

-- Returns all employees that either:
-- were hired between 1/1/1993 and 12/31/1993
-- or live in any other city than London
SELECT lastname, firstname, city, hiredate
FROM Employees
WHERE hiredate BETWEEN '1993.1.1'AND '1993.12.31'.
OR city <> 'london'
GO

lastname             firstname  city
-------------------- ---------- ---------------
Buchanan             Steven     London

(1 row(s) affected)

lastname             firstname  city            hiredate
-------------------- ---------- --------------- -----------------------------
Davolio              Nancy      Seattle         1992-05-01 00:00:00.000
Fuller               Andrew     Tacoma          1992-08-14 00:00:00.000
Leverling            Janet      Kirkland        1992-04-01 00:00:00.000
Peacock              Margaret   Redmond         1993-05-03 00:00:00.000
Buchanan             Steven     London          1993-10-17 00:00:00.000
NewFamily            Michael    London          1993-10-17 00:00:00.000
Callahan             Laura      Seattle         1994-03-05 00:00:00.000

(7 row(s) affected)

When comparing DATETIME values in WHERE clauses, be aware that this data type stores both date and time. Hence, if you want to compare just the date portion of the whole value, use the CONVERT function to get just the portion you want. For example, if you need to retrieve all orders posted on 7/4/1996 no matter the time they were posted, you can use the query shown in Listing 4.14.

Code Listing 4.14. Comparing DATETIME Values in WHERE Clauses
						
USE Northwind

SELECT orderid, customerid, employeeid, orderdate
FROM Orders
WHERE CONVERT(VARCHAR(20),orderdate,102) = '1996.07.04'
GO

orderid     customerid employeeid  orderdate
----------- ---------- ----------- ----------------------------
10248       VINET      5           1996-07-04 00:00:00.000

(1 row(s) affected)

Caution

NULL values should be handled with care when comparing them in WHERE clauses. Specifically, use IS NULL or IS NOT NULL, according to each case, to check for NULL values, and avoid using comparison operators with NULL values—for example, 'column_name = NULL'—because their behavior depends on the SET ANSI_NULLS setting.


Listing 4.15 shows an example of how IS NULL and IS NOT NULL are used within expressions in a WHERE clause to search for NULL values.

Code Listing 4.15. Using IS NULL and IS NOT NULL to Make Comparisons with NULL Values
						
USE Northwind

-- Retrieves all suppliers whose region doesn't have a NULL value
SELECT companyname, contactname, region
FROM Suppliers
WHERE region IS NOT NULL

-- Retrieves all suppliers whose region is NULL (or unknown)
SELECT companyname, contactname, region
FROM Suppliers
WHERE region IS NULL
GO

-- the output has been simplified

companyname                              contactname                    region
---------------------------------------- ------------------------------ ------
New Orleans Cajun Delights               Shelley Burke                  LA
Grandma Kelly's Homestead                Regina Murphy                  MI
Cooperativa de Quesos 'Las Cabras'      Antonio del Valle Saavedra     Asturia

(9 row(s) affected)
.
companyname                              contactname                    region
---------------------------------------- ------------------------------ ------
Exotic Liquids                           Charlotte Cooper               NULL
Tokyo Traders                            Yoshi Nagase                   NULL
Mayumi's                                 Mayumi Ohno                    NULL.

(21 row(s) affected)

Multiple expressions and the ISNULL function can be used in a WHERE clause as an elegant solution for queries that contain optional search fields. For example, suppose you want to search for employees based on city, title, or both. Two variables can be created to store the value of the city or title to search for (@city and @title). If a variable equals NULL—for example, @city—this means that you are searching for a specific title (which is stored in the @title variable). If both variables are NULL, you want to retrieve all rows in the table.

Usually, to solve this problem, you can validate each variable and create a query accordingly. These are the possible cases:

  • If just the city is used (@title equals NULL), build a query that just searches for employees who live in this city.

  • If just the title is used (@city equals NULL), build a query that just searches for employees whose title is the one stored in the @title variable.

  • If both values are used (city and title), build a query with two expressions in the WHERE clause, and connect these two expressions with an AND operator.

Listing 4.16 shows how to code these three queries (each of them is based on the value of the @title and @city variables). In this example, the @title variable is set to NULL and @city is set to London to retrieve all employees who live in London.

Code Listing 4.16. Using Optional Search Fields with Different Queries
						
USE Northwind

DECLARE @title VARCHAR(60), @city VARCHAR(30)

-- Setting @title to NULL and searching for all employees
-- who live in London

SET @title = NULL
SET @city = 'London'

IF @title IS NOT NULL AND @city IS NULL

  SELECT lastname, firstname, title, city
  FROM Employees
  WHERE title = @title

IF @title IS NULL AND @city IS NOT NULL

  SELECT lastname, firstname, title, city
  FROM Employees
  WHERE city = @city

IF @title IS NOT NULL AND @city IS NOT NULL

  SELECT lastname, firstname, title, city
  FROM Employees
  WHERE city = @city
  AND title = @title..

GO

lastname             firstname  title                          city
-------------------- ---------- ------------------------------ ---------------
Buchanan             Steven     Sales Manager                  London
NewFamily            Michael    Sales Representative           London
King                 Robert     Sales Representative           London
Dodsworth            Anne       Sales Representative           London

(4 row(s) affected)

However, as stated before, you can build just one query using the ISNULL function to validate each variable, and one expression per variable; thus, solving the problem of the optional search fields with just a query and not using any IF statements. This query appears in Listing 4.17 (notice that it has the same output as the preceding example).

Code Listing 4.17. Using Optional Search Fields Using One Query
						
USE Northwind
DECLARE @title VARCHAR(60), @city VARCHAR(30)

-- Setting @title to NULL and serarching for all employees
-- who live in London

SET @title = NULL
SET @city = 'London'

SELECT lastname, firstname, title, city
FROM Employees
WHERE city = ISNULL(@city,city)
AND title = ISNULL(@title,title) .
GO

lastname             firstname  title                          city
-------------------- ---------- ------------------------------ ---------------
Buchanan             Steven     Sales Manager                  London
NewFamily            Michael    Sales Representative           London
King                 Robert     Sales Representative           London
Dodsworth            Anne       Sales Representative           London.

(4 row(s) affected)

Caution

Be aware that IS NULL is different from the ISNULL function. The IS NULL clause is used to make comparisons with NULL values, whereas ISNULL is a function that takes two arguments. If the first one is NULL, it returns the second one; otherwise, the first argument is returned.


Data Aggregation and the GROUP BY Clause

One of the benefits of the SQL language is that it enables you to generate summaries of the data stored in a database. Sometimes, data as a whole might not make sense, but when summarized, it can be used for many purposes.

Transact-SQL provides aggregate functions, which are used to generate summary values. Basically, they return a single value based on a calculation on a set of values. Table 4.1 shows the most common aggregate functions used in Transact-SQL.

Table 4.1. Transact-SQL's Aggregate Functions
Aggregate Function Description
AVG Returns the average or arithmetic mean.
COUNT Returns the number of values (an INT data type). COUNT(*) can be used to return the number of rows in a group (the group can be the whole table, obtaining the number of rows in the table).
COUNT_BIG Similar to COUNT in that it returns the number of values, but it returns a BIGINT data type. BIG_COUNT(*) can be used to return the number of rows in a group (the group can be the whole table, obtaining the number of rows in the table).
MAX Returns the maximum value.
MIN Returns the minimum value.
SUM Returns the sum of the values. Takes only numbers as arguments.

Listing 4.18 indicates how these aggregate functions are used to return summary values based on the values on the whole table.

Code Listing 4.18. Using Aggregate Functions
						
USE Northwind

-- Returns the average of unitsinstock
SELECT AVG(unitsinstock)
FROM Products

-- Returns the number of rows in the Employees table
SELECT COUNT(*)
FROM Employees

-- Returns the price of the most expensive product
SELECT MAX(unitprice)
FROM Products

-- Returns the birthdate of the oldest employee
SELECT MIN(birthdate)
FROM Employees

-- Returns the number of products in stock
SELECT SUM(unitsinstock)
FROM Products
GO

-----------
40
-----------
9

---------------------
263.5000

--------------------------------
1937-09-19 00:00:00.000:

-----------
3119

The DISTINCT keyword can be used in any aggregate function to consider repeating values just once. For example, to retrieve how many different titles the Employees table has, you can use the COUNT aggregate function with the DISTINCT keyword, as shown in Listing 4.19. In this case, the DISTINCT keyword is needed because more than one employee has the same title, and you want to count each title once to see how many different titles are in this table.

Code Listing 4.19. Using DISTINCT in Aggregate Functions
						
USE Northwind

SELECT COUNT(DISTINCT title)
FROM Employees
GO

-----------
4

You use the GROUP BY clause to group rows in a result set, generating a summary row for each group of data. All columns specified in SELECT must also be specified in GROUP BY. However, columns specified in the GROUP BY clause don't have to be in the SELECT column list.

To illustrate, Listing 4.20 shows an example that retrieves the number of employees per title. SQL Server generates a row per each title (this is the column specified in the GROUP BY clause) and counts the number of rows per title.

Code Listing 4.20. Using the GROUP BY Clause
						
USE Northwind

SELECT title, COUNT(*)
FROM Employees
GROUP BY title
GO
title
------------------------------ -----------
Inside Sales Coordinator       1
Sales Manager                  1
Sales Representative           6
Vice President, Sales          1

(4 row(s) affected)

It might be necessary to generate a summary row for a table (just one row and not a row for each group). In this case, because it is just one group (the whole table), use aggregate functions without the GROUP BY clause, as previously shown in Listing 4.18. Moreover, you can use more than one aggregate function in the same query. For example, to get the most recent date in which an order was placed, and the minimum orderid in the Orders table, use the query shown in Listing 4.21.

Code Listing 4.21. Summarizing Data
						
USE Northwind


SELECT MAX(orderdate), MIN(orderid)
FROM orders
GO
--------------------------- -----------

1998-05-06 00:00:00.000     10248

(1 row(s) affected)

If there's a WHERE clause in the query, it must be specified before the GROUP BY clause. SQL Server evaluates the WHERE clause first, and then it generates the groups based on the columns specified in GROUP BY. For example, to retrieve the number of customers in Spain and Venezuela, use the query shown in Listing 4.22.

Code Listing 4.22. Restricting the Groups Generated by GROUP BY
						
USE Northwind


SELECT country, COUNT(*)
FROM Customers
WHERE country IN ('Spain','Venezuela')
GROUP BY country
GO

country
--------------- -----------
Spain           5
Venezuela       4

(2 row(s) affected)

Tip

As a new feature of SQL Server 2000, BIT columns can be used in a GROUP BY clause. This was a limitation of GROUP BY in previous versions.


The use of column aliases is recommended when working with aggregate functions, because when any function is applied to a column, the result set doesn't show the original name of the column. Listing 4.23 shows an example of column aliases when using aggregate functions.

Code Listing 4.23. Using Column Aliases and Aggregate Functions
						
USE Northwind

SELECT country, COUNT(*) AS [number of customers]
FROM Customers
WHERE country IN ('Spain','Venezuela')
GROUP BY country
GO

country         number of customers
--------------- -------------------
Spain           5
Venezuela       4

(2 row(s) affected)

The HAVING Clause

When using GROUP BY in a query to generate groups, you might want to set restrictions on these groups. Specifically, the HAVING clause sets restrictions on the groups generated by GROUP BY. HAVING is similar to WHERE in the sense that it restricts the output of the query, but HAVING is evaluated by SQL Server after the groups are generated.

It's important to know that WHERE is evaluated first, then groups are generated (as a result of GROUP BY), and finally, the HAVING clause is evaluated. Therefore, aggregate functions cannot be referenced in the WHERE clause; they can be referenced only in the HAVING clause.

Listing 4.24 retrieves the number of customers of the countries that have more than five customers. This is done by setting a restriction after the groups are generated (using a HAVING clause); hence, showing only the countries that have more than five customers.

Code Listing 4.24. Setting Restrictions on the Groups Generated by GROUP BY Using HAVING
							
USE Northwind

SELECT country, COUNT(*) AS [number of customers]
FROM Customers
GROUP BY country
HAVING COUNT(*) > 5
GO

country         number of customers
--------------- -------------------
Brazil          9
France          11
Germany         11
UK              7
USA             16

(5 row(s) affected)

Similar to WHERE, multiple conditions can be specified in the HAVING clause, combining them with a logical operator (OR or AND). Listing 4.25 shows how conditions can be combined in a HAVING clause.

Code Listing 4.25. Combining Conditions in a HAVING Clause
							
USE Northwind


SELECT country, COUNT(*) AS [number of customers]
FROM Customers
GROUP BY country
HAVING COUNT(*) > 5
AND COUNT(*) < 10
GO

country         number of customers
--------------- -------------------
Brazil          9
UK              7

(2 row(s) affected) 

The ORDER BY Clause

A table comprises a set of rows, and a set, by definition, is unordered. Therefore, when retrieving data from tables, SQL Server doesn't guarantee the order of the rows in the result set. This is because SQL Server might optimize the query in a different way each time it is executed, depending on the data; resulting in a different order of the rows each time the same query is executed. To guarantee a specific order in a result set, use the ORDER BY clause. Listing 4.26 retrieves information from the Shippers table ordered by company name in ascending order (this is the default in SQL Server).

Code Listing 4.26. Using ORDER BY to Guarantee the Order of Rows
						
USE Northwind

SELECT companyname, phone
FROM Shippers
ORDER BY companyname
GO

companyname                              phone
---------------------------------------- ------------------------
Federal Shipping                         (503) 555-9931
Speedy Express                           (503) 555-9831
United Package                           (503) 555-3199

(3 row(s) affected)

You can include more than one column in the ORDER BY clause, and you also can specify how these values will be sorted, either ascending (using the ASC keyword), which is the default, or descending (using the DESC keyword). If more than one column is specified in the ORDER BY clause, SQL Server sorts the result set in the order in which these columns appear (first, the first column, then the second column, and so on). Listing 4.27 shows how to specify multiple columns and how to order them (either ascending or descending) in the ORDER BY clause.

Code Listing 4.27. Using Multiple Expressions in the ORDER BY Clause
						
USE Northwind

SELECT lastname, firstname
FROM Employees
ORDER BY lastname ASC, firstname DESC
GO

lastname             firstname
-------------------- ----------
Buchanan             Steven
Callahan             Laura
Davolio              Nancy
Dodsworth            Anne
Fuller               Andrew
King                 Robert
Leverling            Janet
NewFamily            Michael
Peacock              Margaret

(9 row(s) affected) 

Tip

As discussed in previous chapters, use TOP if you want to specify the ORDER BY clause when creating a view.


The TOP N Clause

TOP is used to limit the results of a query. It can be used in two ways: to retrieve the first N rows or to retrieve the first N percent of the rows in the result set. The TOP clause must be used along with ORDER BY; otherwise, SQL Server doesn't guarantee a specific ordering, and the TOP clause will be meaningless.

TOP returns the least significant values if they are sorted in ascending order. On the other hand, TOP retrieves the most significant values if they are sorted in descending order. For example, to retrieve the most expensive products, use a TOP clause and an ORDER BY clause sorting the unitprice column in descending order, as shown in Listing 4.28.

Code Listing 4.28. Limiting the Output of a Query Using the TOP Clause
						
USE Northwind

SELECT TOP 10 productid, productname, unitprice
FROM Products
ORDER BY unitprice DESC

SELECT TOP 1 PERCENT productid, productname, unitprice
FROM Products
ORDER BY unitprice DESC
GO

productid   productname                              unitprice
----------- ---------------------------------------- ------------ 
38          Côte de Blaye                      263.5000
29          Thüringer Rostbratwurst             123.7900
9           Mishi Kobe Niku                          97.0000
20          Sir Rodney's Marmalade                   81.0000
18          Carnarvon Tigers                         62.5000
59          Raclette Courdavault                     55.0000
51          Manjimup Dried Apples                    53.0000
62          Tarte au sucre                           49.3000
43          Ipoh Coffee                              46.0000
28          Rössle Sauerkraut                   45.6000
(10 row(s) affected)

productid   productname                              unitprice
----------- ---------------------------------------- -----------
38          Côte de Blaye                      263.5000

(1 row(s) affected)

Caution

If you're concerned about portability, be careful when using TOP because it is not ANSI standard. Instead, it is a feature of Transact-SQL.


The argument of TOP is a positive integer in either case (percent or fixed number of rows).

Caution

The argument of the TOP clause must be an integer; it cannot be a variable. If you want to use a variable, use dynamic queries (EXEC or sp_executesql).


In previous versions of SQL Server (6.5 and earlier), the only way to limit the result set of a query was by using SET ROWCOUNT, which stops the processing of the query when it reaches the number of rows specified by SET ROWCOUNT.

Be aware that TOP is more efficient than SET ROWCOUNT because TOP is evaluated at parse time, not at execution time like SET ROWCOUNT. Another disadvantage of using SET ROWCOUNT is that it remains set until you execute SET ROWCOUNT 0 to reset it to its original behavior (all rows are returned when executing a query). When SET ROWCOUNT is enabled, it also affects modification operations (INSERT, UPDATE, and DELETE). Listing 4.29 demonstrates the usage of SET ROWCOUNT (notice that the result set is equivalent to the one shown in Listing 4.28).

Code Listing 4.29. Using SET ROWCOUNT
						
USE Northwind

-- Use SET ROWCOUNT 10 to limit the output of all queries to 10 rows
SET ROWCOUNT 10

SELECT productid, productname, unitprice
FROM Products
ORDER BY unitprice DESC

-- Use SET ROWCOUNT 0 to reset it to its original state (all rows are returned)
SET ROWCOUNT 0
GO

productid   productname                              unitprice
----------- ---------------------------------------- ----------------
38          Côte de Blaye                      263.5000
29          Thüringer Rostbratwurst             123.7900
9           Mishi Kobe Niku                          97.0000
20          Sir Rodney's Marmalade                   81.0000
18          Carnarvon Tigers                         62.5000
59          Raclette Courdavault                     55.0000
51          Manjimup Dried Apples                    53.0000
62          Tarte au sucre                           49.3000
43          Ipoh Coffee                              46.0000
28          Rössle Sauerkraut                   45.6000

(10 row(s) affected)

Caution

If you use SET ROWCOUNT, don't forget to execute SET ROWCOUNT 0 to turn this setting off; otherwise, it remains set during the connection, affecting all subsequent queries.


Use the WITH TIES keyword of the TOP clause when you want to include ties in the result set. If WITH TIES is specified, the result set may contain more rows than the number of rows specified in the TOP clause because all ties would be included. For example, Listing 4.30 shows a query that retrieves the top six units in stock. Notice that seven rows are returned because there's a tie in the sixth position, and the query returns all ties (two in this case).

Code Listing 4.30. Using WITH TIES in TOP Clauses
						
USE Northwind

SELECT TOP 6 WITH TIES productid, productname, unitsinstock
FROM Products
ORDER BY unitsinstock DESC
GO

productid   productname                              unitsinstock
----------- ---------------------------------------- ------------
75          Rhönbräu Klosterbier           125
40          Boston Crab Meat                         123
6           Grandma's Boysenberry Spread             120
55          Pâté chinois                115
61          Sirop d'érable                    113
33          Geitost                                  112
36          Inlagd Sill                              112

(7 row(s) affected) 

Using Dynamic Queries

In some situations, you might want to parameterize queries using variables to specify, for example, the table to query. However, some elements cannot be specified dynamically in queries, such as the table name and column names. In these specific cases, dynamic queries might be beneficial. Specifically, there are two ways to execute dynamic queries: using EXEC (or EXECUTE), and using the sp_executesql system stored procedure. These two ways are listed in Listing 4.31.

Caution

The string (a dynamic query) that is passed as an argument to sp_executesql must be a Unicode string (to specify Unicode strings, use the N prefix when building the string).


Code Listing 4.31. Dynamically Generating and Executing Queries Using EXEC and sp_executesql
						
USE Northwind

DECLARE @tablename VARCHAR(20), @query NVARCHAR(100)
SET @tablename = 'Shippers'
SET @query = N'SELECT * FROM '+ @tablename

-- Executing the dynamic query using EXEC
EXEC (@query)

-- Executing the dynamic query using sp_executesql
EXEC sp_executesql @query
GO

ShipperID   CompanyName                              Phone
----------- ---------------------------------------- ------------------------
1           Speedy Express                           (503) 555-9831
2           United Package                           (503) 555-3199
3           Federal Shipping                         (503) 555-9931

ShipperID   CompanyName                              Phone
----------- ---------------------------------------- ------------------------
1           Speedy Express                           (503) 555-9831
2           United Package                           (503) 555-3199
3           Federal Shipping                         (503) 555-9931

(3 row(s) affected)

The following are the disadvantages of using dynamic queries:

  • The statements inside EXEC or sp_executesql are executed inside its own batch; therefore, these statements cannot access variables declared in the outside batch.

  • If the query to be executed by EXEC is not similar enough to a previously executed query due to different format, values, or data types, SQL Server cannot reuse a previously executed query plan. However, sp_executesql overcomes this limitation, allowing SQL Server to reuse the execution plan of the query (because it can be cached in memory).

Tip

Use sp_executesql whenever possible when executing dynamic queries, because the plan has a better chance of being reused.


Sometimes the dynamic query is very long and it becomes illegible. In these cases, you can use a variable to store the entire string and then use this variable as the argument of EXEC or sp_executesql, as shown in Listing 4.31. Also, you might want to insert carriage returns (using CHAR(13)) in the query to make it more legible (in case you want to display it). Listing 4.32 indicates how to insert carriage returns in a dynamic query.

Code Listing 4.32. Inserting Carriage Returns When Building Dynamic Queries
						
USE Northwind

DECLARE @query NVARCHAR(100)
SET @query = N'SELECT * '+ CHAR(13)+ 'FROM Shippers'

-- To display the query (which has a carriage return)
SELECT @query

-- Executing the dynamic query
EXEC sp_executesql @query
GO

---------------------------

SELECT * FROM Shippers

(1 row(s) affected)

ShipperID   CompanyName                              Phone
----------- ---------------------------------------- ------------------------
1           Speedy Express                           (503) 555-9831
2           United Package                           (503) 555-3199
3           Federal Shipping                         (503) 555-9931

(3 row(s) affected) 

Caution

In SQL Server, EXECUTE can be used for three different purposes: to execute dynamic queries, to execute stored procedures, and to assign execute permissions to users on stored procedures (using GRANT, DENY, or REVOKE). The difference between executing a stored procedure and a dynamic statement using EXECUTE is that the first one doesn't need to be enclosed in parentheses, whereas the dynamic statement does.


There are some security issues when dynamic statements are executed inside a stored procedure. Usually, to be able to execute a stored procedure, a user just needs to have EXECUTE permissions on the stored procedure. However, if a dynamic query is used, the user also needs permissions on every object referenced by the dynamic query. This is because the dynamic query is not parsed until the stored procedure is executed, and SQL Server must check permissions on every object referenced by the dynamic query.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.191.240.222