HOUR 7. Building SQL Statements Based on Multiple Tables

Now that you’ve learned the basics of the T-SQL language, you’re ready to move on to more advanced techniques. To really take advantage of what T-SQL has to offer, you must know how to return recordsets that contain data based on multiple tables. In this hour you’ll learn:

What Join Types Are Available and How to Use Them

How to Output Data Based on Multiple Tables

How to Group Your Query Output

About the Aggregate Functions Available to You

What the Having Clause Is and When to Use It

How to Union Data from Multiple Tables

How to Build Subqueries


By the Way

The examples that follow are based on a database called Northwind which is included on the sample code CD. Instructions as to how to install the Northwind database are included in Appendix A.


Working with Join Types

When you build a system based on normalized table structures, you must join the tables back together to see the data in a useable format. For example, if you have separated customers, orders, and order details, you need to join these tables in a query to see the name of the customer who placed an order for a particular item. Several types of joins are available. They include inner joins, outer joins, full joins, and self-joins. The sections that follow cover each of these join types.

Using Inner Joins

An inner join is the most common type of join. When you use an inner join, only rows on the one side of the relationship that have matching rows on the many side of the relationship are included in the output. Here’s an example:

SELECT Customers.CustomerID,
     Customers.CompanyName, Orders.OrderID,
     Orders.OrderDate

     FROM Customers
     INNER JOIN Orders ON Customers.CustomerID = Orders.CustomerID

This example includes only those customers who have orders.


By the Way

The word OUTER is assumed in the LEFT JOIN clause used when building a left outer join.


At times you need to join more than two tables in a SQL statement. The most common syntax is

FROM table1 JOIN table2 ON condition1 JOIN table3 ON condition2

The following example joins the Customers, Orders, and OrderDetails tables:

SELECT Customers. CustomerID, Customers.CompanyName,
     Orders.OrderID, Orders.OrderDate

     FROM (Customers
     INNER JOIN Orders
     ON Customers.CustomerID = Orders.CustomerID)
     INNER JOIN [Order Details]
     ON Orders.OrderID = [Order Details].OrderID

In the example, the order of the joins is unimportant. The exception to this is when you combine inner and outer joins. When you combine inner and outer joins, the SQL Server engine applies two specific rules. First, the nonpreserved table in an outer join cannot participate in an inner join. The nonpreserved table is the one whose rows may not appear. In the case of a left outer join from Customers to Orders, the Orders table is considered the nonpreserved table. Therefore, it cannot participate in an inner join with OrderDetails. The second rule is that the nonpreserved table in an outer join cannot participate with another nonpreserved table in another outer join.

Creating Outer Joins

An outer join enables you to include rows from one side of the join in the output, regardless of whether matching rows exist on the other side of the join. Two types of outer joins exist: left outer joins and right outer joins. With a left outer join, SQL Server includes in the output all rows in the first table specified in the SELECT statement. Here’s an example:

SELECT Customers.CustomerID,
     Customers.CompanyName, Orders.OrderID,
     Orders.OrderDate

     FROM Customers
     LEFT OUTER JOIN Orders ON Customers.CustomerID = Orders.CustomerID

In the previous example, customers are included regardless of whether they have orders. With the right outer join shown next, orders are included whether or not they have associated customers. If you have properly enforced referential integrity, this scenario should never exist.

SELECT Customers.CustomerID,
     Customers.CompanyName, Orders.OrderID,
     Orders.OrderDate

     FROM Customers
     RIGHT OUTER JOIN Orders ON Customers.CustomerID = Orders.CustomerID

Utilizing Full Joins

A full join combines the behavior of the left and right outer joins. It looks like this:

SELECT Customers.CustomerID,
     Customers.CompanyName, Orders.OrderID,
     Orders.OrderDate

     FROM Customers
     FULL JOIN Orders ON Customers.CustomerID = Orders.CustomerID

In this example, all customers appear in the output regardless of whether they have orders, and all orders appear in the output whether or not they are associated with customers.

Taking Advantage of Self-Joins

A self-join involves joining a table to itself. Although it is not the most common type of join, this join type is very valuable. Imagine the scenario in which an Employee table contains a field called EmployeeID and another field called ReportsTo. The ReportsTo field must contain a valid EmployeeID. It would not make sense to have separate Employee and Supervisor tables because supervisors are employees. This is where the self-join comes in. A self-join looks like this:

SELECT Employees.EmployeeID, Employees.LastName, Employees.FirstName,
    Supervisors.EmployeeID
 as SupervisorID,
    Supervisors.LastName
 as SupervisorLastName,
    Supervisors.FirstName
 as SupervisorFirstName
    FROM Employees INNER JOIN Employees as Supervisors
    ON Employees.ReportsTo = Supervisors.EmployeeID

In this example, the EmployeeID from the Employees table is joined to an alias of the ReportsTo field of an alias of the Employees table (called Supervisors). The resulting employee and supervisor information is output from the query (see Figure 7.1).

Figure 7.1. A SELECT statement that shows the result of joining the Employee table to itself.

Image

Working with the GROUP BY Clause

You can use the GROUP BY clause to calculate summary statistics. The syntax of the GROUP BY clause is

GROUP BY group-by-expression1 [,group-by-expression2 [,...]]

You use the GROUP BY clause to dictate the fields on which SQL Server groups the query result. When you include multiple fields in a GROUP BY clause, they are grouped from left to right. SQL Server automatically outputs the fields in the order designated in the GROUP BY clause. In the following example, the SELECT statement returns the country, city, and total freight for each country/city combination. The results are displayed in order by country and city (see Figure 7.2):

SELECT Customers.Country, Customers.City,
    Sum(Orders.Freight) AS SumOfFreight

    FROM Customers
    INNER JOIN Orders ON Customers.CustomerID = Orders.CustomerID
    GROUP BY Customers.Country, Customers.City

Figure 7.2. A SELECT statement that returns the country, city, and total freight for each country/city combination.

Image

The GROUP BY clause indicates that SQL Server doesn’t display the detail for the selected records. Instead, it displays the fields indicated in the GROUP BY uniquely. One of the fields in the SELECT statement must include an aggregate function. SQL Server displays this result of the aggregate function along with the fields specified in the GROUP BY clause.

Including Aggregate Functions in Your SQL Statements

You use aggregate functions to summarize table data. The aggregate functions available include COUNT, COUNT_BIG, SUM, AVG, MIN, and MAX. The following sections discuss each of these aggregate functions. You can find additional aggregate functions in the Books Online.

Using the COUNT Function

You use the COUNT function to count the number of rows in a table. It looks like this:

SELECT COUNT(*) AS CountOfCustomers FROM Customers

The example counts the number of rows in the Customers table (see Figure 7.3).

Figure 7.3. A SELECT statement that counts the number of rows in the Customers table.

Image

As an alternative, you can count values in a particular column. The SQL statement looks like this:

SELECT COUNT(Region) AS CountOfCustomers FROM Customers

This example counts the number of regions found in the Customers table (see Figure 7.4).

Figure 7.4. A SELECT statement that counts the number of regions found in the Customers table.

Image

Working with the COUNT_BIG Function

The COUNT_BIG function is identical to the COUNT function, except that it returns a bigint data type. It looks like this:

SELECT COUNT_BIG(Region) AS CountOfCustomers FROM Customers

Exploring the SUM Function

The SUM function is available only for numeric columns. It adds the data in the columns. Here’s an example:

SELECT SUM(Freight) FROM Orders

The example totals the Freight column for all rows in the Orders table. When used with the GROUP BY clause, the SUM function can easily total values for each grouping.

SELECT ShipVia, SUM(Freight) FROM Orders GROUP BY ShipVia

The example totals the freight for each shipper (see Figure 7.5).

Figure 7.5. A SELECT statement that totals the freight for each shipper found in the Orders table.

Image

Working with the AVG Function

Just as you can easily total data, you can average data. The following statement finds the average freight for all orders in the Orders table:

SELECT AVG(Freight) FROM Orders

When used with the GROUP BY clause, the AVG function can easily average values in each grouping.

SELECT ShipVia, AVG(Freight) FROM Orders GROUP BY ShipVia

The result provides the average freight for each shipper (see Figure 7.6).

Figure 7.6. A SELECT statement that provides the average freight for each shipper found in the Orders table.

Image

Using the MIN Function

Another important aggregate function is MIN. You use the MIN function to find the minimum value in a column. This statement finds the minimum freight in the Orders table:

SELECT MIN(Freight) FROM Orders

When used with the GROUP BY clause, the MIN function can easily find the minimum values in each grouping.

SELECT ShipVia, MIN(Freight) FROM Orders GROUP BY ShipVia

The result provides the minimum freight for each shipper (see Figure 7.7).

Figure 7.7. A SELECT statement that provides the minimum freight for each shipper found in the Orders table.

Image

Using the MAX Function

A related aggregate function is MAX. You use the MAX function to find the maximum value in a column. This statement finds the maximum freight in the Orders table (see Figure 7.8):

SELECT MAX(Freight) FROM Orders

Figure 7.8. You use the MAX function to find the maximum value in a column.

Image

When used with the GROUP BY clause, the MAX function can easily find the maximum values in each grouping.

SELECT ShipVia, MAX(Freight) FROM Orders GROUP BY ShipVia

The result provides the maximum freight for each shipper (see Figure 7.9).

Figure 7.9. When used with the GROUP BY clause, the MAX function can easily find the maximum values in each grouping.

Image

Taking Advantage of the HAVING Clause

A HAVING clause is similar to a WHERE clause, but it differs in one major respect: SQL Server applies it after it summarizes the data rather than beforehand. In other words, the WHERE clause is used to determine which rows are grouped. The HAVING clause determines which groups are included in the output. A HAVING clause looks like this:

HAVING expression1 [{AND|OR} expression2[...]]

In the following example, SQL Server applies the criteria > 1000 after it applies the aggregate function SUM to the grouping. Therefore, SQL Server includes only country/city combinations with total freight greater than 1000 in the output (see Figure 7.10).

Figure 7.10. A SELECT statement that includes country/city combinations with total freight greater than 1000 in the output.

Image

SELECT Customers.Country, Customers.City,
    Sum(Orders.Freight
) AS SumOfFreight
    FROM Customers
    INNER JOIN Orders ON Customers.CustomerID = Orders.CustomerID
    GROUP BY Customers.Country, Customers.City
    HAVING (((Sum(Orders.Freight))>1000))

Exploring the Power of Union Queries

Union queries enable you to combine rather than join data from more than one table. A typical example of a union query is one that combines data from a Products table and a DiscontinuedProducts table. Another example is a query that combines data from a Customers table and a CustomerArchive table. Here’s an example of a union query:

SELECT ProductID, ProductName, UnitPrice FROM Products
UNION ALL
SELECT ProductID, ProductName, UnitPrice FROM DiscontinuedProducts

This example outputs all rows from the Products table as well as from the DiscontinuedProducts table (see Figure 7.11).

Figure 7.11. A SELECT statement that outputs all rows from the Products table as well as from the DiscontinuedProducts table.

Image


By the Way

The Northwind database does not contain a DiscontinuedProducts table. To follow along with this example, you need to create a table with the same structure as the Products table, except the primary key field should not be an IDENTITY column.


If you want to order the results, you must place the ORDER BY statement after the second SELECT statement:

SELECT ProductID, ProductName, UnitPrice FROM Products
UNION ALL
SELECT ProductID, ProductName, UnitPrice FROM DiscontinuedProducts
ORDER BY UnitPrice DESC

In this example, SQL Server combines the results of both SELECT statements in descending order by UnitPrice.

Working with Subqueries

A subquery is a query that SQL Server evaluates before it evaluates the main query. Here’s an example:

SELECT CustomerID, CompanyName, City, Country FROM Customers WHERE
    Customer ID Not In(Select CustomerID FROM Orders)

In this example, SQL Server executes the statement that selects data from the Orders table before it evaluates the statement that selects data from the Customers table (see Figure 7.12).

Figure 7.12. A SELECT statement that selects data from the Orders table before it evaluates the statement that selects data from the Customers table.

Image

This is not a very efficient method of accomplishing the task of finding all the customers without orders. A better solution would be to use an outer join to solve this problem. You could modify the SQL statement to look like this:

SELECT Customers.CustomerID, CompanyName, City, Country
    FROM Customers LEFT JOIN Orders ON Customers.CustomerID = Orders.CustomerID
    WHERE Orders.CustomerID Is Null

This example uses a left outer join to select all customers who do not have orders. Because this uses a left outer join, customers are included whether or not they have orders. Because the criteria designate that only rows with a null CustomerID appear in the output, only Customers without orders are included (see Figure 7.13).

Figure 7.13. A SELECT statement that uses a left outer join to select all customers who do not have orders.

Image

Summary

In a world of normalized data, it is important that you understand how to use queries to join your tables back together. This chapter showed you how to join table data. You learned about the various join types available to you. You also learned how to group data, and how to work with aggregate functions. You learned about the HAVING clause, a clause necessary for working with aggregated data. Finally, you learned about two special types of queries: union queries and subqueries.

Q&A

Q

Why must you join tables together in a query?

A

When you build a system based on normalized table structures, you must join the tables back together to see the data in a useable format.

Q

Explain what a full join is.

A

A full join combines the behavior of a left outer join and a right outer join. An example is where you show all customers whether or not they have orders, and all orders whether or not they are associated with a customer.

Q

Explain the difference between the HAVING clause and the WHERE clause.

A

SQL Server Express applies the HAVING clause after it summarizes the data, whereas it applies the WHERE clause before it summarizes the data.

Q

Describe a subquery.

A

SQL Server Express evaluates a subquery before it evaluates the main query.

Workshop

Quiz

1. Name the five join types.

2. What is the purpose of aggregate functions?

3. Name four aggregate functions.

4. One of the aggregate functions is COUNT LARGE (true/false).

5. A union query joins two tables (true/false)?

Quiz Answers

1. Inner, left outer, right outer, full, and self.

2. They summarize table data.

3. Sum, Count, Min, Max, Avg.

4. False. It is called COUNT BIG.

5. False. Union queries allow you to combine two tables.

Activities

Practice joining the Customers and Orders tables in the Northwind sample database with an inner join, a left-outer join, a right-outer join, and a full join. Add a few customers without orders. Note that the number of rows in the output differs for the various join types. Perform a self-join by joining the EmployeeID in the Employees table to the ReportsTo field in the Employees table. Output the LastName and FirstName fields from each table instance used in the query. You should see a list of all employees and their supervisors. Finally, find the sum, average, minimum, and maximum freight amounts in the Orders table.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
13.59.177.14