4. Simple SQL Retrieval
It may seem a bit backwards to talk about retrieval before creating a database or entering data, but much of SQL's data modification syntax relies on finding data to be changed. You will therefore find it easier to work with modification statements if you are first familiar with retrieving data. We are therefore going to assume that someone else has created a database and loaded it with data for our use.
SQL has one command for retrieving data: SELECT. This is nowhere as restrictive as it might seem. SELECT contains syntax for choosing columns, choosing rows, combining tables, grouping data, and performing some simple calculations. In fact, a single SELECT statement can result in a DBMS performing any or all of the relational algebra operations.
The basic syntax of the SELECT statement has the following general structure:
SELECT column1, column2 …
FROM
table1, table2 …
WHERE predicate
The SELECT clause specifies the columns you want to see. You specify the tables used in the query in the FROM clause. The operational WHERE clause can contain a wide variety of criteria that identify which rows you want to retrieve.
Note: Most SQL command processors are not case sensitive when it comes to parts of a SQL statement. SQL keywords, table names, column names, and so on can be in any case you choose. However, most DBMSs are case sensitive when it comes to matching data values. Therefore, whenever you place a value in quotes for SQL to match, you must match the case of the stored data. In this book, SQL keywords will appear in uppercase letters; database components such as column and table names will appear in lowercase letters.
In addition to these basic clauses, SELECT has many other syntax options. Rather than attempt to summarize them all in a single general statement, you will learn to build the parts of a SELECT gradually throughout this and the next few chapters of this book.
Note: The SQL queries you see throughout the book are terminated by a semi-colon (;). This is not part of the SQL standard, but is used by many DBMSs so that you can type a command on multiple lines. The SQL command processor doesn't execute the query until it encounters the semi-colon.
Choosing Columns
One of the characteristics of a relation is that you can view any of the columns in any order you choose. SQL therefore lets you specify the columns you want to see and the order in which you want to see them, using the relational algebra project to produce the final result table.
Retrieving All Columns
To retrieve all the columns in a table, viewing the columns in the order in which they were defined when the table was created, you can use an asterisk (*) rather than listing each column. For example, to see all the works that the rare book store has handled, you would use
SELECT *
FROM work;
Because this query is requesting all rows in the table, there is no WHERE clause. As you can see in Figure 4-1, the result table labels each column with its name.
Note: The layout of the printed output of many SQL queries in this book has been adjusted so that it will fit across the width of the pages. When you actually view listings on the screen, each rowwill be in a single horizontal line. If a listing is too wide to fit on the screen or a terminal program's window, you will need to scroll.
B9780123756978500042/f04-01-9780123756978.jpg is missing
Figure 4-1
Viewing all columns in a table
Using the * operator to view all columns is a convenient short-hand for interactive SQL when you want a quick overview of data. However, it can be troublesome when used in embedded SQL. If the columns in the table are changed. In particular, if a column is added to the table and the application is not modified to handle the new column, then the application may not work properly.
Retrieving Specific Columns
In most SQL queries, you will want to specify exactly which column or columns you want retrieved. To specify columns, you list them following SELECT in the order in which you want to see them. For example, a query to view the names and phone numbers of all of our store's customers is written
SELECT first_name,last_name,contact_phone
FROM customer;
The result (see Figure 4-2) shows all rows in the table for just the three columns specified in the query. The order of the columns in the result table matches the order in which the columns appeared after the SELECT keyword.
B9780123756978500042/f04-02-9780123756978.jpg is missing
Figure 4-2
Choosing specific columns
Removing Duplicates
Unique primary keys ensure that relations have no duplicate rows. However, when you view only a portion of the columns in a table, you may end up with duplicates. For example, executing the following query produced the result in Figure 4-3:
SELET customer_numb, credit_card_numb
FROM sale;
B9780123756978500042/f04-03-9780123756978.jpg is missing
Figure 4-3
A result table with duplicate rows
Duplicates appear because the same customer uses the same credit card umber for more than one purchase. Keep in mind that although this table with duplicate rows is not a legal relation, that doesn't present a problem for the database because it is not stored in the database.
To remove duplicates from a result table, you insert the keyword DISTINCT following SELECT:
SELECT DISTINCT customer_numb, credit_card_numb
FROM sale;
The result is a table without the duplicate rows (see Figure 4-4). Although a legal relation has no duplicate rows, most DBMS vendors have implemented SQL so that it leaves the duplicates. As you read in Chapter 2, the primary reason is performance. To remove duplicates, a DBMS must sort the result table by every column in the table. It must then scan the table from top to bottom, looking at every “next” row to identify duplicate rows that are next to one another. If a result table is large, the sorting and scanning can significantly slow down the query. It is therefore up to the user to decide whether to request unique rows.
B9780123756978500042/f04-04-9780123756978.jpg is missing
Figure 4-4
The result table in Figure 4-3 with the duplicates removed
Ordering the Result Table
The order in which rows appear in the result table may not be what you expect. In some cases, rows will appear in the order in which they are physically stored. However, if the query optimizer uses an index to process the query, then the rows will appear in index key order. If you want row ordering to be consistent and predictable, you will need to specify how you want the rows to appear.
When you want to control the order of rows in a result table you add an ORDER BY clause to your SELECT statement.
For example, if you issue the query
SELECT *
FROM author;
you will see the unordered listing in Figure 4-5. Adding the ORDER BY clause sorts the result in alphabetical order (see Figure 4-6):
SELECT *
FROM author
ORDER BY author_last_first;
B9780123756978500042/f04-05-9780123756978.jpg is missing
Figure 4-5
An unordered result table
B9780123756978500042/f04-06-9780123756978.jpg is missing
Figure 4-6
The result table from Figure 4-6 sorted in alphabetical order by author name
The keywords ORDER BY are followed by the column or columns on which you want to sort the result table. When you include more than one column, the first column represents the outer sort, the next column a sort within it. For example, assume that you issue the query
SELECT zip_postcode, last_name, first_name
FROM customer
ORDER BY zip_postcode, last_name;
The result (see Figure 4-7) first orders by the zipcode and then sorts by the customer's last name within each zipcode. If we reverse the order of the columns on which the output is to be sorted, as in
SELECT zip_postcode, last_name, first_name
FROM customer
ORDER BY last_name, zip_postcode;
the output (see Figure 4-8) then sorts first by last name and then by zipcode within each last name.
B9780123756978500042/f04-07-9780123756978.jpg is missing
Figure 4-7
Sorting output by two columns
B9780123756978500042/f04-08-9780123756978.jpg is missing
Figure 4-8
Reversing the sort order of the query in Figure 4-8
Choosing Rows
As well as viewing any columns from a relation, you can also view any rows you want. We specify row selection criteria in a SELECT statement's WHERE clause.
In its simplest form, a WHERE clause contains a logical expression against which each row in a table is evaluated. If a row meets the criteria in the expression, then it becomes a part of the result table. If the row does not meet the criteria, then it is omitted. The trick to writing row selection criteria—one example of the predicates to which you were introduced in Chapter 2—is therefore knowing how to create logical expressions against which data can be evaluated.
Predicates
As you read in Chapter 2, a logical expression that follows WHERE is known as a predicate. It uses a variety of operators to represent row selection criteria. If a row meets the criteria in a predicate (in other words, the criteria evaluate as true), then the row is included in the result table. If the row doesn't meet the criteria (the criteria evaluate as false), then the row is excluded.
Relationship Operators
In Table 4-1 you can see the six operators used to express data relationships. 1 To write an expression using one of the operators, you surround it with two values. In database queries, such expression have either a column name on one side and a literal value on the other, as in
cost > 1.95
or column names on both sides:
numb_on_hand <= reorder_point
1The symbol used for the “not equal to” operator varies from one DBMS to another. Check the documentation that accompanies your software to determine whether the “not equal to” operator is != or < >.
Table 4-1 The relationship operators
OperatorMeaningExamples
=Equal tocost = 1.95
numb_in_stock = reorder_point
<Less thancost < 1.95
numb_in_stock < reorder_point
<=Less than or equal tocost <= 1.95
numb_in_stock <= reorder_point
>Greater thancost > 1.95
numb_in_stock > reorder_point
>=Greater than or equal tocost >= 1.95
numb_in_stock >= reorder_point
!= or < >Not equal tocost != 1.95
numb_in_stock != reorder_point
The first expression asks the question “Is the cost of the item greater than 1.95?” The second asks “Is the number of items in stock less than or equal to the reorder point?”
The way in which you enter literal values into a logical expression depends on the data type of the column to which you are comparing the value:
◊ Numbers: Type numbers without any formatting. In other words, leave out dollar signs, commas, and so on. You should, however, put decimal points in the appropriate place in a number with a factional portion.
◊ Characters: Type characters surrounded by quotation marks. Most DBMSs accept pairs of either single or double quotes. If your characters include an apostrophe (a single quote), then you should use double quotes. Otherwise, use single quotes.
◊ Dates: Type dates in the format used to store them in the database. This will vary from one DBMS to another.
◊ Times: Type times in the format used to store them in the database. This will vary from one DBMS to another.
When you are using two column names, keep in mind that the predicate is applied to each row in the table individually. The DBMS substitutes the values stored in the columns in the same row when making its evaluation of the criteria. You can therefore use two column names when you need to examine data that are stored in the same row but in different columns.
However, you cannot use a simple logical expression to compare the same column in two or more rows.
The DBMS also bases the way it evaluates data on the type of data:
◊ Comparisons involving numeric data are based on numerical order.
◊ Comparisons involving character data are based on alphabetical order.
◊ Comparisons involving dates and times are based on chronological order.
Logical Operators
Sometimes a simple logical expression is not enough to identify the rows you want to retrieve; you need more than one criterion. In that case, you can chain criteria together with logical operators. For example, assume that you want to retrieve volumes that you have in stock that cost more than $75 and that are in excellent condition. The predicate you need is therefore made up of two simple expressions:
condition_code = 2
asking_price > 75
A row must meet both of these criteria to be included in the result table. You therefore connect the two expressions with the logical operator AND into a single complex expression:
condition_code = 2 AND asking_price >75
Whenever you connect two simple expressions with AND, a row must meet all of the conditions to be included in the result.
You can use the AND operators to create a predicate that includes a range of dates. For example, if you want to find all sales that were made in August and September of 2013, the predicate would be written:
sale_date >= ‘01-Aug-2013’ AND sale_date <= ‘31-Sep-2013’
To be within the interval, a sale date must meet both individual criteria. 2
2The date format used in the sample queries is a fairly generic one that is recognized by most DBMSs. However, you should consult the documentation for your DBMS to determine exactly what will work with your product.
You will find a summary of the action of the AND operators in Table 4-2. The labels in the columns and rows represent the result of evaluating the single expressions on either side of the AND. As you can see, the only way to get a true result for the entire expression is for both simple expressions to be true.
Table 4-2 AND truth table
ANDTrueFalse
TrueTrueFalse
FalseFalseFalse
If you want to create an expression from which a row needs to meet only one condition, then you connect simple expressions with the logical operator OR. For example, if you want to retrieve volumes that cot more than $125 or less than $50, you would use the predicate
asking_price > 100 OR asking_price < 50
Whenever you connect two simple expressions with OR, a row needs to meet only one of the conditions to be included in the result of the query. When you want to create a predicate that looks for dates outside an interval, you use the OR operator. For example, to see sales that occurred prior to March 1, 2013 or after December 31, 2013, the predicate is written
sale_date < ‘01-Mar-2013’ OR sale_date > ‘31-Dec-2013’
You can find a summary of the OR operation in Table 4-3. Notice that the only way to get a false result is for both simple expression surrounding OR to be false.
Table 4-3 OR truth table
ORTrueFalse
TrueTrueTrue
FalseTrueFalse
There is no limit to the number of simple expression you can connect with AND and OR. For example, the following expression is legal:
condition_code >= 3
AND selling_price < asking_price
AND selling_price > 75
Negation
The logical operator NOT (or !) inverts the result of logical expression. If a row meets the criteria in a predicate, then placing NOT in front of the criteria excludes the row from the result. By the same token, if a row does not meet the criteria in a predicate, then placing NOT in front of the expression includes the row in the result. For example,
NOT (asking_price <= 50)
retrieves all rows where the cost is not less than or equal to $50 (in other words, greater than $50). First the DBMS evaluates the value in the asking_price column against the expression asking_price <= 50. If the row meets the criteria, then the DBMS does nothing. If the row does not meet the criteria, it includes the row in the result.
The parentheses in the preceding example group the expression to which NOT is to be applied. In the following example, the NOT operator applies only to the expression asking_price <= 50.
NOT (asking_price <= 50)
AND selling_price < asking_price
NOT can be a bit tricky when it is applied to complex expressions As an example, consider this expression:
NOT (asking_price <= 50
AND selling_price < asking_price)
Rows that have both an asking price of less than or equal to $50 and a selling price that was less than the asking price will meet the criteria within parentheses. However, the NOT operator excludes them from the result. Those rows that have either an asking price of more than $50 or a selling price greater than or equal to the asking price will fail the criteria within the parentheses, but will be included in the result by the NOT. This means that the expression is actually the same as
asking_price > 50
OR selling_price >= asking_price
or
NOT (asking_price <= 50)
OR NOT (selling_price < asking_price)
Precedence and Parentheses
When you create an expression with more than one logical operation, the DBMS must decide on the order in which it will process the simple expressions. Unless you tell it otherwise, a DBMS uses a set of default rules of precedence. In general, a DBMS evaluates simple expressions first, followed by the logical expression. When there is more than one operator of the same type, evaluation proceeds from left to right.
As a first example, consider this expression:
asking_price < 50
OR condition_code = 2
selling_price > asking_price
If the asking price of a book is $25, its condition code is 3, and the selling price was $20, the DBMS will exclude the row from the result. The first simple expression is true; the second is false. An OR between the first two produces a true result because at least one of the criteria is true. Then the DBMS performs an AND between the true result of the first portion and the result of the third simple expression (false). Because we are combining a true result and a false result with AND, the overall result is false. The row is therefore excluded from the result.
We can change the order in which the DBMS evaluates the logical operators, and coincidentally, the result of the expression, by using parentheses to group the expressions that are to have higher precedence:
asking_price < 50 OR (condition_code = 2
AND selling_price > asking_price)
A DBMS gives highest precedence to the parts of the expression within parentheses. Using the same sample data from the preceding paragraph, the expression within parentheses is false (both simple expressions are false). However, the OR with the first simple expression produces true, because the first simple expression is true. Therefore, the row is included in the result.
Special Operators
SQL predicates can include a number of special operators that make writing logical criteria easier. These include BETWEEN, LIKE, IN, and IS NULL.
Note: There are additional operators that are used primarily with subqueries, SELECT statements in which you embed one complete SELECT within another. You will be introduced to them inChapter 5.
The BETWEEN operator simplifies writing predicates that look for values that lie within an interval. Remember the example you saw earlier in this chapter using AND to generate a date interval? Using the BETWEEN operator, you could rewrite that predicate as
sale_date BETWEEN ‘01-Aug-2013’
AND ‘31-Sep-2013’
Any row with a sale date of August 1, 2013 through September 31, 2013 will be included in the result.
If you negate the BETWEEN operator, the DBMS returns all rows that are outside the interval. For example,
sale_date NOT BETWEEN ‘01-Aug-2013’
AND ‘31-Sep-2013’
retrieves all rows with dates prior to August 1, 2013 and after September 31, 2013. It does not include the 01-Aug-2013 or 31-Sep-2013. NOT BETWEEN is therefore a shorthand for the two simple expressions linked by OR that you saw earlier in this chapter.
LIKE
The LIKE operator provides a measure of character string pattern matching by allowing you to use placeholders (wildcards) for one or more characters. Although you may find that the wildcards a different in your particular DBMS, in most case, % stands for zero, one, or more characters and _ stands for zero or one character.
The way in which the LIKE operator works is summarized in Table 4-4. As you can see, you can combine the two wildcards to produce a variety of begins with, ends with, and contains expressions.
Table 4-4 Using the LIKE operator
ExpressionMeaning
LIKE ‘Sm%’Begins with Sm
LIKE ‘%ith’Ends with ith
LIKE ‘%ith%’Contains ith
LIKE ‘Sm_’Begins with Sm and is followed by at most one character
LIKE ‘_ith’Ends with ith and is preceded by at most one character
LIKE ‘_ith_’Contains ith and begins and ends with at most one additional character
LIKE ‘%ith_’Contains ith, begins with any number of characters, and ends with at most one additional character
LIKE ‘_ith%’Contains ith, begins with at most one additional character, and ends with any number of characters
As with BETWEEN you can negate the LIKE operator:
last_name NOT LIKE ‘Sm%’
Rows that are like the pattern are therefore excluded from the result.
One of the problems you may run into when using LIKE is that you need to include the wildcard characters as part of your data. For example, what can you do if you want rows that contain ‘nd_by’? The expression you want is
column_name LIKE ‘%nd_by%’
The problem is that the DBMS will see the _ as a wildcard, rather than as characters in your search string. The solution was introduced in SQL-9s, providing you with the ability to designate an escape character.
An escape character removes the special meaning of the character that follows. Because many programming languages use as the escape character, it is a logical choice for pattern matching, although it can be any character that is not part of your data. To establish the escape character, you add the keyword ESCAPE, followed by the escape character, to your expression:
column_name LIKE ‘%nd\_by%’ ESCAPE ‘’
IN
The IN operator compares the value in a column against a set of values. IN returns true if the value is within the set. For example, assume that a store employee is checking the selling price of a book and wants to know if it is either $25, $50, or $60. Using the IN operator, the expression would be written:
selling_price IN (25,50,60)
This is shorthand for
selling_price = 25 OR selling_price = 50
OR selling_price = 60
Therefore, any row whose price is one of those three values will be included in the result. Conversely, if you write the predicate
selling_price NOT IN (25,50,60)
the DBMS will return the rows with pries other than those in the set of values. The preceding expression is therefore the same as
selling_price != 25 AND selling_price != 50
AND selling_price !=60
or
NOT (selling_price = 25 OR selling_price = 50
OR selling_price = 60)
Note: The most common use of IN and NOT IN is with a sub-query, where the set of values to which data are compared are generated by an embedded SELECT. You will learn about this inChapter 5.
IS NULL
As you know, null is a specific indicator in a database. Although columns that contain nulls appear empty when you view them, the database actually stores a value that represents null so that an unknown value can be distinguished from, for example, a string value containing a blank. As a user, however, you will rarely know exactly what a DBMS is using internally for null. This means that you need some special way to identify null in a predicate so you can retrieve rows containing nulls. That is where the IS NULL operator comes in.
For example, an expression to identify all rows for volumes that have not been sold is written as
sale_date IS NULL
Conversely, to find all volumes that have been sold, you could use
sale_date IS NOT NULL
Performing Row Selection Queries
To perform SQL queries that select specific rows, you place a predicate after the SQL keyword WHERE. Depending on the nature of the predicate, the intention of the query may be to retrieve one or more rows. In this section you will therefore see some SELECT examples that combine a variety of row selection criteria. You will also see how those criteria are combined in queries with column selection and with sorting of the output.
Using a Primary Key Expression to Retrieve One Row
A common type of SQL retrieval query uses a primary key expression in its predicate to retrieve exactly one row. For example, if someone at the rare book store wants to see the name and telephone number of customer number 6, then the query is written
SELECT first_name, last_name, contact_phone
FROM customer
WHERE customer_numb = 6;
The result is the single row requested by the predicate.
first_namelast_namecontact_phone
JaniceSmith518-555-6666
If a table has a concatenated primary key, such as the employee number and child name for the dependents table you saw in Chapter 1, then a primary key expression needs to include a complex predicate in which each column of the primary key appears in its own simple logical expression. For example, if you wanted to find the birthdate of employee number 0002's son John, you would use following query:
SELECT child_birth_date
FROM dependents
WHERE employee_number = ‘0002’
AND child_name = ‘John’;
In this case, the result is simply
child_birth_date
---------------
2-Dec-1999
Retrieving Multiple Rows
Although queries with primary key expressions are written with the intention of retrieving only one row, more commonly SQL queries are designed to retrieve multiple rows.
Using Simple Predicates
When you want to retrieve data based on a value in a single column, you construct a predicate that includes just a simple logical expression. For example, to see all the books ordered on sale number 6, someone at the store would use
SELECT isbn
FROM volume
WHERE sale_id = 6;
The output (see Figure 4-9) displays a single column for rows where the sale_id is six.
B9780123756978500042/f04-09-9780123756978.jpg is missing
Figure 4-9
Displaying a single column from multiple rows using a
Using Complex Predicates
When you want to see rows that meet two or more simple conditions, you use a complex predicate in which the simple conditions are connected by AND or OR. For example, if someone wanted to see the books on order number 6 that sold for less than the asking price, the query would be written
SELECT isbn
FROM volume
WHERE sale_id = 6
AND selling_price < asking_price;
Only two rows meet the criteria:
isbn
----------------
978-1-11111-130-1
978-1-11111-139-1
By the same token, if you wanted to see all sales that took place prior to August 1, 2013 and for which the total amount of the sale was less than $100, the query would be written
SELECT sale_id, sale_total_amt
FROM sale
WHERE sale_date < ‘1-Aug-2012’
AND sale_total_amt < 100;
It produces the result in Figure 4-10.
Note: Don't forget that the date format required by your DBMS may be different from the one used in examples in this book.
B9780123756978500042/f04-10-9780123756978.jpg is missing
Figure 4-10
Retrieving rows using a complex predicate including a date
Alternatively, if you needed information about all sales that occurred prior to or on August 1, 2013 that totaled more than 100 along with sales that occurred after August 1, 2013 that totaled less than 100, you would write the query
SELECT sale_id, sale_date, sale_total_amt
FROM sale
WHERE (sale_date <= ‘1-Aug-2013’
AND sale_total_amt > 100)
OR (sale_date > ‘1-Aug-2013’
AND sale_total_amt < 100);
Notice that although the AND operator has precedence over OR and therefore the parentheses are not strictly necessary, the predicate in this query includes parentheses for clarity. Extra parentheses are never a problem—as long as you balance every opening parenthesis with a closing parenthesis—and you should feel free to use them whenever they help make it easier to understand the meaning of a complex predicate. The result of this query can be seen in Figure 4-11.
B9780123756978500042/f04-11-9780123756978.jpg is missing
Figure 4-11
Using a complex predicate that includes multiple logical operators
Using BETWEEN and NOT BETWEEN
As an example of using one of the special predicate operators, consider a query where someone wants to see all sales that occurred between July 1, 2013 and August 31, 2013. The query would be written
SELECT sale_id, sale_date, sale_total_amt
FROM sale
WHERE sale_date BETWEEN ‘1-Jul-2013’ AND ‘31-Aug-2013’;
It produces the output in Figure 4-12.
B9780123756978500042/f04-12-9780123756978.jpg is missing
Figure 4-12
Using BETWEEN to retrieve rows in a date range
The inverse query retrieves all orders not placed between July 1, 2013 and August 31, 2013 is written
SELECT sale_id, sale_date, sale_total_amt
FROM sale
WHERE sale_date NOT BETWEEN ‘1-Jul-2013’ AND ‘31-Aug-2013’;
and produces the output in Figure 4-13.
B9780123756978500042/f04-13-9780123756978.jpg is missing
Figure 4-13
Using NOT BETWEEN to retrieve rows outside a date range
If we want output that is easier to read, we might ask the DBMS to sort the result by sale date:
SELECT sale_id, sale_date, sale_total_amt
FROM sale
WHERE sale_date NOT BETWEEN ‘1-Jul-2013’ AND ‘31-Aug-2013’
ORDER BY sale_date;
producing the result in Figure 4-14.
B9780123756978500042/f04-14-9780123756978.jpg is missing
Figure 4-14
Output sorted by date
Nulls and Retrieval: Three-Valued Logic
The predicates you have seen to this point omit one important thing: the presence of nulls. What should a DBMS do when it encounters a row that contains null rather than a known value? As you read in Chapter 2, the relational data model doesn't have a specific rule as to what a DBMS should do, but it does require that the DBMS act consistently when it encounters nulls.
Consider the following query as an example:
SELECT inventory_id, selling_price
FROM volume
WHERE selling_price < 100;
The result can be found in Figure 4-15. Notice that every row in the result table has a value of selling price, which means that rows for unsold items—those with null in the selling price column—are omitted. The DBMS can't ascertain what the selling price for unsold items will be: Maybe it will be less than $100 or maybe it will be greater than or equal to $100.
B9780123756978500042/f04-15-9780123756978.jpg is missing
Figure 4-15
Retrieval based on a column that includes rows with nulls
The policy of most DBMSs is to exclude rows with nulls from the result. For rows with null in the selling price column, the maybe answer to “Is selling price less than 100” becomes false. This seems pretty straightforward, but what happens when you have a complex logical expression of which one portion returns maybe? The operation of AND, OR, and NOT must be expanded to take into account that they may be operating on a maybe.
The three-valued logic table for AND can be found in Table 4-5. Notice that something important hasn't changed: The only way to get a true result is for both simple expressions linked by AND to be true. Given that most DBMSs exclude rows where the predicate evaluates to maybe, the presence of nulls in the data will not change what an end user sees.
Table 4-5 Three-valued AND truth table
ANDTrueFalseMaybe
TrueTrueFalseMaybe
FalseFalseFalseFalse
MaybeMaybeFalseMaybe
The same is true when you look at the three-valued truth table for OR (see Table 4-6). As long as one simple expression is true, it does not matter whether the second returns true, false, or maybe. The result will always be true.
Table 4-6 Three-valued OR truth table
ORTrueFalseMaybe
TrueTrueTrueTrue
FalseTrueFalseMaybe
MaybeTrueMaybeMaybe
If you negate an expression that returns maybe, the NOT operator has no effect. In other words, NOT (MAYBE) is still maybe.
To see the rows that return maybe, you need to add an expression to your query that uses the IS NULL operator. For example, the easiest way to see which volumes have not been sold is to write a query like:
SELECT inventory_id, isbn, selling_price
FROM volume
WHERE selling_price is null;
The result can be found in Figure 4-16. Note that the selling price column is empty in each row. (Remember that you typically can't see any special value for null.) Notice also that the rows in this result table are all those excluded from the query in Figure 4-15.
Four-Valued Logic
Codd's 330 rules for the relational data model include an enhancement to three-valued logic that he called four-valued logic. In four-valued logic, there are actually two types of null: “null and it doesn't matter that it's null” and “null and we've really got a problem because it's null.” For example, if a company sells internationally, then it probably has a column for the country of each customer. Because it is essential to know a customer's country, a null in the country column would fall into the category of “null and we've really got a problem.” In contrast, a missing value in a company name column would be quite acceptable in a customer table for rows that represented individual customers. Then the null would be “null and it doesn't matter that it's null.” Four-valued logic remains purely theoretical, however, and isn't implemented in DBMSs.
B9780123756978500042/f04-16-9780123756978.jpg is missing
Figure 4-16
Using IS NULL to retrieve rows containing nulls
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.116.23.112