Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

26
Introduction to SQL

Eventually, you (or someone else) must build the database that you've designed. Also at some point, someone will probably want to actually use the database you've spent so much time designing.

Structured Query Language (SQL, sometimes pronounced “sequel”) is a database language that includes commands that let you build, modify, and manipulate relational databases. Many database tools use SQL either directly or behind the scenes to create and manipulate the database. For example, Figure 26.1 shows how the pgAdmin tool uses SQL to define the order_items table used in Chapters 18 and 19, which cover PostgreSQL.

Because SQL is standardized, many database management tools use it to control relational databases. They use buttons, drop-down menus, check boxes, and other user interface controls to let you design a database interactively, and then they send SQL code to the database engine to do the work.

In addition to using database tools to generate your SQL code, you can write that code yourself, which lets you create and modify the database as needed. It also lets you add data to put the database in a known state for testing. In fact, SQL is so useful for building databases that it's the topic of the next chapter.

NoSQL MAY HAVE SQL

SQL was designed to work with relational databases, so you might think that NoSQL databases don't support it. Some don't, but others do. Whether a NoSQL database supports SQL depends on the type of database and the specific product.

For example, a column-oriented database can act as a relational database that uses a particular storage strategy, so it may be able to execute standard SQL statements.

A document database may also be able to execute SQL on the fields defined inside its documents, although you may need some extra condition checking and error handling to deal with documents that contain different fields.

A graph database, such as Neo4j AuraDB, may support graph query language (GQL), but its only real similarity to SQL is the fact that they both end with QL.

A representation of the pgAdmin screen exhibits the name of the owner. brew_master is the exposed name. — **FIGURE 26.1**

SQL is such an important part of database development that your education as a database designer is sadly lacking if you don't at least understand the basics. (The other developers will rightfully mock you if you don't chuckle when you see a T-shirt that says, “SELECT * FROM People WHERE NOT Clue IS null.”)

In this chapter, you'll learn how to use SQL to do the following:

Create and delete tables.
Insert data into tables.
Select data from the database using various criteria and sort the results.
Modify data in the database.
Delete records.

BACKGROUND

SQL was developed by IBM in the mid-1970s. It is an English-like command language for building and manipulating relational databases.

From a small but ambitious beginning, SQL has grown into a large language containing about 70 commands with hundreds of variations. Because SQL has grown so large, I cannot possibly cover every nook and cranny. Instead, this chapter gives a brief introduction to SQL and then describes some of the most useful SQL commands in greater detail. Even then, this chapter doesn't have room to cover most of the more interesting commands completely. The SELECT statement alone includes so many variations that you could practically write a book about just that one command.

FINDING MORE INFORMATION

SQL is intuitive enough that, once you master the basics, you should be able to get pretty far on your own. The Internet is practically clogged with websites that are chock-full of SQL goodness in the form of tutorials, references, FAQs, question and answer forums, and discussion groups.

In fact, I'll start you off with a small list of websites right now. The following list shows a few websites that provide SQL tutorials:

For help on specific issues, you should find a few SQL forums where you can ask questions. An enormous number of developers work with databases and SQL on a daily basis, so there are lots of forums out there. Just search for SQL forum to find more forums than you can shake a SELECT statement at.

Tutorials can help you get started using some of the more common SQL statements, but they aren't designed as references. If you need more information about a particular command, you should look for a SQL reference. At the following sites you'll find references for different versions of SQL:

MariaDB—https://mariadb.com/kb/en/sql-statements
Microsoft Transact-SQL—https://learn.microsoft.com/en-us/sql/t-sql/language-reference
MySQL—https://dev.mysql.com/doc/refman/8.0/en
Oracle SQL—www.adp-gmbh.ch/ora/sql
PostgreSQL—https://postgresql.org/docs/8.1/sql-commands.html

Each of these pages provides simple navigation to look up specific SQL commands.

Note, however, that these web pages deal with specific versions of SQL. Though the basics of any version of SQL are fairly standard, there are some differences between the various flavors.

STANDARDS

Although all relational databases support SQL, different databases may provide slightly different implementations. This is such an important point that I'll say it again.

Both the International Organization for Standardization (ISO) and the American National Standards Institute (ANSI) have standards for the SQL language, and most database products follow those standards pretty faithfully. However, different database products also add extra features to make certain chores easier. Those extras can be useful but only if you are aware of which features are standard and which are not, and you use the extra features with caution.

For example, in the Transact-SQL language used by SQL Server, the special values @@TOTAL_ERRORS, @@TOTAL_READS, and @@TOTAL_WRITES return the total number of disk write errors, disk reads, and disk writes, respectively, since SQL Server was last started. Other relational databases don't provide those, although they may have their own special values that return similar statistics.

If you use these kinds of special features haphazardly, it may be very hard to rebuild your database or the applications that use it if you are forced to move to a new kind of database. For that matter, extra features can make it hard for you to reuse tools and techniques that you develop in your next database project.

Fortunately, most flavors of SQL are 90 percent identical. You can guard against troublesome changes in the future by keeping the nonstandard features in a single code module or library as much as possible.

Usually the places where SQL implementations differ the most is in system-level chores such as database management and searching metadata. For example, different databases might provide different tools for searching through lists of tables, creating new databases, learning about the number of reads and writes that the database performed, examining errors, and optimizing queries.

MULTISTATEMENT COMMANDS

Many (but not all) databases allow you to execute multiple commands at once if you separate them with semicolons. Even then, some databases disable that capability by default. In that case, you need to take extra steps to enable this feature.

For example, the following code shows how a Python program can execute multistatement commands in MariaDB:

# Semi-colons.
 
# Create the tables.
import pymysql
from pymysql.connections import CLIENT
    
# Connect to the database server.
conn = pymysql.connect(
    host="localhost",
    user="root",
    password="G4eJuHFIhFOa13m",
    database="Chapter26Data",
    client_flag=CLIENT.MULTI_STATEMENTS)
 
# Create a cursor.
cur = conn.cursor()
 
cur.execute("""
DROP TABLE IF EXISTS Crewmembers;
CREATE TABLE Crewmembers (
  FirstName    VARCHAR(45)   NOT NULL,
  LastName     VARCHAR(45)   NOT NULL
);
""")
 
cur.execute("""
INSERT INTO Crewmembers VALUES ("Malcolm", "Reynolds");
INSERT INTO Crewmembers VALUES ('Hoban "Wash"', "Washburne");
INSERT INTO Crewmembers VALUES ("Zoë", "Washburne");
INSERT INTO Crewmembers VALUES ("Kaylee", "Frye");
INSERT INTO Crewmembers VALUES ("Jayne", "Cobb");
SELECT * FROM Crewmembers
""")
 
cur.execute("SELECT * FROM Crewmembers")
rows = cur.fetchall()
for row in rows:
    print(row[0] + " " + row[1])
 
# Close the connection.
conn.commit()
conn.close()

This code first imports the pymysql and pymysql.connections.CLIENT modules. It then connects to the database. The code sets the connect method's client_flag parameter to CLIENT.MULTI_STATEMENTS to enable multistatement commands. The setup code finishes by creating a cursor to manipulate the database.

The program then executes a DROP TABLE statement and a CREATE TABLE statement in a single command. Notice how the commands are separated by semicolons. MariaDB is uncooperative if you leave out the semicolons. You can omit the semicolon after the final command if you like, but I've included it for consistency. (It also makes it easier to add more commands at the end of the sequence without forgetting to add new semicolons.)

Next, the code uses a multistatement command to execute five INSERT statements.

In both execute calls, the Python code uses the multiline string delimiter """ so the statements can include single or double quotes. The second INSERT statement uses single quotes around the first name value so that it can include embedded double quotes for Wash's nickname.

The opening and closing triple quotes are on the lines before and after the SQL code, so the SQL code is exactly as it would appear in a script.

After it inserts the records, the code executes a SELECT statement and displays the new records. The code finishes by committing the changes and closing the database connection.

The following shows the program's output:

Malcolm Reynolds
Hoban "Wash" Washburne
Zoë Washburne
Kaylee Frye
Jayne Cobb

If you include a SELECT statement inside a multistatement command the fetchall method returns an empty result, so you should execute SELECT statements separately, at least in MariaDB.

BASIC SYNTAX

As is mentioned earlier in this chapter, SQL is an English-like language. It includes command words such as CREATE, INSERT, UPDATE, and DELETE.

SQL is case-insensitive. That means it doesn't care whether you spell the DELETE keyword as DELETE, delete, Delete, or DeLeTe.

SQL also doesn't care about the capitalization of database objects such as table and field names. If a database has a table containing people from the Administrative Data Organization, SQL doesn't care whether you spell the table's name ADOPEOPLE, AdoPeople, or aDOPEople. (Although the last one looks like “a dope ople,” which may make pronunciation tricky and might annoy your admins.)

To make the code easier to read, however, most developers write SQL command words in ALL CAPS and they write database object names using whatever capitalization they used when building the database. For example, I prefer MixedCase for table and field names.

Unfortunately, not all database products handle variations in capitalization in exactly the same way. For example, as you learned in Chapter 18, “PostgreSQL in Python,” PostgreSQL “helpfully” converts table and column names to lowercase. Run some tests to see whether you and your database have the same capitalization fashion sense, and then be prepared to do some rewriting if you're forced to switch databases.

A final SQL feature that makes commands easier to read is that SQL ignores whitespace. That means it ignores spaces, tabs, line feeds, and other “spacing” characters, so you can use those characters to break long commands across multiple lines or to align related statements in columns.

For example, the following code shows a typical SELECT command as I would write it:

SELECT FirstName, LastName, Clue,
       Street, City, State, Zip
FROM People
WHERE NOT Clue IS NULL
ORDER BY Clue, LastName, FirstName

This command selects name and address information from the records in the People table where the record's Clue field has a value that is not null. (Basically, it selects people who have a clue, normally a pretty small data set.) It sorts the results by Clue, LastName, and FirstName.

This command places the different SQL clauses on separate lines to make them easier to read. It also places the person's name and address on separate lines and indents the address fields so they line up with the FirstName field. That makes it easier to pick out the SELECT, FROM, WHERE, and ORDER BY clauses.

COMMAND OVERVIEW

SQL commands are typically grouped into four categories: Data Definition Language (DDL), Data Manipulation Language (DML), Data Control Language (DCL), and Transaction Control Language (TCL).

Note that some commands have several variations. For example, in Transact-SQL the ALTER command has the versions ALTER DATABASE, ALTER FUNCTION, ALTER PROCEDURE, ALTER TABLE, ALTER TRIGGER, and ALTER VIEW. The following tables just provide an overview of the main function (ALTER) so you know where to look if you need one of these.

NO DO-OVERS

SQL statements are immediate and unundoable. The database doesn't care if you have your fingers crossed, these statements can drop tables, delete records, destroy users, and generally wreak havoc with no warnings or backsies. For example, the database doesn't give you any warning or make you confirm a DROP TABLE statement. It instantly destroys the table and all of the data it contains. You cannot undo this command, so be really, really, really sure you want to do it before you execute a DROP TABLE command.

If you execute these commands interactively, then you're responsible for the outcome. If your program executes these commands, then it must be certain that the user really wants to live with the consequences.

The only mitigating tool at your disposal is the transaction, which lets you commit or roll back one or more transactions as a group. For example, you can execute a series of statements and, if any of them fails, you can roll back the transaction to cancel them all. You must do that, however; the database won't do it for you.

It's also a good idea to make sure your backups are up-to-date before you start any task that may damage the database such as dropping, adding, or modifying tables. (Chapter 28, “Database Maintenance,” talks about backups.)

DDL commands define the database's structure. Table 26.1 briefly describes the most commonly used DDL commands.

TABLE 26.1: DDL commands

COMMAND	PURPOSE
`ALTER`	Modifies a database object such as a table, stored procedure, or view. The most important variation is ALTER TABLE, which lets you change a column definition or table constraint.
`CREATE`	Creates objects such as tables, indexes, views, stored procedures, and triggers. In some versions of SQL, this also creates databases, users, and other high-level database objects. Two of the most important variations are CREATE TABLE and `CREATE INDEX`.
`DROP`	Deletes database objects such as tables, functions, procedures, triggers, and views. Two of the most important variations are DROP TABLE and `DROP INDEX`.

DML commands manipulate data. They let you perform the CRUD operations: Create, Read, Update, and Delete. (Where athletes “talk smack,” database developers “talk CRUD.”) Table 26.2 summarizes the most common DML commands. Generally developers think of only the CRUD commands INSERT, SELECT, UPDATE, and DELETE as DML commands, but this table includes cursor commands because they are used to select records.

TABLE 26.2: DML commands

COMMAND	PURPOSE
`CLOSE`	Closes a cursor.
`DECLARE`	Declares a cursor that a program can use to fetch and process records a few at a time instead of all at once.
`DELETE`	Deletes records from a table.
`FETCH`	Uses a cursor to fetch rows.
`INSERT`	Inserts new rows into a table. A variation lets you insert the result of a query into a table.
`SELECT`	Selects data from the database, possibly saving the result into a table.
`TRUNCATE`	Deletes all of the records from a table as a group without logging individual record deletions. It also removes empty space from the table while DELETE may leave empty space to hold data later. (Some consider `TRUNCATE` a DDL command, possibly because it removes empty space from the table. Or possibly just to be contrary. Or perhaps I'm the contrary one.)
`UPDATE`	Changes the values in a record.

DCL commands allow you control access to data. Depending on the database, you may be able to control user privileges at the database, table, view, or field level. Table 26.3 summarizes the two most common DCL commands.

TABLE 26.3: DCL commands

COMMAND	PURPOSE
`GRANT`	Grants privileges to a user
`REVOKE`	Revokes privileges from a user

TCL commands let you use transactions. A transaction is a set of commands that should be executed atomically as a single unit, so either every command is executed or none of the commands is executed.

For example, suppose you want to transfer money from one bank account to another. It would be bad if the system crashed after you subtracted money from the first account but before you added it to the other. If you put the two commands in a transaction, the database guarantees that either both happen or neither happens.

Table 26.4 summarizes the most common TCL commands.

TABLE 26.4: TCL commands

COMMAND	PURPOSE
`BEGIN`	Starts a transaction. Operations performed before the next COMMIT or `ROLLBACK` statement are part of the transaction.
`COMMIT`	Closes a transaction, accepting its results.
`ROLLBACK`	Rewinds a transaction's commands back to the beginning of the transaction or to a savepoint defined within the transaction.
`SAVE`	Creates a savepoint within a transaction. (Transact-SQL calls this command SAVE TRANSACTION whereas PostgreSQL calls it `SAVEPOINT`.)

The following sections describe the most commonly used commands in greater detail.

CREATE TABLE

The CREATE TABLE statement builds a database table. The basic syntax for creating a table is:

CREATE TABLE table_name (parameters)

Here table_name is the name you want to give to the new table and parameters is a series of statements that define the table's columns. Optionally, parameters can include column-level and table-level constraints.

A column definition includes the column's name, its data type, and optional extras such as a default value or the keywords NULL or NOT NULL to indicate whether the column should allow null values.

A particularly useful option that you can add to the CREATE TABLE statement is IF NOT EXISTS. This clause makes the statement create the new table only if it doesn't already exist.

For example, the following statement creates a Students table with three fields. Notice how the code uses whitespace to make the field data types and NOT NULL clauses align so they are easier to read:

CREATE TABLE IF NOT EXISTS Students (
  StudentId    INT           NOT NULL    AUTO_INCREMENT,
  FirstName    VARCHAR(45)   NOT NULL,
  LastName     VARCHAR(45)   NOT NULL,
  PRIMARY KEY (StudentId)
)

The StudentId field is an integer (INT) that is required (NOT NULL). The database automatically generates values for this field by adding 1 to the value it last generated (AUTO_INCREMENT).

The FirstName and LastName fields are required variable-length strings up to 45 characters long.

The table's primary key is the StudentId field.

A key part of a column's definitions is the data type. The following list summarizes the most common SQL data types:

BLOB—A Binary Large Object. This is any chunk of binary data such as a JPEG file, audio file, video file, or JSON document. The database knows nothing about the internal structure of this data so, for example, if the BLOB contains a JSON document the database cannot search its contents.
BOOLEAN—A true or false value.
CHAR—A fixed-length string. Use this for strings that always have the same length such as two-letter state abbreviations or five-digit ZIP Codes.
DATE—A month, date, and year such as August 29, 1997.
DATETIME—A date and time such as 2:14 a.m., August 29, 1997.
DECIMAL(p, s)—A fixed-point number where p (precision) gives the total number of digits and s (scale) gives the number of digits to the right of the decimal. For example, DECIMAL(6, 2) holds numbers of the form 1234.56.
INT—An integer value.
NUMBER—A floating-point number.
TIME—A time without a date such as 2:14 a.m..
TIMESTAMP—A date and time.
VARCHAR—A variable-length string. Use this for strings of unknown lengths such as names and street addresses.

Specific database products often provide extra data types and aliases for these types. They also sometimes use these names for different purposes. For example, in different databases the INT data type might use 32 or 64 bits, and the database may provide other data types such as SMALLINT, TINYINT, BIGINT, and so forth to hold integers of different sizes.

Most databases can handle the basic data types but before you make specific assumptions (for example, that INT means 32-bit integer), check the documentation for your database product.

The following code builds a frankly hacked together table with the sole purpose of demonstrating most of the common data types. Notice that this command uses the combined FirstName and LastName fields as the table's primary key:

CREATE TABLE IF NOT EXISTS MishmashTable (
  FirstName               VARCHAR(40)     NOT NULL,
  LastName                VARCHAR(40)     NOT NULL,
  Age                     INT             NULL,
  Birthdate               DATE            NULL,
  AppointmentDateTime     DATETIME        NULL,
  PreferredTime           TIME            NULL,
  TimeAppointmentCreated  TIMESTAMP       NULL,
  Salary                  DECIMAL(8,2)    NULL,
  IncludeInvoice          BOOLEAN         NULL,
  Street                  VARCHAR(40)     NULL,
  City                    VARCHAR(40)     NULL,
  State                   CHAR(2)         NULL,
  Zip                     CHAR(5)         NULL,
  PRIMARY KEY (FirstName, LastName)
)

BUILD A TABLE

The CREATE TABLE statement can be quite complicated. You need to specify the fields' names, data types, default values, and whether they allow null values. You can also specify the table's primary key and indexes, and foreign key constraints.

Try writing a CREATE TABLE statement to make an inventory items table:

Create the InventoryItems table.

Give the new table the following fields:

NAME	TYPE	REQUIRED?
UPC	String up to 40 characters	Yes
Description	String up to 45 characters	Yes
BuyPrice	Number of the form 12345678.90	No
SellPrice	Number of the form 12345678.90	Yes
QuantityinStock	Integer	Yes
ExpirationDate	Date	No
StockLocation	String up to 40 characters	No
ShelfLocation	String up to 40 characters	No
ReorderWhen	Integer	No
ReorderAmount	Integer	No
VendorId	Integer	No

Make the table's primary key the UPC field.
Some of the fields are used in foreign key constraints. Make indexes for those fields. Set their DELETE and UPDATE actions to NO ACTION. The fields are as follows:
LOCAL FIELD FOREIGN TABLE FOREIGN FIELD
VendorId Vendors VendorId
StockLocation StockLocations StockLocation
ShelfLocation ShelfLocations ShelfLocation
Make foreign key constraints for the fields listed in step 3. Set their DELETE and UPDATE actions to NO ACTION. (That prevents the database from deleting or modifying a record in the foreign table if the value is needed by one of these constraints.)

How It Works

Create the InventoryItems table.

The main part of the CREATE TABLE statement defines the fields, their data types, default values if any, and whether they are required. The following code shows the basic statement:

CREATE TABLE IF NOT EXISTS InventoryItems (
  UPC               VARCHAR(40)     NOT NULL,
  Description       VARCHAR(45)     NOT NULL,
  BuyPrice          DECIMAL(10,2)   NULL,
  SellPrice         DECIMAL(10,2)   NOT NULL,
  QuantityinStock   INT             NOT NULL,
  ExpirationDate    DATE            NULL,
  StockLocation     VARCHAR(40)     NULL,
  ShelfLocation     VARCHAR(40)     NULL,
  ReorderWhen       INT             NULL,
  ReorderAmount     INT             NULL,
  VendorId          INT             NULL
)

Make the table's primary key the UPC field.
To define the primary key, you simply add a PRIMARY KEY clause to the statement like this:
```
  PRIMARY KEY (UPC),
```
Make indexes for fields used in foreign key constraints.
Creating indexes for fields used in foreign key constraints is not mandatory (at least in many databases), but it makes matching up the related values in the two tables faster. Some databases, such as MySQL Workbench, create indexes for fields used in foreign keys by default.

For this example, use the INDEX clause to define these indexes. The following lines of code create the indexes:
```
  INDEX Index_VendorId (VendorId ASC),
  INDEX Index_StockLocation (StockLocation ASC),
  INDEX Index_ShelfLocation (ShelfLocation ASC),
 
```

Make foreign key constraints for the fields listed in step 3.

Use the CONSTRAINT clause to define the constraints. Give the constraints names that indicate the table and fields, and the fact that these are foreign key constraints. The following code defines these constraints:

  CONSTRAINT FK_InventoryItems_VendorId
    FOREIGN KEY (VendorId)
    REFERENCES Vendors (VendorId)
    ON DELETE NO ACTION
    ON UPDATE NO ACTION,
 
  CONSTRAINT FK_InventoryItems_StockLocation
    FOREIGN KEY (StockLocation)
    REFERENCES StockLocations (StockLocation)
    ON DELETE NO ACTION
    ON UPDATE NO ACTION,
 
  CONSTRAINT FK_InventoryItems_ShelfLocation
    FOREIGN KEY (ShelfLocation)
    REFERENCES ShelfLocations (ShelfLocation)
    ON DELETE NO ACTION
    ON UPDATE NO ACTION

Note that the foreign key fields must be indexes in the foreign table, at least in MariaDB.

The following code shows the complete CREATE TABLE statement:

CREATE TABLE IF NOT EXISTS InventoryItems (
  UPC               VARCHAR(40)     NOT NULL,
  Description       VARCHAR(45)     NOT NULL,
  BuyPrice          DECIMAL(10,2)   NULL,
  SellPrice         DECIMAL(10,2)   NOT NULL,
  QuantityinStock   INT             NOT NULL,
  ExpirationDate    DATE            NULL,
  StockLocation     VARCHAR(40)     NULL,
  ShelfLocation     VARCHAR(40)     NULL,
  ReorderWhen       INT             NULL,
  ReorderAmount     INT             NULL,
  VendorId          INT             NULL,
 
  PRIMARY KEY (UPC),
 
  INDEX Index_VendorId (VendorId ASC),
  INDEX Index_StockLocation (StockLocation ASC),
  INDEX Index_ShelfLocation (ShelfLocation ASC),
 
  CONSTRAINT FK_InventoryItems_VendorId
    FOREIGN KEY (VendorId)
    REFERENCES Vendors (VendorId)
    ON DELETE NO ACTION
    ON UPDATE NO ACTION,
 
  CONSTRAINT FK_InventoryItems_StockLocation
    FOREIGN KEY (StockLocation)
    REFERENCES StockLocations (StockLocation)
    ON DELETE NO ACTION
    ON UPDATE NO ACTION,
 
  CONSTRAINT FK_InventoryItems_ShelfLocation
    FOREIGN KEY (ShelfLocation)
    REFERENCES ShelfLocations (ShelfLocation)
    ON DELETE NO ACTION
    ON UPDATE NO ACTION
)

The following chapter has more to say about using the CREATE TABLE statement to build databases.

CREATE INDEX

The previous CREATE TABLE example uses INDEX clauses to define indexes for the table as it is being created. The CREATE INDEX statement adds an index to a table after the table has been created.

For example, the following statement adds an index named IDX_People_Names to the People table. This index makes it easier to search the table by using the records' combined FirstName/LastName fields:

CREATE INDEX IDX_People_Names ON People (FirstName, LastName)

Some databases, such as PostgreSQL, don't let you create indexes inside a CREATE TABLE statement. In that case, you must use CREATE INDEX if you want the table to have an index. You can also use CREATE INDEX to add an index to a table if you didn't realize you would need one or you just forgot to do it earlier.

There's also sometimes a strategic reason to add the index after you create the table. Relational databases use complicated self-balancing trees to provide indexes. When you add or delete a record, the database must perform a nontrivial amount of work to update its index structures. If you add the records in sorted order, as is often the case when you first populate the database, this can mean even more work than usual because the tree structures tend to have trouble with sorted values so they perform a lot more work.

When you create an index after the table is populated, the database must perform a fair amount of work to build the index tree, but it has a big advantage that it doesn't when indexing records one at a time: it knows how many records are in the table. Instead of resizing the index tree as it is built one record at a time, the database can build a big empty tree, and then fill it with data.

Some databases may not use this information effectively, but some can fill the table and then add an index more quickly than they can fill the table if the index is created first. Note that the difference is small, so you probably shouldn't worry about creating the indexes separately unless you are loading a lot of records.

DROP

The DROP statement removes an object from the database. For example, the following statement removes the index named IDX_People_Names from the People table in a MariaDB database:

DROP INDEX IDX_People_Names ON People

The following statement shows the Transact-SQL language version of the previous command:

DROP INDEX People.IDX_People_Names

The following shows the PostgreSQL version:

DROP INDEX IDX_People_Names

The other most useful DROP statement is DROP TABLE. You can add the IF EXISTS clause to make the database ignore the command if the table does not already exist. That makes it easier to write scripts that drop tables before creating them. (If you don't add IF EXISTS and you try to drop a table that doesn't exist, the script will crash.)

The following command removes the People table from the database:

DROP TABLE IF EXISTS People

INSERT

The INSERT statement adds data to a table in the database. This command has several variations. For the following examples, assume the People table was created with the following command:

CREATE TABLE IF NOT EXISTS People (
  PersonId     INT           NOT NULL   AUTO_INCREMENT,
  FirstName    VARCHAR(45)   NOT NULL   DEFAULT '<missing>',
  LastName     VARCHAR(45)   NOT NULL   DEFAULT '<none>',
  State        VARCHAR(10)   NULL,
  PRIMARY KEY (PersonId)
)

The simplest form of the INSERT statement lists the values to be inserted in the new record after the VALUES keyword. The values must have the correct data types and must be listed in the same order as the fields in the table.

The following command inserts a new record in the People table:

INSERT INTO People VALUES (1, "Rod", "Stephens", "CO")

QUALITY QUOTES

SQL doesn't care whether you use single or double quotes. That can be useful when you need to compose a SQL statement in a particular programming language. For example, the following code builds the earlier INSERT statement in C#:

    string statement = "INSERT INTO People
      VALUES (1, 'Rod', 'Stephens', 'CO')"

This code uses double quotes to delimit the string, so it's convenient to use single quotes to delimit the values inside the statement.

Some databases will not let you specify values for AUTO INCREMENT fields such as PersonId in this example. If you specify the value null for such a field, the database automatically generates a value for you. (However, some databases won't even let you specify null. In that case, you must use a more complicated version of INSERT that lists the fields so you can omit the AUTO INCREMENT field.)

If you replace a value with the keyword DEFAULT, the database uses that field's default value if it has one.

When it executes the following command, the database automatically generates a PersonId value, the FirstName value defaults to <missing>, the LastName value is set to Markup, and the State value is set to null:

INSERT INTO People VALUES (null, DEFAULT, "Markup", null)

The next form of the INSERT statement explicitly lists the fields that it will initialize. The values in the VALUES clause must match the order of those listed earlier. Listing the fields that you are going to enter lets you omit some fields or change the order in which they are given.

The following statement creates a new People record. It explicitly sets the FirstName field to Snortimer and the State field to Confusion. The database automatically generates a new PersonId value and the LastName value gets its default value <none>:

INSERT INTO People (FirstName, State) VALUES ("Snortimer", "Confusion")

The final version of INSERT INTO described here gets the values that it will insert from a SELECT statement (described in the next section) that pulls values from a table.

The following example inserts values into the SmartPeople table's LastName and FirstName fields. It gets the values from a query that selects FirstName and LastName values from the People table where the corresponding record's State value is not Confusion:

INSERT INTO SmartPeople (LastName, FirstName)
  SELECT LastName, FirstName
  FROM People
  WHERE State <> "Confusion"

Unlike the previous INSERT statements, this version may insert many records in the table if the query returns a lot of data.

SELECT

The SELECT command retrieves data from the database. This is one of the most often used and complex SQL commands. The basic syntax is as follows:

SELECT select_clause
FROM from_clause
[ WHERE where_clause ]
[ GROUP BY group_by_clause ]
[ ORDER BY order_by_clause [ ASC | DESC ] ]

The parts in square brackets are optional and the vertical bar between ASC and DESC means you can include one or the other of those keywords.

The following sections describe the main clauses in more detail.

SELECT Clause

The SELECT clause specifies the fields that you want the query to return.

If a field's name is unambiguous given the tables selected by the FROM clause (described in the next section), you can simply list the field's name as in FirstName.

If multiple tables listed in the FROM clause have a field with the same name, you must put the table's name in front of the field's name as in People.FirstName.

The special value * tells the database that you want to select all of the available fields. If the query includes more than one table in the FROM clause and you want all of the fields from a specific table, you can include the table's name before the asterisk, as in People.*.

For example, the following query returns all the fields for all the records in the People table:

SELECT * FROM People

Optionally, you can give a field an alias by following it with the keyword AS and the alias that you want it to have. When the query returns, it acts as if that field's name is whatever you used as an alias. This is useful for such things as differentiating among fields with the same name in different tables, for creating a new field name that a program can use for a nicer display (for example, changing the CustName field to Customer Name), or for creating a name for a calculated column.

A particularly useful option you can add to the SELECT clause is DISTINCT. This makes the database return only one copy of each set of values that are selected.

For example, suppose the Orders table contains customer first and last names. The following MySQL query selects the FirstName and LastName values from the table, concatenates them into a single field with a space in between, and gives that calculated field the alias Name. The DISTINCT keyword means the query will only return one of each Name result even if a single customer has many records in the table.

SELECT DISTINCT CONCAT(FirstName, " ", LastName) AS Name FROM Orders

The following code shows the Transact-SQL equivalent of this statement:

SELECT DISTINCT FirstName + " " + LastName AS Name FROM Orders

FROM Clause

The FROM clause lists the tables from which the database should pull data. Normally if the query pulls data from more than one table, it either uses a JOIN or a WHERE clause to indicate how the records in the tables are related.

For example, the following statement selects information from the Orders and OrderItems tables. It matches records from the two using a WHERE clause that tells the database to associate Orders and OrderItems records that have the same OrderId value.

SELECT * FROM Orders, OrderItems
WHERE Orders.OrderId = OrderItems.OrderId

Several different kinds of JOIN clauses perform roughly the same function as the previous WHERE clause. They differ in how the database handles records in one table that have no corresponding records in the other table.

For example, suppose the Courses table contains the names of college courses and holds the values shown in Table 26.5.

TABLE 26.5: Courses records

COURSEID	COURSENAME
CS 120	Database Design
CS 245	The Customer: A Necessary Evil
D? = h@p	Introduction to Cryptography

Also suppose the Enrollments table contains the information in Table 26.6 about students taking classes.

TABLE 26.6: Enrollments records

FIRSTNAME	LASTNAME	COURSEID
Guinevere	Conkle	CS 120
Guinevere	Conkle	CS 101
Heron	Stroh	CS 120
Heron	Stroh	CS 245
Maxene	Quinn	CS 245

Now, consider the following query:

SELECT * FROM Enrollments, Courses
WHERE Courses.CourseId = Enrollments.CourseId

This may seem like a simple enough query that selects enrollment information plus each student's class name. For example, one of the records returned would be as follows:

FIRSTNAME	LASTNAME	COURSEID	COURSEID	COURSENAME
Guinevere	Conkle	CS 120	CS 120	Database Design

Note that the result contains two CourseId values, one from each table.

The way in which the kinds of JOIN clause differ is in the way they handle missing values. If you look again at the tables, you'll see that no students are currently enrolled in Introduction to Cryptography. You'll also find that Guinevere Conkle is enrolled in CS 101, which has no record in the Courses table.

The following query that uses a WHERE clause discards any records in one table that have no corresponding records in the second table:

SELECT * FROM Enrollments, Courses
WHERE Courses.CourseId = Enrollments.CourseId

The following statement uses the INNER JOIN clause to produce the same result:

SELECT * FROM Enrollments INNER JOIN Courses
  ON (Courses.CourseId = Enrollments.CourseId)

Table 26.7 shows the results of these two queries.

TABLE 26.7: Courses records

FIRSTNAME	LASTNAME	COURSEID	COURSEID	COURSENAME
Guinevere	Conkle	CS 120	CS 120	Database Design
Heron	Stroh	CS 120	CS 120	Database Design
Maxene	Quinn	CS 245	CS 245	The Customer: A Necessary Evil
Heron	Stroh	CS 245	CS 245	The Customer: A Necessary Evil

The following statement selects the same records except it uses the LEFT JOIN clause to favor the table listed to the left of the clause in the query (Orders). If a record appears in that table, it is listed in the result even if there is no corresponding record in the other table.

SELECT * FROM Orders LEFT JOIN OrderItems
  ON (Orders.OrderId = OrderItems.OrderId)

Table 26.8 shows the result of this query. Notice that the results include a record for Guinevere Conkle's CS 101 enrollment even though CS 101 is not listed in the Courses table. In that record, the fields that should have come from the Courses table have null values.

TABLE 26.8: Courses records

FIRSTNAME	LASTNAME	COURSEID	COURSEID	COURSENAME
Guinevere	Conkle	CS 120	CS 120	Database Design
Heron	Stroh	CS 120	CS 120	Database Design
Maxene	Quinn	CS 245	CS 245	The Customer: A Necessary Evil
Guinevere	Conkle	CS 101	NULL	NULL
Heron	Stroh	CS 245	CS 245	The Customer: A Necessary Evil

Conversely the RIGHT JOIN clause makes the query favor the table to the right of the clause, so it includes all the records in that table even if there are no corresponding records in the other table. The following query demonstrates the RIGHT JOIN clause:

SELECT * FROM Orders RIGHT JOIN OrderItems
  ON (Orders.OrderId = OrderItems.OrderId)

Table 26.9 shows the result of this query. This time there is a record for the Introduction to Cryptography course even though no student is enrolled in it.

TABLE 26.9: Courses records

FIRSTNAME	LASTNAME	COURSEID	COURSEID	COURSENAME
Guinevere	Conkle	CS 120	CS 120	Database Design
Heron	Stroh	CS 120	CS 120	Database Design
Maxene	Quinn	CS 245	CS 245	The Customer: A Necessary Evil
Heron	Stroh	CS 245	CS 245	The Customer: A Necessary Evil
NULL	NULL	NULL	D? = h@p	Introduction to Cryptography

Both the left and right joins are called outer joins because they include records that are outside of the “natural” records that include values from both tables.

Many databases, including MySQL and Access, don't provide a join to select all records from both tables like a combined left and right join. You can achieve a similar result by using the UNION keyword to combine the results of a left and right join. The following query uses the UNION clause:

SELECT * FROM Courses LEFT JOIN Enrollments
  ON Courses.CourseId=Enrollments.CourseId
UNION
SELECT * FROM Courses RIGHT JOIN Enrollments
  ON Courses.CourseId=Enrollments.CourseId

Table 26.10 shows the results.

TABLE 26.10: Courses records

FIRSTNAME	LASTNAME	COURSEID	COURSEID	COURSENAME
Guinevere	Conkle	CS 120	CS 120	Database Design
Heron	Stroh	CS 120	CS 120	Database Design
Maxene	Quinn	CS 245	CS 245	The Customer: A Necessary Evil
Guinevere	Conkle	CS 101	NULL	NULL
Heron	Stroh	CS 245	CS 245	The Customer: A Necessary Evil
NULL	NULL	NULL	D? = h@p	Introduction to Cryptography

WHERE Clause

The WHERE clause provides a filter to select only certain records in the tables. It can compare the values in the tables to constants, expressions, or other values in the tables. You can use parentheses and logical operators such as AND, NOT, and OR to build complicated selection expressions.

For example, the following query selects records from the Enrollments and Courses tables where the CourseId values match and the CourseId is alphabetically less than CS 200 (upper division classes begin with CS 200):

SELECT * FROM Enrollments, Courses
WHERE Enrollments.CourseId = Courses.CourseId
  AND Courses.CourseId < 'CS 200'

Table 26.11 shows the result.

TABLE 26.11: Courses records

FIRSTNAME	LASTNAME	COURSEID	COURSEID	COURSENAME
Guinevere	Conkle	CS 120	CS 120	Database Design
Heron	Stroh	CS 120	CS 120	Database Design

GROUP BY Clause

If you include an aggregate function such as AVERAGE or SUM in the SELECT clause, the GROUP BY clause tells the database which fields to look at to determine whether values should be combined.

For example, the following query selects the CustomerId field from the CreditsAndDebits table. It also selects the sum of the Amount field values. The GROUP BY clause makes the query combine values that have matching CustomerId values for calculating the sums. The result is a list of every CustomerId and the corresponding current total balance calculated by adding up all of that customer's credits and debits.

SELECT CustomerId, SUM(Amount) AS Balance
FROM CreditsAndDebits
GROUP BY CustomerId

ORDER BY Clause

The ORDER BY clause specifies a list of fields that the database should use to sort the results. The optional keyword DESC after a field makes the database sort that field's values in descending order. (The default order is ascending. You can explicitly include the ASC keyword if you want to make the order obvious.)

The following query selects the CustomerId field and the total of the Amount values for each CustomerId from the CreditsAndDebits table. It sorts the results in descending order of the total amount so that you can see who has the largest balance first.

SELECT CustomerId, SUM(Amount) AS Balance
FROM CreditsAndDebits
GROUP BY CustomerId
ORDER BY Balance DESC

The following query selects the distinct first and last name combinations from the Enrollments table and orders the results by LastName and then by FirstName. (For example, if two students have the same last name Zappa, then Dweezil comes before Moon Unit.)

SELECT DISTINCT LastName, FirstName
FROM Enrollments
ORDER BY LastName, FirstName

BE SELECTIVE

Though the SELECT statement has many variations, the basic ideas are reasonably intuitive so a little practice will go a long way toward learning how to write SELECT statements.

Suppose the Authors table has fields AuthorId, FirstName, and LastName. Suppose also that the Books table has fields AuthorId, Title, ISBN (International Standard Book Number), MSRP (Manufacturer's Suggested Retail Price), Year, and Pages. Write a query to select book titles, prices, and author names. Concatenate the authors' first and last names and give the result the alias Author. Select only books where MSRP is less than $10.00 (which these days may not return many results). Sort the results by price in ascending order.

Write the SELECT clause.
Write a FROM clause to select an inner join using the Authors and Books tables.
Write a WHERE clause to select records where MSRP < $10.00.
Write an ORDER BY clause to sort the results by MSRP in ascending order.

How It Works

Write the SELECT clause.
The basic SELECT clause includes the Books table's Title and MSRP fields. It also concatenates the Authors table's FirstName and LastName fields with a space in between. The following code shows the SELECT clause:
```
SELECT MSRP, Title, CONCAT(FirstName, " ", LastName) AS Author
```
Write a FROM clause to select an inner join using the Authors and Books tables.
The FROM clause selects an inner join using the Authors and Books tables. This query uses the tables' AuthorId fields to match corresponding records. The following code shows the FROM clause:
```
FROM Books INNER JOIN Authors
  ON (Books.AuthorId = Authors.AuthorId)
```
Write a WHERE clause to select records where MSRP < $10.00.
The WHERE clause adds a further condition on the selected records, requiring that the MSRP value be less than 10. The following code shows the WHERE clause:
```
WHERE MSRP < 10
```
Write an ORDER BY clause to sort the results by MSRP in ascending order.
The ORDER BY clause sorts the results by MSRP. The default order for ORDER BY clauses is ascending, so the statement doesn't need to explicitly include the ASC keyword. The following code shows the ORDER BY clause:
```
ORDER BY MSRP
```
The following code shows the complete query.
```
SELECT MSRP, Title, CONCAT(FirstName, " ", LastName) AS Author
FROM Books INNER JOIN Authors
  ON (Books.AuthorId = Authors.AuthorId)
WHERE MSRP < 10
ORDER BY MSRP
```

UPDATE

The UPDATE statement changes the values in one or more records' fields. The basic syntax is:

UPDATE table SET field = new_value
WHERE where_clause

For example, the following statement fixes a typo in the Books table. It changes the Title field's value to “The Portable Door” in any records that currently have the incorrect Title “The Potable Door.”

UPDATE Books SET Title = "The Portable Door"
WHERE Title = "The Potable Door"

The WHERE clause is extremely important in an UPDATE statement. If you forget the WHERE clause, the update affects every record in the table! (See the earlier warning about do-overs, crossed fingers, and backsies.) If you forget the WHERE clause in the previous example, the statement would change the title of every book to “The Portable Door,” which is probably not what you intended.

The effects of the UPDATE statement are immediate and irreversible so forgetting the WHERE clause can be disastrous. (In fact, some developers have suggested that an UPDATE statement without a WHERE clause should generate an error unless you take special action to say “yes, I'm really, really sure.”)

UPDATES

Suppose the Sales table includes the fields EmployeeId, Year, Month (which holds three-letter month abbreviations), TotalSales (the number of light sabers sold), and Salary. Write an update statement that gives a $100 bonus to employees who made their sales quota of 10 light sabers sold during the month of August 2025.

Write the UPDATE statement, including the table name and the SET clause.
Write the WHERE clause to select the records that should be updated.

How It Works

Write the UPDATE statement, including the table name and the SET clause.
The UPDATE clause will affect the Sales table. You need to add $100 to the Salary field for certain records. To do that, the clause sets the Salary value to current Salary value plus $100. The following code shows the UPDATE clause:
```
UPDATE Sales
SET Salary = Salary + 100
```
Write the WHERE clause to select the records that should be updated.
The WHERE clause has three parts that require the number of light sabers sold to be at least 10, the month to be AUG, and the year to be 2025. The following code shows the WHERE clause:
```
WHERE TotalSales>= 10
  AND Month = "AUG"
  AND Year = 2025
```
The following code shows the complete UPDATE statement:
```
UPDATE Sales
SET Salary = Salary + 100
WHERE TotalSales>= 10
AND Month = "AUG"
AND Year = 2025
```

DELETE

The DELETE statement removes records from a table. The basic syntax is:

DELETE FROM table
WHERE where_clause

For example, the following statement removes all records from the Books table where the AuthorId is 7:

DELETE FROM Books
WHERE AuthorId = 7

As is the case with UPDATE, the WHERE clause is very important in a DELETE statement. If you forget the WHERE clause, the DELETE statement removes every record from the table mercilessly and without remorse.

SUMMARY

SQL is a powerful tool. The SQL commands described in this chapter let you perform basic database operations such as determining the database's structure and contents. This chapter explained how to do the following:

Use the CREATE TABLE statement to create a table with a primary key, indexes, and foreign key constraints.
Use INSERT statements to add data to a table.
Use SELECT statements to select data from one or more tables, satisfying specific conditions, and sort the result.
Use the UPDATE statement to modify the data in a table.
Use the DELETE statement to remove records from a table.

SQL statements let you perform simple tasks with a database such as creating a new table or inserting a record. By combining many SQL statements into a script, you can perform elaborate procedures such as creating and initializing a database from scratch. The next chapter explains this topic in greater detail. It describes the benefits of using scripts to create databases and discusses some of the issues that you should understand before writing those scripts.

Before you move on to Chapter 27, however, use the following exercises to test your understanding of the material covered in this chapter. You can find the solutions to these exercises in Appendix A.

EXERCISES

Write SQL statements to create the three tables shown in Figure 26.2. Include the primary keys, foreign key constraints, and indexes on the fields used in those constraints.

FIGURE 26.2
Write a series of SQL statements to insert the data shown in Figure 26.3.

FIGURE 26.3

Hint: Use 24-hour clock times as in 14:00 for 2:00pm.
Management has decided that no two shows should start fewer than 15 minutes apart. Write SQL statements to change the 2:00 Flamingo Follies show to 2:15 and the 3:00 Sherm's Shark Show to 3:15.
Hint: Include both the ShowTimes and Shows tables in the UPDATE clause. Then use a WHERE clause to select the correct record by Time and ShowName.

Write a SQL statement to select data from the tables and produce the following result.

SHOW	TIME	LOCATION
Deb's Daring Dolphins	11:00am	Peet Amphitheater
Sherm's Shark Show	11:15am	Sherman's Lagoon
Deb's Daring Dolphins	12:00pm	Peet Amphitheater
Meet the Rays	1:15pm	Sherman's Lagoon
The Walter Walrus Comedy Hour	2:00pm	Peet Amphitheater
Flamingo Follies	2:15pm	Ngorongoro Wash
Wonderful Waterfowl	3:00pm	Ngorongoro Wash
Sherm's Shark Show	3:15pm	Sherman's Lagoon
The Walter Walrus Comedy Hour	5:27pm	Peet Amphitheater
Meet the Rays	6:00pm	Sherman's Lagoon
Deb's Daring Dolphins	6:30pm	Peet Amphitheater

Hints:

Sort the results by Show and then Time.
In MySQL at least, SHOW is a keyword, which means you cannot simply use AS Show to give the ShowName field the alias Show because that would confuse the database. Instead put quotes around the word Show wherever you need it as in AS "Show" and ORDER BY "Show".
To format the times as in 6:30 PM in MySQL, use the DATE_FORMAT function. To make the times line up nicely on the right, use the LPAD function to pad them on the left with spaces. The following code shows how:
```
LPAD(DATE_FORMAT(Time, "%l:%i %p"), 8, " ")
```
Unfortunately when you pad the times, the ORDER BY statement treats the result as a string rather than a time. That means, for example, “ 3:00 PM” comes alphabetically before “11:00 AM” because “ 3:00 AM” begins with a space. To fix this, use the TIME function to convert the times as strings back into times in the ORDER BY clause. For example, if you use the alias Time for the result of this field, then the ORDER BY clause should contain the following:
```
TIME(Time)
```

Write a SQL statement to select data from the tables and produce the following result.

TIME	SHOW	LOCATION
11:00am	Deb's Daring Dolphins	Peet Amphitheater
11:15am	Sherm's Shark Show	Sherman's Lagoon
12:00pm	Deb's Daring Dolphins	Peet Amphitheater
1:15pm	Meet the Rays	Sherman's Lagoon
2:00pm	The Walter Walrus Comedy Hour	Peet Amphitheater
2:15pm	Flamingo Follies	Ngorongoro Wash
3:00pm	Wonderful Waterfowl	Ngorongoro Wash
3:15pm	Sherm's Shark Show	Sherman's Lagoon
5:27pm	The Walter Walrus Comedy Hour	Peet Amphitheater
6:00pm	Meet the Rays	Sherman's Lagoon
6:30pm	Deb's Daring Dolphins	Peet Amphitheater

See Exercise 4 for hints.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for 26 Introduction to SQL

Create new playlist

Sign In

Sign Up

BACKGROUND

FINDING MORE INFORMATION

STANDARDS

MULTISTATEMENT COMMANDS

BASIC SYNTAX

COMMAND OVERVIEW

CREATE TABLE

CREATE INDEX

DROP

INSERT

SELECT

SELECT Clause

FROM Clause

WHERE Clause

GROUP BY Clause

ORDER BY Clause

UPDATE

DELETE

SUMMARY

Table of Contents for
26 Introduction to SQL