Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Chapter 6

Drilling Down to the SQL Nitty-Gritty

IN THIS CHAPTER

Executing SQL statements

Using (and misusing) reserved words

Working with SQL’s data types

Handling null values

Applying constraints

In this chapter, I get into the nitty-gritty of SQL. This is knowledge you need to master before you embark on actually writing SQL statements. SQL has some similarities to computer languages you may already be familiar with, and some important differences. I touch on some of these similarities and differences right here in this chapter, but will discuss others later when I get to the appropriate points in a complete discussion of SQL.

Executing SQL Statements

SQL is not a complete language, but a data sublanguage. As such, you cannot write a program in the SQL language like you can with C or Java. That doesn’t mean SQL is useless, though. There are several ways that you can use SQL. Say you have a query editor up on your screen and all you want is the answer to a simple question. Just type an SQL query, and the answer, in the form of one or more lines of data, appears on your screen. This mode of operation is called interactive SQL.

If your needs are more complex, you have two additional ways of making SQL queries:

You can write a program in a host language, such as C or Java, and embed single SQL statements here and there in the program as needed. This mode of operation is called embedded SQL.
You can write a module containing SQL statements in the form of procedures, and then call these procedures from a program written in a language such as C or Java. This mode of operation is called module language.

Interactive SQL

Interactive SQL consists of entering SQL statements into a database management system such as SQL Server, Oracle, or DB2. The DBMS then performs the commands specified by the statements. You could build a database from scratch this way, starting with a CREATE DATABASE statement, and building everything from there. You could fill it with data, and then type queries to selectively pull information out of it.

Although it’s possible to do everything you need to do to a database with interactive SQL, this approach has a couple of disadvantages:

It can get awfully tedious to enter everything in the form of SQL statements from the keyboard.
Only people fluent in the SQL language can operate on the database, and most people have never even heard of SQL, let alone are able to use it effectively.

SQL is the only language that most relational databases understand, so there is no getting around using it. However, the people who interact with databases the most — those folks that ask questions of the data — do not need to be exposed to naked SQL. They can be protected from that intimidating prospect by wrapping the SQL in a blanket of code written in another language. With that other language, a programmer can generate screens, forms, menus, and other familiar objects for the user to interact with. Ultimately, those things translate the user’s actions to SQL code that the DBMS understands. The desired information is retrieved, and the user sees the result.

Challenges to combining SQL with a host language

SQL has these fundamental differences from host languages that you might want to combine it with:

SQL is nonprocedural. One basic feature of all common host languages is that they are procedural, meaning that programs written in those languages execute procedures in a step-by-step fashion. They deal with data the same way, one row at a time. Because SQL is nonprocedural, it does whatever it is going to do all at once and deals with data a set of rows at a time. Procedural programmers coming to SQL for the first time need to adjust their thinking in order to use SQL effectively as a data manipulation and retrieval tool.
SQL recognizes different data types than does whatever host language you are using with it. Because there are a large number of languages out there that could serve as host languages for SQL, and the data types of any one of them do not necessarily agree with the data types of any other, the committee that created the ANSI/ISO standard defined the data types for SQL that they thought would be most useful, without referring to the data types recognized by any of the potential host languages. This data type incompatibility presents a problem if you want to perform calculations with your host language on data that was retrieved from a database with SQL. The problem is not serious; you just need to be aware of it. (It helps that SQL provides the CAST statement for translating one data type into another.)

Embedded SQL

Until recently, the most common form of SQL has been embedded SQL. This method uses a general-purpose computer language such as C, C++, or COBOL to write the bulk of an application. Such languages are great for creating an application’s user interface. They can create forms with buttons and menus, format reports, perform calculations, and basically do all the things that SQL cannot do. In a database application, however, sooner or later, the database must be accessed. That’s a job for SQL.

It makes sense to write the application in a host language and, when needed, drop in SQL statements to interact with the data. It is the best of both worlds. The host language does what it’s best at, and the embedded SQL does what it’s best at. The only downside to the cooperative arrangement is that the host language compiler will not recognize the SQL code when it encounters it and will issue an error message. To avoid this problem, a precompiler processes the SQL before the host language compiler takes over. When everything works, this is a great arrangement. Before everything works, however, debugging can be tough because a host language debugger doesn’t know how to handle any SQL that it encounters. Nevertheless, embedded SQL remains the most popular way to create database applications.

For example, look at a fragment of C code that contains embedded SQL statements. This particular fragment is written in Oracle’s Pro*C dialect of the C language and is code that might be found in an organization’s human resources department. This particular code block is designed to authenticate and log on a user, and then enable the user to change the salary and commission information for an employee.

EXEC SQL BEGIN DECLARE SECTION;

VARCHAR uid[20];

VARCHAR pwd[20];

VARCHAR ename[10];

FLOAT salary, comm;

SHORT salary_ind, comm_ind;

EXEC SQL END DECLARE SECTION;

main()

{

int sret; /* scanf return code */

/* Log in */

strcpy(uid.arr,"Mary"); /* copy the user name */

uid.len=strlen(uid.arr);

strcpy(pwd.arr,"Bennett"); /* copy the password */

pwd.len=strlen(pwd.arr);

EXEC SQL WHENEVER SQLERROR STOP;

EXEC SQL WHENEVER NOT FOUND STOP;

EXEC SQL CONNECT :uid;

printf("Connected to user: percents ",uid.arr);

printf("Enter employee name to update: ");

scanf("percents",ename.arr);

ename.len=strlen(ename.arr);

EXEC SQL SELECT SALARY,COMM INTO :salary,:comm

FROM EMPLOY

WHERE ENAME=:ename;

printf("Employee: percents salary: percent6.2f

comm: percent6.2f ", ename.arr, salary, comm);

printf("Enter new salary: ");

sret=scanf("percentf",&salary);

salary_ind = 0;

if (sret == EOF !! sret == 0) /* set indicator */

salary_ind =-1; /* Set indicator for NULL */

printf("Enter new commission: ");

sret=scanf("percentf",&comm);

comm_ind = 0; /* set indicator */

if (sret == EOF !! sret == 0)

comm_ind=-1; /* Set indicator for NULL */

EXEC SQL UPDATE EMPLOY

SET SALARY=:salary:salary_ind

SET COMM=:comm:comm_ind

WHERE ENAME=:ename;

printf("Employee percents updated. ",ename.arr);

EXEC SQL COMMIT WORK;

exit(0);

}

Here’s a closer look at what the code does:

First comes an SQL declaration section, where variables are declared.
Next, C code accepts a username and password.
A couple of SQL error traps follow, and then a connection to the database is established. (If an SQL error code or Not Found code is returned from the database, the run is aborted before it begins.)
C code prints out some messages and accepts the name of the employee whose record will be changed.
SQL retrieves that employee’s salary and commission data.
C displays the salary and commission data and solicits new salary and commission data.
SQL updates the database with the new data.
C displays a successful completion message.
SQL commits the transaction.
C terminates the program.

In this implementation, every SQL statement is introduced with an EXEC SQL directive. This is a clue to the compiler not to try to compile what follows, but instead to pass it directly to the DBMS’s database engine.

Some implementations have deprecated embedded SQL or discontinued it entirely. For example, embedded SQL was deprecated in SQL Server 2008, meaning it was still present, but may not be in a subsequent version. Software vendors recommend that deprecated features not be included in new development efforts. Embedded SQL is now absent from MySQL and SAP SQL Anywhere, although an independently developed preprocessor is available for MySQL.

Module language

Module language is similar to embedded SQL in that it combines the strengths of SQL with those of a host language. However, it does it in a slightly different way. All the SQL code is stored — as procedures — in a module separate from the host language program. Whenever the host language program needs to perform a database operation, it calls a procedure from the SQL module to do the job. With this arrangement, all your SQL is kept out of the main program, so the host language compiler has no problem, and neither does the debugger. All they see is host language code, including the procedure calls. The procedures themselves cause no difficulty because they are in a separate module, and the compiler and debugger just skip over them.

Another advantage of module language over embedded SQL is that the SQL code is separated from the host language code. Because high skill in both SQL and any given host language is rare, it is difficult to find good people to program embedded SQL applications. Because a module language implementation separates the languages, you can hire the best SQL programmer to write the SQL, and the best host language programmer to write the host language code. Neither one has to be an expert in the other language.

To see how this would work, check out the following module definition, which shows you the syntax you’d use to create a module that contains SQL procedures:

MODULE [module-name]

[NAMES ARE character-set-name]

LANGUAGE {ADA|C|COBOL|FORTRAN|MUMPS|PASCAL|PLI|SQL}

[SCHEMA schema-name]

[AUTHORIZATION authorization-id]

[temporary-table-declarations…]

[cursor-declarations…]

[dynamic-cursor-declarations…]

procedures…

The MODULE declaration is mandatory, but the module name is not. (It’s a good idea to name your modules anyway, just to reduce the confusion.) With the optional NAMES ARE clause, you can specify a character set — Hebrew, for example, or Cyrillic. The default character set will be used if you don’t include a NAMES ARE clause.

The next line lets you specify a host language — something you definitely have to do. Each language has different expectations about what the procedure will look like, so the LANGUAGE clause determines the format of the procedures in the module.

Although the SCHEMA clause and the AUTHORIZATION clause are both optional, you must specify at least one of them. The AUTHORIZATION clause is a security feature. If your authorization ID does not carry sufficient privileges, you won’t be allowed to use the procedures in the module.

If any of the procedures use temporary tables, cursors, or dynamic cursors, they must be declared before they are used. I talk about cursors in Book 3, Chapter 5.

Using Reserved Words Correctly

Given the fact that SQL makes constant use of command words such as CREATE and ALTER, it stands to reason that it would probably be unwise to use these same words as the names of tables or variables. To do so is a guaranteed way to confuse your DBMS. In addition to such command words, a number of other words also have a special meaning in SQL. These reserved words should also not be used for any purpose other than the one for which they are designed. Consider the following SQL statement:

SELECT CustomerID, FirstName, LastName

FROM Customer

WHERE CustomerID < 1000;

SELECT is a command word, and FROM and WHERE are reserved words. SQL has hundreds of reserved words, and you must be careful not to inadvertently use any of them as the names of objects or variables. Appendix A of this book contains a list of reserved words in ISO/IEC SQL:2016.

SQL’s Data Types

SQL is capable of dealing with data of many different types — as this aptly named section will soon make clear. From the beginning, SQL has been able to handle the common types of numeric and character data, but more recently, new types have been added that enable SQL to deal with nontraditional data types, such as BLOB, CLOB, and BINARY. At present, there are eleven major categories of data types: exact numerics, approximate numerics, character strings, binary strings, Booleans, datetimes, intervals, XML type, collection types, REF types, and user-defined types. Within each category, one or more specific types may exist.

Some implementations of SQL may include data types not mentioned here. These additional types are not mentioned in the ANSI/ISO standard and thus are not guaranteed to be available in other implementations. To maximize portability for your applications, stick to the standard types.

Your SQL implementation may not support all the data types that I describe in this section. Furthermore, your implementation may support nonstandard data types that I don’t describe here.

With that proviso out of the way, read on to find brief descriptions of each of the categories as well as enumerations of the standard types they include.

Exact numerics

Because computers store numbers in registers of finite size, there is a limit to how large or small a number can be and still be represented exactly. There is a range of numbers centered on zero that can be represented exactly. The size of that range depends on the size of the registers that the numbers are stored in. Thus a machine with 64-bit registers can exactly represent a range of numbers that is wider than the range that can be exactly represented on a machine with 32-bit registers.

After doing all the complex math, you’re left with six standard exact numeric data types. They are

INTEGER
SMALLINT
BIGINT
NUMERIC
DECIMAL
DECFLOAT

The next few sections drill down deeper into each type.

INTEGER

Data of the INTEGER type is numeric data that has no fractional part. Any given implementation of SQL will have a limit to the number of digits that an integer can have. If, for some reason, you want to specify a maximum size for an integer that is less than the default maximum, you can restrict the maximum number of digits by specifying a precision argument. By declaring a variable as having type INTEGER (10), you are saying numbers of this type can have no more than ten digits, even if the system you are running on is capable of handling more digits. Of course, if you specify a precision that exceeds the maximum capacity of the system, you’re not gonna get it no matter how much you whine. You cannot magically expand the sizes of the hardware registers in a machine with an SQL declaration.

If there is a possibility that sometime in the near or distant future, your application may be ported to a system that has a different default precision for exact numeric numbers, you should specify a precision. That way, the precision you have planned on will carry over to the new system. If you rely on the default precision, and the default precision of the system you port to is different, your operations may produce different results from those produced by your original system. On the other hand, you may be fine. For example, both Microsoft SQL Server and MySQL reserve the same amount of space for a number of the INTEGER type and thus the precision is the same for both.

SMALLINT

The SMALLINT data type is similar to the INTEGER type, but how it differs from the INTEGER type is implementation-dependent. It may not differ from the INTEGER type at all. The only constraint on the SMALLINT type is that its precision may be no larger than the precision of the INTEGER type.

For systems where the precision of the SMALLINT type actually is less than the precision of the INTEGER type, it may be advantageous to specify variables as being of the SMALLINT type if you can be sure that the values of those variables will never exceed the precision of the SMALLINT type. This saves you some storage space. If storage space is not an issue, or if you cannot be absolutely sure that the value of a variable will never exceed the precision of the SMALLINT type, you may be better off specifying it as being of the INTEGER type.

BIGINT

The BIGINT type is similar to the SMALLINT type. The only difference is that the precision of the BIGINT type can be no smaller than the precision of the INTEGER type. As is the case with SMALLINT, the precision of the BIGINT type could be the same as the precision of the INTEGER type.

If the precision of the BIGINT type for any given implementation is actually larger than the precision of the INTEGER type, a variable of the BIGINT type will take up more storage space than a variable of the INTEGER type. Only use the BIGINT type if there is a possibility that the size of a variable may exceed the precision of the INTEGER type.

NUMERIC

Data of the NUMERIC type does have a fractional part. This means the number contains a decimal point and zero or more digits to the right of the decimal point. For NUMERIC data, you can specify both precision and scale. The scale of a number is the number of digits to the right of the decimal point. For example, a variable declared as of type NUMERIC (10, 2) would have a maximum of ten digits, with two of those digits to the right of the decimal point. The largest number you can represent with this type is 99,999,999.99. If the system you are running on happens to be able to handle numbers with precision greater than ten, only the precision you specify will be used.

DECIMAL

Data of the DECIMAL type is similar to data of the NUMERIC type with one difference. For data of the DECIMAL type, if the system you are running on happens to be able to handle numbers with larger precision than what you have specified, the extra precision will be used.

The NUMERIC data type is better if portability is a possibility. When you use the NUMERIC type, you can be sure the precision you specify will be the precision that is used, regardless of the capabilities of the system. This ensures consistent results across diverse platforms.

DECFLOAT

DECFLOAT is a new exact numeric data type in SQL:2016. It was added to ISO/IEC standard SQL specifically for business applications that deal with exact decimal values. Floating point data types, such as REAL and DOUBLE, can handle larger numbers than exact numerics such as NUMERIC and DECIMAL. However, they cannot be counted upon to produce exact decimal values. DECFLOAT can handle larger numbers than other exact numeric data types, and retain the exactness of an exact numeric type.

Approximate numerics

The approximate numeric types (all three of them) exist so that you can represent numbers either too large or too small to be represented by an exact numeric type. If, for example, a system has 32-bit registers, then the largest number that can be represented with an exact numeric type is the largest number that can be represented with 32 binary digits — which happens to be 4,294,967,295 in decimal. If you have to deal with numbers larger than that, you must move to approximate numerics or buy a computer with 64-bit registers. Using approximate numerics may not be much of a hardship: For most applications, after you get above four billion, approximations are good enough.

Similarly, values very close to zero cannot be represented with exact numerics either. The smallest number that can be represented exactly on a 32-bit machine has a one in the least significant bit position and zeros everywhere else. This is a very small number, but there are a lot of numbers of interest, particularly in science, that are smaller. For such numbers, you must also rely on approximate numerics.

With that intro out of the way, it’s time to meet the three approximate numeric types: REAL, DOUBLE PRECISION, and FLOAT.

REAL

The REAL data type is what you would normally use for single-precision floating-point numbers. The exact meaning of the term single precision depends on the implementation. This is hardware-dependent and a machine with 64-bit registers will, in general, have a larger precision than a machine with 32-bit registers. How much larger may vary from one implementation to another.

A floating-point number is a number that contains a radix point. In the case of decimal numbers, that means a decimal point. The decimal point could appear anywhere in the number, which is why it is called floating. 2.7, 2.73, 27.3, and 2735.53894 are all examples of floating-point numbers. Although we humans are accustomed to seeing numbers expressed in this form, approximate numerics are expressed as a combination of a mantissa and an exponent. This form is a little less user friendly, but enables the approximate representation of very large and very small numbers in a compact form. 6.626 × 10^-34, for example, is a very small number, being as it is an approximation of Planck’s constant, also a very small number. 6.626 is the mantissa, and -34 is the exponent. It would not be possible to represent a number that small exactly with any currently existing hardware.

DOUBLE PRECISION

A double-precision number, which is the basis for the double precision (DOUBLE) data type, on any given system has greater precision than a real number on the same system. However, despite the name, a double-precision number does not necessarily have twice the precision of a real number. The most that can be said in general is that a double-precision number on any given system has greater precision than does a real number on the same system. On some systems, a double-precision number may have a larger mantissa than does a real number. On other systems, a double-precision number may support a larger exponent (absolute value). On yet other systems, both mantissa and exponent of a double-precision number may be larger than for a real number. You will have to look at the specifications for whatever system you are using to find out what is true for you.

FLOAT

The FLOAT data type is very similar to the REAL data type. The difference is that with the FLOAT data type you can specify a precision. With the REAL and DOUBLE PRECISION data types, the default precision is your only option. Because the default precision of these data types can vary from one system to another, porting your application from one system to another could be a problem. With the FLOAT data type, specifying the precision of an attribute on one machine guarantees that the precision will be maintained after porting the application to another machine. If a system’s hardware supports double-precision operations and the application requires double-precision operations, the FLOAT data type automatically uses the double-precision circuitry. If single-precision is sufficient, it uses that.

Character strings

After numbers, the next most common thing to be stored is strings of alphanumeric characters. SQL provides several character string types, each with somewhat different characteristics from the others. The three main types are CHARACTER, CHARACTER VARYING, and CHARACTER LARGE OBJECT. These three types are mirrored by NATIONAL CHARACTER, NATIONAL CHARACTER VARYING, and NATIONAL CHARACTER LARGE OBJECT, which deal with character sets other than the default character set, which is usually the character set of the English language.

CHARACTER

A column defined as being of type CHARACTER or CHAR can contain any of the normal alphanumeric characters of the language being used. A column definition also includes the maximum length allowed for an item of the CHAR type. Consider this example:

Name CHAR (15)

This field can hold a name up to 15 characters long. If the name is less than 15 characters long, the remaining spaces are filled with blank characters to bring the total length up to 15. Thus a CHARACTER field always takes up the same amount of space in memory, regardless of how long the actual data item in the field is.

CHARACTER VARYING

The CHARACTER VARYING or VARCHAR data type is like the CHARACTER type in all respects except that short entries are not padded out with blanks to fill the field to the stated maximum.

Name VARCHAR (15)

The VARCHAR data type doesn’t add blanks on the end of a name. Thus if the Name field contains Joe, the length of the field that is stored will be only three characters rather than fifteen.

CHARACTER LARGE OBJECT (CLOB)

Any implementation of SQL has a limit to the number of characters that are allowed in a CHARACTER or CHARACTER VARYING field. For example, the maximum length of a character string in Oracle 11g is 1,024 characters. If you want to store text that goes beyond that limit, you can use the CHARACTER LARGE OBJECT data type. The CLOB type, as it is affectionately known, is much less flexible than either the CHAR or VARCHAR types in that it does not allow you to do many of the fine-grained manipulations that you can do in those other types. You can compare two CLOB items for equality, but that’s about all you can do. With CHARACTER type data you can, for example, scan a string for the first occurrence of the letter W, and display where in the string it occurs. This type of operation is not possible with CHARACTER LARGE OBJECT data.

Here’s an example of the declaration of a CHARACTER LARGE OBJECT:

Dream CLOB (8721)

Another restriction on CLOB data is that a CLOB data item may not be used as a primary key or a foreign key. Furthermore, you cannot apply the UNIQUE constraint to an item of the CLOB type. The bottom line is that the CLOB data type enables you to store and retrieve large blocks of text, but it turns out you can’t do much with them beyond that.

NATIONAL CHARACTER, NATIONAL CHARACTER VARYING, and NATIONAL CHARACTER LARGE OBJECT

Different languages use different character sets. For example, Spanish and German have letters with diacritical marks that change the way the letter is pronounced. Other languages, such as Russian, have an entirely different character set. To store character strings that contain these different character sets, the various national character types have been added to SQL. If the English character type is the default on your system, as it is for most people, you can designate a different character set as your national character set. From that point on, when you specify a data type as NATIONAL CHARACTER, NATIONAL CHARACTER VARYING, or NATIONAL CHARACTER LARGE OBJECT, items in columns so specified use the chosen national character set rather than the default character set.

In addition to whatever national character set you specify, you can use multiple other character sets in a table definition, by specifying them explicitly. Here’s an example where the national character set is Russian, but you explicitly add Greek and Kanji (Japanese) to the mix:

CREATE TABLE BOOK_TITLE_TRANSLATIONS (

English CHARACTER (40),

Greek VARCHAR (40) CHARACTER SET GREEK,

Russian NATIONAL CHARACTER (40),

Japanese CHARACTER (40) CHARACTER SET KANJI

) ;

Warning Some implementations may not support all the character sets. For example, MySQL does not currently support Kanji.

Binary strings

The various binary string data types were added to SQL:2008. Binary strings are like character strings except that the only characters allowed are 1 and 0. There are three different types of binary strings, BINARY, BINARY VARYING, and BINARY LARGE OBJECT.

BINARY

A string of binary characters of the BINARY type must be some multiple of eight bits long. You can specify such a string with BINARY (x), where x is the number of bytes of binary data contained in the string. For example, if you specify a binary string with BINARY (2), then the string will be two bytes, or 16 bits long. Byte one is defined as the first byte of the string.

BINARY VARYING

The BINARY VARYING or VARBINARY type is like the BINARY type except the string length need not be x bytes long. A string specified as VARBINARY (x) can be a minimum of zero bytes long and a maximum of x bytes long.

BINARY LARGE OBJECT (BLOB)

The BINARY LARGE OBJECT (BLOB) type is used for a really large binary number. That large binary number may represent the pixels in a graphical image, or something else that doesn’t seem to be a number. However, at the most fundamental level, it is a number.

The BLOB type, like the CLOB type, was added to the SQL standard to reflect the reality that more and more of the things that people want to store in databases do not fall into the classical categories of being either numbers or text. You cannot perform arithmetic operations on BLOB data, but at least you can store it in a relational database and perform some elementary operations on it.

Booleans

A column of the BOOLEAN data type, named after nineteenth-century English mathematician George Boole, will accept any one of three values: TRUE, FALSE, and UNKNOWN. The fact that SQL entertains the possibility of NULL values expands the traditional restriction of Boolean values from just TRUE and FALSE to TRUE, FALSE, and UNKNOWN. If a Boolean TRUE or FALSE value is compared to a NULL value, the result is UNKNOWN. Of course, comparing a Boolean UNKNOWN value to any value also gives an UNKNOWN result.

Datetimes

You often need to store either dates, times, or both, in addition to numeric and character data. ISO/IEC standard SQL defines five datetime types. Because considerable overlap exists among the five types, not all implementations of SQL include all five types. This could cause problems if you try to migrate a database from a platform that uses one subset of the five types to a platform that uses a different subset. There is not much you can do about this except deal with it when the issue arises.

DATE

The DATE data type is the one to use if you care about the date of something but could not care less about the time of day within a date. The DATE data type stores a year, month, and day in that order, using ten character positions in the form yyyy-mm-dd. If you were recording the dates that humans first landed on the Moon, the entry for Apollo 11 would be 1969-07-20.

TIME WITHOUT TIME ZONE

Suppose you want to store the time of day, but don’t care which day, and furthermore, don’t even care which time zone the time refers to? In that case, the TIME WITHOUT TIME ZONE data type is just the ticket. It stores hours, minutes, and seconds. The hours and minutes data occupies two digits apiece. The seconds data also occupies two digits, but in addition may include a fractional part for fractions of a second. If you specify a column as being of TIME WITHOUT TIME ZONE type, with no parameter, it will hold a time that has no fractional seconds. An example is 02:56:31, which is fifty-six minutes and thirty one seconds after two in the morning.

For greater precision in storing a time value, you can use a parameter to specify the number of digits beyond the decimal point that will be stored for seconds. Here’s an example of such a definition:

Smallstep TIME WITHOUT TIME ZONE (2),

In this example, there are two digits past the decimal point, so time is measured down to a hundredth of a second. It would take the form of 02:56:31.17.

TIME WITH TIME ZONE

The TIME WITH TIME ZONE data type gives you all the information that you get in the TIME WITHOUT TIME ZONE data type, and adds the additional fact of what time zone the time refers to. All time zones around the Earth are referenced to Coordinated Universal Time (UTC), formerly known as Greenwich Mean Time (GMT). Coordinated Universal Time is the time in Greenwich, U.K., which was the place where people first started being concerned with highly accurate timekeeping. Of course, the United Kingdom is a fairly small country, so UTC is in effect throughout the entire U.K. In fact, a huge “watermelon slice” of the Earth, running from the North Pole to the South Pole, is also in the same time zone as Greenwich. There are 24 such slices that girdle the Earth. Times around the earth range from eleven hours and fifty-nine minutes behind UTC to twelve hours ahead of UTC (not counting Daylight Saving Time). If Daylight Saving Time is in effect, the offset from UTC could be as much as –12:59 or +13:00. The International Date Line is theoretically exactly opposite Greenwich on the other side of the world, but is offset in spots so as to keep some countries in one time zone.

TIMESTAMP WITHOUT TIME ZONE

Just as sometimes you will need to record dates, and other times you will need to record times, it’s certain that there will also be times when you need to store both times and dates. That is what the TIMESTAMP WITHOUT TIME ZONE data type is for. It is a combination of the DATE type and the TIME WITHOUT TIMEZONE type. The one difference between this data type and the TIME WITHOUT TIMEZONE type is that the default value for fractions of a second is six digits rather than zero. You can, of course, specify zero fractional digits, if that is what you want. Suppose you specified a database table column as follows:

Smallstep TIMESTAMP WITHOUT TIME ZONE (0),

A valid value for Smallstep would be 1969-07-21 02:56:31. That was the date and time in Greenwich when Neil Armstrong’s foot first touched the lunar soil. It consists of ten date characters, a blank space separator, and eight time characters.

TIMESTAMP WITH TIME ZONE

If you have to record the time zone that a date and time refers to, use the TIMESTAMP WITH TIME ZONE data type. It’s the same as the TIMSESTAMP WITHOUT TIME ZONE data type, with the addition of an offset that shows the time’s relationship to Coordinated Universal Time. Here’s an example:

Smallstep TIMESTAMP WITH TIME ZONE (0),

In this case, Smallstep might be recorded as 1969-07-20 21:56:31-05:00. That is the date and time in Houston when Neil Armstrong’s foot first touched the lunar soil. Houston time is normally six hours ahead of Greenwich time, but in July, it is only five hours ahead due to Daylight Saving Time.

Intervals

An interval is the difference between two dates, two times, or two datetimes. There are two different kinds of intervals, the year-month interval and the day-hour-minute-second interval. A day always has 24 hours. An hour always has 60 minutes. A minute always has 60 seconds. However, a month may have 28, 29, 30, or 31 days. Because of that variability, you cannot mix the two kinds of intervals. A field of the INTERVAL type can store the difference in time between two instants in the same month, but cannot store an interval such as 2 years, 7 months, 13 days, 5 hours, 6 minutes, and 45 seconds.

XML type

The SQL/XML:2003 update to the ISO/IEC SQL standard introduced the XML data type. Values in the XML type are XML values, meaning you can now manage and query XML data in an SQL database.

With SQL/XML:2006, folks moved to the XQuery Data Model, which means that any XML value is also an XQuery sequence. The details of the XQuery Data Model are beyond the scope of this book. Refer to Querying XML, by Jim Melton and Stephen Buxton (published by Morgan Kaufmann), for detailed coverage of this topic.

With the introduction of SQL/XML:2006, three specific subtypes of the XML type were defined. They are XML(SEQUENCE), XML(CONTENT), and XML(DOCUMENT). The three subtypes are related to each other hierarchically. An XML(SEQUENCE) is any sequence of XML nodes, XML values, or both. An XML(CONTENT) is an XML(SEQUENCE) that is an XML fragment wrapped in a document node. An XML(DOCUMENT) is an XML(CONTENT) that is a well-formed XML document.

Every XML value is at least an XML(SEQUENCE). An XML(SEQUENCE) that is a document node is an XML(CONTENT). An XML(CONTENT) that has legal document children is an XML(DOCUMENT).

XML types may be associated with an XML schema. There are three possibilities:

UNTYPED: There is no associated XML schema.
XMLSCHEMA: There is an associated XML schema.
ANY: There may or may not be an associated XML schema.

So a document of type XML(DOCUMENT(ANY)) may or may not have an associated XML schema. If you specify a column as being of type XML with no modifiers, it must be either XML(SEQUENCE), XML(CONTENT(ANY), or XML(CONTENT(UNTYPED)). Which of those it is depends on the implementation.

ROW type

The ROW type, introduced in the 1999 version of the ISO/IEC SQL standard (SQL:1999), represents the first break of SQL away from the relational model, as defined by its creator, Dr. E.F. Codd. With the introduction of this type, SQL databases can no longer be considered pure relational databases. One of the defining characteristics of Codd’s First Normal Form (1NF) is the fact that no field in a table row may be multivalued. Multivalued fields are exactly what the ROW type introduces. The ROW type enables you to place a whole row’s worth of data into a single field, effectively nesting a row within a row. To see how this works, create a ROW type.

Note: The normal forms constrain the structure of database tables as a defense against anomalies, which are inconsistencies in table data or even outright wrong values. 1NF is the least restrictive of the normal forms, and thus the easiest to satisfy. Notwithstanding that, a table that includes a ROW type fails the test of First Normal Form. According to Dr. Codd, such a table is not a relation, and thus cannot be present in a relational database. I give extensive coverage to normalization and the normal forms in Book 2, Chapter 2.

CREATE ROW TYPE address_type (

Street VARCHAR (25),

City VARCHAR (20),

State CHAR (2),

PostalCode VARCHAR (9)

) ;

This code effectively compresses four attributes into a single type. After you have created a ROW type — such as address_type in the preceding example — you can then use it in a table definition.

CREATE TABLE VENDOR (

VendorID INTEGER PRIMARY KEY,

VendorName VARCHAR (25),

Address address_type,

Phone VARCHAR (15)

) ;

If you have tables for multiple groups, such as vendors, employees, customers, stockholders, or prospects, you have to declare only one attribute rather than four. That may not seem like much of a savings, but you’re not limited to putting just four attributes into a ROW type. What if you had to type in the same forty attributes into a hundred tables?

The ROW type, like many other aspects of SQL that have been added relatively recently, has not yet been included into many of the most popular implementations of SQL. Even Oracle, which is one of the closest implementations to the SQL:2016 standard, does not currently support the ROW type. Instead, it supports object types, which perform a similar function.

Collection types

The introduction of ROW types in SQL:1999 was not the only break from the ironclad rules of relational database theory. In that same version of the standard, the ARRAY type was introduced, and in SQL:2003, the MULTISET type was added. Both of these collection types violate the ol’ First Normal Form (1NF) and thus take SQL databases a couple of steps further away from relational purity.

ARRAY

The ARRAY type violates 1NF, but not in the same way that the ROW type does. The ARRAY type enables you to enhance a field of an existing type by putting more than one entry into it. This creates a repeating group, which was demonized in Codd’s original formulation of the relational model, but now reappears as a desirable feature. Arrays are ordered in the sense that each element in the array corresponds to exactly one ordinal position in the array.

You might ask how a repeating group of the ARRAY type differs from the ROW type’s ability to put “a whole row’s worth of data into a single field.” The distinction is subtle. The ROW type enables you to compress multiple different attributes into a single field, such as a street, city, state, and postal code. The repeating group of the ARRAY type enables you to put multiple instances of the same attribute into a single field, such as a phone number and three alternate phone numbers.

As an example, suppose you want to have alternate ways of contacting your vendors in case the main telephone number does not work for you. Perhaps you would like the option of storing as many as four telephone numbers, just to be safe. A slight modification to the code shown previously will do the trick.

CREATE TABLE VENDOR (

VendorID INTEGER PRIMARY KEY,

VendorName VARCHAR (25),

Address address_type,

Phone VARCHAR (15) ARRAY [4]

) ;

When he created the relational model, Dr. Codd made a conscious decision to sacrifice some functional flexibility in exchange for enhanced data integrity. The addition of the ARRAY type, along with the ROW type and later the MULTISET type, takes back some of that flexibility in exchange for added complexity. That added complexity could lead to data integrity problems if it is not handled correctly. The more complex a system is, the more things that can go wrong, and the more opportunities there are for people to make mistakes.

Multiset

Whereas an array is an ordered collection of elements, a multiset is an unordered collection. You cannot reference individual elements in a multiset because you don’t know where they are located in the collection. If you want to have multiples of an attribute, such as phone numbers, but don’t care what order they are listed in, you can use a multiset rather than an array.

REF types

REF types are different from distinct data types such as INTEGER or CHAR. They are used in obscure circumstances by highly skilled SQL wizards, and just about nobody else. Instead of holding values, an REF type references a user-defined structured type associated with a typed table. Typed tables are beyond the scope of this book, but I mention REF type here for the sake of completeness.

REF types are not a part of core SQL. This means that database vendors can claim compliance with the SQL standard without implementing REF types.

The REF type is an aspect of the object-oriented nature of SQL since the SQL:1999 standard. If object-oriented programming seems obscure to you, as it does to many programmers of a more traditional bent, you can probably survive quite well without ever needing the REF type.

User-defined types

User-defined types (UDTs) are another addition to SQL imported from the world of object-oriented programming. If the data types that I have enumerated here are not enough for you, you can define your own data types. To do so, use the principles of abstract data types (ADTs) that are major features of such object-oriented languages as C++.

SQL is not a complete programming language, and as such must be used with a host language that is complete, such as C. One of the problems with this arrangement is that the data types of the host language often do not match the data types of SQL. User-defined types come to the rescue here. You can define a type that matches the corresponding type in the host language.

The object-oriented nature of UDTs becomes evident when you see that a UDT has attributes and methods encapsulated within it. The attribute definitions and the results of the methods are visible to the outside world, but the ways the methods are actually implemented are hidden from view. In this object-oriented world, you can declare attributes and methods to be public, private, or protected. A public attribute or method is available to anyone who uses the UDT. A private attribute or method may be used only by the UDT itself. A protected attribute or method may be used only by the UDT itself and its subtypes. (If this sounds familiar to you, don’t be surprised — an SQL UDT is much like a class in object-oriented programming.)

There are two kinds of UDTs: distinct types and structured types. The next sections take a look at each one in turn.

Distinct types

A distinct type is very similar to a regular predefined SQL type. In fact, a distinct type is derived directly from a predefined type, called the source type. You can create multiple distinct types from a single source type, each one distinct from all the others and from the source type. Here’s how to create a distinct type from a predefined type:

CREATE DISTINCT TYPE USdollar AS DECIMAL (10,2) ;

This definition (USdollar) creates a new data type for (wait for it) U.S. dollars, based on the predefined DECIMAL type. You can create additional distinct types in the same way:

CREATE DISTINCT TYPE Euro AS DECIMAL (10,2) ;

Now you can create tables that use the new types:

CREATE TABLE USinvoice (

InvoiceNo INTEGER PRIMARY KEY,

CustomerID INTEGER,

SalesID INTEGER,

SaleTotal USdollar,

Tax USdollar,

Shipping USdollar,

GrandTotal USdollar

) ;

CREATE TABLE Europeaninvoice (

InvoiceNo INTEGER PRIMARY KEY,

CustomerID INTEGER,

SalesID INTEGER,

SaleTotal Euro,

Tax Euro,

Shipping Euro,

GrandTotal Euro

) ;

The USdollar type and the Euro type are both based on the DECIMAL type, but you cannot directly compare a USdollar value to a Euro value, nor can you directly compare either of those to a DECIMAL value. This is consistent with reality because one U.S. dollar is not equal to one euro. However, it is possible to exchange dollars for euros and vice versa when traveling. You can make that exchange with SQL too, but not directly. You must use a CAST operation, which I describe in Book 3, Chapter 1.

Structured types

Structured types are not based on a single source type as are the distinct types. Instead, they are expressed as a list of attributes and methods. When you create a structured UDT, the DBMS automatically creates a constructor function, a mutator function, and an observer function. The constructor for a UDT is given the same name as the UDT. Its job is to initialize the UDT’s attributes to their default values. When you invoke a mutator function, it changes the value of an attribute of a structured type. You can then use an observer function to retrieve the value of an attribute of a structured type. If you include an observer function in a SELECT statement, it will retrieve values from the database.

SUBTYPES AND SUPERTYPES

A hierarchical relationship can exist between two structured types. One structured type can be a “child” or subtype of a “parent” or supertype. Consider an example involving books. Suppose you have a UDT named BookUDT, which has a subtype named NovelUDT and another subtype named TechBookUDT. BookUDT is a supertype of both subtypes. Suppose further that TechBookUDT has a subtype named DatabaseBookUDT. DatabaseBookUDT is not only a subtype of TechBookUDT, but also a subtype of BookUDT. Because DatabaseBookUDT is a direct child of TechBookUDT it is considered a proper subtype of TechBookUDT. Since DatabaseBookUDT is not a direct child of BookUDT, but rather a grandchild, it is not considered a proper subtype of BookUDT.

A structured type that has no supertype is considered a maximal supertype, and a structured type that has no subtypes is considered a leaf subtype.

STRUCTURED TYPE EXAMPLE

Here’s how you can create structured UDTs:

/* Create a UDT named BookUDT */

CREATE TYPE BookUDT AS

/* Specify attributes */

Title CHAR (40),

Author CHAR (40),

MyCost DECIMAL (9,2),

ListPrice DECIMAL (9.2)

/* Allow for subtypes */

NOT FINAL ;

/* Create a subtype named TechBookUDT */

CREATE TYPE TechBookUDT UNDER BookUDT NOT FINAL ;

/* Create a subtype named DatabaseBookUDT */

CREATE TYPE DatabaseBookUDT UNDER TechBookUDT FINAL ;

Note: In this code, comments are enclosed within /* comment */ pairs. The NOT FINAL keywords indicate that even though a semicolon is closing out the statement, there is more to come. Subtypes are about to be defined under the supertype. The lowest level subtype closes out with the keyword FINAL.

Now that the types are defined, you can create tables that use them.

CREATE TABLE DATABASEBOOKS (

StockItem DatabaseBookUDT,

StockNumber INTEGER

) ;

Now that the table exists, you can add data to it.

BEGIN

/* Declare a temporary variable x */

DECLARE x = DatabaseBookUDT;

/* Execute the constructor function */

Set x = DatabaseBookUDT() ;

/* Execute the first mutator function */

SET x = x.Title('SQL for Dummies') ;

/* Execute the second mutator function */

SET x = x.Author('Allen G. Taylor') ;

/* Execute the third mutator function */

SET x = x.MyCost(23.56) ;

/* Execute the fourth mutator function */

SET x = x.ListPrice(29.99) ;

INSERT INTO DATABASEBOOKS VALUES (x, 271828) ;

END

Data type summary

Table 6-1 summarizes the SQL data types and gives an example of each.

TABLE 6-1 Data Types

Data Type	Example Value
`CHARACTER (20)`	'`Amateur Radio`'
`VARCHAR (20)`	'`Amateur Radio`'
`CLOB (1000000)`	'`This character string is a million characters long …` '
`SMALLINT`, `BIGINT`, or `INTEGER`	`7500`
`NUMERIC` or `DECIMAL`	`3425.432`
`REAL`, `FLOAT,` or `DOUBLE PRECISION`	`6.626E-34`
`BINARY`	`'1011001110101010'`
`BINARY VARYING`	`'10110'`
`BLOB (1000000)`	`'1001001110101011010101010101…` `'`
`BOOLEAN`	`'true'`
`DATE`	`1957-08-14`
`TIME WITHOUT TIME ZONE (2)`¹	`12:46:02.43`
`TIME WITH TIME ZONE (3)`	`12:46:02.432-08:00`
`TIMESTAMP WITHOUT TIME ZONE (0)`	`1957-08-14 12:46:02`
`TIMESTAMP WITH TIME ZONE (0)`	`1957-08-14 12:46:02-08:00`
`INTERVAL DAY`	`INTERVAL '4' DAY`
`ROW`	`ROW (Street VARCHAR (25), City VARCHAR (20), State CHAR (2), PostalCode VARCHAR (9))`
`ARRAY`	`INTEGER ARRAY [15]`
`MULTISET`	`Phone VARCHAR (15) MULTISET [4]`
`REF`	Not an ordinary type, but a pointer to a referenced type
`USER DEFINED TYPE`	Currency type based on `DECIMAL`

¹ Argument specifies number of fractional digits.

Handling Null Values

SQL is different from practically any computer language that you may have encountered up to this point in that it allows null values. Other languages don’t. Allowing null values gives SQL a flexibility that other languages lack, but also contributes to the impedance mismatch between SQL and host languages that it must work with in an application. If an SQL database contains null values that the host language does not recognize, you have to come up with a plan that handles that difference in a consistent way.

I’m borrowing the term impedance mismatch from the world of electrical engineering. If, for example, you’ve set up your stereo system using speaker cable with a characteristic impedance of 50 ohms feeding speakers with an impedance of 8 ohms, you’ve got yourself a case of impedance mismatch and you’ll surely get fuzzy, noisy sound — definitely low fidelity. If a data type of a host language does not exactly match the corresponding data type of SQL, you have a similar situation, bad communication across the interface between the two.

A null value is a nonvalue. If you are talking about numeric data, a null value is not the same as zero, which is a definite value. It is one less than one. If you are talking about character data, a null value is not the same as a blank space. A blank space is also a definite value. If you are talking about Boolean data, a null value is not the same as FALSE. A false Boolean value is a definite value too.

A null value is the absence of a value. It reminds me of the Buddhist concept of emptiness. I almost feel that if I ever come to understand null values completely, I will have transcended the illusions of this world and achieved a state of enlightenment.

A field may contain a null value for several reasons:

A field may have a definite value, but the value is currently unknown.
A field may not yet have a definite value, but it may gain one in the future.
For some rows in a table, a particular field in that row may not be applicable.
The old value of a field has been deleted, but it has not yet been replaced with a new value.

In any situation where knowledge is incomplete, null values are possible. Because in most application areas, knowledge is never complete, null values are very likely to appear in most databases.

Applying Constraints

Constraints are one of the primary mechanisms for keeping the contents of a database from turning into a misleading or confusing mess. By applying constraints to tables, columns, or entire databases, you prevent the addition of invalid data or the deletion of data that is required to maintain overall consistency. A constraint can also identify invalid data that already exists in a database. If an operation that you perform in a transaction causes a constraint to be violated, the DBMS will prevent the transaction from taking effect (being committed). This protects the database from being put into an inconsistent state.

Column constraints

You can constrain the contents of a table column. In some cases, that means constraining what the column must contain, and in other cases, what it may not contain. There are three kinds of column constraints: the NOT NULL, UNIQUE, and CHECK constraints.

NOT NULL

Although SQL allows a column to contain null values, there are times when you want to be sure that a column always has a distinct value. In order for one row in a table to be distinguished from another, there must be some way of telling them apart. This is usually done with a primary key, which must have a unique value in every row. Because a null value in a column could be anything, it might match the value for that column in any of the other rows. Thus it makes sense to disallow a null value in the column that is used to distinguish one row from the rest. You can do this with a NOT NULL constraint, as shown in the following example:

CREATE TABLE CLIENT (

ClientName CHAR (30) NOT NULL,

Address1 CHAR (30),

Address2 CHAR (30),

City CHAR (25),

State CHAR (2),

PostalCode CHAR (10),

Phone CHAR (13),

Fax CHAR (13),

ContactPerson CHAR (30)

) ;

When entering a new client into the CLIENT table, you must make an entry in the ClientName column.

UNIQUE

The NOT NULL constraint is a fairly weak constraint. You can satisfy the constraint as long as you put anything at all into the field, even if what you put into it would allow inconsistencies into your table. For example, suppose you already had a client named David Taylor in your database, and someone tried to enter another record with the same client name. If the table was protected only by a NOT NULL constraint, the entry of the second David Taylor would be allowed. Now when you go to retrieve David Taylor’s information, which one will you get? How will you tell whether you have the one you want? A way around this problem is to use the stronger UNIQUE constraint. The UNIQUE constraint will not only disallow the entry of a null value in a column, but it will also disallow the entry of a value that matches a value already in the column.

CHECK

Use the CHECK constraint for preventing the entry of invalid data that goes beyond maintaining uniqueness. For example, you can check to make sure that a numeric value falls within an allowed range. You can also check to see that a particular character string is not entered into a column.

Here’s an example that ensures that the charge for a service falls within the acceptable range. It insures that a customer is not mistakenly given a credit rather than a debit, and that she is not charged a ridiculously high amount either.

CREATE TABLE TESTS (

TestName CHARACTER (30) NOT NULL,

StandardCharge NUMERIC (6,2)

CHECK (StandardCharge >= 0.00

AND StandardCharge <= 200.00)

) ;

The constraint is satisfied only if the charge is positive and less than or equal to $200.

Table constraints

Sometimes a constraint applies not just to a column, but to an entire table. The PRIMARY KEY constraint is the principal example of a table constraint; it applies to an entire table.

Although a primary key may consist of a single column, it could also be made up of a combination of two or more columns. Because a primary key must be guaranteed to be unique, multiple columns may be needed if one column is not enough to guarantee uniqueness.

To see what I mean, check out the following, which shows a table with a single-column primary key:

CREATE TABLE PROSPECT (

ProspectName CHAR (30) PRIMARY KEY,

Address1 CHAR (30),

Address2 CHAR (30),

City CHAR (25),

State CHAR (2),

PostalCode CHAR (10),

Phone CHAR (13),

Fax CHAR (13)

) ;

The primary key constraint in this case is listed with the ProspectName column, but it is nonetheless a table constraint because it guarantees that the table contains no duplicate rows. By applying the primary key constraint to ProspectName, you are guaranteeing that ProspectName cannot have a null value, and no entry in the ProspectName column may duplicate another entry in the ProspectName column. Because ProspectName is guaranteed to be unique, every row in the table must be distinguishable from every other row.

ProspectName may not be a particularly good choice for a proposed primary key. Some people have rather common names— Joe Wilson or Jane Adams. It is quite possible that two people with the same name might both be prospects of your business. You could overcome that problem by using more than one column for the primary key. Here’s one way to do that:

CREATE TABLE PROSPECT (

ProspectName CHAR (30) NOT NULL,

Address1 CHAR (30) NOT NULL,

Address2 CHAR (30),

City CHAR (25),

State CHAR (2),

PostalCode CHAR (10),

Phone CHAR (13),

CONSTRAINT prospect_pk PRIMARY KEY

(ProspectName, Address1)

) ;

A composite primary key is made up of both ProspectName and Address1.

You might ask, “What if a father and son have the same name and live at the same address?” The more such scenarios you think up, the more complex things tend to get. In many cases, it’s best to make up a unique ID number for every row in a table and let that be the primary key. If you use an autoincrementer to generate the keys, you can be sure they are unique. This keeps things relatively simple. You can also program your own unique ID numbers by storing a value in memory and incrementing it by one after each time you add a new record that uses the stored value as its primary key.

CREATE TABLE PROSPECT (

ProspectID INTEGER PRIMARY KEY,

ProspectName CHAR (30),

Address1 CHAR (30),

Address2 CHAR (30),

City CHAR (25),

State CHAR (2),

PostalCode CHAR (10),

Phone CHAR (13)

) ;

Many database management systems automatically create autoincrementing primary keys for you as you enter new rows into a table.

Foreign key constraints

Relational databases are categorized as they are because the data is stored in tables that are related to each other in some way. The relationship occurs because a row in one table may be directly related to one or more rows in another table.

For example, in a retail database, the record in the CUSTOMER table for customer Lisa Mazzone is directly related to the records in the INVOICE table for purchases that Ms. Mazzone has made. To establish this relationship, one or more columns in the CUSTOMER table must have corresponding columns in the INVOICE table.

The primary key of the CUSTOMER table uniquely identifies each customer. The primary key of the INVOICE table uniquely identifies each invoice. In addition, the primary key of the CUSTOMER table acts as a foreign key in INVOICE to link the two tables. In this setup, the foreign key in each row of the INVOICE table identifies the customer who made this particular purchase. Here’s an example:

CREATE TABLE CUSTOMER (

CustomerID INTEGER PRIMARY KEY,

CustomerName CHAR (30),

Address1 CHAR (30),

Address2 CHAR (30),

City CHAR (25),

State CHAR (2),

PostalCode CHAR (10),

Phone CHAR (13)

) ;

CREATE TABLE SALESPERSON (

SalespersonID INTEGER PRIMARY KEY,

SalespersonName CHAR (30),

Address1 CHAR (30),

Address2 CHAR (30),

City CHAR (25),

State CHAR (2),

PostalCode CHAR (10),

Phone CHAR (13)

) ;

CREATE TABLE INVOICE (

InvoiceNo INTEGER PRIMARY KEY,

CustomerID INTEGER,

SalespersonID INTEGER,

CONSTRAINT customer_fk FOREIGN KEY (CustomerID)

REFERENCES CUSTOMER (CustomerID),

CONSTRAINT salesperson_fk FOREIGN KEY (SalespersonID)

REFERENCES SALESPERSON (SalespersonID)

) ;

Each invoice is related to the customer who made the purchase and the salesperson who made the sale.

Using constraints in this way is what makes relational databases relational. This is the core of the whole thing right here! How do the tables in a relational databases relate to each other? They relate by the keys they hold in common. The relationship is established, but also constrained by the fact that a column in one table has to match a corresponding column in another table. The only relationships present in a relational database are those where there is a key-to-key link mediated by a foreign key constraint.

Assertions

Sometimes a constraint may apply not just to a column or a table, but to multiple tables or even an entire database. A constraint with such broad applicability is called an assertion.

Suppose a small bookstore wants to control its exposure to dead inventory by not allowing total inventory to grow beyond 20,000 items. Suppose further that stocks of books and DVDs are maintained in different tables — the BOOKS and DVD tables. An assertion can guarantee that the maximum is not exceeded.

CREATE TABLE BOOKS (

ISBN INTEGER,

Title CHAR (50),

Quantity INTEGER ) ;

CREATE TABLE DVD (

BarCode INTEGER,

Title CHAR (50),

Quantity INTEGER ) ;

CREATE ASSERTION

CHECK ((SELECT SUM (Quantity)

FROM BOOKS)

+ (SELECT SUM (Quantity)

FROM DVD)

< 20000) ;

This assertion adds up all the books in stock, then adds up all the DVDs in stock, and finally adds those two sums together. It then checks to see that the sum of them all is less than 20,000. Whenever an attempt is made to add a book or DVD to inventory, and that addition would push total inventory to 20,000 or more, the assertion is violated and the addition is not allowed.

Most popular implementations do not support assertions. For example, SQL Server 2016, DB2, Oracle Database 18c, SAP SQL Anywhere, MySQL, and PostgreSQL do not. Assertions may become available in the future, since they are a part of SQL:2003, but it would not be wise to hold your breath until this functionality appears. Although a feature that would be nice to have, assertions are far down on the list of features to add for most DBMS vendors.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Chapter 6: Drilling Down to the SQL Nitty-Gritty

Create new playlist

Sign In

Sign Up

Drilling Down to the SQL Nitty-Gritty

Executing SQL Statements

Interactive SQL

Challenges to combining SQL with a host language

Embedded SQL

Module language

Using Reserved Words Correctly

SQL’s Data Types

Exact numerics

INTEGER

SMALLINT

BIGINT

NUMERIC

DECIMAL

DECFLOAT

Approximate numerics

REAL

DOUBLE PRECISION

FLOAT

Character strings

CHARACTER

CHARACTER VARYING

CHARACTER LARGE OBJECT (CLOB)

NATIONAL CHARACTER, NATIONAL CHARACTER VARYING, and NATIONAL CHARACTER LARGE OBJECT

Binary strings

BINARY

BINARY VARYING

BINARY LARGE OBJECT (BLOB)

Booleans

Datetimes

DATE

TIME WITHOUT TIME ZONE

TIME WITH TIME ZONE

TIMESTAMP WITHOUT TIME ZONE

TIMESTAMP WITH TIME ZONE

Intervals

XML type

ROW type

Collection types

ARRAY

Multiset

REF types

User-defined types

Distinct types

Structured types

SUBTYPES AND SUPERTYPES

STRUCTURED TYPE EXAMPLE

Data type summary

Handling Null Values

Applying Constraints

Column constraints

NOT NULL

UNIQUE

CHECK

Table constraints

Foreign key constraints

Assertions

Table of Contents for
Chapter 6: Drilling Down to the SQL Nitty-Gritty