MySQL Column Types

Each table in a database is made up of one or more columns. When you create a table using a CREATE TABLE statement, you specify a type for each column. A column type is more specific than a data type, which is just a general category such as "number" or "string." A column type precisely characterizes the kind of values a given table column may contain, such as SMALLINT or VARCHAR(32).

MySQL's column types are the means by which you describe what kinds of values a table's columns contain, which in turn determines how MySQL treats those values. For example, if you have numeric values, you can store them using a numeric or a string column type, but MySQL will treat the values somewhat differently depending on how you store them. Each column type has several characteristics:

  • What kind of values you can store in it

  • How much space values take up, and whether the values are fixed-length (all values of the type taking the same amount of space) or variable-length (the amount of space depending on the particular value being stored)

  • How values of the type are compared and sorted

  • Whether or not the type allows NULL values

  • Whether or not the type can be indexed

We'll survey MySQL's column types briefly to get a broad overview, then discuss in more detail the properties that characterize each type.

Overview of Column Types

MySQL provides column types for values from all the general data type categories except the NULL value. NULL spans all types in the sense that whether or not a column may contain NULL values is treated as a type attribute.

MySQL has column types for both integer and floating-point numeric values, as shown in Table 2.2. Integer columns can be signed or unsigned. A special attribute allows integer column values to be generated automatically, which is useful in applications that require unique sequence or identification numbers.

Table 2.2. Numeric Column Types
Type Name Meaning
TINYINTA very small integer
SMALLINTA small integer
MEDIUMINTA medium-sized integer
INTA standard integer
BIGINTA large integer
FLOATA single-precision floating-point number
DOUBLEA double-precision floating-point number
DECIMALA floating-point number, represented as a string

MySQL string column types are shown in Table 2.3. Strings can hold anything, even arbitrary binary data such as images or sounds. Strings can be compared according to whether or not they are case sensitive. In addition, you can perform pattern matching on strings. (Actually, in MySQL you can perform pattern matching on any column type, but it's most often done with string types.)

Table 2.3. String Column Types
Type NameMeaning
CHARA fixed-length character string
VARCHARA variable-length character string
TINYBLOBA very small BLOB (binary large object)
BLOBA small BLOB
MEDIUMBLOBA medium-sized BLOB
LONGBLOBA large BLOB
TINYTEXTA very small text string
TEXTA small text string
MEDIUMTEXTA medium-sized text string
LONGTEXTA large text string
ENUMAn enumeration; columns may be assigned one enumeration member
SETA set; columns may be assigned multiple set members

MySQL date and types are shown in Table 2.4. For temporal values, MySQL provides types for dates (either with or without a time), times, and timestamps (a special type that allows you to track when changes were last made to a record). There is also a type for efficiently representing year values when you don't need an entire date.

Table 2.4. Date and Time Column Types
Type NameMeaning
DATEA date value, in "YYYY-MM-DD" format
TIMEA time value, in "hh:mm:ss" format
DATETIMEA date and time value, in "YYYY-MM-DD hh:mm:ss" format
TIMESTAMPA timestamp value, in YYYYMMDDhhmmss format
YEARA year value, in YYYY format

To create a table, you issue a CREATE TABLE statement and specify a list of the columns that make up the table. Each column has a name and a type, and various attributes may be associated with each type. Here's an example that creates a table my_table with three columns named f, c,and i:

CREATE TABLE my_table
(
    f FLOAT(10,4),
    c CHAR(15) NOT NULL DEFAULT "none",
    i TINYINT UNSIGNED NULL
)

The syntax for declaring a column is as follows:

							col_name
							col_type [col_attributes] [general_attributes]

The name of the column is given by col_name. Column names may be up to 64 characters long, and may consist of alphanumeric characters, as well as the underscore and dollar sign characters ('_' and '$'). A column name may begin with any character that is legal in a name, including a digit. A name may not consist entirely of digits, however, because that would make it indistinguishable from a number. Words such as SELECT, DELETE, and CREATE are reserved and cannot be used as column names. However, function names (words such as POS and MIN) are not reserved and may be used.

The column type col_type indicates the specific kind of values the column can hold. The type specifier may also indicate the maximum length of the values you store in the column. For some types, you specify the length explicitly as a number. For others, the length is implied by the type name. For example, CHAR(10) specifies an explicit length of 10 characters, whereas TINYBLOB values have an implicit maximum length of 255 characters. Some of the type specifiers allow you to indicate a maximum display width (how many characters to use for displaying values). Floating-point types allow the number of decimal places to be specified, so you can control how precise values are.

Following the column type, you may specify optional type-specific attributes as well as more general attributes. The attributes function as type modifiers. They cause MySQL to change the way it treats column values in some way:

  • The type-specific attributes that are allowable depend on the column type you choose. For example, UNSIGNED is allowable only for integer types, and BINARY is allowable only for CHAR and VARCHAR.

  • The general attributes may be given for any column type, with a few exceptions. You may specify NULL or NOT NULL to indicate whether or not a column can hold NULL values. You may also specify DEFAULT def_value to indicate that a column should be assigned the value def_value when a new row is created without explicitly specifying the column's value. The value of def_value must be a constant; it cannot be an expression or refer to other columns. You cannot specify a default value for BLOB or TEXT columns.

If multiple column-specific attributes are given, they may be specified in any order as long as they follow the column type and precede any general attributes. Similarly, if multiple general attributes are given, they may be specified in any order as long as they follow the column type and any column-specific attributes that may be present.

The rest of this section discusses each of MySQL's column types to show the syntax for declaring the type and the properties that characterize it, such as its range and storage requirements. The type specifications are shown as you use them in CREATE TABLE statements. Optional information is indicated by square brackets ([]). For example, the syntax MEDIUMINT[(M] indicates that the maximum display width, specified as (M), is optional. On the other hand, for CHAR(M), the lack of brackets indicates that (M) is required.

Numeric Column Types

MySQL's numeric column types fall into two general classifications:

  • Integer types. For numbers that have no fractional part, such as 1, 43, -3, 0, or-798432. You can use integer columns for data represented by whole numbers, such as weight to the nearest pound, height to the nearest inch, number of stars in a galaxy, number of people in a household, or number of bacteria in a petri dish.

  • Floating-point types. For numbers that may have a fractional part, such as 3.14159, -.00273, -4.78, or 39.3E+4. You can use floating-point column types for values that may have a fractional part or that are extremely large or small. Some types of data you might represent as floating-point values are average crop yield; distances; money values, such as item cost or salary; unemployment rates; or stock prices. Integer values may be assigned to floating-point columns. They are treated as floating-point values with a fractional part of zero.

The name and range of each numeric type are shown in Table 2.5. The amount of storage required for values of each type is shown in Table 2.6.

The CREATE TABLE Statement

The examples used throughout this chapter use CREATE TABLE extensively. The statement should be reasonably familiar to you since we used it in the tutorial section of Chapter 1, "Introduction to MySQL and SQL." See also the entry for CREATE TABLE in Appendix D, "SQL Syntax Reference."


Table 2.5. Numeric Column Type Ranges
Type SpecificationRange
TINYINT[(M)]Signed values: -128 to 127 (-27 to 27–1) Unsigned values: 0 to 255 (0 to 28–1)
SMALLINT[(M)]Signed values: -32768 to 32767 (-215 to 215–1) Unsigned values: 0 to 65535 (0 to 216–1)
MEDIUMINT[(M)]Signed values: -8388608 to 8388607 (-223 to 223–1) Unsigned values: 0 to 16777215 (0 to 224–1)
INT[(M)]Signed values: -2147683648 to 2147483647 (-231 to 231–1) Unsigned values: 0 to 4294967295 (0 to 232–1)
BIGINT[(M)]Signed values: -9223372036854775808 to 9223372036854775807 (-263 to 263–1) Unsigned values: 0 to 18446744073709551615 (0 to 264–1)
FLOAT[(M,D)],FLOAT(4)Minimum non-zero values: ±1.175494351E–38Maximum non-zerovalues: ±3.402823466E+38
DOUBLE[(M,D)],FLOAT(8)Minimum non-zero values:±2.2250738585072014E–308Maximum non-zero values: ±1.7976931348623157E+308
DECIMAL(M,D)Varies; range depends on M and D

Table 2.6. Numeric Column Type Storage Requirements
Type SpecificationStorage Required
TINYINT[(M)]1 byte
SMALLINT[(M)]2 bytes
MEDIUMINT[(M)]3 bytes
INT[(M)]4 bytes
BIGINT[(M)]8 bytes
FLOAT[(M,D)], FLOAT(4)4 bytes
DOUBLE[(M,D)], FLOAT(8)8 bytes
DECIMAL(M,D)M bytes (MySQL < 3.23), M+2 bytes (MySQL ≥ 3.23)

MySQL provides five integer types: TINYINT, SMALLINT, MEDIUMINT, INT, and BIGINT. INTEGER is a synonym for INT. These types vary in the range of values they can represent. Integer columns can be declared as UNSIGNED to disallow negative values; this shifts the range for the column upward to begin at 0. The types also vary in the amount of storage required. Types with a larger range require more storage.

MySQL provides three floating-point types: FLOAT, DOUBLE, and DECIMAL. Unlike integer types, floating-point types cannot be UNSIGNED, and their range is different from integer types in the sense that there is not only a maximum value the type can represent, but there is also a minimum non-zero value. The minimum values provide a measure of how precise the type is, which is often important for recording scientific data. (There are, of course, corresponding negative maximum and minimum values.)

DOUBLE PRECISION[(M,D)] and REAL[(M,D)] are synonyms for DOUBLE[(M,D)]. NUMERIC(M,D) is a synonym for DECIMAL(M,D)..FLOAT(4) and FLOAT(8) are provided for ODBC compatibility.Prior to MySQL 3.23, they are synonyms for FLOAT(10,2) and DOUBLE(16,4). From MySQL 3.23 on, FLOAT(4) and FLOAT(8) have their own behavior, described shortly below.

When you choose a numeric type, consider the range of values you need to represent and choose the smallest type that will cover the range. Choosing a larger type wastes space, leading to tables that are unnecessarily large and that cannot be processed as efficiently as if you had chosen a smaller type. For integer values, TINYINT is the best if the range of values in your data is small, such as a person's age or number of siblings. MEDIUMINT can represent millions of values and can be used for many more types of values, at some additional cost in storage space. BIGINT has the largest range of all but requires twice as much storage as the next smallest integer type (INT) and should be used only when really necessary. For floating-point values, DOUBLE takes twice as much space as FLOAT. Unless you need exceptionally high precision or an extremely large range of values, you can probably represent your data at half the storage cost by using FLOAT.

When you declare an integer column, you can specify an optional display size M. If given, M should be an integer from 1 to 255. It represents the number of characters used to display values for the column. For example, MEDIUMINT(4) specifies a MEDIUMINT column with a display width of 4. If you declare an integer column without an explicit width, a default width is assigned. The defaults are the lengths of the "longest" values for each type. If the printable representation of a particular value requires more than M characters, the full value is displayed; values are not chopped to fit within M characters.

For each floating-point type, you may specify a maximum display size M and the number of decimal places D. The value of M should be from 1 to 255. The value of D may be from 0 to 30, but should be no more than M–2. (If you're more familiar with ODBC terms, M and D correspond to the ODBC concepts of "precision" and "scale.") M and D are optional for FLOAT and DOUBLE, but required for DECIMAL.

Where M and D are optional, default values are used if they are omitted. The following statement creates a table to illustrate the default values of M and D for numeric column types (DECIMAL is not included because M and D are not optional for that type):

CREATE TABLE my_table
(
    itiny TINYINT, itiny_u TINYINT UNSIGNED,
    ismall SMALLINT, ismall_u SMALLINT UNSIGNED,
    imedium MEDIUMINT, imedium_u MEDIUMINT UNSIGNED,
    ireg INT, ireg_u INT UNSIGNED,
    ibig BIGINT, ibig_u BIGINT UNSIGNED,
    fp_single FLOAT, fp_double DOUBLE
)

If you issue a DESCRIBE my_table statement after creating the table, the Field and Type columns of the output look like this:[1]

[1] The display width for BIGINT will be 21 (not 20) if you run this query using a version of MySQL older than 3.23, due to a minor glitch.

+-----------+-----------------------+
| Field     | Type                  |
+-----------+-----------------------+
| itiny     | tinyint(4)            |
| itiny_u   | tinyint(3) unsigned   |
| ismall    | smallint(6)           |
| ismall_u  | smallint(5) unsigned  |
| imedium   | mediumint(9)          |
| imedium_u | mediumint(8) unsigned |
| ireg      | int(11)               |
| ireg_u    | int(10) unsigned      |
| ibig      | bigint(20)            |
| ibig_u    | bigint(20) unsigned   |
| fp_single | float(10,2)           |
| fp_double | double(16,4)          |
+-----------+-----------------------+

Every numeric column has a range of values determined by the column's type. If you attempt to insert a value that lies outside the column's range, truncation occurs: MySQL clips the value to the appropriate endpoint of the range and uses the result. No truncation occurs when values are retrieved.

Value truncation occurs according to the range of the column type, not the display width. For example, a SMALLINT(3) column has a display width of 3 and a range from -32768 to 32767. The value 12345 is wider than the display width but within the range of the column, so it is inserted without clipping and retrieved as 12345. The value 99999 is outside the range, so it is clipped to 32767 when inserted. Subsequent retrievals retrieve the value 32767.

In general, values assigned to a floating-point column are rounded to the number of decimals indicated by the column specification. If you store 1.23456 in a FLOAT(8,1) column, the result is 1.2. If you store the same value in a FLOAT(8,4) column, the result is 1.2346. This means you should declare floating-point columns with a sufficient number of decimals to give you values as precise as you require. If you need accuracy to thousandths, don't declare a type with only two decimal places.

The display width for BIGINT will be 21 (not 20) if you run this query using a version of MySQL older than 3.23, due to a minor glitch.

The exception to this handlingof floating-point values is that in MySQL 3.23, the behavior of FLOAT(4) and FLOAT(8) changed. These two types are nowsingle-precision (4-byte) and double-precision (8-byte) types that are true floating-point types in the sense that values are stored as given, within the limits imposed by your hardware.

The DECIMAL type differs from FLOAT and DECIMAL in that DECIMAL values are actually stored as strings. The maximum possible range for DECIMAL is the same as for DOUBLE, but the effective range is determined by the values of M and D. If you vary M and hold D fixed, the range becomes larger as M becomes larger. This is illustrated by the first three rows of Table 2.7. If you hold M fixed and vary D, the range becomes smaller as D becomes larger (though the precision increases). This is shown by the last three rows of Table 2.7.

Table 2.7. How M and D Affect the Range of DECIMAL(M,D)
Type SpecificationRange (for MySQL < 3.23)Range (for MySQL ≥ 3.23)
DECIMAL(4,1)-9.9 to 99.9-999.9 to 9999.9
DECIMAL(5,1)-99.9 to 999.9-9999.9 to 99999.9
DECIMAL(6,1)-999.9 to 9999.9-99999.9 to 999999.9
DECIMAL(6,2)-99.99 to 999.99-9999.99 to 99999.99
DECIMAL(6,3)-9.999 to 99.999-999.999 to 9999.999

The range for a given DECIMAL type depends on your version of MySQL. For versions of MySQL prior to 3.23, DECIMAL(M,D) columns are stored using M bytes per value, and the sign character (if needed) and decimal point are included in the M bytes. Thus, for a type DECIMAL(5,2), the range is –9.99 to 99.99 because those cover all the possible 5-character values.

As of MySQL 3.23, DECIMAL values are handled according to the ANSI specification, which states that a type of DECIMAL(M,D) must be able to represent any value with M digits and D decimal places. For example, DECIMAL(5,2) must be able to represent values from –999.99 to 999.99. The sign character and decimal point must still be stored, so DECIMAL values from MySQL 3.23 and up use M+2 bytes. For DECIMAL(5,2), 7 bytes are needed for the "longest" value (–999.99). At the positive end of the range, the sign byte is not needed to hold a sign character, so MySQL uses it to extend the range beyond that required by the ANSI specification. For DECIMAL(5,2), the maximum value can be 9999.99 because 7 bytes are available.

In short, the range for DECIMAL(M,D) in MySQL 3.23 and up is equivalent to the range for (DECIMALM+2,D) in earlier versions.

In all versions of MySQL, if D is 0 for a DECIMAL column, the decimal point is not stored. The effect of this is to extend the range of the column by an extra order of magnitude because the byte normally used to store the decimal point can be used for another digit.

Numeric Column Type Attributes

The ZEROFILL attribute may be specified for all numeric types. It causes displayed values for the column to be padded with leading zeros to the display width. You can use ZEROFILL when you want to make sure column values always display using a given number of digits. Actually, it's more accurate to say "a given minimum number of digits" because values wider than the display width are displayed in full without being chopped. You can see this by issuing the following statements:

CREATE TABLE my_table (my_zerofill INT(5) ZEROFILL)
INSERT INTO my_table VALUES(1),(100),(10000),(1000000)
SELECT my_zerofill FROM my_table

The output from the SELECT statement is as follows. Note that the final value, which is wider than the column's display width, is displayed in full:

+-------------+
| my_zerofill |
+-------------+
|       00001 |
|       00100 |
|       10000 |
|     1000000 |
+-------------+

Two other attributes may be specified for integer column types only:

  • AUTO_INCREMENT. Use the AUTO_INCREMENT attribute when you want to generate unique identifiers or values in a series. AUTO_INCREMENT values normally begin at 1 and increase by 1 per row. When you insert NULL into an AUTO_INCREMENT column, MySQL inserts a value one greater than the current maximum value in that column. You may have at most one AUTO_INCREMENT column in a table.

    For any column that you want to use with AUTO_INCREMENT, the column should be declared NOT NULL, and it should be declared as a PRIMARY KEY or as a UNIQUE key. For example, you can declare such a column in any of the following ways:

    CREATE TABLE ai (i INT AUTO_INCREMENT NOT NULL PRIMARY KEY)
    CREATE TABLE ai (i INT AUTO_INCREMENT NOT NULL, PRIMARY KEY (i))
    CREATE TABLE ai (i INT AUTO_INCREMENT NOT NULL, UNIQUE (i))
    

    The behavior of AUTO_INCREMENT is discussed further in "Working With Sequences."

  • UNSIGNED. This attribute disallows negative values. Making a column UNSIGNED doesn't change the size of the underlying data type's range; it just shifts the range upward. Consider this table specification:

    CREATE TABLE my_table
    (
    itiny TINYINT,
    itiny_u TINYINT UNSIGNED
    )
    

    itiny and itiny_u are both TINYINT columns with a range of 256 values, but the range of itiny is -128 to 127, whereas the range of itiny_u is 0 to 255.

    UNSIGNED is useful for columns into which you plan to store data that don't take on negative values, such as population counts or attendance figures. If you use a regular, signed column for such values, you use only half of the column type's range. By making the column UNSIGNED, you effectively double your range. If you use the column for sequence numbers, it will take twice as long to run out of values if you make it UNSIGNED.

Following the attributes just described, which are specific to numeric columns, you may also specify the general attributes NULL or NOT NULL. If you do not specify NULL or NOT NULL, the default is NULL. You may also specify a default value using the DEFAULT attribute. If you do not specify a default value, one is chosen automatically. For all numeric column types, the default is NULL for columns that may contain NULL, and 0 otherwise.

The following example creates a table with three INT columns, having default values of -1, 1, and NULL:

CREATE TABLE t
(
    i1 INT DEFAULT -1,
    i2 INT DEFAULT 1,
    i3 INT DEFAULT NULL
)

Working With Sequences

Many applications need to use unique numbers for identification purposes. The requirement for unique values occurs in a number of contexts: membership numbers, sample or lot numbering, customer IDs, bug report or trouble ticket tags, and so forth.

MySQL's mechanism for providing unique numbers is through AUTO_INCREMENT columns.These allow you to generate sequential numbers automatically. Unfortunately, AUTO_INCREMENT is also sometimes poorly understood, a phenomenon perhaps compounded by the changes that were made to AUTO_INCREMENT in MySQL 3.23. This section describes how AUTO_INCREMENT columns behave so that you can use them effectively without running into the traps that sometimes surprise people. It also describes how you can generate sequences without using an AUTO_INCREMENT column.

AUTO_INCREMENT for MySQL Versions up to 3.23

For versions of MySQL up to 3.23, AUTO_INCREMENT columns behave as follows:

  • Inserting NULL into an AUTO_INCREMENT column causes MySQL to automatically generate the next sequence number and insert that value into the column instead. AUTO_INCREMENT sequences begin at 1, so the first record inserted into the table gets a sequence column value of 1, and subsequent records get values of 2, 3, and so forth. In general, each automatically generated value will be one more than the current maximum value stored in the column.

  • Inserting 0 into an AUTO_INCREMENT column is like inserting NULL into the column. Inserting a row without specifying a value for the AUTO_INCREMENT column also is like inserting NULL.

  • If you insert a record and explicitly specify a value for the AUTO_INCREMENT column, one of two things will occur. If a record already exists with that value, an error occurs because values in AUTO_INCREMENT columns must be unique. If a record does not exist with that value, the record is inserted and, if the value in the column is the new largest value, the sequence continues with the next value after that for subsequent rows. In other words, you can "bump up" the counter by inserting a record with a sequence value greater than the current counter value.

    Bumping up the counter can result in gaps in the sequence, but you can also exploit this behavior to your advantage. Suppose you create a table with an AUTO_INCREMENT column, but you want a sequence to begin at 1000 rather than at 1. You can achieve this one of two ways. First, you can insert the first record with an explicit sequence value of 1000, then insert subsequent records by inserting NULL into the AUTO_INCREMENT column. Second, you can insert a fake record with a value of 999 in the AUTO_INCREMENT column. The first real record you insert after that will get a sequence number of 1000, after which you can delete the fake record.

  • If you insert an illegal value into an AUTO_INCREMENT column, do not expect anything useful to happen. The result is unpredictable.

  • If you delete the record containing the largest value in an AUTO_INCREMENT column, that value is reused the next time you generate a new value. If you delete all the records in the table, all values are reused: The sequence begins over, starting with 1.

  • REPLACE statements work normally.

  • UPDATE statements work using rules similar to those that apply to inserting new records. If you update an AUTO_INCREMENT column to NULL or 0, it is updated to the next sequence number. If you attempt to update the column to a value that already exists, an error occurs (unless you happen to be setting the column to the value that it already has). If you update the column to a value larger than any existing column value, the sequence continues with the next number after that for subsequent records.

  • The value of the most recent automatically generated sequence number is available by calling the LAST_INSERT_ID() function. This allows you to reference the AUTO_INCREMENT value in other statements without knowing what the value is. LAST_INSERT_ID() is tied to AUTO_INCREMENT values generated during the current server session; it is not affected by AUTO_INCREMENT activity associated with other clients. If no AUTO_INCREMENT value has been generated during the current session, LAST_INSERT_ID() returns 0.

It's extremely useful to be able to generate sequence numbers automatically. However, the behavior just described has two shortcomings. First, the reuse of sequence values when records at the top of the sequence are deleted makes it harder to generate a monotonic (strictly increasing) set of values for applications that may delete records as well as insert them. Second, the means by which you begin a sequence at a value higher than 1 are clumsy.

AUTO_INCREMENT for MySQL Versions 3.23 and up

MySQL 3.23 introduced the following changes to AUTO_INCREMENT behavior to deal with the concerns just noted:

  • The values in an automatically generated series are strictly increasing and are not reused. If the maximum value is 143 and you delete the record containing that value, MySQL still generates the next value as 144.

  • You can specify the initial sequence number explicitly when you create the table. The following example creates a table with an AUTO_INCREMENT column seq that begins at 1,000,000:

    CREATE TABLE my_table
        (seq INT UNSIGNED AUTO_INCREMENT NOT NULL PRIMARY KEY)
        AUTO_INCREMENT = 1000000
    

    When a table has multiple columns (as most tables do), there is no ambiguity about which column the terminating AUTO_INCREMENT = 1000000 clause applies to because you can have only one AUTO_INCREMENT column per table.

Issues to Consider with AUTO_INCREMENT

You should keep the following points in mind to avoid being surprised when you use AUTO_INCREMENT columns:

  • AUTO_INCREMENT is not a column type; it's a column type attribute. Furthermore, AUTO_INCREMENT is an attribute intended for use only with integer types. Versions of MySQL earlierthan 3.23 are lax in enforcing this constraint and will let you declare a column type such as CHAR with the AUTO_INCREMENT attribute. However, only the integer types work correctly as AUTO_INCREMENT columns.

  • The primary purpose of the AUTO_INCREMENT mechanism is to allow you to generate a sequence of positive integers, and it's best if you stick to using AUTO_INCREMENT columns that way. For this reason, you should declare AUTO_INCREMENT columns to be UNSIGNED. This also has the advantage of giving you twice as many sequence numbers before you hit the upper end of the column type's range.

    It is possible under some circumstances to generate sequences of negative values using an AUTO_INCREMENT column, but I don't recommend it. If you're determined to try it, make sure to perform adequate testing, and retest if you upgrade to a different version of MySQL. My own experiments indicate somewhat inconsistent behavior between versions with regard to negative sequences.

  • Don't be fooled into thinking that adding AUTO_INCREMENT to a column declaration is a magic way of getting an unlimited sequence of numbers. It's not; AUTO_INCREMENT sequences are bound by the range of the underlying column type. For example, if you use a TINYINT UNSIGNED column, the maximum sequence number is 255. When you reach that limit, your application will begin to fail with "duplicate key" errors.

  • MySQL3.23 introduced the new AUTO_INCREMENT behaviors of not reusing sequence numbers and allowing you to specify an initial sequence number in the CREATE TABLE statement. These behaviors are undone if you delete all records in the table using a DELETE statement of the following form:

    DELETE FROM tbl_name
    										

    In this case, the sequence starts over from 1 rather than continuing in strictly increasing order. The sequence starts over even if your CREATE TABLE statement specifies an initial sequence number explicitly. This occurs due to the way MySQL optimizes DELETE statements that empty a table entirely: It re-creates the data and index files from scratch rather than deleting each record, and that causes all sequence number information to be lost. If you want to delete all records but preserve the sequence information, you can suppress the optimization and force MySQL to perform a row-by-row delete operation instead, like this:

    DELETE FROM tbl_name WHERE 1 > 0
    

What can you do to maintain a strictly increasing series if you have a version of MySQL older than 3.23? One solution is to maintain a separate table that you use only for generating AUTO_INCREMENT values, and from which you never delete records. That way, the values in the table are never reused. When you need to generate a new record in your main table, first insert a NULL into the sequence number table. Then insert the record into your main table using the value of LAST_INSERT_ID() for the column that you want to contain a sequence number:

INSERT INTO ai_tbl SET ai_col = NULL
INSERT INTO main_tbl SET id=LAST_INSERT_ID() …

Suppose you want to write an application that generates AUTO_INCREMENT values, but you want the sequence to start with 100 rather than 1. Suppose also that you want it to be portable to all versions of MySQL. How can you accomplish this?

If portability is a goal, you can't rely on the capability that MySQL 3.23 provides for specifying the initial sequence number in the CREATE TABLE statement. Instead, when you want to insert a record, first check whether or not the table is empty by issuing the following statement:

SELECT COUNT(*) FROM tbl_name
								

This is an extra step, but it doesn't cost you much because SELECT COUNT(*) with no WHERE clause is optimized to return quickly. If the table is empty, insert the record and explicitly specify a value of 100 for the sequence number column. If the table isn't empty, simply specify NULL for the sequence number column value and let MySQL generate the next number automatically.

This approach allows you to insert records with sequence numbers of 100, 101, and so on, and it works whether or not MySQL allows the initial sequence value to be specified. This approach does not work if you require sequence numbers to be strictly increasing even when records are deleted from the table. In that case, you can combine this method with the technique described earlier of using a secondary table for nothing else but to generate sequence numbers for use in your main table.

Why might you want to begin a sequence with a value higher than 1? One reason is to make sequence numbers all have the same number of digits. If you're generating customer ID numbers, and you expect never to have more than a million customers, you could begin the series at 1,000,000. You'll be able to add a million customer records before the digit count for customer ID values changes.

Another way to force sequence numbers to be a certain widthis to use a ZEROFILL column, of course. That can present problems, depending on the context in which you use your data. For example, if you manipulate sequence numbers with leading zeros in Perl orPHP scripts, you have to be careful to use them only as strings; if they get converted to numbers, the leading zeros will be lost. The following short Perl script illustrates the perils of dealing with numbers like this:

#! /usr/bin/perl
$s = "00010";   # create "number" with leading zeroes
print "$s
";
$s++;           # use Perl's intelligent increment
print "$s
";
$s += 1;        # use $s in a numeric context
print "$s
";

When executed, the script prints the following output:

00010  Okay
00011  Okay
12  Oops!

Perl's '++' autoincrement operator is smart and can create sequence values from either strings or numbers, but '+=' operates on numbers only. In the output just displayed, you can see that '+=' causes a string-to-number conversion and the leading zeroes in the value of $s are lost.

Other reasons for not beginning a sequence at 1 might have nothing to do with technical considerations. For example, if you were assigning membership numbers, you might want to begin a sequence at a number higher than 1 to forestall political squabbling over who gets to be member number 1 by making sure there isn't any such number. Hey, it happens. Sad, but true.

Generating Sequences Without AUTO_INCREMENT

Another method for generating sequence numbers doesn't use an AUTO_INCREMENT column at all. Instead, it uses the variant form of the LAST_INSERT_ID() function that takes an argument. (This form was introduced in MySQL 3.22.9.) If you insert or update a column using LAST_INSERT_ID(expr), the next call to LAST_INSERT_ID() with no argument will return the value of expr. In other words, expr is treated as though it had been generated by the AUTO_INCREMENT mechanism. This allows you to generate a sequence number and then use it in a later statement within the client session, without having the value be affected by other clients.

One way to use this strategy is to create a single-row table containing a value that is updated each time you want the next value in the sequence. For example, you can create the table like this:

CREATE TABLE seq_table (seq INT UNSIGNED NOT NULL)
INSERT INTO seq_table VALUES(0)

These statements create the table seq_table and initialize it with a single row containing a seq value of 0. To use the table, generate the next sequence number like this:

UPDATE seq_table SET seq = LAST_INSERT_ID(seq+1)

This statement retrieves the current value of the seq column and increments it by 1 to produce the next value in the sequence. Generating the new value using LAST_INSERT_ID(seq+1) causes it to be treated as though it were an AUTO_INCREMENT value, and the value can be retrieved in a later statement by calling LAST_INSERT_ID() without an argument. This works even if some other client has generated another sequence number in the meantime because LAST_INSERT_ID() is client-specific.

You can also use this method if you want to generate sequence values that increment by a value other than 1, or that are negative. For example, the following two statements could be used to generate a sequence of numbers that increase by 100 each time or a sequence of negative numbers:

UPDATE seq_table SET seq = LAST_INSERT_ID(seq+100)
UPDATE seq_table SET seq = LAST_INSERT_ID(seq-1)

You can also use this method to generate a sequence that begins at an arbitrary value by setting the seq column to an appropriate initial value.

For an application of this sequence-generation method for multiple counters, see "Setting Up a Counter Table" in Chapter 3, "MySQL SQL Syntax and Use."

String Column Types

MySQL provides several string types to hold character data. Strings are often used for values like these:

"N. Bertram, et al."
"Pencils (no. 2 lead)"
"123 Elm St."
"Monograph Series IX"

But strings are actually "generic" types in a sense because you can use them to represent any value. For example, you can use string types to hold binary data, such as images or sounds, or output from gzip, should you wish to store compressed data.

For all string types, values that are too long are chopped to fit. But string types range from very small to very large, with the largest type able to hold nearly 4GB of data, so you should be able to find something long enough to avoid truncation of your information.[2]

[2] Due to limitations imposed by the maximum packet size of the client/server communication protocol,the effective limit on column values is 24MB.

Table 2.8 shows the types provided by MySQL for declaring string-valued columns, and the maximum size and storage requirements of each type. For variable-length column types, the amount of storage taken by a value varies from row to row, and depends on the length of the values actually stored in the column. This length is represented by L in the table.

The extra bytes required in addition to L are the number of bytes needed to store the length of the value. MySQL handles variable-length values by storing both the content of the value and its length. These extra bytes are treated as an unsigned integer. Notice the correspondence between a variable-length type's maximum length, the number of extra bytes required for that type, and the range of the unsigned integer type that uses the same number of bytes. For example, MEDIUMBLOB values may be up to 224–1 bytes long and require 3 bytes to record the result. The 3-byte integer type MEDIUMINT has a maximum unsigned value of 224–1. That's not a coincidence.

Table 2.8. String Column Types
Type SpecificationMaximum SizeStorage Required
CHAR(M)M bytesM bytes
VARCHAR(M)M bytesL+1 bytes
TINYBLOB, TINYTEXT28–1 bytesL+1 bytes
BLOB, TEXT216–1 bytesL+2 bytes
MEDIUMBLOB, MEDIUMTEXT224–1 bytesL+3 bytes
LONGBLOB, LONGTEXT232–1 bytesL+4 bytes
ENUM("value1","value2",…)65535 members1 or 2 bytes
SET("value1","value2",…)64 members1, 2, 3, 4, or 8 bytes

CHAR and VARCHAR Column Types

CHAR and VARCHAR are the most commonly used string types. The difference between them is that CHAR is a fixed-length type and VARCHAR is a variable-length type. Values in a CHAR(M) column each take M bytes; shorter values are right-padded with spaces when they are stored. (Trailing spaces are stripped off on retrieval, however.) Values in a VARCHAR(M) column are stored using only as many bytes as necessary, plus one byte to record the length.[3]

[3] Trailing spaces are stripped when values are stored; this differs from the ANSI SQL standard for VARCHAR values.

If your values don't vary much in length, CHAR is a better choice than VARCHAR because tables with fixed-length rows can be processed more efficiently than tables with variable-length rows. If your values are all the same length, VARCHAR will actually use more space due to the extra byte used to record the length of values.

Prior to MySQL 3.23, CHAR and VARCHAR columns may be declared with a maximum length M from 1 to 255. Beginning with MySQL 3.23, CHAR(0) is also legal. CHAR(0) is useful as a placeholder when you want to declare a column but don't want to allocate space for it if you're not sure yet how wide to make it. You can use ALTER TABLE to widen the column later. A CHAR(0) column may also be used to represent on/off values if you allow it to be NULL. Values in such a column may have two values, NULL and the empty string. A CHAR(0) column takes very little storage space in the table—only a single bit.

With a few limited exceptions, you can't mix CHAR and VARCHAR within the same table. MySQL will even changecolumns from one type to another, depending on the circumstances. (This is something that other databases do not do.) The principles that apply are as follows:

  • Tables with fixed-length rows are processed more easily than tables with variable-length rows. (The reasons for this are discussed in the section "Choosing Column Types.")

  • Table rows are fixed-length only if all the columns in the table are fixed-length types. If even a single column has a variable length, table rows become variable-length as well.

  • Because the performance advantages of fixed-length rows are lost when the row becomes variable-length, any fixed-length columns may as well be converted to variable-length equivalents when that will save space.

What this means is that if you have VARCHAR columns in a table, you cannot also have CHAR columns; MySQL silently converts them to VARCHAR. Suppose you create a table like this:

CREATE TABLE my_table
(
    c1 CHAR(10),
    c2 VARCHAR(10)
)

If you issue a DESCRIBE my_table query, the output looks like this:

+-------+-------------+------+-----+---------+-------+
| Field | Type        | Null | Key | Default | Extra |
+-------+-------------+------+-----+---------+-------+
| c1    | varchar(10) | YES  |     | NULL    |       |
| c2    | varchar(10) | YES  |     | NULL    |       |
+-------+-------------+------+-----+---------+-------+

Notice that the presence of the VARCHAR column causes MySQL to convert c1 to VARCHAR as well. If you try using ALTER TABLE to convert c1 to CHAR, it won't work. The only way to convert a VARCHAR column to CHAR is to convert all VARCHAR columns in the table at the same time:

ALTER TABLE my_table MODIFY c1 CHAR(10), MODIFY c2 CHAR(10)

The BLOB and TEXT column types are variable-length like VARCHAR, but they have no fixed-length equivalent, so you cannot use CHAR columns in the same table as BLOB or TEXT columns. Any CHAR column will be converted to VARCHAR.

The exception to non-mixing of fixed-length and variable-length columns is that CHAR columns shorter than four characters are not converted to VARCHAR. For example, MySQL will not change the CHAR column in the following table to VARCHAR:

CREATE TABLE my_table
(
    c1 CHAR(2),
    c2 VARCHAR(10)
)

The reason columns shorter than four characters are not converted is that, on average, any savings you might gain by not storing trailing spaces are offset by the extra byte needed in a VARCHAR column to record the length of each value. In fact, if all your columns are short, MySQL will convert any that you declare as VARCHAR to CHAR. MySQL does this because the conversion won't increase storage requirements on average and will improve performance by making table rows fixed-length. If you create a table with the following specification, the VARCHAR columns are all silently changed to CHAR:

CREATE TABLE my_table
(
    c1 VARCHAR(1),
    c2 VARCHAR(2),
    c3 VARCHAR(3)
)

You can see that the columns are changed by examining the output of DESCRIBE my_table:

+-------+---------+------+-----+---------+-------+
| Field | Type    | Null | Key | Default | Extra |
+-------+---------+------+-----+---------+-------+
| c1    | char(1) | YES  |     | NULL    |       |
| c2    | char(2) | YES  |     | NULL    |       |
| c3    | char(3) | YES  |     | NULL    |       |
+-------+---------+------+-----+---------+-------+

BLOB and TEXT Column Types

A "BLOB" is abinary large object—basically, a container that can hold anything you want to toss into it, and that you can make about as big as you want. In MySQL, the BLOB type is really a family of types (TINYBLOB,BLOB, MEDIUMBLOB, LONGBLOB), which are identical except in the maximum amount of information they can hold (see Table 2.8). MySQL also has a family of TEXT types (TINYTEXT, TEXT, MEDIUMTEXT, LONGTEXT). These are identical to the corresponding BLOB types in all respects except that for comparison and sorting purposes, BLOB values are case sensitive and TEXT values are not. BLOB and TEXT columns are useful for storing data that may grow very large or that may vary widely in size from row to row. Some examples are word-processing documents, images and sounds, compound data, and news articles.

BLOB or TEXT columns can be indexed from MySQL 3.23.2 on, although you must specify a prefix size to be used for the index to avoid creating index entries that might be huge and thereby defeat any benefits to be gained by that index. Besides, you generally don't perform searches by looking through a BLOB or TEXT column anyway because such columns often contain binary data (such as images). It's more common to use other columns in the table to record some sort of identifying information about the BLOB or TEXT values and use those to determine which rows you want.

BLOB or TEXT columns may require special care:

  • Due to the typical large variation in the size of BLOB and TEXT values, tables containing them are subject to high rates of fragmentation if many deletes and updates are done. You'll want to run OPTIMIZE TABLE periodically to reduce fragmentation and maintain good performance. See Chapter 4, "Query Optimization," for more information.

  • If you're using very large values, you may need to tune the server to increase the value of the max_allowed_packet parameter. See Chapter 11, "General MySQL Administration," for more information. You will also need to increase the packet size for any client that wishes to use very large values. Appendix E, "MySQL Program Reference," describes how to do this for the mysql and mysqldump clients.

ENUM and SET Column Types

ENUM and SET are special string types for which column values must be chosen from a fixed set of strings. The primary difference between them is that ENUM column values must consist of exactly one member of the set of values whereas SET column values may contain any or all members of the set. In other words, ENUM is used for values that are mutually exclusive, whereas SET allows multiple choices from a list of values.

The ENUM column type defines an enumeration. ENUM columns may be assigned values consisting of exactly one member chosen from a list of values specified at table-creation time. An enumeration may have up to 65,536 members (one of which is reserved by MySQL). Enumerations are commonly used to represent category values. For example, values in a column declared as ENUM("N","Y") can be either "N" or "Y". Or you can use ENUM for such things as answers to multiple-choice questions in a survey or questionnaire, or available sizes or colors for a product:

employees ENUM("less than 100","100-500","501-1500","more than 1500")
color ENUM("red","green","blue","black")
size ENUM("S","M","L","XL","XXL")

If you are processing selections from Web pages, you can use an ENUM to represent the option that a visitor to your site chooses from a set of mutually exclusive radio buttons on a page. For example, if you run an online pizza ordering service, an ENUM can be used to represent the type of crust a customer orders:

crust ENUM("thin","regular","pan style")

If enumeration categories represent counts, it's important to choose your categories properly when you create the enumeration. For example, when recording white blood cell counts from a laboratory test, you may group the counts into categories like this:

wbc ENUM("0-100","101-300",">300")

When a test result comes in as an exact count, you record the value in terms of the category into which the count falls. But you cannot recover the original count if you decide you want to convert the column from a category-based ENUM to an integer column based on exact count.

The SET type is similar to ENUM in the sense that when you create a SET column, you specify a list of legal set members. But unlike ENUM, each column value may consist of any number of members from the set. The set may have up to 64 members. You can use a SET when you have a fixed set of values that are not mutually exclusive, as they are in an ENUM column. For example, you might use a SET to represent options available for an automobile:

SET("luggage rack","cruise control","air conditioning","sun roof")

Then particular SET values would represent those options actually ordered by customers:

SET("cruise control,sun roof")
SET("luggage rack,air conditioning")
SET("luggage rack,cruise control,air conditioning")
SET("air conditioning")
SET("")

The empty string means the customer ordered no options. This is a legal SET value.

SET column values are represented as a single string. If a value consists of multiple set members, the members are separated in the string by commas. Obviously, this means you shouldn't use a string containing a comma as a SET member.

Other uses for SET columns might be for representing information such as patient diagnoses or results from selections on Web pages. For a diagnosis, there may be a standard list of symptoms to ask a patient about, and the patient might exhibit any or all of them. For your online pizza service, the Web page for ordering could have a set of check boxes for ingredients that a customer wants on a pizza, and several might be chosen.

The way you declare the legal value list for an ENUM or SET column is significant in several ways:

  • The list determines the possible legal values for the column, as has already been discussed.

  • You can insert ENUM or SET values in any lettercase, but the lettercase of the strings specified in the column declaration determines the lettercase of column values when they are retrieved later. For example, if you have an ENUM("Y","N") column and you store "y" and "n" in it, the values are displayed as "Y" and "N" when you retrieve them. This does not affect comparison or sorting behavior because ENUM and SET columns are not case sensitive.

  • The order of values in an ENUM declaration is the order used for sorting. The order of values in a SET declaration also determines sort order, although the relationship is more complicated because column values may contain multiple set members.

  • The order of values in a SET declaration determines the order in which substrings appear when SET column values consisting of multiple set members are displayed.

ENUM and SET are classified as string types because enumeration and set members are specified as strings when you create columns of these types. However, the members are stored internally as numbers and you can operate on them as such. This means that ENUM and SET types are more efficient than other string types because they often can be handled using numeric operations rather than string operations. It also means that ENUM and SET values can be used in either string or numeric contexts.

ENUM members in the column declaration are numbered sequentially beginning with 1. (0 is used by MySQL for the error member, which is represented in string form by the empty string.) The number of enumeration values determines the storage size of an ENUM column. One byte can represent 256 values, two bytes can represent 65,536 values. (Compare this to the ranges of the one-byte and two-byte integer types TINYINT UNSIGNED and SMALLLINT UNSIGNED.) Thus, the maximum number of enumeration members is 65,536 (counting the error member) and the storage size depends on whether or not there are more than 256 members. You can specify a maximum of 65,535 (not 65,536) members in the ENUM declaration because MySQL reserves a spot for the error member as an implicit member of every enumeration. When you assign an illegal value to an ENUM column, MySQL assigns the error member instead.

Here is an example you can try using the mysql client. It shows the numeric ordering of enumeration members and also demonstrates that the NULL value has no number in the ordering:

mysql> CREATE TABLE e_table (e ENUM("jane","fred","will","marcia"));
mysql> INSERT INTO e_table VALUES("jane"),("fred"),("will"),("marcia"),(""),(NULL);
mysql> SELECT e, e+0, e+1, e*3 FROM e_table;
+--------+------+------+------+
| e      | e+0  | e+1  | e*3  |
+--------+------+------+------+
| jane   |    1 |    2 |    3 |
| fred   |    2 |    3 |    6 |
| will   |    3 |    4 |    9 |
| marcia |    4 |    5 |   12 |
|        |    0 |    1 |    0 |
| NULL   | NULL | NULL | NULL |
+--------+------+------+------+

You can operate on ENUM members either by name or number:

mysql> SELECT e FROM e_table WHERE e="will";
+------+
| e    |
+------+
| will |
+------+
mysql> SELECT e FROM e_table WHERE e=3;
+------+
| e    |
+------+
| will |
+------+

It is possible to declare the empty string as a legal enumeration member. It will be assigned a non-zero numeric value, just as would be any other member listed in the declaration. However, using an empty string may cause some confusion because that string is also used for the error member that has a numeric value of 0. In the following example, assigning the illegal enumeration value "x" to the ENUM column causes the error member to be assigned. This is distinguishable from the empty string member only when retrieved in numeric form:

mysql> CREATE TABLE t (e ENUM("a","","b"));
mysql> INSERT INTO t VALUES("a"),(""),("b"),("x");
mysql> SELECT e, e+0 FROM t;
+------+------+
| e    | e+0  |
+------+------+
| a    |    1 |
|      |    2 |
| b    |    3 |
|      |    0 |
+------+------+

The numeric representation of SET columns is a little different than for ENUM columns. Set members are not numbered sequentially. Instead, each member corresponds to an individual bit in the SET value. The first set member corresponds to bit 0, the second member corresponds to bit 1, and so on. A numeric SET value of 0 corresponds to the empty string. SET members are maintained as bit values. Eight set values per byte can be stored this way, so the storage size for a SET column is determined by the number of set members, up to a maximum of 64 members. SET values take 1, 2, 3, 4, or 8 bytes for set sizes of 1 to 8, 9 to 16, 17 to 24, 25 to 32, and 33 to 64

The representation of a SET as a set of bits is what allows a SET value to consist of multiple set members. Any combination of bits can be turned on in the value, so the value may consist of any combination of the strings in the SET declaration that correspond to those bits.

Here's an example that shows the relationship between the string and numeric forms of a SET column; the numeric value is displayed in both decimal and binary form:

mysql> CREATE TABLE s_table (s SET("jane","fred","will","marcia"));
mysql> INSERT INTO s_table VALUES("jane"),("fred"),("will"),("marcia"),(""),(NULL);
mysql> SELECT s, s+0, BIN(s+0) FROM s_table;
+--------+------+----------+
| s      | s+0  | BIN(s+0) |
+--------+------+----------+
| jane   |    1 | 1        |
| fred   |    2 | 10       |
| will   |    4 | 100      |
| marcia |    8 | 1000     |
|        |    0 | 0        |
| NULL   | NULL | NULL     |
+--------+------+----------+

If you assign to a SET column a value containing substrings that are not listed as set members, those strings drop out and the column is assigned a value consisting of the remaining substrings. When you assign values to SET columns, the substrings don't need to be listed in the same order that you used when you declared the column. However, when you retrieve the value later, members will be listed in declaration order. Suppose you declare a SET column to represent furniture items using the following declaration:

SET("table","lamp","chair")

If you assign a value of "chair,couch,table" to this column, two things happen. First, "couch" drops out because it's not a member of the set. Second, when you retrieve the value later, it appears as "table,chair". This occurs because MySQL determines which bits correspond to each substring of the value to be assigned and turns them on in the stored value. "couch" corresponds to no bit and is ignored. On retrieval, MySQL constructs the string value from the numeric value by scanning the bits in order, which automatically reorders the substrings to the order used when the column was declared. This behavior also means that if you specify a set member more than once in a value, it will appear only once when you retrieve the value. If you assign "lamp,lamp,lamp" to a SET column, it will be simply "lamp" when retrieved.

The fact that MySQL reorders members in a SET value means that if you search for values using a string, you must list members in the proper order. If you insert "chair,table" and then search for "chair,table" you won't find the record; you must look for it as "table,chair".

Sorting and indexing of ENUM and SET columns is done according to the internal (numeric) values of column values. The following example might appear to be incorrect otherwise because the values are not sorted in alphanumeric order:

mysql> SELECT e FROM e_table ORDER BY e;
+--------+
| e      |
+--------+
| NULL   |
|        |
| jane   |
| fred   |
| will   |
| marcia |
+--------+

The NULL value sorts before other values (or after, for a descending sort).

You can exploit the ENUM sorting order if you have a fixed set of values and you want them to sort in a particular order. Make the column an ENUM when you create the table and list the enumeration values in the column declaration in the order that you want them to be sorted.

For cases where you want an ENUM to sort in regular lexicographic order, you can convert the column to a non-ENUM string by using CONCAT() and sorting the result:

mysql> SELECT CONCAT(e) as e_str FROM e_table ORDER BY e_str;
+--------+
| e_str  |
+--------+
| NULL   |
|        |
| fred   |
| jane   |
| marcia |
| will   |
+--------+

String Column Type Attributes

The BINARY attribute may be specified for the CHAR and VARCHAR types to cause column values to be treated as binary strings (that is, case sensitive in comparison and sorting operations).

The general attributes NULL or NOT NULL may be specified for any of the string types. If you don't specify either of them, NULL is the default. However, declaring a string column as NOT NULL does not prevent entry of an empty string. An empty value is different than a missing value, so don't make the mistake of thinking that you can force a string column to contain non-empty values by declaring it NOT NULL. If you require string values to be non-empty, that is a constraint you must enforce within your own applications.

You may also specify a default value using the DEFAULT attribute for all string column types except the BLOB and TEXT types. If you don't specify a default value,one is chosen automatically. The default is NULL for columns that may contain NULL. For columns that may not contain NULL, the default is the empty string except for ENUM, where the default is the first enumeration member. (For SET, the default when the column cannot contain NULL is actually the empty set, but that is equivalent to the empty string.)

Date and Time Column Types

MySQL provides several column types for temporal values: DATE, DATETIME, TIME, TIMESTAMP, and YEAR. Table 2.9 shows the types provided by MySQL for declaring columns that hold date and time values and the range of legal values for each type. The YEAR type was introduced in MySQL 3.22 . The others have been present in all versions of MySQL. The storage requirements for each type are shown in Table 2.10.

Table 2.9. Date and Time Column Types
Type SpecificationRange
DATE"1000-01-01" to "9999-12-31"
TIME"-838:59:59" to "838:59:59"
DATETIME"1000-01-01 00:00:00" to "9999-12-31 23:59:59"
TIMESTAMP[(M)]19700101000000 to sometime in the year 2037
YEAR[(M)]1901 to 2155

Table 2.10. Date and Time Column Type Storage Requirements
Type SpecificationStorage Required
DATE3 bytes (4 bytes prior to MySQL 3.22)
TIME3 bytes
DATETIME8 bytes
TIMESTAMP4 bytes
YEAR1 byte

Each date and time type has a "zero" value that is stored when you insert a value that is illegal for the type, as shown in Table 2.11. This value is also the default value for date and time columns that are declared NOT NULL.

Table 2.11. Date and Time Type "Zero" Values
Type SpecificationZero Value
DATE"0000-00-00"
TIME"00:00:00"
DATETIME"0000-00-00 00:00:00"
TIMESTAMP00000000000000
YEAR0000

MySQL always represents dates with the year first, in accordance with the ANSI specification. For example, December 3, 1999 is represented as "1999-12-03". MySQL does allow some leeway in the way it allows input dates to be specified. For example, it will convert two-digit year values to four digits, and you need not supply leading two digits for month and day values that are less than 10. However, you must specify the year first. Formats that you may be more used to, such as "12/3/99" or "3/12/99", will be interpreted incorrectly. The date interpretation rules MySQL uses are discussed further in "Working with Date and Time Columns."

Time values are returned in the time zone local to the server; MySQL doesn't make any time zone adjustments for the values that it returns to the client.

DATE, TIME, and DATETIME Column Types

The DATE, TIME, and DATETIME types hold date, time, and combined date and time values. The formats are "YYYY-MM-DD", "hh:mm:ss", and "YYYY-MM-DD hh:mm:ss". For the DATETIME type, the date and time parts are both required; if you assign a DATE value to a DATETIME column, MySQL automatically adds a time part of "00:00:00".

MySQL treats the time in DATETIME and TIME values slightly differently. For DATETIME, the time part represents a time of day. A TIME value, on the other hand, represents elapsed time (that's why the range is so great and why negative values are allowed). The rightmost part of the value is taken to indicate seconds, so if you insert a "short" (not fully qualified) time value, such as "12:30", into a TIME column, the value stored is "00:12:30". That is, it's interpreted as "12 minutes, 30 seconds." You can use TIME columns to represent time of day if you like, but keep this conversion rule in mind to avoid problems. To insert a value of "12 hours, 30 minutes," you must specify it as "12:30:00".

TIMESTAMP Column Type

TIMESTAMP columns represent values in YYYYMMDDhhmmss format,with a range from 19700101000000 to sometime in the year 2037. The range is tied to UNIX time, where the first day of 1970 is "day zero," also known as "the epoch." The beginning of 1970 determines the lower end of the TIMESTAMP range. The upper end of the range corresponds to the four-byte limit on UNIX time, which can represent values into the year 2037.[4]

[4] The upper limit on TIMESTAMP values will increase as operating systems are modified to extend the upper range of UNIX time values. This is something that must be addressed at the system library level. MySQL will take advantage of these changes as they are made.

The TIMESTAMP type is so called because it has the special property of recording when a record is created or modified. If you insert a NULL into a TIMESTAMP column, the column value is set automatically to the current date and time. This also happens if you create or update a row but assign no explicit value to the column. However, only the first TIMESTAMP column in a row is treated this way, and even for the first TIMESTAMP column, you can defeat timestamping by inserting an explicit date and time value into the column rather than NULL.

A TIMESTAMP column declaration may include a specification for a maximum display width M. Table 2.12 shows the display formats for the allowed values of M. If M is omitted from a TIMESTAMP declaration or has a value of 0 or greater than 14, the column is treated as TIMESTAMP(14). Odd values of M in the range from 1 to 13 are treated as the next higher even number.

Table 2.12. TIMESTAMP Display Formats
Type SpecificationDisplay Format
TIMESTAMP(14)YYYYMMDDhhmmss
TIMESTAMP(12)YYYYMMDDhhmm
TIMESTAMP(10)YYMMDDhhmm
TIMESTAMP(8)YYYYMMDD
TIMESTAMP(6)YYMMDD
TIMESTAMP(4)YYMM
TIMESTAMP(2)YY

The display width for TIMESTAMP columns has nothing to do with storage size or with the values stored internally. TIMESTAMP values are always stored in 4 bytes and used in calculations to full 14-digit precision, regardless of the display width. To see this, suppose you declare a table as follows, then insert some rows into it and retrieve them:

CREATE TABLE my_table
(

    ts TIMESTAMP(8),
    i INT
)
INSERT INTO my_table VALUES(19990801120000,3)
INSERT INTO my_table VALUES(19990801120001,2)
INSERT INTO my_table VALUES(19990801120002,1)
INSERT INTO my_table VALUES(19990801120003,0)
SELECT * FROM my_table ORDER BY ts, i

The output from the SELECT statement looks like this:

+----------+------+
| ts       | i    |
+----------+------+
| 19990801 |    3 |
| 19990801 |    2 |
| 19990801 |    1 |
| 19990801 |    0 |
+----------+------+

On the face of it, the rows appear to be sorted in the wrong order—the values in the first column are all the same, so it seems the sort should order the rows according to the values in the second column. This apparently anomalous result is due to the fact that MySQL is sorting based on the full 14-digit values inserted into the TIMESTAMP column.

MySQL has no column type that can be set to the current date and time when the record is created and that remains immutable thereafter. If you want to achieve that, you can do it two ways:

  • Use a TIMESTAMP column. When a record is first created, set the column to NULL to initialize it to the current date and time:

    INSERT INTO tbl_name (ts_col, …) VALUES(NULL, …)
    

    Whenever you update the record thereafter, explicitly set the column to the value it already has. Assigning an explicit value defeats the timestamping mechanism because it prevents the column's value from being automatically updated:

    UPDATE tbl_name SET ts_col=ts_col WHERE …
    
  • Use a DATETIME column. When you create a record, initialize the column to NOW():

    INSERT INTO tbl_name (dt_col, …) VALUES(NOW(), …)
    

Whenever you update the record thereafter, leave the column alone:

UPDATE tbl_name SET /* anything BUT dt_col here */ WHERE …

If you want to use TIMESTAMP columns to maintain both a time-created value and a last-modified value, you can do so by using one TIMESTAMP for the time-modified value, and a second TIMESTAMP for the time-created value. Make sure the time-modified column is the first TIMESTAMP, so that it's set when the record is created or changed. Make the time-created column the second TIMESTAMP, and initialize it to NOW() when you create new records. That way its value will reflect the record creation time and will not change after that.

YEAR Column Type

YEAR is a one-byte column type used for efficient representation of year values. It has a range of 1901 to 2155. You can use the YEAR type when you want to store date information but only need the year part of the date, such as year of birth, year of election to office, and so forth. When you do not need a full date value, YEAR is much more space-efficient than other date types.

A YEAR column declaration may include a specification for a display width M, which should be either 4 or 2. If M is omitted from a YEAR declaration, the default is 4.

TINYINT has the same storage size as YEAR (one byte), but not the same range. To cover the same range of years as YEAR by using an integer type, you would need a SMALLINT, which takes twice as much space. If the range of years you need to represent coincides with the range of the YEAR type, YEAR is more space-efficient than SMALLINT. Another advantage of YEAR over an integer column is that MySQL will convert two-digit values into four-digit values for you using MySQL's usual year-guessing rules. For example, 97 and 14 become 1997 and 2014. However, be aware that inserting the numeric value 00 will result in the value 0000 being stored, not 2000. If you want a value of zero to convert to 2000, you must specify it as a string "00".

Date and Time Column Type Attributes

There are no attributes that are specific to the date and time column types. The general attributes NULL or NOT NULL may be specified for any of the date and time types. If you don't specify either of them, NULL is the default. You may also specify a default value using the DEFAULT attribute. If you don't specify a default value, one is chosen automatically. The default is NULL for columns that may contain NULL. Otherwise, the default is the "zero" value for the type.

Working with Date and Time Columns

MySQL tries to interpret date and time values in a variety of formats. DATE values may be specified in any of the following formats, including both string and numeric forms. Table 2.13 shows the allowable formats for each of the date and time types.

Table 2.13. Date and Time Type Input Formats
TypeAllowable Formats
DATETIME, TIMESTAMP"YYYY-MM-DD hh:mm:ss"
 "YY-MM-DD hh:mm:ss"
 "YYYYMMDDhhmmss"
 "YYMMDDhhmmss"
 YYYYMMDDhhmmss
 YYMMDDhhmmss
DATE"YYYY-MM-DD"
 "YY-MM-DD"
 "YYYYMMDD"
 "YYMMDD"
 YYYYMMDD
 YYMMDD
TIME"hh:mm:ss"
 "hhmmss"
 hhmmss
YEAR"YYYY"
 "YY"
 YYYY
 YY

Formats that have two digits for the year value are interpreted using the rules described in "Interpretation of Ambiguous Year Values." For string formats that include delimiter characters, you don't have to use '-' for dates and ':' for times. Any punctuation character may be used as the delimiter. Interpretation of values depends on context, not on the delimiter. For example, although times are typically specified using a delimiter of ':', MySQL won't interpret a value containing ':' as a time in a context where a date is expected. In addition, for the string formats that include delimiters, you need not specify two digits for month, day, hour, minute, or second values that are less than 10. The following are all equivalent:

"2012-02-03 05:04:09"
"2012-2-03 05:04:09"
"2012-2-3 05:04:09"
"2012-2-3 5:04:09"
"2012-2-3 5:4:09"
"2012-2-3 5:4:9"

Note that values with leading zeroes may be interpreted differently depending on whether they are specified as strings or numbers. The string "001231" will be seen as a six-digit value and interpreted as "2000-12-31" for a DATE, as "2000-12-31 00:00:00" for a DATETIME. On the other hand, the number 001231 will be seen as 1231 after the parser gets done with it and then the interpretation becomes problematic. This is a case where it's best to supply a string value, or else use a fully qualified value if you are using numbers (that is, 20001231 for DATE and 200012310000 for DATETIME).

In general, you may freely assign values between the DATE, DATETIME, and TIMESTAMP types, although there are certain restrictions to keep in mind:

  • If you assign a DATETIME or TIMESTAMP value to a DATE, the time part is discarded.

  • If you assign a DATE value to a DATETIME or TIMESTAMP, the time part of the resulting value is set to zero.

  • The types have different ranges. In particular, TIMESTAMP has a more limited range (1970 to 2037), so, for example, you cannot assign a pre-1970 DATETIME value to a TIMESTAMP and expect reasonable results. Nor can you assign values that are well in the future to a TIMESTAMP.

MySQL provides many functions for working with date and time values. See Appendix C for more information.

Interpretation of Ambiguous Year Values

For all date and time types that include a year part (DATE, DATETIME TIMESTAMP, YEAR), MySQL handles values that contain two-digit years by converting them to four-digit years. This conversion is performed according to the following rules:[5]

[5] In MySQL 4.0, the rules will change slightly in that 69 will be converted to 1969 rather than to 2069. This is according to the rules specified by the X/Open UNIX standard.

  • Year values from 00 to 69 become 2000 to 2069

  • Year values from 70 to 99 become 1970 to 1999

You can see the effect of these rules most easily by assigning different two-digit values into a YEAR column and then retrieving the results. This will also demonstrate something you should take note of:

mysql> CREATE TABLE y_table (y YEAR);
mysql> INSERT INTO y_table VALUES(68),(69),(99), (00);
mysql> SELECT * FROM y_table;
+------+
| y    |
+------+
| 2068 |
| 1969 |
| 1999 |
| 0000 |
+------+

Notice that 00 was converted to 0000, not to 2000. That's because 0 is a perfectly legal value for the YEAR type; if you insert a numeric zero, that's what you get. To get 2000, insert the string "0" or "00". You can make sure MySQL sees a string and not a number by inserting YEAR values using CONCAT(). This function returns a string result uniformly regardless of whether its argument is a string or a number.

In any case, keep in mind that the rules for converting two-digit to four-digit year values provide only a reasonable guess. There is no way for MySQL to be certain about the meaning of a two-digit year when the century is unspecified. If MySQL's conversion rules don't produce the values that you want, the solution is obvious: Provide unambiguous data with four-digit years.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.144.205.223