String Column Types

MySQL has a range of string column types. Although in name they are intended for storing string or textual data, you can in fact use them to store any kind of data you want, including numbers, images, sounds, and binary data.

Your choice of string column type for a given application largely depends on the length of data you want to store and how you want to work with the data. Table 5.3 shows MySQL's string column types.

Table 5.3. String Column Types
Type NameMaximum Size
CHAR(M)M bytes
VARCHAR(M)M bytes
TINYBLOB28-1 bytes
BLOB216-1 bytes
MEDIUMBLOB224-1 bytes
LONGBLOB232-1 bytes
TINYTEXT28-1 bytes
TEXT216-1 bytes
MEDIUMTEXT224-1 bytes
LONGTEXT232-1 bytes
ENUM('value1','value2',...)65565 values
SET('value1','value2',...)64 values

In essence, CHAR defines a fixed-length column type, whereas VARCHAR and the TEXT and BLOB types are variable-length types. CHAR and VARCHAR are intended for small strings (up to 255 characters long), whereas the TEXT and BLOB types are intended for longer strings.

ENUM and SET are special string column types that must take values from a list of allowed values.

CHAR and VARCHAR Column Types

The CHAR and VARCHAR column types both take the M parameter, which sets the maximum length permissible. M must be between 1 and 255. If you try to insert a string longer than M into a CHAR or VARCHAR field, it will be truncated.

The main difference between CHAR and VARCHAR types is in the way they use their storage: a CHAR(M) type always uses M bytes of storage, values being padded out to the right with spaces to the specified length, M. When you retrieve a value from a CHAR field, the trailing spaces will be removed.

A VARCHAR, in contrast, takes up just enough storage to hold the value (in fact, 1 byte more, because it also stores the value's length). So if you have a VARCHAR(255) column and insert a 10-character string into it, it takes up 11 bytes of storage.

Consider the following example, which creates a table called people to illustrate the use of CHAR and VARCHAR:

CREATE TABLE people (
 initials CHAR(3),
 last_name VARCHAR(100))

Now insert some data, using the INSERT statement and a list of value sets:

INSERT INTO people VALUES ("Elizabeth", "Downey"), ("G", "Washington")

Now see what's stored in the table:

+----------+------------+
| initials | last_name  |
+----------+------------+
| Eli      | Downey     |
| G        | Washington |
+----------+------------+

As you can see, Elizabeth has been truncated to Eli because it's too long for the three-character column. (Your application would have trapped this attempt to put a name into an initials field, right?) However, the second row has been stored correctly.

How can you check that the spaces will be removed from the CHAR field when you read from it? Use the CONCAT function, which concatenates, or joins, values together. If there's nothing after initials, you know the trailing spaces have been removed.

The following line

SELECT CONCAT(initials, last_name) FROM people

produces this:

+-----------------------------+
| CONCAT(initials, last_name) |
+-----------------------------+
| EliDowney                   |
| GWashington                 |
+-----------------------------+

Note

The CONCAT() function is just one of many functions that exist in MySQL's function library.

Although the purpose of this function is to concatenate a number of string values together, MySQL has functions for handling strings, performing arithmetic, working with times and dates, running encryption algorithms, and much more.

You will learn about the function library in Day 10, “Operators and Functions in MySQL.”


It worked. The values joined seamlessly together, proving that the trailing spaces in the CHAR field were removed.

Silent Column Changes

It may sound strange, but MySQL is loathe to mix CHAR and VARCHAR column types in the same table. In the preceding example, the table looked like this when we created it:

+-----------+--------------+------+-----+---------+-------+
| Field     | Type         | Null | Key | Default | Extra |
+-----------+--------------+------+-----+---------+-------+
| initials  | char(3)      | YES  |     | NULL    |       |
| last_name | varchar(100) | YES  |     | NULL    |       |
+-----------+--------------+------+-----+---------+-------+

This is what we expected. But let's add another column, a CHAR(10). We could do this using an ALTER TABLE statement (more on how to alter tables in Day 6):

ALTER TABLE people ADD COLUMN title CHAR(10)

Now view the description of the table again:

+-----------+--------------+------+-----+---------+-------+
| Field     | Type         | Null | Key | Default | Extra |
+-----------+--------------+------+-----+---------+-------+
| initials  | char(3)      | YES  |     | NULL    |       |
| last_name | varchar(100) | YES  |     | NULL    |       |
| title     | varchar(10)  | YES  |     | NULL    |       |
+-----------+--------------+------+-----+---------+-------+

Notice what's happened to the title column: MySQL has silently changed it from a CHAR(10) to a VARCHAR(10).

It's more efficient for MySQL to process fixed-length rows of data. In other words, for optimal performance, it prefers the entire table to be defined using fixed-length columns. However, if any column is variable length (VARCHAR, TEXT, or BLOB), it might as well treat the whole thing as variable length.

MySQL converts any VARCHAR columns with a length of less than four to type CHAR. But if any column has a variable length, all CHAR columns longer than three characters are converted to VARCHAR.

MySQL performs these changes for efficiency of performance and to make best use of its data storage. However, they should not affect your handling of data.

BINARY Column Attribute

You can specify the BINARY column attribute for CHAR and VARCHAR columns. This causes values in those columns to be treated as binary strings.

You can use it as follows:

								column_name CHAR(M) BINARY

or

								column_name VARCHAR(M) BINARY

Values in those columns will then be handled in a case-sensitive way in comparison and sorting operations.

The BINARY attribute is “sticky”: if you use a BINARY column in an expression with other non-BINARY columns, the entire expression will be handled as a binary value. Any comparison will thus be done in a case-sensitive way.

BLOB and TEXT Column Types

BLOB and TEXT are families of column types intended as high-capacity binary and textual storage. You can use them to store all manner of data: text, binary data, images, compressed files, and so on. They are typically used where large-volume storage is required.

The actual column types come in the forms TINYBLOB, TINYTEXT, BLOB, TEXT, MEDIUMBLOB, MEDIUMTEXT, LONGBLOB, and LONGTEXT. As shown previously in Table 5.3, the BLOB and TEXT types themselves can store 216-1 bytes, which is 65 kilobytes. The LONG forms can store as much as 232-1 bytes, which is 4 gigabytes of data.

BLOB means binary large object, and this family of types is intended for storing binary data (or any kind of data represented in binary form), whereas TEXT types are intended for large amounts of text. However, BLOB and TEXT are essentially the same thing, and the only difference between the two families is that when doing comparison and sorting, BLOB values, being binary, are case sensitive. TEXT values are not, and comparison and sorting are not performed in a case-sensitive way.

A TEXT family column is really just like a large capacity VARCHAR column, and like VARCHAR, its storage requirements are dictated by the length of the stored value in each row. (TEXT types require between 1 and 4 bytes of storage in addition to the length of data stored. These extra bytes record the length of the stored value.)

In comparison, a BLOB family column is really just like a VARCHAR BINARY column. BLOB storage requirements are the same as for the corresponding TEXT type.

ENUM and SET Column Types

The ENUM and SET column types can be used to hold column values chosen from a given set of strings. The possible strings (think of them as “members”) are declared at table creation time, and after that only these values may be inserted into the column.

ENUM

You can use the ENUM column type where you want the column to contain exactly one of the allowed members specified at table creation time.

For example, you might create a table with an ENUM column specifying a set of three enumeration members:

CREATE TABLE my_table (
 number ENUM("one", "two", "three") NULL)

This example declared that the column can be NULL, though you could also have put NOT NULL to not allow null values to exist.

The ENUM column would be allowed to hold any one of the string values: NULL (because we declared it can be NULL), "", "one", "two", and "three".

For each ENUM column, a corresponding index exists, which is a numeric representation of the number of each possible member, starting from 1. In this case, the corresponding indexes would be NULL, 0, 1, 2, and 3, respectively.

Although NULL is a legal value in this case, an index of 0 denotes that an illegal value has been inserted. 1, 2, and 3 correspond with the valid members.

Now see what happens by inserting some rows of data, some of it valid, and some not:

INSERT INTO my_table VALUES (NULL), (""), ("one"), ("three"), ("four")

Note that the second value "" and the final value "four" are not in the set of valid members. Let's write a simple query to display the contents of the table, showing both the string value of the number column and the corresponding index value, which you get by treating the ENUM column as a number (by adding zero to it, number+0):

SELECT number, number+0 FROM my_table

This gives the following output:

+--------+----------+
| number | number+0 |
+--------+----------+
| NULL   |     NULL |
|        |        0 |
| one    |        1 |
| three  |        3 |
|        |        0 |
+--------+----------+

As you can see, we retrieved both the string value of the ENUM column and the corresponding index value. The 0 values denote the attempts to enter illegal values.

NULL is a valid value, although if we had declared the column as NOT NULL, the insert would be treated as illegal. Inserting NULL among other valid values would put a 0 into that row, whereas inserting a single row with NULL would cause an error.

If you insert a numeric into an ENUM column, that number will be assumed to be the index of that enumeration member. Thus if we do this:

INSERT INTO my_table VALUES (2)

it would cause a row with value two to be added because this member has the index value 2.

When sorting on an ENUM column, the sort order is governed by the index of the members (in other words, from the order in the CREATE TABLE declaration) rather than the members' textual representations. Therefore, sorting the data in the simple table here by adding ORDER BY:

SELECT number FROM my_table ORDER BY number

would produce the following:

+--------+
| number |
+--------+
| NULL   |
|        |
|        |
| one    |
| two    |
| three  |
+--------+

As you can see, one, two, and three are usefully in the order in which they were declared rather than in string-comparison order.

SET

The SET column type works in much the same way as ENUM. However, although ENUM can hold only one of the enumeration members, SET can hold any number of members, including all of them.

Here's an example that declares the members of a column:

CREATE TABLE pizza (
 topping SET ("pepperoni", "prawns", "anchovies", "tuna", "cheese") NULL)

To insert data into the rows, you can now specify the SET column to have no value, a single value from the valid members, or many values:

INSERT INTO pizza VALUES (""), ("pepperoni"), ("anchovies,tuna")

Notice how when we inserted data for the third row (in the third set of parentheses), we specified two valid members separated by a comma. Here's what the table now holds:

+----------------+
| topping        |
+----------------+
|                |
| pepperoni      |
| anchovies,tuna |
+----------------+

The first item was nothing (""), which leaves nothing in the table at the first row. Now let's try inserting two more rows, one with an invalid member, and one with a NULL:

INSERT INTO pizza VALUES ("chicken"), (NULL)

This adds two more rows, one containing no value, and one containing NULL:

+----------------+
| topping        |
+----------------+
|                |
| pepperoni      |
| anchovies,tuna |
|                |
| NULL           |
+----------------+

Because we declared that the column could contain NULL when we created the table, inserting a NULL results in a NULL in the data. However, if we had declared the column as NOT NULL, inserting a NULL would have resulted in an empty value in that row, denoting an attempt to enter an illegal value.

As with ENUM, we can display the SET field and the index numbers of its members:

SELECT topping, topping+0 FROM pizza

produces

+----------------+-----------+
| topping        | topping+0 |
+----------------+-----------+
|                |         0 |
| pepperoni      |         1 |
| anchovies,tuna |        12 |
|                |         0 |
| NULL           |      NULL |
+----------------+-----------+

The indexes are created in a different way from ENUM and are derived from the combined binary representations of the members present in the value. Study the MySQL Reference Manual if you require more information on how these values are represented.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.129.39.252