Indices are database objects that can greatly increase database performance, enabling
faster execution of statements involving comparative criteria. An index tracks the data on one
or more columns in a table, allowing conditional clauses (such as the WHERE
clause) to find their targeted rows more efficiently.
The internal workings of indices vary, and there are several implementations to choose from. This section describes the different types of indices available, and explains when you should use one type over the other.
While indices exist to enhance performance, they also contribute to system overhead. Indices must be updated as data in the column that they are applied to fluctuates. Maintaining infrequently used indices decreases performance when the amount of time spent maintaining them outweighs the time saved through using them. In general, indices should be applied only to columns that you expect to use frequently in comparative expressions.
Indices
are created with the CREATE INDEX
SQL command. Here is the syntax for
CREATE INDEX
:
CREATE [ UNIQUE ] INDEX indexname ON table [ USING indextype ] ( column [ opclass ] [, ...] )
In this syntax, indexname
is the name of the new
index to be created, table
is the name of the table to
be indexed, and column
is the name of a specific
column to be indexed. Optionally, the indextype
parameter may be set to specify index implementation, and the opclass
parameter may be set to indicate what operator class should be
used in sorting input values.
Operator classes are stored in PostgreSQL’s pg_opclass
column. Unless
you are especially knowledgeable of the technical inner workings of PostgreSQL’s operator
classes, you shouldn’t use this option.
Regarding the column
to index, multiple names may
be specified, separated by commas; doing so creates an index across all
specified columns. Such an index will only be used by PostgreSQL when executing SQL statements
that search all indexed columns in the WHERE
clause through the AND
keyword. Multicolumn indices are limited to a maximum of 16 columns in a
default PostgreSQL installation, and may not use an index type other than B-tree.
In determining upon which columns to create an index, consider which columns will be most
frequently accessed for search conditions. For example, while the books
table keeps an index on its id
column (the primary key), the title
column is also frequently checked for in WHERE
conditions. Adding a secondary index to the title
column will greatly
increase the performance of SQL statements making comparisons against values in that
column.
Example 7-1 shows how to create such an index and uses
the d
psql slash command to view the books
table. In
addition to showing the table’s column types, this command also shows the indices that have
been created on it.
Example 7-1. Creating an index
booktown=# CREATE INDEX books_title_idx booktown-# ON books (title); CREATE booktown=# d books Table "books" Attribute | Type | Modifier ------------+---------+---------- id | integer | not null title | text | not null author_id | integer | subject_id | integer | Indices: books_id_pkey, books_title_idx
Certain types of table
constraints, notably the PRIMARY KEY
and UNIQUE
constraints, result in the implicit creation of an index for use in enforcing the constraint.
In Example 7-2 you see the creation of the Book Town authors
table with the PRIMARY KEY
constraint placed on its
id
column. This use of PRIMARY KEY
causes an index
called authors_pkey
to be implicitly created.
Example 7-2. Implicit index creation
booktown=# CREATE TABLE authors (id integer PRIMARY KEY, booktown(# last_name text, booktown(# first_name text); NOTICE: CREATE TABLE/PRIMARY KEY will create implicit index 'authors_pkey' for table 'authors' CREATE example=# d authors Table "authors" Attribute | Type | Modifier ------------+---------+---------- id | integer | not null last_name | text | first_name | text | Index: authors_pkey
The index created in Example 7-2 enables PostgreSQL to
quickly verify that the primary key is unique for any new row inserted into the table, as well
as enhances the performance of queries which use the id
column as a search
condition.
Specifying the UNIQUE
keyword causes the index to disallow duplicate
values within the column (or set of columns) it indexes. Creating a UNIQUE
index on a table is functionally identical to creating a table with the UNIQUE
constraint (see the section titled Using Constraints
later in this chapter).
Example 7-3 creates a unique index called unique_publisher_idx
on the publishers
table’s name
column. This will disallow two publishers from having the same name in this
table.
Example 7-3. Creating a unique index
booktown=# CREATE UNIQUE INDEX unique_publisher_idx booktown-# ON publishers (name); CREATE booktown=# d publishers Table "publishers" Attribute | Type | Modifier -----------+---------+---------- id | integer | not null name | text | address | text | Indices: publishers_pkey, unique_publisher_idx
As
the NULL
value does not technically match any value, duplicate instances
of NULL
can be inserted into a column with a unique index placed on it.
This is the main practical difference between a unique index and an index implicitly created
by the PRIMARY KEY
constraint, which does not allow NULL
values at all.
The optional USING
clause can be used to specify the type of index to
implement. PostgreSQL 7.1.x supports three types of indices including:
B-tree
R-tree
Hash
PostgreSQL’s B-tree implementation uses Lehman-Yao high-concurrency B-tree algorithms and is both the most capable, and most commonly used indexing method. For this reason, it is the default index type.
The R-tree implementation is primarily useful for spacial data type operations (i.e.,
operations on geometric types) and utilizes Guttman’s quadratic split algorithm. The Hash
implementation utilizes Litwin’s linear hashing routines, which have been traditionally used
for indices that involve frequent direct equal-to comparisons (e.g., with the =
operator).
At the time of this writing, for PostgreSQL 7.1.x, the B-tree index implementation is by
far the most capable and flexible of the available index types. At this time, it is
recommended that you use the B-tree index type over the Hash implementation, even
for direct =
comparisons. The Hash index exists mostly for legacy reasons,
though it may still be used if you are sure your system would benefit from a Hash index over
a B-tree index.
As stated, the R-tree index implementation is recommended for indexing geometric types; be aware, however, that several limitations exist. For example, you cannot create a unique R-tree index, nor can you create an R-tree index on more than one column. In these cases, it makes more sense to rely on the capable B-tree index type.
The USING
clause can be used with the keywords BTREE
, RTREE
, and HASH
in order to
explicitly choose the type of index you want to create. Specifying BTREE
explicitly chooses the default index type.
Example 7-4 creates a table called polygons
, which stores spacial data of the type polygon
. An
index named spacial_idx
is then applied to its shape
column with the R-tree implementation.
A slightly modified form of the CREATE INDEX
command can be used to
index the results of a function on a column value, rather than the column value itself. This
is called a functional index.
Use the following syntax to create a functional index:
CREATE [ UNIQUE ] INDEX indexname ON table [ USING indextype ] ( functionname ( column [, ...] ) [ opclass ] )
The only difference in this syntax is that the index is created on the results of the specified function applied to each column value. All other clauses have the same effect as the functional index.
Functional indices are useful on table columns that commonly have their values prepared
through a function before being compared against values in a SQL statement. For example, the
upper( )
function is commonly used to make case-insensitive comparisons.
Creating an index using the upper( )
function enables such
case-insensitive comparisons to be carried out efficiently.
Example 7-5 creates a functional index named upper_title_idx
on the books
table. It uses the upper( )
function on the title
column as the basis to create
the index. Then it performs an example SQL query that is more efficiently executed as a
result of the newly created functional index.
The SQL command to permanently destroy an existing index from a table is DROP
INDEX
. Here is the syntax for DROP INDEX
:
DROP INDEX indexname [, ...]
In this syntax, indexname
is the name of the
index that you wish to permanently remove. Multiple indices to drop may be specified,
separated by commas.
Example 7-6 drops the upper_title_idx
index
created in Example 7-5.
18.191.176.194