Table Management

The tables are the reason for the existence for the database system. Using the "black box" theory of databases, users tend to believe that once they have created a table, everything is automatically taken care of and handled by the omnipotent and omniscient database management system. For the user, it is often enough to allow his understanding to stop at this point. Ask a user where his data resides, and you'll probably get "in a table" as an answer.

One of the most important DBA functions is the arrangement and management of the tables and indexes in the database system. Unless the basic requirements of efficient table and index layout are achieved, the entire system starts off at a disadvantage and optimum performance becomes an illusive dream. Once the system is set up properly, the DBA must continuously monitor these aspects of the database system. It will always be evolving, and the DBA needs to be able to at least stay abreast of the changes. To effectively set up and maintain a database system, the DBA needs to have a deeper understanding of the physical layout of tables and indexes. This understanding must start with the sizing and physical storage of tables within databases.

Sizing and Physical Storage

Database tables are stored either in normal UNIX or NT files or on raw devices. Raw device storage is more efficient and much more common than UNIX file system storage. Since tables are not stored in individual files that are visible to UNIX, the DBA has to use Informix utilities to view table data.

Much useful information is contained in the table systables. Every database managed by Informix will have a separate set of system tables. These tables can be inspected either by the isql utility, by the dbaccess utility, by INFORMIX-4GL or other client programs, by GUI-based Informix clients, or by embedded ESQL code.

Data about physical storage locations on disk is available through use of the tb/oncheck utility. This utility is very useful and is one of the few ways that the DBA can see how the physical data is actually stored and arranged. IDS systems can also duplicate much of the data from oncheck by running queries against the sysmaster tables. See Chapter 11 for examples.

When a table is first created, a typical SQL statement to create the table might be:

CREATE TABLE test_table
     (
field1  SMALLINT,
field2  SMALLINT
)IN dbspacename EXTENT SIZE XXX NEXT SIZE YYY

IDS systems also have the option to use a FRAGMENT BY clause to allow the data to be intelligently located across multiple disk devices.

From a physical storage standpoint there are three interesting areas of this SQL statement. They are discussed in the next three sections.

in <Dbspacename>

Here, DBSPACENAME would be replaced with the name of one of your system's dbspaces. At table creation time, this is one way that you can specify where to put the table. Physical table location is most important when trying to place active tables in preferred locations on the disk. If you have some tables that are very active, you may wish to create dbspaces that are made up of chunks from your fastest disk devices. You may also wish to create dbspaces composed of chunks on different devices or controlled by different disk controllers to separate two tables that often get accessed together. Such a scheme would allow simultaneous access of both tables without the disk thrashing that would be caused if both tables were on the same disk and accessed by the same read head.

In the absence of the IN <dbspacename> clause, the table will be placed in the default dbspace for the current database. When the database is first created, it can be created in a particular dbspace by the following SQL:

CREATE database databasename IN dbspacename;

If the IN <dbspacename> clause is omitted, the database will be placed in the rootdbs, which is usually a very bad idea.

Extent Size

This is the size in kilobytes of the first extent of the table. Whether the table has one row or hundreds, it will still occupy this amount of space in the database. This space is reserved at table creation time and is released only if the table is recreated. Due to the internal layout of Informix OnLine engines, it is better if an entire table will fit within a total of eight extents. If the number of extents goes above eight, the engine has to access another page from disk each time the table is accessed. It is best to try to build your tables so that the first extent will contain all the data and indexes. Putting the majority of data into the first extent ensures that most of the data will be located in a contiguous area of disk, with resulting performance gains in long sequential reads. This is less critical in IDS systems, which keeps the free extent list in SMI tables rather than in memory caches.

Next Size

When the first extent is filled, the Informix engine automatically allocates an additional extent of NEXT size in kilobytes. As these extents are filled, more are allocated. Once an extent is allocated, it is not released unless the table is rebuilt. This means that when left alone, tables will grow in size or remain the same in size. They never decrease in size without manual intervention.

There are some cases in which the engine modifies the NEXT SIZE. There is an internal limit to the number of extents that Informix engines can physically assign to a tablespace. This limit varies with the Informix version. For example, in Version 4.XX the limit is approximately 200 extents. If a table is created using the default EXTENT SIZE and NEXT SIZE, it is possible for the table to grow to its maximum number of extents, preventing it from growing any further.

To partially alleviate this limitation, the engine does several things to minimize the number of extents used. In the first case, when Informix is allocating a NEXT extent, it will try to allocate it contiguous to the previous extent. If it is possible to do this, it considers the added extent space to simply be an enlargement of the last extent and does not call it a separate extent. This can sometimes cause confusion to the DBA when she is looking at tb/oncheck -pe outputs. Although the CREATE TABLE statement for the table may specify a NEXT SIZE the real extents may vary in size from that specified.

Another approach the engine takes to reduce extent usage is called extent doubling. Whenever the number of extents in a table reaches a multiple of 32, the NEXT SIZE parameter is automatically doubled. For example, in a table created using the default eight pages of EXTENT and NEXT sizes (16K on 2K page machines), the 32nd extent will be 16K in size. The 33rd extent will be 32K in size.

At any time during the life and growth of a table, you can alter the NEXT SIZE parameter by using the ALTER TABLE SQL statement. This will not affect existing extents in use, but the next extent to be allocated will be given the new size. This allows a certain amount of tuning to be done to table allocation schemas after the table is created.

Finally, if a dbspace begins to get full, the NEXT SIZE parameter will be limited to the actual available space in the dbspace. For example, suppose your NEXT size is 64K and when it comes time to allocate another extent to the tablespace, there is only 40K of contiguous space left in the tablespace. The final extent will be 40K, not 64K. This will allow all of the space in the dbspace to be used. If there are other noncontiguous spaces left in the tablespace, they will be used for further table needs in order of decreasing size.

Correct use of EXTENT SIZE and NEXT SIZE can allow creation of extremely large tables without choking the engine. There is one case, however, that can cause a problem. It involves the creation of implicit temporary tables. You will remember that these implicit temporary tables are created by the engine for various internal uses. These tables are created with default extents and they are not under user control. It is possible for these implicit temporary tables to reach their maximum number of extents at about 8 megabytes due to the limitations on the number of extents.

Reclaiming Space

One way to recover the extents allocated to a table is by dropping and re-creating the table. This can be done explicitly with commands to drop and re-create the table or implicitly as part of a command that re-creates the table as a side effect of doing something else.

Dropping and Re-creating the Table

This is the most flexible method of recovering space that is no longer needed by a table. The DBA simply creates a dummy table in the correct dbspace, copies the data from the original table into the new table, drops the original table, and then renames the new table the same as the old. Since the DBA is recreating the table totally, the new table can be in whichever dbspace is needed, with whatever FIRST SIZE and NEXT SIZE are desired. Sometimes, especially in systems with very large tables, this is not possible due to a shortage in disk space. In certain cases, attempting to recreate the table becomes a time-consuming and risky proposition that entails unloading the table to tape, re-creating it with the proper sizing, and finally reloading from tape. Due to my inherent distrust of tape devices, this is not something I will do lightly.

Implicitly Re-creating the Table

There are several SQL commands that re-create a table as a side effect of doing another job. In general, any Data Definition Language (DDL) statement can cause a table to be rebuilt. For example, the following command causes the table sample to be rebuilt.

ALTER TABLE sample ADD newfield integer

The table must be rebuilt because of the addition of the new column, newfield. Any such DDL command can cause the table to be rebuilt. Statements that create cluster indexes also force a physical rebuild of the table.

ALTER INDEX xxxxxxx TO CLUSTER;
CREATE CLUSTER INDEX  yyyyyyy ON tabname (colname);

Both these SQL statements make use of the main property of clustered indexes, that is, a clustered index rearranges the physical layout of the table. It essentially creates a new table with the rows rearranged in the order of the cluster index. This has the by-product of freeing up unused NEXT extents. It does not allow placing the table in another dbspace or altering the FIRST EXTENT or NEXT EXTENT numbers. It simply releases any extents that are not used.

When a new cluster index is created, or when an existing index is altered to cluster, the engine creates a temporary copy of the table that is dropped when the index is completed. If the table is very large, this could result in space problems during this process. Be sure that you have enough space available in your system to create the large temporary table needed. This table will be the same size as the existing table.

IDS systems have additional tools for reorganizing the tables, the most efficient of which is

ALTER FRAGMENT ON TABLE tablename INIT IN spacename

IDS systems beginning with version 7.30 also have the capability of doing an in-place table reorganization that does not require that there be enough space in the database to hold a temporary copy of the table. Note that an in-place reorganization of the table will not reorganize extents or reclaim used pages.

Altering a Table

Later versions of IDS (I'm not sure when this began. Check your own version.) have a feature that is quite useful for those cases when you need to add a column to a table. In earlier versions, any table alter required that the engine go through the entire table adding an additional column. In later versions, if you add a field to the end of the table, the alter table is almost instantaneous as the engine does not have to traverse the entire table. If you try to alter the same table by inserting a field between two existing fields, the traversal still occurs. This speedup of the alter table is quite helpful, especially in data warehouse systems that may have hundreds of millions of rows in a table.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.118.171.20