Defining a table space

This section provides more detailed information about three different types of table spaces—segmented, partitioned, and LOB. Each type of table space has its own advantages and disadvantages. This information will help you choose the table space that best suits your needs. This section also summarizes the process of defining table spaces.

DB2 divides table spaces into equal-sized units, called pages, which are written to or read from disk in one operation. You can specify page sizes for the data; the default page size is 4 KB.

Recommendation: Use partitioned table spaces for all table spaces that are referred to in queries that can take advantage of query parallelism. Use segmented table spaces for other queries. The explanations of the different table space types will help you decide.

General naming guidelines for table spaces

A table space name is an identifier of up to eight characters, which you can qualify with a database name. The default database name is database DSNDB04. The following table space name is typical:

ObjectName
Table spaceMYDB.MYTS

Coding guidelines for defining table spaces

DB2 stores the names and attributes of all table spaces in the SYSIBM.SYSTABLESPACE catalog table, regardless of whether you define the table spaces explicitly or implicitly.

Recommendation: For large tables, use a partitioned table space. For small tables, use a segmented table space.

Defining a table space explicitly

Use the CREATE TABLESPACE statement to create a table space explicitly. This statement allows you to specify the attributes of the table space. The following list introduces some of the clauses of the CREATE TABLESPACE statement that you will read about in this section.

LOB

Indicates that the table space is to be a large object (LOB) table space.

DSSIZE

Indicates the maximum size, in GB, for each partition or, for LOB table spaces, each data set.

FREEPAGE integer

Specifies how often DB2 should leave a page of free space when the table space or partition is loaded or reorganized. You specify that DB2 should set aside one free page for every integer number of pages. Using free pages can improve performance for applications that perform high-volume inserts or that update variable-length columns.

PCTFREE integer

Indicates the percentage (integer) of each page that DB2 should leave as free space when the table is loaded or reorganized. Specifying PCTFREE can improve performance for applications that perform high-volume inserts or that update variable-length columns.

COMPRESS

Specifies that data is to be compressed. You can compress data in a table space and thereby store more data on each data page. “Compressing data” on page 297 has information about data compression.

BUFFERPOOL bpname

Identifies the buffer pool that this table space is to use and determines the page size of the table space. The buffer pool is a portion of memory in which DB2 temporarily stores data for retrieval. You can read about the effect of buffer pool size on performance in “Caching data: The role of buffer pools” on page 294.

LOCKSIZE

Specifies the size of locks that DB2 is to use within the table space. DB2 uses locks to protect data integrity. Use of locks results in some overhead processing costs, so choose the lock size carefully. You can read about locking in “Improving performance for multiple users: Locking and concurrency” on page 301.

You can create segmented, partitioned, and LOB table spaces.

This section provides an overview of how to implement segmented and partitioned table spaces. “Defining large objects” on page 268 has information about LOB table spaces.

A segmented table space can hold one or more tables. Segmented table spaces hold a maximum of 64 GB of data. They might use one or more VSAM data sets. A table space can be larger if either of the following conditions is true:

  • The table space is a partitioned table spaces that you create with the DSSIZE option.

  • The table space is a LOB table space.

Table space pages are either 4 KB, 8 KB, 16 KB, or 32 KB in size. As a general rule, each DB2 database should have no more than 50 to 100 table spaces. Following this guideline helps minimize maintenance, increase concurrency, and decrease log volume.

Defining a table space implicitly

For small tables, you implicitly create a segmented table space when you use the CREATE TABLE statement to create a table and do not specify an existing table space name. When this occurs, DB2 performs the following tasks:

  • Generates a table space for you

  • Derives a table space name from the name of your table

  • Uses default values for space allocation and other table space attributes

One or more tables are created for segmented table spaces.

For large tables, you need to explicitly create a partitioned table space or a LOB table space before you create a table. One table is created for partitioned and LOB table spaces. If your CREATE TABLE statement does not specify a database name, DB2 uses the default database, DSNDB04, and the default DB2 storage group, SYSDEFLT.

You also need to explicitly create a table space when you define a declared temporary table. (You read about declared temporary tables in “Types of tables” on page 218.)

Segmented table spaces

A segmented table space is ideal for storing more than one table, especially relatively small tables. The pages hold segments, and each segment holds records from only one table.

Each segment contains the same number of pages, which must be a multiple of 4 (from 4 to 64). Each table uses only as many segments as it needs.

To search all the rows for one table, you don't need to scan the entire table space. Instead, you can scan only the segments that contain that table. Figure 7.2 shows a possible organization of segments in a segmented table space.

Figure 7.2. A possible organization of segments in a segmented table space


When you use the INSERT statement or the LOAD utility to insert records into a table, records from the same table are stored in different segments. You can reorganize the table space to move segments of the same table together. You can read more about reorganization and other techniques that influence performance of your DB2 subsystem in “Chapter 8. Managing DB2 performance.”

Coding the definition of a segmented table space

A segmented table space consists of segments that hold the records of one table. You define a segmented table space by using the CREATE TABLESPACE statement with a SEGSIZE clause. If you use this clause, the value that you specify represents the number of pages in each segment. The value must be a multiple of 4 (from 4 to 64). The choice of the value depends on the size of the tables that you store. Table 7.6 summarizes the recommendations for SEGSIZE.

Table 7.6. Recommendations for SEGSIZE
Number of pagesSEGSIZE recommendation
≤ 284 to 28
> 28 < 128 pages32
≥ 128 pages64

Another clause of the CREATE TABLESPACE statement is LOCKSIZE TABLE. This clause is valid only for tables that are in segmented table spaces. DB2, therefore, can acquire locks that lock a single table, rather than the entire table space. You can read about locking in “Improving performance for multiple users: Locking and concurrency” on page 301.

If you want to leave pages of free space in a segmented table space, you must have at least one free page in each segment. Specify the FREEPAGE clause with a value that is less than the SEGSIZE value.

Example: If you use FREEPAGE 30 with SEGSIZE 20, DB2 interprets the value of FREEPAGE as 19, and you get one free page in each segment.

You can read more about free space in “Using free space in data and index storage” on page 298.

If you are creating a segmented table space for use by declared temporary tables, you cannot specify the FREEPAGE or LOCKSIZE clause.

Characteristics of segmented table spaces

Segmented table spaces share the following characteristics:

  • When DB2 scans all the rows for one table, only the segments that are assigned to that table need to be scanned. DB2 doesn't need to scan the entire table space. Pages of empty segments do not need to be fetched.

  • When DB2 locks a table, the lock does not interfere with access to segments of other tables. (You can read more about locking in “Improving performance for multiple users: Locking and concurrency” on page 301.)

  • When DB2 drops a table, its segments become available for reuse immediately after the drop is committed without waiting for an intervening REORG utility job. (You can read more about this utility in “Determining when to reorganize data” on page 298.)

  • When all rows of a table are deleted, all segments except the first segment become available for reuse immediately after the delete is committed. No intervening REORG utility job is necessary.

  • A mass delete, which is the deletion of all rows of a table, operates much more quickly and produces much less log information. In some cases, you must delete each individual row.

  • If the table space contains only one table, segmenting it means that the COPY utility does not copy pages that are empty. The pages can be empty as a result of a dropped table or a mass delete.

  • Some DB2 utilities, such as LOAD with the REPLACE option, RECOVER, and COPY, operate on only a table space or a partition, not on individual segments. Therefore, for a segmented table space, you must run these utilities on the entire table space. For a large table space, you might notice availability problems.

  • Maintaining the space map creates some additional overhead.

Creating fewer table spaces by storing several tables in one table space can help you avoid reaching the maximum number of concurrently open data sets. Each table space requires at least one data set. A maximum number of concurrently open data sets is determined during installation. Using fewer table spaces means less time spent allocating and deallocating data sets.

Partitioned table spaces

You use a partitioned table space to store a single table. DB2 divides the table space into partitions. The partitions are based on the boundary values defined for specific columns. Utilities and SQL statements can run concurrently on each partition.

In Figure 7.3, each partition contains one part of a table.

Figure 7.3. Pages in a partitioned table space


Defining partitioned table spaces

In a partitioned table space, you can think of each partition as a unit of storage. You use the PARTITION clause of the CREATE TABLESPACE statement to define a partitioned table space. For each partition that you specify in the CREATE TABLESPACE statement, DB2 creates a separate data set. You assign the number of partitions (from 1 to 4096), and you can assign partitions independently to different storage groups.

The maximum number of partitions in a table space depends on the data set size (DSSIZE parameter) and the page size. The size of the table space depends on the data set size and on how many partitions are in the table space.

Characteristics of partitioned table spaces

Partitioned table spaces share the following characteristics:

  • You can plan for growth. When you define a partitioned table space, DB2 usually distributes the data evenly across the partitions. Over time, the distribution of the data might become uneven as inserts and deletes occur.

    You can rebalance data among the partitions by redefining partition boundaries with no impact to availability. You can also add a partition to the table and to each partitioned index on the table; the new partition becomes available immediately.

  • You can spread a large table over several DB2 storage groups or data sets. Not all the partitions of the table need to use the same storage group.

  • Partitioned table spaces let a utility job work on part of the data while allowing other applications to concurrently access data on other partitions. In that way, several concurrent utility jobs can, for example, load all partitions of a table space concurrently. Because you can work on part of your data, some of your operations on the data may require less time.

  • You can break mass update, delete, or insert operations into separate jobs, each of which works on a different partition. Breaking the job into several smaller jobs that run concurrently can reduce the elapsed time for the whole task.

    If your table space uses nonpartitioned indexes, you might need to modify the size of data sets in the indexes to avoid I/O contention among concurrently running jobs. Use the PIECESIZE parameter of the CREATE INDEX or the ALTER INDEX statement to modify the sizes of the index data sets.

  • You can put frequently accessed data on faster devices. Evaluate whether table partitioning or index partitioning can separate more frequently accessed data from the remainder of the table. You can put the frequently accessed data in a partition of its own. You can also use a different device type. You can read more about table and index partitioning later in this chapter.

  • You can take advantage of parallelism for certain read-only queries. When DB2 determines that processing will be extensive, it can begin parallel processing of more than one partition at a time. Parallel processing (for read-only queries) is most efficient when you spread the partitions over different disk volumes and allow each I/O stream to operate on a separate channel.

    You can take advantage of query parallelism. Use the Parallel Sysplex data sharing technology to process a single read-only query across many DB2 subsystems in a data sharing group. You can optimize Parallel Sysplex query processing by placing each DB2 subsystem on a separate central processor complex. You can read more about Parallel Sysplex processing in “Chapter 12. Data sharing with your DB2 data”.

  • Partitioned table space scans are sometimes less efficient than table space scans of segmented table spaces.

  • DB2 opens more data sets when you access data in a partitioned table space than when you access data in other types of table spaces.

  • Nonpartitioned indexes and data-partitioned secondary indexes are sometimes a disadvantage for partitioned tables spaces. You can read more about these types of indexes later in this chapter.

EA-enabled table spaces and index spaces

You can enable partitioned table spaces for extended addressability (EA), a function of DFSMS. The term for table spaces and index spaces that are enabled for extended addressability is EA-enabled. You must use EA-enabled table spaces or index spaces if you specify a maximum partition size (DSSIZE) that is larger than 4 GB in the CREATE TABLESPACE statement.

Both EA-enabled and non-EA-enabled partitioned table spaces can have only one table and up to 4096 partitions. Table 7.7 summarizes the differences.

Table 7.7. Differences between EA-enabled and non-EA-enabled table spaces
EA-enabled table spacesNon-EA-enabled table spaces
Holds up to 4096 partitions of 64 GBHolds up to 4096 partitions of 4 GB
Created with any valid value of DSSIZEDSSIZE cannot exceed 4 GB
Data sets are managed by SMSData sets are managed by VSAM or SMS
Requires setupNo additional setup

You can read more about this topic in “Assignment of table spaces to physical storage” on page 249.

Large object table spaces

LOB table spaces (also known as auxiliary table spaces) are necessary for holding large object data, such as graphics, video, or very large text strings. If your data does not fit entirely within a data page, you can define one or more columns as LOB columns.

LOB objects can do more than store large object data. You can also define LOB columns for infrequently accessed data; the result is faster table space scans on the remaining data in the base table. The table space scan is faster because potentially fewer pages are accessed.

A LOB table space always has a direct relationship with the table space that contains the logical LOB column values. The table space that contains the table with the LOB columns is, in this context, the base table space. LOB data is logically associated with the base table, but it is physically stored in an auxiliary table that resides in a LOB table space. Only one auxiliary table can exist in a large object table space. A LOB value can span several pages. However, only one LOB value is stored per page.

You must have a LOB table space for each LOB column that exists in a table. For example, if your table has LOB columns for both resumes and photographs, you need one LOB table space (and one auxiliary table) for each of those columns. If the base table space is a partitioned table space, you need one LOB table space for each LOB in each partition.

If the base table space is not a partitioned table space, each LOB table space is associated with one column of LOBs in a base table. If the base table space is a partitioned table space, each column of LOBs in each partition is associated with a LOB table space.

In a partitioned table space, you can store more LOB data in each column because each partition must have a LOB table space. Table 7.8 shows the approximate amount of data that you can store in one column for the different types of table spaces.

Table 7.8. Approximate maximum size of LOB data in a column
Table space typeMaximum (approximate) LOB data in each column
Segmented16 TB
Partitioned, with NUMPARTS up to 641000 TB
Partitioned with DSSIZE, NUMPARTS up to 2544000 TB
Partitioned with DSSIZE, NUMPARTS up to 409664000 TB

You can read more about the process of defining LOB table spaces in “Defining large objects” on page 268.

Recommendation: Consider defining long string columns as LOB columns when a row does not fit in a 32-KB page. Use the following guidelines to determine if a LOB column is a good choice:

  • Defining a long string column as a LOB column might be better if the following factors are true:

    - Table space scans are normally run on the table.

    - The long string column is not referenced often.

    - Removing the long string column from the base table will considerably increase the performance of table space scans.

  • LOBs are physically stored in another table space. Therefore, performance for inserting, updating, and retrieving long strings might be better for non-LOB strings than for LOB strings.

Assignment of table spaces to physical storage

You can store table spaces and index spaces in user-managed storage, in DB2-managed storage groups, or in SMS-managed storage. (A storage group is a set of disk volumes.) See the “IBM Storage Management Subsystem” sidebar for more information.

If you don't use SMS, you need to name the DB2 storage groups when you create table spaces or index spaces. DB2 will allocate space for these objects from the named storage group. You can assign different partitions of the same table space to different storage groups.

Recommendation: Use products in the IBM Storage Management Subsystem (SMS) family, such as Data Facility SMS (DFSMS), to manage some or all of your data sets. Organizations that use SMS to manage DB2 data sets can define storage groups with the VOLUMES(*) clause. As a result, SMS assigns a volume to the table spaces and index spaces in that storage group.

IBM Storage Management Subsystem

IBM offers the Storage Management Subsystem (SMS) family of products. A key product in the SMS family is the Data Facility Storage Management Subsystem (DFSMS). DFSMS can automatically manage all the data sets that DB2 uses and requires. If you use DFSMS to manage your data sets, the result is a reduced workload for DB2 database administrators and storage administrators.

You can experience the following benefits by using DFSMS:

  • Simplified data set allocation

  • Improved allocation control

  • Improved performance management

  • Automated disk space management

  • Improved management of data availability

  • Simplified data movement

DB2 database administrators can use DFSMS to achieve all their objectives for data set placement and design. To successfully use DFSMS, DB2 database administrators and storage administrators need to work together to ensure that the needs of both groups are satisfied.


Figure 7.4 shows how storage groups work together with the various DB2 data structures.

Figure 7.4. Hierarchy of DB2 structures


To create a DB2 storage group, use the SQL statement CREATE STOGROUP. This statement provides a list of volumes that DB2 can use.

After you define a storage group, DB2 stores information about it in the DB2 catalog. The catalog table SYSIBM.SYSSTOGROUP has a row for each storage group, and SYSIBM.SYSVOLUMES has a row for each volume in the group.

The process of installing DB2 includes the definition of a default storage group, SYSDEFLT. If you have authorization, you can define tables, indexes, table spaces, and databases. DB2 uses SYSDEFLT to allocate the necessary auxiliary storage. DB2 stores information about SYSDEFLT and all other storage groups in the catalog tables SYSIBM.SYSSTOGROUP and SYSIBM.SYSVOLUMES.

Recommendation: Use storage groups whenever you can, either explicitly or implicitly, by using the default storage group. In some cases, organizations need to maintain closer control over the physical storage of tables and indexes. These organizations choose to manage their own user-defined data sets rather than to use storage groups. Because this process is complex, this book does not describe the details.

Example: Consider the following CREATE STOGROUP statement:

CREATE STOGROUP MYSTOGRP
 VOLUMES (*)
 VCAT ALIASICF;

This statement creates storage group MYSTOGRP. The * on the VOLUMES clause indicates that SMS is to manage your storage group. The VCAT clause identifies ALIASICF as the name or alias of the catalog of the integrated catalog facility that the storage group is to use. The catalog of the integrated catalog facility stores entries for all data sets that DB2 creates on behalf of a storage group.

A few examples of table space definitions

You have read about different types of table spaces. This section provides two examples of table space definitions, which use the following clauses:

IN

Identifies the database in which DB2 should create the table space.

USING STOGROUP

Indicates that you want DB2 to define and manage the data sets for this table space. If you specify the DEFINE NO clause, you can defer allocation of data sets until data is inserted or loaded into a table in the table space.

PRIQTY integer

Specifies the minimum primary space allocation for a DB2-managed data set. This parameter applies only to table spaces that are using storage groups. The integer represents the number of kilobytes.

SECQTY integer

Specifies the minimum secondary space allocation for a DB2-managed data set. This parameter applies only to table spaces that are using storage groups. The integer represents the number of kilobytes.

Example definition for a segmented table space

The following CREATE TABLESPACE statement creates a segmented table space with 32 pages in each segment:

CREATE TABLESPACE MYTS
  IN MYDB
  USING STOGROUP MYSTOGRP
    PRIQTY 30720
    SECQTY 10240
  SEGSIZE 32
  LOCKSIZE TABLE
  BUFFERPOOL BP0
  CLOSE NO;

Example definition for an EA-enabled partitioned table space

The following CREATE TABLESPACE statement creates an EA-enabled table space, SALESHX. Assume that a large query application uses this table space to record historical sales data for marketing statistics. The first USING clause establishes the MYSTOGRP storage group and space allocations for all partitions:

CREATE TABLESPACE SALESHX
  IN MYDB
  USING STOGROUP MYSTOGRP
    PRIQTY 4000
    SECQTY 130
    ERASE NO
  DSSIZE 16G
  NUMPARTS 48
   (PARTITION 46
     COMPRESS YES,
    PARTITION 47
     COMPRESS YES,
    PARTITION 48
     COMPRESS YES)
  LOCKSIZE PAGE
  BUFFERPOOL BP1
  CLOSE NO;

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.21.248.47