Components of the Informix Dynamic Server

Informix Dynamic Server is composed of three separate components: the process component, the shared memory component, and the disk component. The disk component is almost identical to that of OnLine. There are some differences in the shared memory component, mainly due to structures that are put into place to handle parallelism. The major differences between the earlier version and the latest version lie in the process component, which was drastically redesigned.

Process Component

OnLine Version 5.X is a two-process architecture, in which a client process communicates with a sqlturbo process that did the processing work in a linear, single-threaded manner. Each client process in OnLine had a corresponding sqlturbo process, which led to performance limitations as more and more clients attached to the server.

Informix Dynamic Server did away with the one-to-one correspondence between front-end and back-end processors. Instead, IDS is configured with multiple virtual processors running multiple threads.

Virtual Processors

A virtual processor (VP) is an operating system program that can manage multiple threads. Each VP is called oninit and runs as a separate process within the operating system and can perform tasks for multiple client programs. This allows for a great reduction in the number of operating system processes that run on the system. It also allows a client process to call upon multiple VPs to do its work. Allowing the virtual processors to manage the threads' scheduling and synchronization is more efficient than allowing the operating system to manage the threads. Removing control from the operating system and giving it to the virtual processors allows less conflict between the database processes and other applications running on the machine.

Virtual processors are segmented by function into VP classes. The DBA can control the number of VPs within each class by entries in the $ONCONFIG file. Informix Dynamic Server classes are divided into the following VP classes:

  • CPU All user processing threads. These are the usually the most active virtual processors and handle the lion's share of the database activities. CPUVPs should be kept as active as possible so that they are neither paged out or allowed to sleep. CPUVPs are not allowed to run any tasks that call for built-in wait states such as waiting for disk I/O or any type of messages. These types of tasks are handled by other VP classes.

  • AIO These are asynchronous input/output processes. They perform all disk input and output to the database chunks as well as writes to the message log. If your system is configured to use KAIO (kernel asynchronous input output), disk input and output from raw disk devices is handled by KIO VPs. These virtual processors use operating system calls to access the data and often run faster than AIO VPs. Even with KAIO configured, i/o to operating system files is still handled by the AIO VPs.

  • PIO Virtual processors of this class handle writes to the physical log.

  • LIO Virtual processors of this class handle writes to the logical logfiles.

  • MSC This class has only one VP, and it handles various miscellaneous tasks such as managing licenses and authenticating UNIX users.

  • ADT This class handles auditing tasks if auditing is set up.

  • SOC Runs communication protocols using sockets. Note: An individual system may use SOC VPs or TLI VPs, but not both.

  • TLI Runs TLI communications protocols if TLI networking is in use.

  • SHM Runs shared memory communication threads if one of the shared memory protocols is established and if the NETTYPE ipcshn parameter specifies NET. Otherwise, shared memory listener threads run in CPU VPs.

  • ADM Runs the Informix internal timer.

  • OPT Used to transfer blob data to and from optical disks

Shared Memory Component

Shared memory in both OnLine and IDS is used mainly to cache data pages from disk. Since access to RAM is much faster (by an order of roughly 50 to 1) than access to the hard disk system, performance is greatly enhanced every time the Informix engine is able to read a page from shared memory instead of having to go to the disk. Much of the effort of tuning Informix systems is a quest to improve the cache hit ratio of database reads.

Shared memory in IDS is divided into three separate components: the resident component, the virtual component, and the message component. The major difference between OnLine and IDS lies in the virtual portion of shared memory, as OnLine does not have a virtual component.

Resident Portion of Shared Memory

The resident portion of IDS's shared memory contains the same structures as the earlier, OnLine. The resident portion is static; its sizing does not change unless changes are made in the onconfig file and the engine is shut down and restarted. In most online transaction processing (OLTP) systems, the resident portion of shared memory is the largest portion. In systems that do a lot of decision support (DSS) queries, you may see the virtual portion become the largest part of shared memory.

The main function of the resident function is to handle caching of disk data. In some Informix ports (check your release notes), the onconfig parameter RESIDENT can be set to a value of 1, which guarantees that the resident portion is never paged out. Since this is a very critical portion of shared memory that receives a lot of use, any paging can be detrimental to performance. One of the negatives of setting RESIDENT to 1 is that other applications may be adversely affected if Informix grabs and holds much of the shared memory. For a system that runs only the Informix engine, this may be a good thing, but it may not endear the DBA to other users who are running non-Informix applications on the same machine. Yet another reason to have Informix running on a machine of its own.

For a more detailed explanation of this portion of shared memory, read the chapter on shared memory in the OnLine architecture. The concepts and execution are the same.

Buffer Pools

The buffer pools are usually the largest portion of the resident shared memory. These are the buffers that are established with the BUFFERS configuration parameter in the onconfig file. As in OnLine, the buffers are used to cache data that is read from disk. Each buffer contains one page of data that has been read from disk. Buffers use a least recently used (LRU) algorithm whenever a buffer needs to be recycled. This works exactly as it does in OnLine.

Internal Tables

Internal tables are virtual tables that reside in the resident portion of shared memory. They hold data related to the setup, configuration, and status of the Informix engine. There are internal tables for dbspaces, chunks, tablespaces, locks, and logfile locations. It is these internal tables that are accessed by monitoring utilities such as onstat. Since these are virtual tables, they don't really exist as tables. You cannot see them in any way except through the monitoring utilities. These tables are similar across the engine versions, with the exception that the later IDS has more features and thus, more internal tables than OnLine. IDS systems maintain SMI tables for all shared memory data structures.

Physical Log Buffer

There are two physical log buffers in the resident portion. These buffers are configurable in size by the PHYSBUFF parameter in the onconfig file. The value of PHYSBUFF is the size in kilobytes of each buffer.

Logical Log Buffer

There are three logical log buffers in the resident portion. They are configured through the LOGBUFF parameter, which represents the size in kilobytes of each of the three buffers.

Virtual Portion of Shared Memory

The virtual portion of shared memory has no analog in the earlier Informix versions. This section of memory can contract or expand according to the needs of the engine and the desires of the DBA. IDS is officially known as Informix Dynamic Server, and it is this portion of memory that gives rise to the "Dynamic" in the name.

Session Data

Information about different users and different connections to an IDS database.

Dictionary Cache

As tables are used in the IDS database, their structural information is stored in this area.

Stored Procedure Cache

As stored procedures are run, their preparsed and precompiled contents are saved in this cache for later access.

Thread Information

Information about various threads running inside of virtual processors.

Sort Space

Used to do in-memory sorts in some ORDER BY, GROUP BY, dynamic hash joins, temporary indexes, and index build operations.

Big Buffers

Big buffers are data areas that are used to perform readaheads in order to maximize the efficiency of sequential scans.

Message Portion of Shared Memory

The message portion of shared memory is used in systems that use one of the shared memory communication protocols that uses shared memory rather than streams or sockets.

Disk Component of Both IDS and OnLine

Both the OnLine and IDS engine series handle the low-level organization, access, and structures of the disks in similar ways. IDS extends some of the concepts of logical and physical layout to embrace a concept known as fragmentation. For many of us, the term fragmentation has a negative connotation, as in "the disk is highly fragmented and doesn't perform well." In the Informix IDS world, though, fragmentation has a much kinder, gentler interpretation. IDS defines fragmentation as intelligently placing data across multiple disks for maximum performance. As the OnLine product matured into IDS, Informix found that by smartening up this concept of fragmentation, it was able to take more advantage of multiple CPU systems. In this way the fragmentation concepts are an extension to the disk component in OnLine. Fragmentation has also been called "horizontal partitioning of data" by C.J. Date.

To understand how the engines deal with the data on the disk, the DBA needs to understand the differences between physical and logical storage. Physical storage refers to the actual devices or files that Informix uses to talk to the hardware by way of the UNIX operating system. Logical storage refers to the method in which the database views the data structure internally.

Units of Disk Storage

Both IDS and OnLine can address the disks directly, bypassing the UNIX filesystem. This results in efficiencies because the Informix disk access routines can be much more specialized than UNIX routines. These processes are designed strictly for database access, while the UNIX routines have to be able to handle everything from disk input and output (i/o) to device i/o. Use of these raw devices by the engine can take advantage of the fact that these raw devices are large, contiguous blocks. This allows Informix database engines to use more efficient i/o methods such as big reads and direct memory to disk transfers. The disk subsystem is just another resource to the system.

In IDS versions prior to 7.3 NT systems do not utilize this concept of raw disks. All database chunks in NT are stored as files visible to the operating system. This is one of the major differences between IDS on UNIX and IDS on NT. IDS versions greater than 7.3 allow NT systems to use raw disk devices.

Disk space is made available to either engine through the tb/onmonitor or tb/onspaces utilities. Using tb/onmonitor is somewhat safer for the DBA, as tb/onmonitor will respond with verification prompts on the line of "Do you really want to do something as stupid as this?"

Throughout this book, we will be using the tb/onmonitor command because it is simple and intuitive and because it provides a menuing interface. Most if not all of the functions of tb/onmonitor can be called individually by using the underlying programs and the reader should realize that most of the work can be done directly from the command line should it be necessary.

Don't get too comfortable with tb/onmonitor. When IDS came out on Windows NT, the graphical utilities completely replaced onmonitor. Most of the functions are capable of being done through the graphical user interface (GUI), but in several instances, you have to go to the underlying utilities such as onspaces or onparams.

Tb/onmonitor allows the DBA to assign parts of a raw UNIX device or a UNIX file to a dbspace. It will also allow the DBA to specify offsets into the device to be used. An offset is a certain number of kilobytes that the engine is instructed to skip at the beginning of a device. Informix does not place any data within the offset. In some systems, the formatting of a disk will place system data in the first few tracks of a device. If Informix writes over this, the disk format is blown. Specifying an offset allows the DBA to map around areas such as this.

Using offsets also can allow OnLine to place several chunks within the same UNIX device. If separate data areas are needed, this can be a boon to the DBA, as reformatting a hard disk is often quite a chore. Even if it were easy, the wise DBA will avoid making such UNIX-level changes lightly.

Chunks

The actual unit in which disk storage is made available to overall database access is the chunk. Chunks are sections of the hard disk. Chunks can be either actual UNIX files or portions of the disk outside the UNIX file system. Using the non-UNIX, or raw chunk is preferable to using UNIX files because the additional overhead of managing the UNIX file system will slow the system down.

Note that both raw and UNIX filesystems will show up in a listing of /dev. The difference is that the raw devices are never mounted and do not show up in a UNIX df output. They also never have a makefs done on them, as it is the makefs command that creates the UNIX file system.

For the most part, an Informix engine is happy with a chunk if it can identify it through a UNIX device. OnLine does not know whether the device is a single disk partition or a virtual device composed of multiple disks, perhaps striped and mirrored. It's all the same to Informix. Anything that the DBA can do to speed up access to the chunk will help Informix. If your UNIX supports striped or mirrored disks or RAID (redundant arrays of inexpensive disks) and if you have the option of using them, they will provide payoffs in the area of fault tolerance. There's seldom any reason not to use the UNIX disk striping, as striping generally provides overall improvements in disk performance. Mirroring and RAIDs can greatly increase your fault tolerance, but there is often a cost to balance against the improvements. Mirroring will usually slow down disk operations a bit, whether you are using the operating system's or Informix's mirroring. Using a RAID can also slow down writes to a disk. For some areas of high write activity, such as physical and logical logfile spaces, the DBA may want to explore using lower levels of RAID to optimize the write speeds. Most popular for database work is RAID5, which stripes data across all the disks and provides the capability to lose any one disk in a RAID cluster without losing data. RAID5 is usually as fast as or faster in read performance, but can be costly in write performance.

The speed and efficiency of disk access is a powerful indicator of overall database speed. A little effort spent at the time the system is installed will pay you back many times over in better performance.

There are both advantages and drawbacks to using raw devices or operating system files. You need to be aware of these limitations and risks before committing to either type of storage because it can get complicated to have to change from one to the other. Advantages of raw disk over cooked files are:

  • Raw supports KAIO (kernel asynchronous i/o) on ports where the operating system uses it. KAIO often represents a significant performance increase.

  • Raw devices won't be destroyed when some well-meaning new system admininstrator decides to clean out some of the "large junk files" that happen to be in your data directory. Raw disks are harder to create, but they are a lot less liable to being destroyed by someone who doesn't know what he's doing.

  • Raw disk devices are totally controlled by the Informix processes, so Informix knows their status.

  • Raw disk devices do not need the relatively large operating system load as do files in a UNIX filesystem.

On the other hand, there are several advantages to using the cooked files over the raw devices:

  • Cooked files are easier to deal with because they can be seen by normal operating system utilities.

  • With cooked files, you can let the operating system tell you how much space is free. With raw files, it's hard to see them in the utilities, and it's easy to make mistakes regarding them.

  • On NT versions prior to 7.30, for example, you're forced to use the cooked files. There's no choice. IDS versions later than 7.30 give you the option of using raw devices on NT systems.

  • With cooked files, you're not likely to have two chunks overlap each other or leave a big hole in the middle of the device that nobody knows is there. These are problems that plague the layout of raw devices.

  • Having the data in operating system files also adds other possibilities to database backup, in that normal backup procedures can "see" the data. Be careful, however, to have the database engine completely shut down during such backups. If your backup program locks a file and Informix tries to read it, the chunk will be marked offline. This can range from a mere nuisance in later versions of IDS, to an event requiring a call into the system by Informix Support.

Usually, performance is a deciding factor, and raw spaces are preferable if you have a choice. If you have other factors involved, it may make sense to look at cooked files. The two can coexist, however. Often, creating a database in a hastily created cooked file dbspace is the fastest, cleanest method of solving a problem. There have been reports of cooked filesystems performing as well as raw filesystems in more recent UNIX implementations. The implementation on NT relies solely on Microsoft's NTFS filesystem, and the performance has been shown to match the performance of UNIX systems.

The only real rule here is not to take everything you hear or read at face value. Even though it may go against "conventional wisdom," try out your options if you have the chance and time. Systems are evolving so fast that what may be true one day is hopelessly outdated the next. Don't be afraid to experiment.

The Informix manuals recommend that the actual names of the partitions that comprise a chunk not be used, but that they be linked to a more descriptive name. Thus, you may have a chunk that is physically known to UNIX as /dev/rdisk23h but known to Informix as /dev/chunkl. The DBA has used the UNIX In command as follows:

In /dev/rdisk23h /dev/chunk1
chown Informix /dev/chunk1
chgrp informix /dev/chunkl
chmod 660 /dev/chunk1

Using links rather than actual device names serves more than a mnemonic purpose. If you ever lose a disk or have to rearrange your system's disk layout, the links will be much easier to set up than the actual devices. For instance, in the above example suppose you have a disk crash on /dev/rdisk23h. If you have called your chunk /dev/rdisk23h, you will be down until you physically replace /dev/rdisk23h. The chunk name and its path are stored within the chunk reserved page for the OnLine system. When the system comes up, it will always look for chunks and paths found on this reserved page.

If you have used links, you can scrounge around and find another device to link to /dev/chunkl. This is important because disk problems usually lead to the need to restore your system from a tape archive. The tape archive program requires that the device names of the target system match the source system. The sizes of the target system's chunks must be at least as large as those of the corresponding source system's. They can be larger.

The ownership and group ownership are critical. If you have it wrong in OnLine systems, you will often have no problems as long as user informix is using the system, but you will get error messages when another user tries to use the system. IDS systems will not even start up if the permissions and ownerships are not correct.

Notice also that the UNIX /dev/rdisk*** devices are used. These are character-based devices, not the block-based devices such as the /dev/disk*** devices. If you have accidentally created your raw devices on block devices, you may find that the initial process of disk initialization takes a long time. Subsequent accesses to the disk will take much longer than usual. This is because of the additional overhead of using the UNIX kernel services. Using the block devices will use the UNIX kernel buffer cache, while using the character device will not. To guarantee that writes are flushed to disk in a timely manner, you must use the character device.

Chunks can be further divided into tablespaces and pages. Many chunks can be joined together to form dbspaces, which can be joined together to form databases. Lets look at the differences between these terms.

Pages

The smallest, most atomic unit of disk storage is the page. Page sizes are either 2K. or 4K, depending upon your architecture. This page size is not changeable. Most of the reports generated by tbcheck and the other utilities express size in number of pages. SQL CREATE TABLE statements refer to sizes in kilobytes. Remember the difference.

Extents

Extents are groups of pages that are contiguous on the disk. The first extent is allocated to a tablespace at table creation time. When a table is created in the database, the table creator can specify the FIRST SIZE and NEXT SIZE in the CREATE TABLE statement. The tablespace is initially created with one extent with a size of FIRST SIZE (in kilobytes, not pages) As the table grows, additional extents of size NEXT SIZE (in kilobytes) are allocated. Although the space within an extent is contiguous, when additional extents are allocated they are not necessarily contiguous with the previous extent(s). The Informix engines attempt to allocate subsequent extents contiguously, but if there is not enough available space to add an extent of NEXT SIZE, the database engine will look elsewhere for the space. When two extents are created contiguously, they are merged into a single extent. Informix defaults to four pages FIRST SIZE and four pages NEXT SIZE if the table builder does not specify the sizes.

Tablespaces

A tablespace is a collection of all of the extents that are allocated for a given table. These extents hold the table's data and indexes and any free space that may be in the table. A tablespace can gain extents, but it never loses them unless the table is rebuilt, either as an explicit rebuild or by use of an ALTER INDEX TO CLUSTER statement or by issuing any ALTER TABLE statement that affects the structure of the table. Another way of rebuilding a table to consolidate extents is to use the ALTER FRAGMENT ON TABLEā€¦INIT syntax in IDS. This is often the fastest way of consolidating a table.

Dbspaces

A dbspace is a collection of chunks. Additional chunks can be added to a dbspace at any time if there is adequate disk space and an available UNIX device or file, and provided that there are adequate slots in the chunks pseudotable. Adding chunks can be done without bringing the database down.

The chunks are known as the primary chunk for the first chunk in a dbspace and as secondary chunks for any additional chunks added. There are also mirror chunks if you have chosen to use Informix mirroring. Databases and tables are created in particular dbspaces. Tables cannot span multiple dbspaces, but databases can span multiple dbspaces. If you run out of space in a dbspace, you must add another chunk. In a pinch, create a chunk out of a cooked UNIX file until you can get your system cleaned out.

It is important to note that once added, a chunk cannot be dropped in OnLine systems. The DBA has to drop the entire dbspace to clear up its chunks. IDS allows you to drop unused chunks from a dbspace.

Blobspaces

A blob is a "binary large object." Blobs are used to store text or binary data. Blob storage can be used for such items as compressed picture files, encoded voice data, word processing documents, and the like. Blob data can be stored either in ordinary tablespaces or in dedicated blobspaces. In most cases, using blobspaces is more efficient, since the blobspace creator can specify the page sizes for the blobspace. Using larger page sizes for blobspaces is a good way to reduce the number of locks required for updating the blobs. This will also allow the engine to handle the large I/O requests that are used to access blobs much more efficiently. Thus, blobspaces can be tailor-made to hold particular types of blob data.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.216.121.55