Migration and expiring data and indexes
IBM Content Manager OnDemand (Content Manager OnDemand) provides multiple methodologies for expiring report data (documents) and their indexes. In this chapter, we describe the overall lifecycle of report data, including loading, storage, migration, and expiration.
In this chapter, we cover the following topics:
Expiring data on OnDemand for i
10.1 Introduction
For this chapter, unless explicitly stated otherwise, the term “data” is used to refer to the report data, the extracted documents or segments, and their related indexes and the extracted resources.
A Content Manager OnDemand system logically stores data in application groups. An application group is defined by the Content Manager OnDemand administrator. It consists of data that has the same indexing, data storage, and expiration requirements. The application group definition also specifies where the report and document data are stored, how long the data is stored, and how the data expires. The method or methods that can be used to expire the data are a function of the application group parameters that are defined before the data is loaded into Content Manager OnDemand. In a Content Manager OnDemand system, data typically goes through a lifecycle of loading, storing, migration, and an expiration process.
10.2 Loading and storing the data
The Content Manager OnDemand architecture allows the control and management of the data throughout its lifecycle. The data lifecycle begins with running an efficient load process. Each load process invocation ingests report data for a specified application group.
During a load process, Content Manager OnDemand stores report (document) data, its resources, and index data, as shown in Figure 10-1.
Figure 10-1 Data and index storage locations
The Content Manager OnDemand load process identifies, segments, and compresses groups of documents into storage objects that are then stored in the Content Manager OnDemand archive, as illustrated in Figure 10-1. To improve the efficiency of the storage process, Content Manager OnDemand aggregates the stored documents (typically a few kilobytes in size) into storage objects. This aggregation provides efficient, high-volume storage, retrieval, and expiration performance.
The object size is defined by clicking Advanced on the Storage Manager tab of the Application Group window. The object size is the size of a storage object in kilobytes (KB). By default, Content Manager OnDemand segments and compresses report data into 10 MB storage objects. For most use cases, the default value is appropriate. Valid values are
1 KB - 150 MB.
 
Object size value: Exercise caution when you change the object size value. Specifying too large or too small a value can adversely affect performance when you load data.
The storage objects are stored in storage sets. The storage sets contain one or more primary storage nodes. The storage node points to the location where the data is stored, which can be cache, the storage manager (Tivoli Storage Manager, object access method (OAM), or Archive Storage Manager (ASM)), or a combination.
The primary storage nodes can be on one or more object servers. When the Load Type is Local, Content Manager OnDemand loads data on the server on which the data loading program runs in the primary storage node with the Load Data property specified. If the Load Type is Local, and the storage set contains primary nodes on different object servers, you must select the Load Data check box for one primary node on each object server.
The storage set must support the number of days that you plan to maintain reports in the application group. For example, if you must maintain reports in archive storage for seven years, the storage set must identify a storage node (or migration policy on an IBM i server) that is maintained by ASM for seven years.
A detailed description of adding storage sets and storage nodes is in Chapter 5, “Storage management” on page 89 and the related OnDemand Administrative Guide.
10.2.1 Storing the report (document) data
To improve efficiency and scalability, stored documents are embedded within storage objects. The storage objects are then stored in cache or a storage manager (OAM, Tivoli Storage Manager, or ASM). The storage objects are eventually expired from the system based on values that are defined by the Content Manager OnDemand administrator. In this section, we describe each scenario and how it is implemented. The parameters that are described in this section are on the Storage Manager tab of the Application Group window unless otherwise specified.
Three sets of data are stored when you load a report:
Index data, which is extracted by the indexing program and used by the search process
Resources, such as an overlay and fonts, which are used to customize the viewed data
Documents (or report segments) that will be viewed
Figure 10-2 on page 222 shows the datasets and illustrates four scenarios of their storage and expiration.
Figure 10-2 Data, resource, and index storage and expiration scenarios
Scenario 1: Cache only, then expiration
In this scenario, the storage object is stored to cache only and it is expired from cache after a predetermined period. Typically, this methodology is employed under the following circumstances:
The life of the data is short, and hierarchical storage management (HSM) is not necessary.
The life of the data is long, and a backup process exists for the data in cache.
The cache device is large enough to hold the total archived data, and the cache device is reliable and performs well.
This method is enabled by selecting a cache-only storage set and entering a number in the Cache Data for __ Days field.
When you select a cache-only storage set, Content Manager OnDemand automatically sets Migrate Data from Cache to No and sets the Expire in __ Days field to the same value as the Cache Data for __ Days field. (The default value is 90 days.)
Selecting a cache-only storage set requires the creation of backup and data management systems that are external to the Content Manager OnDemand system.
 
Cache-only storage: If the storage set contains cache-only storage nodes, ensure that the Cache Data value and the Life of Data and Indexes value are the same. Otherwise, the add or update operation cannot be completed.
Scenario 2: Cache, then migration to storage, and then expiration
In this scenario, the storage object is first stored to cache for a short period, after which it is migrated to a storage manager for long-term storage.
Typically, this methodology is employed under the following circumstances:
Most of the data access occurs during the initial period. After that period, the data is infrequently accessed, if ever. So, after this initial period, the data is migrated to the storage manager.
A performance advantage is possible if you retrieve the data from cache versus if you retrieve the data from the storage manager. The performance advantage for cache can occur if the storage manager is on a device that is separate from the Content Manager OnDemand object server, or if the storage manager is local but the storage device is relatively slow, such as tape or an optical disk.
Migrating data from cache
This function, which can be accessed by clicking Advanced on the Storage Manager tab of the Application Group window, determines how long the data is kept in cache before it is migrated to archive storage (on a potentially slower archive storage device).
The data needs to be kept on a high-performance storage device for the period during which it is retrieved frequently. The storage set must support the type of media that is required to hold reports that are stored in the application group. For example, if you must maintain reports in cache storage for 90 days and in archive storage for seven years, the storage set must identify a storage node (or migration policy) that causes ASM to maintain the data for seven years, and you must select Cache Data for __ Days and enter 90 in its field.
From a user’s perspective, no procedural difference exists in retrieving the data from either cache or archive storage. The only user-perceivable difference is the response time. Various archive storage mechanisms provide different performance profiles. For example, when you use OAM and the data is stored in DB2 tables on disk, the response time is as fast as the cache response time. The main difference in response time is based on the type of disk that is used by either method. Conversely, if the OAM data is stored on optical disks or tape, the response time is increased dramatically. If you use a network-attached Tivoli Storage Manager server, the retrieval rates (throughput and response times) are governed by the Tivoli Storage Manager device and the TCP/IP connection to that device.
Typically in a z/OS environment, data is not stored in cache. Content Manager OnDemand for z/OS customers typically use OAM as their storage manager. OAM supports storing the data directly in DB2 where the storing and retrieval rates are exceptionally fast, which eliminates the need to maintain and monitor cache file systems in the z/OS file system (zFS) or the hierarchical file system (HFS).
Scenario 3: Storage manager only, then expiration
The storage object is stored directly to the storage manager. Typically, this methodology is employed under the following circumstances:
The performance of the storage manager equals the performance of the local file system, which implies that the storage manager stores data to a relatively fast device, such as local disk.
Hierarchical storage management is beneficial. An example is z/OS systems where storing directly to OAM is the most popular solution.
If you do not need to maintain reports in cache storage, select a storage set that identifies a storage node (or migration policy) that is maintained by ASM and set Cache Data to No. Content Manager OnDemand automatically sets Migrate Data from Cache to When Data is Loaded.
Scenario 4: Both cache and storage manager, then expiration
The storage object is stored directly to both cache and the storage manager. After a short period, the data is expired from cache. Then, after a much longer period, the data is expired from the storage manager. Typically, this methodology is used under the following circumstances:
The cache file system allows more efficient data retrieval.
The data needs to be kept for a longer period.
The hierarchical storage management (or other features) of the storage manager is required.
The Cache data field determines whether Content Manager OnDemand stores data in cache storage. If the storage set is a cache-only storage set, Yes is the only selection. If the storage set is an archive manager-controlled storage set (OAM, Tivoli Storage Manager, or ASM), you can optionally add storing the data in cache.
 
Note: Whether the data is stored in cache or in a storage manager, the main performance differences are a result of the following items:
The hardware speed (and I/O channels and interfaces) on which the data is stored.
The location of the hardware device in relations to the object server.
If the hardware device connects over a TCP/IP link, that link can form a bottleneck, depending on the link’s throughput and the required data retrieval rate.
10.2.2 Storing the index data
The Content Manager OnDemand load process extracts document indexes from the report data and stores the indexes in the Content Manager OnDemand database application group data tables. With these indexes, users can efficiently locate, select, and retrieve documents. Typically, indexes are expired when the document data is expired.
Each application group is segmented into multiple physical tables by using a date or a date and time field. The size of each physical table is determined by the Max rows setting. Each row in the table contains a set of user-defined and system-defined indexes that enable the search for a report segment or a document. Index data is loaded into a table. When the Max rows value is reached, the table is closed and a new table is created. The number of physical tables that represent an application group might grow from 1 to n.
10.2.3 Storing the resource data
If data caching is enabled, Content Manager OnDemand stores resources in the cache. Two locations on the Storage Management tab affect how resources are stored:
Resource Data
Document Data
Resource Data
The following selections are possible for Resource Data:
Always Maintain in Cache: The resource data stays in cache forever, and it does not expire.
Cache Resource Data for xxx Days: The resource data stays in cache for xxx number of days before the data expires.
Restore Resources to Cache: The resource data is not in cache, and the resource data is requested. The resources are restored to cache from the storage manager.
The ARSLOAD program saves one copy of a resource on each node for each application group. The resource can be stored multiple times, depending on how the ARSLOAD program compares the data. The ARSLOAD program compares the last 50 resources against the resource that is generated by the load. If a match is not found, a new resource is stored.
When the ARSLOAD program processes a resource group file, it checks the resource identifier to determine whether the resource is present on the system.
If the storage node identifies a client node in OAM or Virtual Storage Access Method (VSAM), the storage manager copies the resources to archive storage.
Document Data
For Document Data, the following selections are valid:
Yes for Cache Data: You can cache document data and resource data or only resource data.
No Cache: Document data is not stored in cache.
Cache Document Data for xxx Days: Document data is stored in cache for xxx number of days before the data expires.
10.3 Configuring for migration and expiration
Many customers choose to expire their document data and indexes somewhere in the range of 5 - 10 years. In one extreme, document and index data might expire daily. In another extreme, document and index data might never expire.
Four typical lifecycle scenarios are common. The Content Manager OnDemand administrator selects the scenario to implement through various parameters (as shown in this section), which are on the Storage Management tab of the Application Group window. The four scenarios are illustrated in Figure 10-2 on page 222.
10.3.1 Migrating index data
Index migration is the process by which Content Manager OnDemand moves index data from the database to archive storage. Index migration optimizes database storage space. With index migration, you can maintain index data for a long time. You typically migrate index data only after users no longer need to access the reports. However, for legal or other requirements, you often must maintain data for a number of months or years.
If a user queries the index data that was migrated, an administrator must act to import a copy of the migrated table or tables by running ARSADMIN (Multiplatforms or z/OS) or Start Import into Content Manager OnDemand (STRIMPOND) on IBM i. After Content Manager OnDemand maintains the imported index data in the database for the number of days that is specified in the Keep Imported Migrated Indexes field, Content Manager OnDemand deletes the data from the database.
Migration of indexes
This configuration is set up by clicking Advanced on the Storage Manager tab of the Application Group window.
This field determines when Content Manager OnDemand migrates index data to archive storage. Choose from No Migration or Migrate After __ Days. As a preferred practice, do not migrate indexes to archive storage. Indexes that are migrated cannot be searched until after they are imported by an administrator. Use this capability only under limited circumstances.
 
Closing index tables: Before you can migrate index data, the index tables must be closed. The following Database Organization field options are valid:
If the Database Organization field for the application group is set to Single Load per table, this option is no longer supported.
If the Database Organization field for the application group is set to Multiple Loads per table, the index table is closed when the Maximum Rows value is reached.
The Single table for all loads option is available for Content Manager OnDemand for z/OS and Content Manager OnDemand for IBM i. Select the Single table for all loads check box if you want to create one database table for each application group. This option is most frequently used when you load a small amount of data. If you select this option, the Maximum Rows field in this window is removed.
To close a table to loading before the Maximum Rows value is reached, run the ARSTBLSP program with the -a1 parameter.
The index data must be migrated only after users no longer need to access the data. If a user must access data in the migrated tables, the process of importing the data into the database requires administrator intervention, and usually results in a significant delay in completing the query. Additional space is required in the database and temporary storage areas to import the data.
To enable the migration of index data, you must define a storage set that identifies a storage node that is maintained by ASM and update the System Migration application group to use the storage set.
10.3.2 Expiring data and indexes
In all four of the storage and expiration scenarios, the index data is stored in the Content Manager OnDemand database in application group data tables. Typically, these indexes are expired when the document data is expired from the system.
Life of Data and Indexes field
This field determines when Content Manager OnDemand deletes documents, resources, and index data from the application group.
The following options are valid:
Never Expires: Content Manager OnDemand maintains the application group data indefinitely.
Expires in __ Days: After the data reaches this threshold, Content Manager OnDemand can delete data from the application group the next time that ARSMAINT (with Content Manager OnDemand for Multiplatforms or z/OS) or Disk Storage Management (DSM) (with IBM i) is run. The default value is 2555 days (seven years). The maximum value that you can use is 99999 days (273 years).
 
Note: If you plan to maintain application group data in archive storage, the length of time that ASM maintains the data must be equal to or exceed the value that you specify for the Life of Data and Indexes field.
Life of Data and Indexes can be used only if ARSMAINT (with Multiplatforms or z/OS) or Disk Storage Management (DSM) (with IBM i) handles the expiration.
10.3.3 Expiring document data
Document data expiration is affected by the document expiration type.
Expiration type
The document expiration type determines how data is deleted from the application group. The expiration type option is on the Storage Management tab of the Application Group window.
Four expiration types are valid:
Load
Storage Manager
Segment
Document
Expiration type: Load
When the expiration type is set to Load, the system deletes an input file (a load) from the application group. Load is the default expiration type. The latest date value from the input file and the Life of Data and Indexes field determine when the data is eligible to be deleted.
 
Note: The application group must have an expiration type of Load if any of the following circumstances are true:
You use or plan to use the Enhanced Retention Management feature.
You use or plan to use the full text search feature.
You use or plan to integrate with the FileNet P8.
For application groups with expiration types of Document, Segment, or Storage Manager, utilities exist to convert these application groups to Load.
Consider engaging IBM Lab Services to provide these services.
With Content Manager OnDemand for Multiplatforms or z/OS, when the expiration type is set to Load, if your object server is on z/OS, and your storage manager is OAM, you can allow OAM to handle the data expiration and Content Manager OnDemand to handle the index expiration by using ARSEXOAM program.
With Content Manager OnDemand for i, when the expiration type is set to Load, you can still allow ASM to handle the data and index expiration by creating an expiration level in the migration policy.
Expiration type: Storage Manager (z/OS)
The storage manager (OAM or VSAM) determines when data is deleted from the system. Storage Manager expiration works with either the ARSEXPIR program or the ARSEXOAM program.
For more information about how to configure the system to use the ARSEXPIR and ARSEXOAM programs, see the IBM Content Manager OnDemand for z/OS Administration Guide:
Storage Manager expiration is supported only on Content Manager OnDemand for z/OS systems.
Expiration type: Segment
The system deletes a segment (table) of data from the application group. The system can delete a segment of data only after the segment is closed and every record in the segment reaches its expiration date.
With Multiple Loads per Database Table enabled, the system uses the maximum number of rows to determine when to close a table. A segment likely contains the data from more than one input file. If the Maximum Rows setting is too large, the segment is not expired until all of the documents in the table reach their expiration dates. If the Maximum Rows setting is too small, segments are created constantly and potentially deleted (based on the expiration date). This large number of tables imposes a performance impact during the search query time and expiration time.
The system derives the expiration date from the Segment field (or the date that the data was loaded, if there is no Segment field) and the Life of Data and Indexes field. If the Segment field contains a date in the MMYY format, data is eligible to be deleted on the first day of the month (MM).
To specify the Segment field, complete the following steps:
1. Click the Field Information tab.
2. Select a date or date and time field.
3. Select the Segment check box.
Expiration type: Document
When the expiration type is set to Document, the system deletes a document from the application group. To determine when to delete a document, the system uses the value of the Expire Date field and the Life of Data and Indexes field. If the Expire Date field contains only the month and year (MMYY format), the system deletes documents on the first day of the month (MM).
To specify the Expire Date field, complete the following steps:
1. Click the Field Information tab.
2. Select a date or date and time field.
3. Select the Expire Date check box.
 
Performance note: Individual document deletion is the most costly type of deletion in terms of processor consumption and run time.
10.3.4 Expiring annotations
Annotations for all application groups are kept in a single application group data table, which allows the expiration of annotations to be controlled at a system-wide level. The Life Of Annotations field setup is on the System Parameters General tab. Annotations can be set to never expire or to expire after N days. After the number of days (N) passes and ARSMAINT is run, Content Manager OnDemand removes the annotation.
10.4 Reloading data
If you are migrating data by unloading and then reloading the data, you need to determine your future expiration policy.
Reloading to change the expiration type
For example, if your current expiration policy is set to Storage Manager but you later want to perform holds on the data, during the migration process (when you create the application group and before you load any data), change your expiration policy from Storage Manager to Load.
When you use the Enhanced Retention Management feature with Content Manager OnDemand or IBM Enterprise Records (formerly IBM FileNet Records Manager), Content Manager OnDemand must be in complete control of expiration processing. Therefore, if you are using Tivoli Storage Manager or OAM, you must disable the ability for either of these storage managers to expire data.
Also, you can use Enhanced Retention Management and Content Federation Services for Content Manager OnDemand only with application groups with an expiration type of Load. For those application groups with expiration types of Document, Segment, or Storage Manager, utilities exist to convert these application groups to an expiration type of Load.
Consider engaging IBM Lab Services to provide these services.
Reloading ad hoc stored documents
If you choose not to take advantage of the ability of Content Manager OnDemand to aggregate documents but instead you choose to load documents ad hoc by using the storeDocument Java API, StoreDoc Object Linking and Embedding (OLE) API, or CommonStore, you must migrate the data later.
If you choose not to take advantage of the ability of Content Manager OnDemand to aggregate documents into 10 MB storage objects, this decision might result in millions of small objects that are stored in your storage manager, which might cause the storage manager to experience performance problems when it migrates these small objects to tape.
 
Note: Consider aggregating these smaller objects into larger objects for performance reasons.
For you to aggregate all of these tiny objects into larger objects after they are stored individually requires that you retrieve and reload them as larger objects. You might want to engage IBM Lab Services to assist you with this task.
Another option is to not migrate objects to tape, but to use another random access hardware device instead.
10.5 Expiration processing on Multiplatforms and z/OS
This section goes into detail about the expiration process on Multiplatforms and z/OS.
10.5.1 Content Manager OnDemand expiration: ARSMAINT
The ARSMAINT program manages application group data in the Content Manager OnDemand database and in cache storage.
You typically run the ARSMAINT program on a regular schedule to perform the following tasks:
Migrate files from cache storage to archive storage.
Delete files from cache storage.
Optionally, migrate index data from the database to archive storage.
Delete index data from the database.
The application group data and the data that you stored in cache are all managed by the ARSMAINT program. It is managed by using the storage management values from the application groups that are defined to the system.
Here are the storage management field values that are used:
Life of Data and Indexes
Length of Time to Cache Data on Magnetic
Length of Time Before Copying Cache to Archive Media
Length of Time Before Migrating Indexes to Archive Media
Length of Time to Maintain Imported Migrated Indexes
Expiration Type
10.5.2 Expiring indexes
The ARSMAINT program uses the Expiration Type field value to determine how to delete index data from an application group. The ARSMAINT program can expire a table of application group data at a time, a load at a time, or individual documents. Ensure that the ARSMAINT program command runs periodically (for example, daily) so that Content Manager OnDemand deletes indexes and cache data (and the storage manager deletes archive data, if applicable). By running the ARSMAINT program regularly, you ensure that the expired documents can no longer be retrieved.
Additionally, you can start manual expiration processing by running the ARSMAINT program from the command line. For example, to run expiration processing, run the following command at the command line:
arsmaint -d
When the ARSMAINT program removes indexes, it saves the following message in the system log:
“128 ApplGrp Segment Expire (ApplGrp) (Segment)”
One message is saved in the system log for each table that was dropped during expiration processing.
 
When to run the maintenance processes: Most maintenance processes need to run when no other applications are updating the database or need exclusive access to the database and when you are sure that no one is retrieving documents from the system. For example, you must not perform maintenance on the database while you are loading data into the system.
The relationship between ARSMAINT and ARSSOCKD processing is illustrated in Figure 10-3.
Figure 10-3 Relationship between ARSMAINT and ARSSOCKD programs
Collecting statistics
Content Manager OnDemand provides two programs to collect statistics on database tables: the ARSDB program and the ARSMAINT program.
When you run the ARSMAINT program to collect statistics, it collects statistics on all of the tables in the database that changed since the last time that you collected statistics. You can automate the collection of statistics by scheduling the ARSMAINT program to run with the appropriate options.
You can use the ARSDB program to collect statistics on the Content Manager OnDemand system tables. The Content Manager OnDemand system tables include the user table, the group table, and the application group table. For most systems, the Content Manager OnDemand system tables require little maintenance. You can probably schedule the ARSDB program to collect statistics once a month (or less often).
The syntax for the ARSDB program is shown:
/opt/IBM/ondemand/V9.0/bin/arsdb <options>
The options are explained:
-e Drop configuration indexes.
-r Create configuration indexes.
-s Collect statistics.
System log messages
When you run the ARSMAINT program, it saves messages about its activities in the system log. The types of messages that are saved in the system log depend on the options that you specify when you run the ARSMAINT program.
The number of messages that are saved in the system log each time that expiration processing runs depends on the following factors:
The options that you specify for the ARSMAINT program
The number of application groups that is processed
The number of segments of data that is processed
The number of cache storage file systems that are defined on the server
 
Note: You see one set of messages for each object server on which you run the ARSMAINT program.
For example, when expiration processing starts on a specified server, you might see the following message:
“109 Cache Expiration (Date) (Min%) (Max%) (Server)”
Migration processing uses the specified date (the default is “today” in internal format). Expiration processing begins on each cache file system that exceeds the Max% (default 80%) and ends when the free space that is available in the file system falls below the Min% (default 80%).
One of these messages shows for each storage object that is deleted from cache storage. A storage object is eligible to be deleted when its “Cache Document Data for n Days” or “Life of Data” period passes (whichever occurs first).
A storage deletion message looks similar to the following message:
“196 Cache Migration (ApplGrp) (ObjName) (Server)”
Also, information-only messages report the percentage of space that is used in the file system.
An information message looks similar to the following message:
“124 Filesystem Statistics (filesystem) (% full) (server)”
Load table (ARSLOAD)
The ARSLOAD table can be used to track loads for expiration. This table maintains a record of all successful loads to application groups with the “expire by load” expiration type.
10.5.3 Removing documents from the Tivoli Storage Manager archive
Removing a document from archive storage means that the backup (if the primary document copy is in cache) or long-term copy (if the primary document copy is in archive) of the document is deleted from the system. You remove documents from archive storage when you no longer have a business or legal requirement to keep them.
A management class contains an archive copy group that specifies the criteria that makes a document eligible for deletion. Documents become eligible for deletion under the following conditions:
Administrators delete documents from client nodes
An archived document exceeds the time criteria in the archive copy group (how long archived copies are kept)
ASM does not delete information about expired documents from its database until expiration processing runs. You can run expiration processing either automatically or manually by command. Ensure that expiration processing runs periodically to allow ASM to reuse storage pool space that is occupied by expired documents.
When expiration processing runs, ASM deletes documents from its database. The storage space that these documents used to occupy then becomes reclaimable. For more information, see “Reclaiming space in storage pools” on page 233.
You control automatic expiration processing by using the expiration processing interval (EXPINTERVAL) in the server options file (dsmserv.opt). You can set the option by editing the dsmserv.opt file. For more information, see the Content Manager OnDemand Installation and Configuration Guide:
You can obtain more information in the “Running expiration processing automatically” section at the following website:
If you use the server option to control when expiration processing occurs, ASM processes expirations each time that you start the server. Afterward, it runs expiration processing at the interval that you specified with the option, which is measured from the start time of the server.
You can manually start expiration processing by running the EXPIRE INVENTORY command. Expiration processing then deletes information about expired files from the database. You can schedule this command by running the DEFINE SCHEDULE command. If you schedule the EXPIRE INVENTORY command, set the expiration interval to 0 (zero) in the server options so that ASM does not run expiration processing when you start the server. You can control how long the expiration process runs by using the DURATION parameter with the EXPIRE INVENTORY command.
Reclaiming space in storage pools
Space on a storage pool volume becomes reclaimable as documents expire or as they are deleted from the volume. For example, documents become obsolete because of aging.
ASM reclaims the space in storage pools based on a reclamation threshold that you can set for each storage pool. When the percentage of space that can be reclaimed on a volume rises above the reclamation threshold, ASM reclaims the volume. ASM rewrites documents on the volume to other volumes in the storage pool, making the original volume available for new documents.
ASM checks whether reclamation is needed at least once each hour and begins space reclamation for eligible volumes. You can set a reclamation threshold for each storage pool when you define or update the storage pool.
During reclamation, ASM copies the files to volumes in the same storage pool unless you specified a reclamation storage pool. Use a reclamation storage pool to allow automatic reclamation for a storage pool with only one drive. See your ASM documentation for details.
After ASM moves all documents to other volumes, one of the following actions occur for the reclaimed volume:
If you explicitly defined the volume to the storage pool, the volume becomes available for reuse by that storage pool.
If the volume was acquired as a scratch volume, ASM deletes the volume from its database.
 
Important: For more information about reclamation processing, including choosing a reclamation threshold, reclaiming volumes in a storage pool with one drive, reclaiming Write Once Read Many (WORM) optical media, reclaiming for copy storage pools, and reclaiming offsite volumes, see your Tivoli Storage Manager documentation.
Managing Tivoli Storage Manager storage
For each automated library, Tivoli Storage Manager tracks in its volume inventory for the library whether a volume has scratch or private status:
A scratch volume is a labeled volume that is empty or contains no valid data, and it can be used to satisfy any request to mount a scratch volume. To support Content Manager OnDemand, you define scratch volumes to Tivoli Storage Manager. Tivoli Storage Manager uses scratch volumes as needed, and returns the volumes to scratch when they become empty (for example, when all data on the volume expires).
A private volume is a volume that is in use or owned by an application, and it might contain valid data. Volumes that you define to Tivoli Storage Manager are private volumes. A private volume is used to satisfy only a request to mount that volume by name. When Tivoli Storage Manager uses a scratch volume, it changes the volume’s status to private. Tivoli Storage Manager tracks whether defined volumes were originally scratch volumes. Volumes that were originally scratch volumes return to scratch status when they become empty.
Secondary storage of storage volumes
For instructions that describe how to handle physical storage volumes and remove them from the library, see the documentation that is provided by the library manufacturer.
For instructions about documentation that you might need to complete when you remove storage volumes from a library and where to store them for safekeeping, see your organization’s media storage guide.
Protecting data with data retention protection
To avoid the accidental erasure or overwriting of critical data, Content Manager OnDemand supports the Tivoli Storage Manager APIs that relate to data retention. Data retention protection prohibits the explicit deletion of documents until their specified retention criterion is met. Although documents can no longer be explicitly deleted, they can still expire.
 
Important notes:
Data retention protection is permanent. After it is turned on, it cannot be turned off.
Content Manager OnDemand does not support deletion on hold data. This feature prevents held data from being deleted until the hold is released.
Tivoli Storage Manager supports two retention policies:
In creation-based retention, the policy becomes active when the data is stored (created) on the Tivoli Storage Manager server. This policy is the default retention policy method and it is used with normal backup/archive clients.
In event-based retention, the policy becomes active when the client sends a retention event to the Tivoli Storage Manager server. The retention event can be sent to the server any time after the data is stored on the server. Until the retention event is received, the data is indefinitely stored on the Tivoli Storage Manager server. For Content Manager OnDemand, the retention event is the call to delete the data. A load, unload, application group delete, or expiration of data triggers the retention event.
If you decide to use these policies in Tivoli Storage Manager, the Content Manager OnDemand scenarios that are described in the rest of this section are supported.
Turning off data retention protection
When you turn off data retention protection, the following descriptions explain what happens when you use the creation-based object expiration policy and the event-based retention object expiration policy:
Creation-based object expiration policy: Content Manager OnDemand issues a delete object command through the Tivoli Storage Manager API. Objects are deleted during the next inventory expiration. If a Content Manager OnDemand application group is deleted, a delete filespace command is issued instead, and the objects are immediately deleted with the file space.
Event-based retention object expiration policy: Content Manager OnDemand issues an event trigger command through the Tivoli Storage Manager API. The status of the objects that are affected changes from PENDING to STARTED, and the objects are expired by Tivoli Storage Manager based on their retention parameters. If the retention parameters are set to NOLIMIT, the objects never expire. If a Content Manager OnDemand application group is deleted, a delete filespace command is issued instead, and the objects are immediately deleted with the file space.
Turning on data retention protection
When you turn on data retention protection, the following descriptions explain what happens when you use creation-based object expiration policy and event-based retention object expiration policy:
Creation-based object expiration policy: Content Manager OnDemand issues no commands to Tivoli Storage Manager. The objects are effectively orphaned by Content Manager OnDemand and are expired by Tivoli Storage Manager based on their retention parameters. If the retention parameters are set to NOLIMIT, the objects never expire.
Event-based retention object expiration policy: Content Manager OnDemand issues an event trigger command through the Tivoli Storage Manager API. The event status of the objects that are affected is changed from PENDING to STARTED, and the affected objects are expired by Tivoli Storage Manager based on their retention parameters. If the retention parameters are set to NOLIMIT, the objects never expire.
If a Content Manager OnDemand application group is deleted, a delete filespace command cannot be used with data retention protection; the operation is treated the same as though a delete is indicated. The status of all of the affected objects is changed from PENDING to STARTED, and the affected objects are expired by Tivoli Storage Manager based on their retention parameters. This action leaves the file space entries in Tivoli Storage Manager, so you must manually delete these entries when the file space is empty (even with data retention protection on).
Recommendations
Consider the following preferred practices when you work with data retention protection:
Set up the application groups to expire by load.
Define the Tivoli Storage Manager archive copy groups to be event-based, and retain data for 0 days.
Run the Tivoli Storage Manager inventory expiration regularly to ensure that expired data is removed.
The following devices are supported by Content Manager OnDemand:
IBM DR450 and DR550
These devices are disk-based systems that contain a Tivoli Storage Manager that runs data retention protection.
EMC Centera
This device is a disk-based system that is treated as a device by Tivoli Storage Manager. Tivoli Storage Manager must run data retention protection.
10.5.4 Storage Manager-based expiration (z/OS only)
The ARSEXOAM and ARSEXPIR programs are used for storage manager-based expiration.
ARSEXOAM
The ARSEXOAM program is used to process the rows in the ARSOAM_DELETE table that indicate that Content Manager OnDemand OAM objects expired and to remove the associated table entries for those objects. This program works for z/OS only.
Figure 10-4 shows how the ARSEXOAM program deletes the index entries for object stores in OAM.
Figure 10-4 How ARSEXOAM deletes index entries for object stores in OAM
 
Notes:
If one object for a load ID is deleted, all of the index entries for that load ID are deleted.
Index entries of all OAM objects that are recorded as being deleted by rows in the ARSOAM_DELETE table are deleted regardless of the settings in the Life of Data and Indexes section on the Storage Management tab of the application group.
If you plan to use Storage Management expiration, ensure that you set the expiration type of all application groups to Storage Manager.
The recommended expiration type for Content Manager OnDemand is Load. Content Manager OnDemand supports the expiration type of Load with the use of ARSEXOAM for expiring the indexes in Content Manager OnDemand.
Storage Manager expiration is incompatible with Enhanced Retention Manager and Content Federation Services for Content Manager OnDemand.
The following parameters relate to the ARSEXOAM program:
COMMITCNT
Specifies the number of fetches from the ARSOAM_DELETE, ARSOD, and ARSODIND tables that are performed between COMMITS.
If this parameter is not specified, 1000 is used. If 0 is specified, no commits are performed while fetching. The ARSOD and ARSODIND tables are processed only if Content Manager OnDemand for OS/390 Version 2 migrated index rows are being deleted.
UNLOADMAX
Specifies how many objects to hold in memory at any time. The default is 100,000.
REQLIMIT
Specifies the maximum number of objects to send to the server in each request. This number defaults to the ARS_EXPIRE_REQLIMIT parameter in the ars.cfg, or 100 if ARS_EXPIRE_REQLIMIT is not specified. Load IDs for the same application group can be grouped up to the ARS_EXPIRE_REQLIMIT value. All load IDs in a single expiration request must belong to the same application group. For example, adding ARS_EXPIRE_REQLIMIT=100 allows up to 100 load IDs for an application group to be processed at a time. The optimum value to use is a function of multiple variables, including table size. Suboptimal values might lead to table scans. EXPLAINs with various SQL that uses the type of SQL that is involved help determine whether an index or a table scan occurs.
ARSEXPIR
The ARSEXPIR program can be used to process System Management Facility (SMF) records that indicate that Content Manager OnDemand objects expired and to remove the associated index entries for those objects.
Figure 10-5 on page 238 illustrates two methods that the ARSEXPIR program uses to expire OAM and VSAM objects.
Figure 10-5 Two ways ARSEXPIR expires OAM and VSAM objects
The ARSEXPIR program uses SMF type 65 (for VSAM objects) or SMF type 85 (for OAM objects). The installation must collect and install ARSSMFWR as the CBRHADUX OAM auto-delete exit. For more information, see “Deleting OAM and VSAM Objects” in the IBM Content Manager OnDemand for z/OS: Administration Guide, SC19-1213.
ARSSMFWR determines which objects were deleted. The ARSEXPIR program then instructs the Content Manager OnDemand server to remove the index entries.
 
Notes:
If one object for a load ID is deleted, all of the index entries for that load ID are deleted.
Index entries of all objects that are recorded as being deleted by the SMF records are deleted regardless of the settings in the Life of Data and Indexes section on the Storage Management tab of the application group. If you want to use Storage Management expiration, ensure that you set the expiration types of all application groups to Storage Manager.
Important keywords that affect the expiration performance are COMMITCNT, REQLIMIT, UNLOADMAX, and USERSMF:
COMMITCNT
This keyword specifies the number of fetches from the ARSOD and ARSODIND table that are to be performed between COMMITS. If this number is not specified, 1000 is used. If this number is 0, no commits are performed while fetching. This parameter is used only if Content Manager OnDemand for OS/390 Version 2 migrated index rows are being deleted.
REQLIMIT
This keyword specifies the maximum number of objects to send to the server in each request. The REQLIMIT keyword defaults to the ARS_EXPIRE_REQLIMIT parameter in the ars.cfg, or 100 if ARS_EXPIRE_REQLIMIT is not specified.
UNLOADMAX
Specifies how many objects to hold in memory at any one time. The default is 100,000.
USERSMF
This keyword specifies the SMF record type that is written by the ARSSMFWR exit (if used). This parameter can be omitted if ARSSMFWR is omitted. For more information about the ARSSMFWR exit, see IBM Content Manager OnDemand for z/OS Configuration Guide, SC19-3363.
10.6 Expiring data on Content Manager OnDemand for i
In most circumstances, you must run Disk Storage Management (DSM) and Archived Storage Management (ASM) to expire data from Content Manager OnDemand for i.
10.6.1 Content Manager OnDemand expiration
Disk Storage Management (DSM) is the process for performing Content Manager OnDemand based expiration. DSM performs the following functions:
Controls the expiration of indexes and data from Content Manager OnDemand (if you do not use storage manager-based expiration).
Migrates data from cache to the storage manager (if the Migrate Data from Cache option is not set to When data is loaded).
Expires data from cache if Cache Data is set to Yes.
If you do not run DSM, your disk storage requirements for Content Manager OnDemand might be higher than expected. The number of objects that are stored in the integrated file system (IFS) might also be higher than necessary, which results in longer save and restore times.
 
Note: If you have never run DSM, the first execution of the Start Disk Storage Management (STRDSMOND) command might last for an extended period.
If you want to configure Content Manager OnDemand so that DSM is not required in the future, see the section “Eliminating the need to run Disk Storage Manager (DSM)” in the latest Content Manager OnDemand for i Common Server Administration Guide, SC19-2792.
10.6.2 Storage Manager expiration
ASM is the process for performing Storage Manager-based expiration. ASM performs the following functions:
Controls the expiration of indexes and data from Content Manager OnDemand (if you use Storage Manager-based expiration)
Aggregates data before it migrates it to archive media (if you select the Aggregation option in the migration policy)
Moves data between storage levels of the migration policy
If you do not run ASM, your disk storage requirements for Content Manager OnDemand are probably higher than expected. The number of objects that are stored in the IFS is also higher than necessary, which results in longer save and restore times.
If you never run ASM, the first execution of the Start Archived Storage Management (STRASMOND) command or the Start Disk Storage Management (STRDSMOND) command with the STRASMOND parameter set to YES might last for an extended period.
For more information about expiring archives by using ASM, see Expiration processing in Common Server Archive Storage Manager (ASM):
 
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.133.152.198