Architecture, components, and functional characteristics
This chapter provides a description of the architecture of the IBM TS7700. The description includes general virtualization concepts, new concepts, and functions that were introduced with TS7700 R4.1. In addition, configuration examples are addressed.
First, stand-alone and clustered configuration features are explained, followed by features that apply only to multi-cluster grid configurations.
The following topics are described:
Terms and expressions that are used to describe the TS7700
Architecture of the TS7700
Underlying concepts of tape virtualization within the TS7700
Differences of the TS7700 models
Hardware components for TS7700 Release 4.1
Attachment of the TS7740 and TS7700 to an IBM TS3500 or TS4500 tape library
Tape drive support
Multi-cluster grid examples
Functional characteristics of the TS7700:
 – Replication policies
 – Tape Partitions and delay premigration concepts on a TS7700T
 – Cluster families
 – Logical Write Once Read Many (LWORM) support
 – Enhanced cache removal policies for grids that contain one or more TS7700D clusters
 – IBM FlashCopy® and Selective write protect for disaster recovery (DR) testing
 – Device allocation assistance (DAA)
 – Scratch allocation assistance (SAA)
 – Selective Device Access Control (SDAC)
 – On-demand support of up to 4,000,000 logical volumes in a grid
 – Support up to 496 devices per cluster
User security and user access enhancements
Grid network support
This chapter includes the following sections:
2.1 TS7700 architecture
The architectural design of the IBM TS7700 and many of its capabilities are addressed. A short description of the original IBM Virtual Tape Server (VTS) architecture is included to help you understand the differences.
The TS7700 family now includes three different models:
IBM TS7760D and IBM TS7760T
IBM TS7720D and IBM TS7720T
IBM TS7740
Though there are some differences between these models, the underlying architecture is the same. If a function or feature is unique or behaves differently for a given model, it is clearly stated. If not, you can assume that it is common across all models.
When the TS7700 is referenced, it implies all models and types, including the TS7760D, TS7760T, TS7720D, TS7720T, and TS7740. When the function is only applicable to models that are disk-only, then TS7700D is used, if they are only applicable to tape attached models, then TS770T is used. If the function is only applicable to a specific version of the TS7700, TS7760D, TS7760T, TS7720D, TS7720T, TS7740 or a subset, the product name or names are referenced.
2.1.1 Monolithic design of a Virtual Tape Server
The previous IBM 3494 VTS performed all functions within a single IBM System p server. The previous generation VTS also serves as the Redundant Array of Independent Disks (RAID) disk controller. The RAID was tightly integrated into the system. Peer-to-peer (PTP) functions were created with additional hardware components, and limited to two sites. The complex design had reached an architectural limit that made it difficult to further enhance. A fundamental architecture change was required.
IBM decided that it was time to create a next-generation solution with a focus on scalability and business continuance. Many components of the original VTS were retained, although others were redesigned. The result was the TS7700 Virtualization Engine.
2.1.2 Modular design of the TS7700
The modular design of the TS7700 separates the functions of the system into smaller components. These components have well-defined functions that are connected by open interfaces. The platform enables components to be scaled up from a small configuration to a large one. This provides the capability to grow the solution to meet your business objectives.
The TS7700 is built on a multi-node architecture. This architecture consists of nodes, clusters, and grid configurations. The elements communicate with each other through standard-based interfaces. In the current implementation, a virtualization node (vNode) and hierarchical data storage management node (hnode) are combined into a general node (gnode), running on a single System p server.
A TS7700 and the previous VTS design are shown in Figure 2-1.
Figure 2-1 TS7700 virtualization design compared to a VTS design
Nodes
Nodes are the most basic components in the TS7700 architecture. A node has a separate name, depending on the role that is associated with it. There are three types of nodes:
Virtualization nodes
Hierarchical data storage management nodes
General nodes
Virtualization node
A vNode is a code stack that presents the virtual image of a library and drives to a host system. When the TS7700 is attached as a virtual tape library, the vNode receives the tape drive and library requests from the host. The vNode then processes them as real devices process them. It then converts the tape requests through a virtual drive and uses a file in the cache subsystem to represent the virtual tape image. After the logical volume is created or altered by the host system through a vNode, it is in disk cache.
Hierarchical data storage management node
An hierarchical data storage management node (hnode) is a code stack that performs management of all logical volumes that are in disk cache or physical tape. This management occurs after the logical volumes are created or altered by the host system through a vNode.
The hnode is the only node that is aware of physical tape resources and the relationships between the logical volumes and physical volumes. It is also responsible for any replication of logical volumes and their attributes between clusters. An hnode uses standardized interfaces, such as Transmission Control Protocol/Internet Protocol (TCP/IP), to communicate with external components.
General node
A general node (gnode) can be considered a vNode and an hnode sharing a physical controller. The current implementation of the TS7700 runs on a gnode. The engine has both a vNode and hnode that are combined in an IBM POWER8 processor-based server.
Figure 2-2 shows a relationship between nodes.
Figure 2-2 Node architecture
Cluster
The TS7700 cluster combines the TS7700 server with one or more external (from the server’s perspective) disk subsystems. This subsystem is the TS7700 cache controller. This architecture enables expansion of disk cache capacity.
Figure 2-3 shows the TS7700 configured as a cluster.
Figure 2-3 TS7700 cluster
A TS7700 cluster provides Fibre Channel connection (IBM FICON) host attachment, and a default count of 256 virtual tape devices. Features are available that enable the device count to reach up to 496 devices per cluster. The IBM TS7740 and IBM TS7700T cluster also includes the assigned TS3500 or TS4500 tape library partition, fiber switches, and tape drives. The IBM TS7720 and TS7760 can include one or more optional cache expansion frames.
Figure 2-4 shows the components of a TS7740 cluster.
Figure 2-4 TS7740 cluster components
The TS7700 Cache Controller and associated disk storage media act as cache storage for data. The capacity of each disk drive module (DDM) depends on your configuration.
The TS7700 Cache Drawer acts as an expansion unit for the TS7700 Cache Controller. One or more controllers and their expansion drawers are collectively referred to as the TS7700 Tape Volume Cache, or often named the TVC. The amount of cache available per TS7700 Tape Volume Cache depends on your configuration.
The TS7760 Cache (CSA, CXA) provides a new TVC protection, called Dynamic Disk Pooling (DDP).
The TS7740 Cache provided a RAID 6 (since CC9) and RAID 5 (up to CC8) protected TVC to temporarily contain compressed virtual volumes before they are offloaded to physical tape.
The TS7720 and TS7720T CS9/CS9 use RAID 6 protection. If an existing installation is upgraded, the existing cache is protected by RAID6, where the new CSA/CXA cache uses DDP for protection.
2.1.3 Previous Peer-to-Peer Virtual Tape Server design
In the IBM 3494 PtP VTS, you needed external Virtual Tape Controller (VTC) hardware to present two VTSs as a single library to the host. The VTCs were connected to the host through IBM Enterprise Systems connection (ESCON) or FICON. Each VTC was connected to both VTSs. Only two VTSs were supported in P2P configuration.
This limited P2P design was one of the main reasons that the previous VTS needed to be redesigned. The new TS7700 replaced the P2P concepts with an industry-leading new technology referred to as a grid.
2.1.4 Principles of grid design
The TS7700 R4.1 grid configuration is a series of two, three, four, five, six, seven, and eight clusters. These clusters are connected by grid links to each other by a grid network to constitute resilient DR and HA solutions.
 
Fast path: Seven and eight cluster grid configurations are available with a request for price quotation (RPQ).
A grid configuration and all virtual tape drives emulated in all configured clusters appear as one large library to the attached IBM Z hosts.
Logical volumes that are created within a grid can be selectively replicated to one or more peer clusters by using a selection of different replication policies. Each replication policy or Copy Consistency Point provides different benefits, and can be intermixed. The grid architecture also enables any volume that is located within any cluster to be accessed remotely, which enables ease of access to content anywhere in the grid.
In general, any data that is initially created or replicated between clusters is accessible through any available cluster in a grid configuration. This concept ensures that data can still be accessed even if a cluster becomes unavailable. In addition, it can reduce the need to have copies in all clusters because the adjacent or remote cluster’s content is equally accessible.
A grid can be of all one TS7700 model type, or any mixture of models types, including TS7760D, TS7760T, TS7720D, TS7720T, and TS7740. When a mixture of models is present within the same grid, it is referred to as a hybrid grid.
The term multi-cluster grid is used for a grid with two or more clusters. For a detailed description, see 2.3, “Multi-cluster grid configurations: Components, functions, and features” on page 61.
2.1.5 TS7700 Models
With R4.0 a new model that is called TS7760 was introduced in the TS7700 family. A disk only model is available (referenced as TS7760D). The TS7760T provides a tape attachment to a physical library, either IBM TS3500 or IBM TS4500. In an IBM TS3500, all types of tape drives of the TS1100 family are supported. In an IBM TS4500, the TS1140 and TS1150 can be used. Both models provide up to 2.5 petabytes (PB) of usable data in cache.
The TS7760 provides a disk only model and an option to attach to a physical tape library with TS1100 tape drives, as did its predecessor the TS7720. To support the IBM TS4500, R4.0 and later needs to be installed on a TS7720T. Both models deliver a maximum of 2.5 PB usable data in cache.
The TS7740 provides up to 28 terabytes (TB) of usable disk cache space, and supports the attachment to the IBM TS3500 tape library and TS1100 family of physical tape drives.
2.1.6 Introduction of the TS7700T
When the TS7700 was first created, the product family’s first model (TS7740) was designed with disk cache and physical tape concepts similar to the original VTS. The disk cache is primarily used for temporary storage, and most of the solution’s capacity was provided by using back-end physical tape. As the IBM Z tape industry evolved, there became a need for large capacity, disk-only tape solutions, enabling more primary data workloads to move to tape with minimal penalty in performance.
The TS7720 was introduced as a response to this need. Hybrid grid configurations combined the benefits of both the TS7720 (with its large disk cache) and the TS7740 (with its economical and reliable tape store). Through Hybrid grids, large disk cache repositories and physical tape offloading were all possible. The next evolution of the TS7700 was combining the benefits of both TS7720 and TS7740 models into one solution.
Through the combination of the technologies, the TS7760, TS7720, TS7740, and Hybrid grid benefits can now be achieved with a single product. All features and functions of the TS7720, the TS7740, and hybrid grid have been maintained, although additional features and functions have been introduced to further help with the industry’s evolving use of IBM Z virtual tape.
In addition to the features and functions that are provided on the TS7700D and TS7740, two key, unique features were introduced as part of the R3.2 TS7720T product release:
Disk Cache Partition, which provides better control of how workloads use the disk cache
Delay Premigration, or the ability to delay movement to tape
TS7700T disk cache partitioning
With the TS7700T supporting up to 2.5 PB of disk cache, the traditional TS7740 disk cache management style might not be adequate for many workload types. Different workloads can have different disk cache residency requirements, and treating all of the types with one recently used algorithm isn’t always sufficient. A method to manage disk cache usage at workload granularity might be needed.
The TS7700T supports the ability to create 1 - 7 tape-managed partitions. Each partition is user-defined in 1 TB increments. Workloads that are directed to a tape-managed partition are managed independently concerning disk cache residency. After you create 1 - 7 tape-managed partitions, the disk cache capacity that remains is viewed as the resident-only partition. Partitions can be created, changed, and deleted concurrently from the Management Interface (MI).
Within this document, the tape-managed partitions are referred to as CP1 - CP7, or generically as cache partitions (CPx). The resident-only partition is referred to as CP0. The partitions are logical, and have no direct relationship to one or more physical disk cache drawers or types. All CPx partitions can use back-end physical tape, but the CP0 partition has no direct access to back-end physical tape. In addition, CPx partitions have no direct relationship to physical tape pools. Which partition and which pool are used for a given workload is independent.
Storage Class (SC) is used to direct workloads to a given partition. There is no automatic method to have content move between partitions. However, it can be achieved through mount/demount sequences, or through the LIBRARY REQUEST command.
TS7700T tape-managed partitions (CP1-CP7, CPx)
At least one tape-managed partition must exist in a TS7700T configuration. The default is CP1. However, you can configure new partitions and delete CP1 if you like, if at least one other CPx partition exists. Each CPx partition can be a unique customized size in 1 TB increments. The minimum size is 1 TB, and the maximum size is the size of the TS7700T disk cache minus 2 TB. CPx partitions support the movement of content to tape (premigration), and the removal of content from disk cache (migration).
Workloads that are directed to a given CPx partition are handled similarly to a TS7740, except that the hierarchal storage management of the CPx content is only relative to workloads that target the same partition. For example, workloads that target a particular CPx partition do not cause content in a different CPx partition to be migrated. This enables each workload to have a well-defined disk cache residency footprint.
Content that is replicated through the grid accepts the SC of the target cluster, and uses the assigned partition. If more than one TS7700T exists in a grid, the partition definitions of the two or more TS7700Ts do not need to be the same.
TS7700T CPx premigration queue
All CPx partitions share a premigration queue. The maximum amount of content that can be queued in the premigration queue is limited by a new TS7700T feature code FC5274. Each feature provides 1 TB of premigration queue. The minimum is one feature for 1 TB, and the maximum is of 10 features for 10 TB of premigration queue.
Content queued for premigration is already compressed, so the premigration queue size is based on post-compressed capacities. For example, if you have a host workload that compresses at 3:1, 6 TB of host workload results in only 2 TB of content queued for premigration.
PMPRIOR and PMTHLVL are LIBRARY REQUEST-tunable thresholds that are used to help manage and limit content in the premigration queue. As data is queued for premigration, and premigration activity is minimal until the PMPRIOR threshold is crossed. When crossed, the premigration activity increases based on the defined premigration drive count.
If the amount of content in the premigration queue continues to increase, the PMTHLVL threshold is crossed, and the TS7700T intentionally begins to throttle inbound host and copy activity into all CPx partitions to maintain the premigration queue size. This is when the TS7700T enters the sustained state of operation. The PMPRIOR and PMTHLVL thresholds can be no larger than the FC5274 resulting premigration queue size. For example, if three FC5274 features are installed, PMTHLVL must be set to a value of 3 TB or smaller.
After a logical volume is premigrated to tape, it is no longer counted against the premigration queue. The volume exists in both disk cache and physical tape until the migration policies determine whether and when the volume should be deleted from disk cache.
How many FC5274 features should be installed is based on many factors. The IBM tape technical specialists can help you determine how many are required based on your specific configuration.
TS7700T CPx delay premigration
With more workloads benefiting from a larger disk cache, you might determine that copying data to tape isn’t necessary unless the data has aged a certain amount of time. This provides a method to retain data only in disk cache until a delay criteria is met, and only then queuing it for premigration. If the logical volume expires before this delay period, the data is never moved to tape. This reduces physical tape activity to only the workload that is viewed as archive content. It also can greatly reduce the back-end physical tape reclamation processing that can result from data that expires quickly.
Another reason that you might want to delay premigration is to run the TS7700T longer in the peak mode of operation, which can help reduce your job run times. By delaying premigration, the amount of content in the premigration queue can be reduced, which helps eliminate any throttling that can occur if the PMTHLVL threshold is crossed while running your workloads.
The delay normally is enough to get you through your daily job window. However, this is only valid for environments that have a clearly defined window of operation. The delayed premigration content is eventually queued, and any excessive queuing past the PMTHLVL threshold might result in heavy throttling. If workloads continue throughout the day, this might not be a feasible option.
The delay period is in hours, and is an attribute of the SC. Independent of which CPx partition that the data is assigned to, the delay period can be unique per workload.
TS7700T CPx Migrations
TS7700T migration operates similarly to the TS7740, except that each CPx-configured partition is treated independently concerning space management. Migration is the process of removing a logical volume in disk cache after first putting a copy on physical tape and meeting the following criteria:
A copy of the logical volume must already be premigrated to primary physical tape, and if configured, secondary physical tape.
Peer clusters in a TS7700 grid configuration have completed copies of the logical volume.
This prerequisite can be lifted when an excessive backlog of copies exists.
The preference group criteria has been met.
PG0: Volumes are removed from disk cache immediately, independent of which CPx partition it is contained in.
PG1: Volumes are removed from disk cache based on a least recently used algorithm. Only when space is required for a specific CPx partition are these logical volumes migrated.
TS7700T resident-only partition (CP0)
Logical volumes that are stored in CP0 are treated the same as volumes in a TS7700 disk-only cluster. The logical volumes in CP0 cannot directly move to tape. Auto-removal policies are applicable to the content assigned to the CP0 partition, including pinned, prefer keep, prefer remove, and retention policies. Content that is assigned to CPx partitions is never a candidate for auto removal. If CP0 is less than 10 TB in usable size, auto removal is disabled.
The CP0 usable size is determined by the remaining configured capacity after defining one or more CPx partitions. The CP0 partition must be at least 2 TB, and can be as large as the configured cache size minus 3 TB. As CPx partitions are created or increased in size, CP0 loses usable capacity. As CPx partitions are deleted or decreased in size, CP0 gains usable capacity. Other than workloads directly targeting CP0, the CP0 usable capacity is also used for FlashCopy processing, and for overcommit or overspill, as described in the next section.
Overcommit and overspill
When a CPx partition contains more content than its configured size, the partition is moved to the overcommit state. The partition remains in this state until the excess can be migrated. There are a few ways to have a partition enter the overcommitted state:
An existing partition is decreased if configured by a user to a new size value that is smaller than the total amount of data currently resident in the CPx partition that is being resized.
An excess of volume content is moved from one partition to another as part of a policy change.
CPx receives more content during a workload than can be premigrated before the partition fills. This is referred to as overspill.
In each of these cases, the excess space is taken from CP0’s usable capacity, and the TS7700T is not viewed as degraded. It is by design that CP0 lends capacity for these expected use cases.
If CP0 has no remaining free space, further overspill is prevented. The CPx partitions are not allowed to overcommit any further. A new LI REQUEST command was introduced in R4.0 to reserve space for the CP0, which cannot be used for overspill purposes.
Moving logical volumes between CP partitions
In the following scenarios, a logical volume can be moved from one partition to another:
A virtual volume’s policy changes during a mount/demount sequence, and a new SC rule is applied. The movement occurs when the volume is closed or unmounted. While mounted, the volume remains in its originally assigned partition.
The LI REQ PARTRFSH command was run, which enables a partition assignment change to occur without a mount/demount sequence. When using PARTRFSH, no other construct changes are refreshed other than the assigned partition. For example, pool properties that are assigned to the volume during its last mount/demount sequence are retained. If more construct changes are required, such as moving data from one pool to another, use a mount/demount sequence instead. The command must be issued to a TS7700T distributed library, and the command supports only a single volume at a time within the current release.
In either case, logical volumes can be moved from CP0 to CPx, from CPx to CP0, and from CPx to a different CPx partition. Movement rules are as described.
The following rules apply for CPx to CPx movements:
Virtual volumes that are still in disk cache are reassigned to the new partition, and adjust the active content of both the source and target partition.
Any delay in premigration continues to be respected, assuming that the target partition can accommodate the delay.
The movement is part of a mount/demount sequence, and any delay relative to the last access is refreshed.
Any other changes in constructs, such as preference group, premigration delay rules, and pool attributes, are only kept if the movement is the result of a mount/demount sequence.
The following rules apply for CPx to CP0 movements:
Virtual volumes only in CPx cache are reassigned to CP0.
Virtual volumes currently in CPx cache and on tape have the cache copy that is reassigned to CP0, and all copies on tape are invalidated.
Virtual volumes currently in CPx and only on tape have the partition reassigned, but a copy is not automatically moved to CP0 disk cache. If a recall occurs later, the instance recalled into CP0 disk cache becomes the only copy, and all instances on physical tape become invalid. Until then, the content remains only on physical tape.
Any other changes in constructs, such as removal properties, are only kept if the movement is the result of a mount/demount sequence.
The following rules apply for CP0 to CPx movements:
The partition assignment of the logical volume in disk cache is immediately reassigned.
If no delay premigration is active for the assigned SC, the volume is immediately queued for premigration.
If a delay premigration is active for the assigned SC, the delay criteria based on last access or creation time is accepted. The movement itself does not alter the last access or creation time reference point.
If the logical volume was previously migrated in a CPx partition, moved to CP0, and then moved back to a CPx partition before it was recalled into CP0 disk cache, it operates the same as though it is a CPx to CPx move.
Any other changes in constructs, such as preference group, premigration delay rules, and pool attributes, are only kept if the movement is the result of a mount/demount sequence.
2.1.7 Management of the TS7700
The management of the TS7700 is based on the following key components:
TS7700 MI
TS3500 or TS4500 web interface
Advanced (outboard) policy management
Data Facility Storage Management Subsystem (DFSMS) integration with the TS7700 to provide the storage management subsystem (SMS) constructs’ names for policy management
Host commands to control the TS7700
Messages for automated alerting and operations
Tools
Call home support
TS7700 Management Interface
The TS7700 MI is a web-based graphical user interface (GUI). It is used to configure the TS7700, set up outboard policy management behavior, monitor the systems, and perform many other customer-facing management functions.
TS3500/ TS4500 web interface
The TS3500 and TS4500 web interface is used to configure and operate the tape library, particularly for the management of physical drives and media.
Advanced (outboard) policy management
Policy management enables you to better manage your logical and stacked volumes through the usage of the SMS construct names. With IBM z/OS and DFSMS, the SMS construct names that are associated with a volume (Storage Class (SC), Storage Group (SG), Management Class (MC), and Data Class (DC)) are sent to the library.
When a volume is written from load point, the eight-character SMS construct names (as assigned through your automatic class selection (ACS) routines) are passed to the library. At the library’s MI, you can then define policy actions for each construct name, enabling you and the TS7700 to better manage your volumes. For the other IBM Z platforms, constructs can be associated with the volumes, when the volume ranges are defined through the library’s MI.
DFSMS constructs in IBM Z platform and their equivalents in TS7700
In IBM Z platform, the following DFSMS constructs exist:
Storage Class
Storage Group
Management Class
Data Class
Each of these constructs is used to determine specific information about the data that must be stored. All construct names are also presented to the TS7700. They need to have an equivalent definition at the library. You can define these constructs in advance on the TS7700 MI. For more information, see “Defining TS7700 constructs” on page 555. If constructs are sent to the TS7700 without having predefined constructs on the TS7700, the TS7700 creates the construct with default parameters.
 
Tip: Predefine your SMS constructs on the TS7700. The constructs that are created automatically might not be suitable for your requirements.
Storage Class in SMS
SCs perform three functions. They decide whether data is SMS-managed. They decide the level of performance of a data set. They decide whether you can override SMS and place data on specific volumes.
Storage Class in TS7700
The SC in TS7700 is used to set the cache preferences for the logical volume. This definition is cluster-based.
Storage Group in SMS
SGs are the fundamental concept of DFSMS. DFSMS groups disks together into storage pools, so you allocate by storage pool. Storage pools can also consist of tape volumes. This enables SMS to direct tape allocations to a VTS or automated library. For tape SGs, one or more tape libraries can be associated with them.
Connectivity is defined at both the library level and the SG level. If an SG is connected to certain systems, any libraries that are associated with that SG must be connected to the same systems. You can direct allocations to a local or remote library, or to a specific library by assigning the appropriate SG in the SG ACS routine.
Storage Group in TS7700
The SG in the TS7700 is used to map the logical volume to a physical pool and to the primary pool number. This definition is cluster-based.
Management Class in SMS
MCs are used to determine backup and migration requirements. When assigned to data sets, MCs replace and expand attributes that otherwise are specified on job control language (JCL) data definition (DD) statements, IDCAMS DEFINE commands, and DFSMS Hierarchical Storage Manager (DFSMShsm) commands. An MC is a list of data set migration, backup, and retention attribute values. An MC also includes object expiration criteria, object backup requirements, and class transition criteria for the management of objects.
Management Class in TS7700
From the TS7700 side, the MC is used for functions, such as Copy Policy, Selective Dual Copy Pool (depending on the physical pool, this function might be used for Copy Export), Retain Copy Mode, and Scratch Mount Candidate for Scratch Allocation assistance. This definition is cluster-based.
Data Class in SMS
The DATACLAS construct defines what a file looks like. The Data Class ACS routine is always started, even if a file is not SMS-managed. A Data Class is only ever assigned when a file is created and cannot be changed. A file is described by its data set organization, its record format, its record length, its space allocation, how many volumes it can span, its data compaction, its media type, and its recording information.
Data Class TS7700
The DATACLAS in the TS7700 is used for the definition of the virtual volume size, and whether it must be treated as an LWORM volume. This definition is shared on the grid. If you define it on one cluster, it is propagated to all other clusters in the grid. In R4.1.2, new compression optimization options are supported along with the 3490 block handling counters.
 
Important: The DATACLAS assignment is applied to all clusters in a grid when a volume is written from beginning of tape. Given that SG, SC, and MC can be unique per cluster, they are independently recognized at each cluster location for each mount/demount sequence.
Host commands
Several commands to control and monitor your environment are available. They are described in detail in Chapter 6, “IBM TS7700 implementation” on page 225, Chapter 8, “Migration” on page 299, Chapter 9, “Operation” on page 339, and Appendix F, “Library Manager volume categories” on page 877. These major commands are available:
D SMS,LIB Display library information for composite and distributed libraries.
D SMS,VOLUME Display volume information for logical volumes.
LI REQ The LIBRARY REQUEST command, also known as the Host Console Request function, is initiated from a z/OS host system to a TS7700 composite library or a specific distributed TS7700 library within a grid. Use the LIBRARY REQUEST command to request information that is related to the current operational state of the TS7700, its logical and physical volumes, and its physical resources.
The command can also be used to run outboard operations at the library, especially setting alerting thresholds. Because all keyword combinations are passed to the TS7700 and all responses are text-based, the LIBRARY REQUEST command is a primary means of adding management features with each TS7700 release without requiring host software changes.
The LIBRARY REQUEST command can be issued from the MI for TS7700 clusters that are running R3.2 or later. When settings are changed, the TS7700 behavior can change for all of the hosts that use the TS7700, which you need to consider when changing settings by using the LI REQ command. For more information, see the white paper found on the following website: http://www.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/WP101091
DS QLIB Use the DEVICE SERVICES QUERY LIBRARY command to display library and device-related information for the composite and distributed libraries.
There is a subtle difference, but it is important to understand. The DS QLIB command can return different data, depending on which host it is entered. An LI command returns the same data without regard to the host if both hosts have full accessibility.
Automation handling and messages
R4.1.2 introduces the possibility on the MI to influence the following for each message independently:
By severity
If the message will be presented to the notification channels
To which notification channels (Host, SNMP, RSYSLOG server) the message is sent
Sending additional information to the notification channels
The changes to the event notification settings are grid wide and will be persistent if new microcode levels are installed.
In addition, you can back up these settings and restore them independently on other grids to improve the ease of managing maintenance of a multi grid environment.
For information about content-based retrieval (CBRxxxx) messages, see the hTS7700 Series Operator Informational Messages white paper at:
Tools
Many helpful tools are provided for the TS7700. For more information, see Chapter 9, “Operation” on page 339.
Remote System Log Processing Support (RSYSLOG)
Because security is becoming a critical aspect for most customers, a new audit trail is introduced with R4.1.2. All events and messages can now be send to one or two different RSYSLOG servers. RSYSLOG is an open standard utility that uses TCP for the message transport.
A new MI interface is provided to configure a TCP port and a IP address to a RSYSLOG server to send the events from TS7700.
Call Home support
The Call Home function automatically generates a service alert when a problem is detected within the subsystem, such as a problem in the following components:
Inside the TS7700 components themselves
In the associated TS3500 or TS4500 library and tape drives
In the cache disk subsystem
Status information is transmitted to the IBM Support Center for problem evaluation. An IBM Service Support Representative (IBM SSR) can be dispatched to the installation site if maintenance is required. Call Home is part of the service strategy that is adopted in the TS7700 family. It is also used in a broad range of tape products, including VTS models and tape controllers, such as the IBM System Storage® 3592-C07.
The Call Home information for the problem is transmitted with the appropriate information to the IBM product support group. This data includes the following information:
Overall system information, such as system serial number and Licensed Internal Code level
Details of the error
Error logs that can help to resolve the problem
After the Call Home is received by the assigned IBM support group, the associated information is examined and interpreted. Following analysis, an appropriate course of action is defined to resolve the problem. For instance, an IBM SSR might be sent to the site location to take the corrective actions. Alternatively, the problem might be repaired or resolved remotely by IBM support personnel through a broadband (if available) or telephone (if necessary) connection.
The TS3000 Total Storage System Console (TSSC) is the subsystem component responsible for placing the service call or Call Home when necessary. Since model 93p and release TSSC V4.7, only broadband connection is supported.
2.2 Stand-alone cluster: Components, functions, and features
In general, any cluster can be used as a stand-alone cluster. The TS7700 has several internal characteristics for High Availability (DDP or RAID 6 protection, dual power supplies, and so forth). However, a grid configuration can be configured for both additional HA and DR functions with different levels of business continuance. See Chapter 3, “IBM TS7700 usage considerations” on page 111.
Next, general information is provided about the components, functions, and features used in a TS7700 environment. The general concepts and information are also in 2.2, “Stand-alone cluster: Components, functions, and features” on page 30. Only deviations and additional information for multi-cluster grid are in 2.3, “Multi-cluster grid configurations: Components, functions, and features” on page 61.
2.2.1 Views from the Host: Library IDs
All host interaction with tape data in a TS7700 is through virtual volumes and virtual tape drives.
You must be able to identify the logical entity that represents the virtual drives and volumes, but also address the single entity of a physical cluster. Therefore, two types of libraries exist, a composite library and a distributed library. Each type is associated with a library name and a Library ID.
Composite library
The composite library is the logical image of the stand-alone cluster or grid that is presented to the host. All logical volumes and virtual drives are associated with the composite library. In a stand-alone TS7700, the host sees a logical tape library with up to 31 3490E tape CUs. These CUs each have 16 IBM 3490E tape drives, and are connected through 1 - 8 FICON channels. The composite library is defined through the Interactive Storage Management Facility (ISMF). A composite library is made up of one or more distributed libraries.
Figure 2-5 illustrates the host view of a stand-alone cluster configuration.
Figure 2-5 TS7700 stand-alone cluster configuration
Distributed library
Each cluster in a grid is a distributed library, which consists of a TS7700. In a TS7700T or TS7740, it is also attached to a physical TS3500 or TS4500 tape library. At the host, the distributed library is also defined to SMS. It is defined by using the existing ISMF windows, and has no tape devices defined. The virtual tape devices are defined to the composite library only.
A distributed library consists of the following cluster hardware components:
A virtualization engine
A TS7700 TVC
A 3952-F05 frame or 3952-F06 frame
Attachment to a physical library (TS7700T or TS7740)
Several physical tape drives (TS7700T or TS7740)
 
Important: A composite library ID must be defined both for a multi-cluster grid and a stand-alone cluster. For a stand-alone cluster, the composite library ID must not be the same as the distributed library ID. For a multiple grid configuration, the composite library ID must differ from any of the unique distributed library IDs. Both the composite library ID and distributed library ID are five-digit hexadecimal strings.
The Library ID is used to tie the host’s definition of the library to the actual hardware.
2.2.2 Tape Volume Cache
The TS7700 TVC is a disk buffer that receives all emulated tape write data, and provides all emulated tape read data.
The host operating system (OS) sees the TVC as virtual IBM 3490E Tape Drives, and the 3490 tape volumes are represented by storage space in a fault-tolerant disk subsystem. The host never writes directly to the physical tape drives attached to a TS7740 or TS7700T.
Originally. the TS7760 was delivered with 4 TB disk drive support. Since August 2017, only 8 TB disk drives are available. You can mix 4 TB or 8 TB drives within a frame, but you cannot mix them within a drawer or enclosure.
The following fault-tolerant TVC options are available. The TS7760 CSA/XSA cache are protected by the Dynamic Disk Pooling (DDP). For TS7740 configurations that use CC6, CC7, or CC8 technology, the TVC is protected with RAID 5. For all TS7720D, TS7720T, or TS7740 configurations that use CC9 technology, RAID 6 is used.
DDP on the TS7760 models provides not only a higher protection level, but also allow faster rebuild times. A DDP is built up from either one or two drawers. In a single drawer DDP, the data can be re-created when up to two disks in a DDP becomes unavailable. In a two-drawer DDP configuration, up to four disks can become unavailable and the data can still be re-created, but only two disks can be rebuilt at the same time.
Whether a DDP is built up from a single drawer or from two drawers depends on your configuration. Even numbers are always bounded to a two-drawer DDP. If an uneven number of drawers is installed, the last drawer in the frame is configured as a single drawer DDP.
The DDP does not use a global spare concept anymore, but provide free space on each of the 12 DDMs in a CSA/XSA drawer. In case of a DDM failure, the data is read from all remaining DDMs, and write to all remaining DDMs into the free space on the remaining DDMS. This procedure is called reconstruction.
If a second DDM fails during the reconstruction, the drawer pauses the first reconstruction, and starts a “Critical reconstruction.” This process allows much faster rebuild times. After the Critical reconstruction is finished, the paused reconstruction will be started again.
As opposed to a RAID-protected system, the data will not be copied to the original DDM after the failing DDM has been replaced. Instead, newly arriving data is used to rebalance the usage of the DDMs. This behavior uses less internal resources and allows faster return to normal processing.
For older cache models, the RAID configurations provide continuous data availability to users. If up to one data disk (RAID 5) or up to two data disks (RAID 6) in a RAID group become unavailable, the user data can be re-created dynamically from the remaining disks by using parity data that is provided by the RAID implementation. The RAID groups contain global hot spare disks to take the place of a failed hard disk drive (HDD).
Using parity, the RAID controller rebuilds the data from the failed disk onto the hot spare as a background task. This process enables the TS7700 to continue working while the IBM SSR replaces the failed HDD in the TS7700 Cache Controller or Cache Drawer.
The TS7720T and the TS7760T support Cache Partitions. Virtual volumes in resident-only partition (CP0) are treated as one partition in TS7720. Virtual volumes in tape-attached partition (CP1 - CP7) are treated as one partition in TS7740. For a detailed description about Cache Partition, see 2.1.6, “Introduction of the TS7700T” on page 22.
2.2.3 Virtual volumes and logical volumes
Tape volumes that are created and accessed through the TS7700 virtual devices are referred to as logical volumes or virtual volumes. Either name can be used interchangeably. Logical volumes are objects that are in the TVC. They can optionally replicate to peer locations and also offload to back-end physical tape.
Each logical volume, like a real volume, has the following characteristics:
Has a unique volume serial number (VOLSER) known to the host and to the TS7700.
Is loaded and unloaded on a virtual device.
Supports all tape write modes, including Tape Write Immediate mode.
Contains all standard tape marks and data blocks.
Supports an IBM, International Organization for Standardization (ISO), or American National Standards Institute (ANSI) standard label.
Prior to R3.2, Non-Initialized tapes or scratch mounts required that the tape be written from beginning of tape (BOT) for the first write. Appends could then occur at any legal position.
With R3.2 and later, Non-Initialized auto-labeled tapes allow the first write to occur at any position between BOT and just after the first tape mark after the volume label.
The application is notified that the write operation is complete when the data is written to a buffer in vNode. The buffer is implicitly or explicitly synchronized with the TVC during operation. Tape Write Immediate mode suppresses write data buffering.
Each host-written record has a logical block ID.
The end of volume is signaled when the total number of bytes written into the TVC after compression reaches one of the following limits:
 – 400 mebibytes (MiB) for an emulated cartridge system tape (CST).
 – 800 MiB for an emulated enhanced capacity cartridge system tape (ECCST) volume.
 – 1000, 2000, 4000, 6000, or 25,000 MiB using the larger volume size options that are assigned by DC.
The default logical volume sizes of 400 MiB or 800 MiB are defined at insert time. These volume sizes can be overwritten at every individual scratch mount, or any private mount where a write from BOT occurs, by using a DC construct option.
Virtual volumes can exist only in a TS7700. You can direct data to a virtual tape library by assigning a system-managed tape SG through the ACS routines. SMS passes DC, MC, SC, and SG names to the TS7700 as part of the mount operation. The TS7700 uses these constructs outboard to further manage the volume. This process uses the same policy management constructs defined through the ACS routines.
Beginning with TS7700 R2.0, a maximum of 2,000,000 virtual volumes per stand-alone cluster or multi-cluster grid was introduced. With a model V07/VEB server with R3.0 followed by model VEC server with R4.0, a maximum number of 4,000,000 virtual volumes per stand-alone cluster or multi-cluster grid are supported.
The default maximum number of supported logical volumes is still 1,000,000 per grid. Support for extra logical volumes can be added in increments of 200,000 volumes by using FC5270. Larger capacity volumes (beyond 400 MiB and 800 MiB) can be defined through DC and associated with CST (MEDIA1) or ECCST (MEDIA2) emulated media.
The VOLSERs for the logical volumes are defined through the MI when inserted. Virtual volumes go through the same cartridge entry processing as native cartridges inserted into a tape library that is attached directly to an IBM Z host.
After virtual volumes are inserted through the MI, they are placed in the insert category and handled exactly like native cartridges. When the TS7700 is varied online to a host, or after an insert event occurs, the host operating system interacts by using the object access method (OAM) with the Library.
Depending on the definitions in the DEVSUPxx and EDGRMMxx parmlib members, the host operating system assigns newly inserted volumes to a particular scratch category. The host system requests a particular category when it needs scratch tapes, and the TS7700 knows which group of volumes to use to satisfy the scratch request.
2.2.4 Logical volumes and compression
Two new compression algorithms are available with R4.1.2. Which compression algorithm will be used is based on the definition in the Data Class. This definition allows you to select the most suitable compression algorithm for different workloads.
Before R4.1.2, the compression was based only on an IBMLZ1 algorithm within the FICON channel adapter in a TS7700. Two additional TS7700 CPU based algorithms can now be selected:
LZ4
ZSTD
The difference is the delivered compression ratio and the requested TS7700 CPU consumption. ZSTD generally can achieve a higher compression ratio than LZ4 (thus, more CPU is used).
To use the new compression algorithms, all clusters in a grid must have R4.1.2 or later microcode. This microcode can also be installed on TS7740 (V07) and TS7720 (VEB) if 32 GB main memory is configured. Note that, especially with ZSTD compression, there might be performance considerations to the TS7700 throughput on heavily loaded existing workloads running on older hardware. VEB/V07 clients should test how the new algorithms work on their TS7700 configuration with small workloads before putting them into production. To avoid any negative effect, analyze the VEHSTATS performance reports in advance. For most V07 and VEB installations, LZ4 provides a good compromise to reach a higher compression ratio that does not exhaust the CPU power in TS7700 models.
Using the new compression algorithms will have a positive impact on these areas:
Cache resources (cache bandwidth and cache space) required
Grid link bandwidth
Physical tape resources
Premigration queue length (FC 5274)
It can also have a positive impact on your recovery point objective, depending on your configuration.
The actual host data that is stored on a virtual CST or ECCST volume is displayed by the LI REQ commands and in the MI. Depending on the selected logical volume size (400 MB to 25 GB), the uncompressed size varies between 1200 MiB - 75,000 MiB (assuming a 3:1 compression ratio).
2.2.5 Mounting a scratch virtual volume
When a request for a scratch is sent to the TS7700, the request specifies a mount category. The TS7700 selects a virtual VOLSER from the candidate list of scratch volumes in the category.
Scratch volumes at the mounting cluster are chosen by using the following priority order:
1. All volumes in the source or alternative source category that are owned by the local cluster, not currently mounted, and do not have pending reconciliation changes against a peer cluster
2. All volumes in the source or alternative source category that are owned by any available cluster, not currently mounted, and do not have pending reconciliation changes against a peer cluster
3. All volumes in the source or alternative source category that are owned by any available cluster and not currently mounted
4. All volumes in the source or alternative source category that can be taken over from an unavailable cluster that has an explicit or implied takeover mode enabled
The first volumes that are chosen in the preceding steps are the volumes that have been in the source category the longest. Volume serials are also toggled between odd and even serials for each volume selection.
For all scratch mounts, the volume is temporarily initialized as though the volume was initialized by using the EDGINERS or IEHINITT program. The volume has an IBM-standard label that consists of a VOL1 record, an HDR1 record, and a tape mark.
If the volume is modified, the temporary header information is applied to a file in the TVC. If the volume is not modified, the temporary header information is discarded, and any previously written content (if it exists) is not modified. In addition to choosing a volume, TVC selection processing is used to choose which TVC acts as the input/output (I/O) TVC, as described in 2.3.4, “I/O TVC selection” on page 67.
 
Important: In Release 3.0 or later of the TS7700, all categories that are defined as scratch inherit the Fast Ready attribute. There is no longer a need to use the MI to set the Fast Ready attribute to scratch categories. However, the MI is still needed to indicate which categories are scratch.
When the Fast Ready attribute is set or implied, no recall of content from physical tape is required in a TS7740 or TS7700T. No mechanical operation is required to mount a logical scratch volume. In addition, the volume’s current consistency is ignored because a scratch mount requires a write from BOT.
The TS7700 with SAA function activated uses policy management with z/OS host software to direct scratch allocations to specific clusters within a multi-cluster grid.
2.2.6 Mounting a specific virtual volume
In a stand-alone environment, the mount is directed to the virtual drives of this cluster. In a grid environment, specific mounts are more advanced. See 2.3.12, “Mounting a specific virtual volume” on page 74.
In the stand-alone environment, the following scenarios are possible:
1. There is a valid copy in the TVC. In this case, the mount is signaled as complete and the host can access the data immediately.
2. There is no valid copy in the TVC. In this case, there are further options:
a. If it is a TS7760D, TS7720D, or TS7700T CP0, the mount fails.
b. If it is a TS7740 or TS7700T C1-CP7, and if it is on back-end physical tape, the virtual volume is recalled from a stacked volume. Mount completion is signaled to the host system only after the entire volume is available in the TVC.
The recalled virtual volume remains in the TVC until it becomes the least recently used (LRU) volume, unless the volume was assigned a Preference Group of 0 or the Recalls Preferred to be Removed from Cache override is enabled by using the TS7700 Library Request command.
If the mounted virtual volume was modified, the volume is again premigrated.
If modification of the virtual volume did not occur when it was mounted, the TS7740 or TS7700T does not schedule another copy operation, and the current copy of the logical volume on the original stacked volume remains active. Furthermore, copies to remote TS7700 clusters in a grid configuration are not required if modifications were not made. If the primary or secondary pool location has changed, it is recognized now, and one or two new copies to tape are queued for premigration.
In a z/OS environment, to mount a specific volume in the TS7700, that volume must be in a private category within the library. The tape management system (TMS) prevents a scratch volume from being mounted in response to a specific mount request. Also, the TS7700 treats any specific mount that targets a volume that is assigned to a scratch category, which is also configured through the MI as scratch (Fast Ready), as a host scratch mount. In Release 3.0 or later of TS7700, all scratch categories are Fast Ready. If this occurs, the temporary tape header is created, and no recalls take place.
In this case, DFSMS Removable Media Manager (DFSMSrmm) or other TMS fails the mount operation, because the expected last written data set for the private volume was not found. Because no write operation occurs, the original volume’s contents are left intact, which accounts for categories that are incorrectly configured as scratch (Fast Ready) within the MI.
2.2.7 Logical WORM support and characteristics
The TS7700 supports the LWORM function through TS7700 software emulation. The host views the TS7700 as an LWORM-compliant library that contains WORM-compliant 3490E logical drives and media.
The LWORM implementation of the TS7700 emulates physical WORM tape drives and media. TS7700 provides the following functions:
Provides an advanced function DC construct property that enables volumes to be assigned as LWORM-compliant during the volume’s first mount, where a write operation from BOT is required, or during a volume’s reuse from scratch, where a write from BOT is required
Generates, during the assignment of LWORM to a volume’s characteristics, a temporary worldwide identifier that is surfaced to host software during host software open and close processing, and then bound to the volume during the first write from BOT
Generates and maintains a persistent Write-Mount Count for each LWORM volume, and keeps the value synchronized with host software
Enables only appends to LWORM volumes by using physical WORM append guidelines
Provides a mechanism through which host software commands can discover LWORM attributes for a given mounted volume
No method is available to convert previously written volumes to LWORM volumes without having to read the contents and rewrite them to a new logical volume that has been bound as an LWORM volume.
TS7700 reporting volumes (BVIR) cannot be written in LWORM format. For more information, see 11.14.1, “Overview of the BVIR function” on page 679.
 
Clarification: Cohasset Associates, Inc. has assessed the LWORM capability of the TS7700. The conclusion is that the TS7700 meets all US Securities and Exchange Commission (SEC) requirements in Rule 17a-4(f), which expressly enables records to be retained on electronic storage media.
2.2.8 Virtual drives
From a host perspective, each TS7700 appears as 16 logical IBM 3490E tape CUs. With R3.2, up to 31 logical control units (LCUs) can be defined with the 496 drives. Each CU has 16 unique drives that are attached through FICON channels. Virtual tape drives and CUs are defined just like physical IBM 3490 systems through the hardware configuration definition (HCD). Defining a preferred path for the virtual drives gives you no benefit.
Each virtual drive has the following characteristics of physical tape drives:
Uses host device addressing
Is included in the I/O generation for the system
Is varied online or offline to the host
Signals when a virtual volume is loaded
Responds and processes all IBM 3490E I/O commands
Becomes not ready when a virtual volume is rewound and unloaded
Supports manual stand-alone mount processing for host initial program load (IPL) when initiated from the MI
For software transparency reasons, the functions of the 3490E integrated cartridge loader (ICL) are also included in the virtual drive’s capability. All virtual drives indicate that they have an ICL. For scratch mounts, using the emulated ICL in the TS7700 to preinstall virtual cartridges is of no benefit.
With FC 5275, you can add 1 LCU with 16 drives up to the maximum of 496 logical drives per cluster.
 
Note: 8 Gigabit (Gb) and 16 Gigabit (Gb) FICON adapters must be installed in a cluster before these additional devices can be defined. Existing configurations with 4 Gb FICON adapters do not support these additional devices.
2.2.9 Selective Device Access Control
Due to the expanding capacity and throughput characteristics of the TS7700, there is an increased need for multiple system plexes or tenants that share a common TS7700 or TS7700 grid. Selective Device Access Control (SDAC) meets this need by enabling a secure method of hard partitioning. The primary intent of this function is to prevent one host logical partition (LPAR) or sysplex with an independent TMS from inadvertently modifying or removing data that is owned by another host.
This is valuable in a setup where you have a production system and a test system with different security settings on the hosts, and you want to separate the access to the grid in a more secure way. It can also be used in a multi-tenant service provider to prevent tenants from accessing each other’s data, or when you have different IBM Z operating systems that share the TS7700, such as z/OS, IBM z/VSE, IBM z/Transaction Processing Facility (IBM z/TPF), and IBM z/VM.
Hard partitioning is a way to give a fixed number of LCUs to a defined host group, and connect the units to a range of logical volumes that are dedicated to a particular host or hosts. SDAC is a useful function when multiple partitions have the following characteristics:
Separate volume ranges
Separate TMS
Separate tape configuration database
SDAC enables you to define a subset of all of the logical devices per host (CUs in ranges of 16 devices based on the LIBPORT definitions in HCD). It enables exclusive control on host-initiated mounts, ejects, and attribute or category changes. The implementation of SDAC is described in Appendix I, “Case study for logical partitioning of a two-cluster grid” on page 919.
Implementing SDAC requires planning and orchestration with other system areas, to map the wanted access for the device ranges from individual servers or LPARs, and consolidate this information in a coherent input/output definition file (IODF) or HCD. From the TS7700 subsystem standpoint, SDAC definitions are set up using the TS7700 MI.
 
Important: SDAC is based on the availability of LIBPORT definitions or another equivalent way to define device ranges and administratively protect those assignments. Device partitions must be defined on 16 device boundaries to be compatible with SDAC.
2.2.10 Physical drives
The physical tape drives used by a TS7740 or TS7700T are installed in an IBM TS3500 or IBM TS4500 tape library. The physical tape drives are not addressable by any attached host system, and are controlled by the TS7740 or TS7700T. The TS7740 and TS7700T support TS1150, TS1140, TS1130, TS1120, and IBM 3592-J1A physical tape drives installed in an IBM TS3500 tape library. If an IBM TS4500 tape library is used, only TS1140 and TS1150 are supported.
 
Remember: Do not change the assignment of physical tape drives attached to a TS7740 or TS7700T in the IBM TS3500 IBM or IBM TS4500 Tape Library web interface. Consult your IBM SSR for configuration changes.
Before Release 3.3, all attached physical drives had to be homogeneous. With Release 3.3, support was added for the use of a mix between the TS1150 and one other tape drive generation. This is called heterogeneous tape drive support and is for migration purposes only. Although the TS1150 does not support JA and JB cartridges, it might be necessary to read the existing data with a tape drive from the previous generation, and then write the data with the TS1150 to a JC or JD cartridge. No new data can be placed on the existing JA and JB cartridges using the heterogeneous support. This is referred to as sunset media.
To support the heterogeneous tape drives, additional controls were introduced to handle the reclaim value for sunset media differently from the rest of the tape media. Also, two more SETTING ALERTS were introduced to allow the monitoring of the sunset drives.
2.2.11 Stacked volume
Physical cartridges that are used by the TS7740 and TS7700T to store logical volumes are under the control of the TS7740 or TS7700T node. The physical cartridges are not known to the hosts. Physical volumes are called stacked volumes. Stacked volumes must have unique, system-readable VOLSERs and external labels like any other cartridges in a tape library.
 
Tip: Stacked volumes do not need to be initialized before inserting them into the TS3500 or TS4500. However, the internal VOL1 labels must match the external labels if they were previously initialized or used.
After the host closes and unloads a virtual volume, the storage management software inside the TS7740 or TS7700T schedules the virtual volume to be copied (also known as premigration) onto one or more physical tape cartridges. The TS7740 or TS7700T attempts to maintain a minimal amount of stacked volume to which virtual volumes are copied.
Therefore, mount activity is reduced because a minimal number of physical cartridges are mounted to service multiple virtual volume premigration requests that target the same physical volume pool. How many physical cartridges for premigration per pool can be mounted in parallel is defined within the MI as part of the pool property definitions. Virtual volumes are already compressed and are written in that compressed format to the stacked volume. This procedure maximizes the use of a cartridge’s storage capacity.
A logical volume that cannot fit in the currently filling stacked volume does not span across two or more physical cartridges. Instead, the stacked volume is marked full, and the logical volume is written on another stacked volume from the assigned pool.
Due to business reasons, it might be necessary to separate logical volumes from each other (selective dual write, multi-client environment, or encryption requirements). Therefore, you can influence the location of the data by using volume pooling. For more information, see “Using physical volume pools” on page 51.
Through the TS3500/ TS4500 web interface physical cartridge ranges should be assigned to the appropriate library partition associated with your TS7700. This enables them to become visible to the correct TS7700. The TS7700 MI must further be used to define which pool physical tapes are assigned to when initially inserted into the TS3500 or TS4500, which includes the common scratch pool. How physical tapes can move between pools for scratch management is also defined by using the MI.
2.2.12 Selective Dual Copy function
In a TS7740 or TS7700T, a logical volume and its internal data usually exist as a single entity that is copied to a single stacked volume. If the stacked volume is damaged, you can lose access to the data within one or more logical volumes that are contained on the damaged physical tape. The TS7700 provides a method to create redundant copies on independent physical tapes to help reduce the risk of such a loss.
With the Selective Dual Copy function, storage administrators can selectively create two copies of logical volumes within two pools of a TS7740 or TS7700T. The Selective Dual Copy function can be used with the Copy Export function to provide a secondary offsite physical copy for DR purposes. For more information about Copy Export, see 2.2.26, “Copy Export function” on page 56.
The second copy of the logical volume is created in a separate physical pool to ensure physical cartridge separation. Control of Dual Copy is through the MC construct (see “Management Classes window” on page 456). The second copy is created when the original volume is pre-migrated.
 
Important: When used for Copy Export, ensure that reclamation in the secondary physical volume pool is self-contained (the secondary volume pool reclaims onto itself) to keep secondary pool cartridges isolated from the others. Otherwise, Copy Export DR capabilities might be compromised.
The second copy that is created through the Selective Dual Copy function is only available when the primary volume cannot be recalled or is inaccessible. It cannot be accessed separately, and cannot be used if the primary volume is being used by another operation. The second copy provides a backup if the primary volume is damaged or inaccessible.
Selective Dual Copy is defined to the TS7740/TS7700T and has the following characteristics:
The selective dual copy feature is enabled by the MC setting through the MI where you define the secondary pool.
Secondary and primary pools can be intermixed:
 – A primary pool for one logical volume can be the secondary pool for another logical volume unless the secondary pool is used as a Copy Export pool.
 – Multiple primary pools can use the same secondary pool.
At Rewind Unload (RUN) time, the secondary pool assignment is determined, and the copy of the logical volume is scheduled. The scheduling of the backup is determined by the premigration activity occurring in the TS7740 or TS7700T.
The secondary copy is created before the logical volume is migrated to the primary pool.
2.2.13 General TVC management in a stand-alone cluster
The TS7700 cluster manages the TVC cache. Through policy settings and LI REQ settings, you can influence the behavior of the way the cluster performs these actions. You can define which data to keep longer in the TVC, and which data is preferably removed from cache.
The following topics are described in the next sections:
Rules for cache management
Short introduction of how you control the contents of cache
Description of how the TVC cache management mechanism works
Description of which TVC cache management processes exist
Rules for Cache Management
Cache Management has the following rules:
The TVC contents are managed by definitions in the SC.
In a stand-alone TS7700D or TS7700T CP0, active data always remains in the cache.
In a TS7740 or TS7700T CPx, if volumes are not in cache during a tape volume mount request, they are scheduled to be brought back into the disk cache from a physical tape device (recall).
In a TS7740 configuration, if a modified virtual volume is closed and dismounted from the host, it is scheduled to be copied to a stacked volume (premigration).
In a TS7700T CPx configuration, if a modified virtual volume is closed and dismounted from the host, it can be scheduled to be copied to a stacked volume (premigration) or kept in the delay premigration queue, depending on the SC definition. Virtual volumes in the delay premigration queue are only subject to an earlier premigration if the amount of data for this specific tape partition in the delay premigration queue is above the delay premigration threshold.
In a TS7740 or TS7700T CPx, if the TVC runs out of space, the cache management removes or migrates previously premigrated volumes. Candidates for removal from cache are selected by using an LRU algorithm, and accept PG0/PG1 definitions.
In addition, a TS7700T CPx partition can temporarily overspill into CP0 if CP0 space is available when the CPx partition has no migration candidates remaining.
The TS7700 emulates a 3490E tape of a specific size that is chosen through the DC construct. However, the space that is used in the TVC is the number of bytes of data that is written to the virtual volume after compression and after a minimal amount of TS7700 metadata is introduced. When the virtual volume is written to the physical tape, it uses only the space that is occupied by the compressed data and resulting metadata.
How you control the content of the TVC (TS7700T and TS7740)
You control the content through the SC construct. Through the MI, you can define one or more SC names. If the selected cluster possesses a physical library, you can assign Preference Level 0 or 1. If the selected cluster does not possess a physical library, volumes in that cluster’s cache display a Level 1 preference.
The following values are possible:
Use IART Volumes are removed according to the IBM TS7700s Initial Access Response Time (IART) assigned by the host during the volume’s creation. The result is either Level 0 or Level 1.
Level 0 Volumes are removed from the TVC as soon as they are copied to tape. This is called Preference Group 0 (PG0). This control is suitable for data that is unlikely to be read again.
Level 1 Copied volumes remain in the TVC until more space is required, and then volumes are removed from disk cache in a least recently used order.
In a z/OS environment, the SC name that is assigned to a volume in the ACS routine is directly passed to the TS7700 and mapped to the predefined constructs. Figure 2-6 shows this process.
Figure 2-6 TS7740 TVC management through Storage Class
If the host passes a previously undefined SC name to the TS7700 during a scratch mount request, the TS7700 adds the name by using the definitions for the default SC.
 
Define SCs: Ensure that you predefine the SCs. The default SC might not support your needs.
For environments that are not z/OS (SMS) environments that use the MI, an SC can be assigned to a range of logical volumes during insert processing. The SC can also be updated to a range of volumes after they have been inserted through the MI.
To be compatible with the IART method of setting the preference level, the SC definition also enables a Use IART selection to be assigned. Even before Outboard Policy Management was made available for the previous generation VTS, you could assign a preference level to virtual volumes by using the IART attribute of the SC. The IART is an SC attribute that was originally added to specify the wanted response time (in seconds) for an object by using the OAM.
If you wanted a virtual volume to remain in cache, you assign an SC to the volume whose IART value is 99 seconds or less. Conversely, if you want to give a virtual volume preference to be out of cache, you assign an SC to the volume whose IART value was 100 seconds or more. Assuming that the Use IART selection is not specified, the TS7700 sets the preference level for the volume based on the Preference Level 0 or 1 of the SC assigned to the volume.
2.2.14 TVC Cache management in a TS7740 stand-alone cluster
As mentioned, virtual volumes with an SC with Preference Level 0 (PG0) are deleted from cache as soon as the logical volume is premigrated. If there are no more PG0 volumes that have been copied to physical volumes to remove, the TS7740 selects Preference Level 1 (Preference Group 1 or PG1) volumes. PG1 virtual volumes stay in the TVC for as long a time as possible.
When a volume is assigned Preference Level 1, the TS7740 adds it to the queue of volumes to be copied to physical tape after a 4-minute time delay, and after any volumes are assigned to Preference Level 0. The 4-minute time delay is to prevent unnecessary copies from being performed when a volume is created, then quickly remounted, and appended to again.
When space is needed in cache, the TS7740 first determines whether there are any PG0 volumes that can be removed. If not, the TS7740 selects PG1 volumes to remove based on an LRU algorithm. This process results in volumes that have been copied to physical tape, and have been in cache the longest without access, to be removed first.
Figure 2-7 shows cache usage with policy-based cache management.
Figure 2-7 TS7740 cache usage with policy-based cache management
When a preference level is assigned to a volume, that assignment is persistent until the volume is reused for scratch and a new preference level is assigned. Or, if the policy is changed and a mount/dismount occurs, the new policy also takes effect.
 
Important: As of R2.1, all scratch volumes, independent of their preference group assignment, are favored for migration before selecting PG0 and PG1 candidates.
Recalled logical volumes are preferred for migration
Normally, a volume recalled into cache is managed as though it were newly created or modified, because it is in the TVC selected for I/O operations on the volume. A recalled volume displaces other volumes in cache, and moves to the end of the list of PG1 candidates to migrate due to how the LRU algorithm functions. The default behavior assumes any recall of a volume into the TVC might follow with additional host access.
However, there might be use cases where volumes recalled into cache are known to be accessed only once, and should be removed from disk cache as soon as they are read (for example, during a multi-volume data set restore). In this case, you wouldn’t want the volumes to be kept in cache, because they require other more important cache resident data to
be migrated.
Each TS7740 and TS7720T has an LI REQ setting that can determine how it handles recalled volumes. The LI REQ SETTING RECLPG0 determines whether volumes that are recalled into cache are forced to PG0 or not. If forced to PG0, they are immediately migrated, freeing up space for other recalls without the need to migrate critical PG1 content.
Based on your current requirements, you can set or modify this control dynamically through the LI REQ SETTING RECLP0 option:
When DISABLED, which is the default, logical volumes that are recalled into cache are managed by using the actions that are defined for the SC construct associated with the volume as defined at the TS7700.
When ENABLED, logical volumes that are recalled into cache are managed as PG0 (preferable to be removed from cache). This control overrides the actions that are defined for the SC associated with the recalled volume.
2.2.15 About TVC cache management in a TS7700D and TS7700T CP0 stand-alone cluster
In a stand-alone TS7720 configuration, virtual volumes always remain in the TVC, because no physical tape drives are attached to the TS7700D. With a TS7700T configuration, contents in the CP0 partition are also not candidates for moving to physical tape. There is no autoremoval function in a stand-alone environment.
In a TS7700D stand-alone cluster, you can influence the TVC content only with the Delete Expired and EJECT setting. No further cache management is available. For a TS770T, the
LI REQ PARTRFSH command can be used to move data between partitions.
If a TS7700D runs out of cache space or TS7700T runs out of CP0 space, warning messages and critical messages are shown. If the TS7700D enters the Out of cache condition, it moves to a read-only state. If a TS7720T CP0 partition becomes full, it becomes read-only regarding workloads that target CP0.
 
Important: Monitor your cache in a TS7700D stand-alone environment to avoid an Out of Cache Resources situation.
2.2.16 TVC Cache management in a TS7700T CPx stand-alone cluster
The TS7700T Tape partitions (CPx) have the same TVC cache controls as the TS7740. Two extra features exist on the TS7700T for cache management control when compared to the TS7740. These features are described in the following sections.
Multiple tape partitions
As mentioned earlier, you can specify 1 - 7 independent tape partitions. Each tape partition has its own independent cache management. The TVC management based on PG0/PG1 and LRU management is on a tape partition level, which means that data in one tape partition has no influence on the cache management in another tape partition.
Time-delayed premigration
You can use the delay premigration to delay the premigration of data. The delay premigration is controlled by two different components:
In the partition itself, the amount of data that can be stored in the delay premigration queue is determined. The minimum is 500 gigabytes (GB). The maximum is the tape partition size minus 500 GB.
In the SC, the delay premigration settings are defined. You can choose either after Volume creation or Volume last accessed as the used reference point. In addition, you define the time of the grace period. The time is specified in hours, 0 - 65535. A value of 0 means that no delay premigration time is set.
You can have multiple SCs, with different delay premigration definitions, pointing to the same tape partition. You can also have multiple SCs with different delay premigration definitions that point to different tape partitions.
Consider a situation where the amount of delayed premigration content, which has also not yet met its delay criteria, exceeds a partition’s configured maximum delay-premigration limit. In this case, delayed content that has been present in the partition for the longest time is moved to the premigration queue proactively to maintain the configured limit. A too-small defined delay premigration size leads to the situation where data is always pushed out to tape too early, which can create excess back-end tape usage and its associated activity.
One important aspect of using delay premigration is that the content that is delayed for premigration is not added to the premigration queue until its delay criteria has been met. This means that if a large amount of delayed content meets its criteria at the same time, the premigration queue can rapidly increase in size. This rapid increase can result in unexpected host throttling.
Ensure that your FC5274 feature counts can accommodate these large increases in premigration activity. Alternatively, try to ensure that multiple workloads that are delayed for premigration do not reach their criteria at the same time.
Assume that you have three different tape partitions and a unique SC for each one. The following list describes the SC definitions:
CP1: Delay premigration 12 hours after volume creation
CP2: Delay premigration 6 hours after volume creation
CP3: Delay premigration 3 hours after volume creation
In CP1 at 22:00, 6 TB are written every night. The 12-hour delay ensures that they are premigrated later in the day when there is a lower workload. To make the example simpler, we assume that no compression exists for all data.
In CP2 at 04:00, 2 TB are written. The six-hour delay makes them eligible for premigration also at 10:00 in the morning.
In CP3 at 07:00, 1 TB is written. The three-hour delay has them eligible for premigration at the same time as the other two workloads.
Therefore, all 9 TB of the workload is meeting its delay criteria at roughly the same time, producing a large increase in premigration activity. If the premigration queue size is not large enough, workloads into the TS7700T are throttled until the premigration process can reduce the queue size. Ensure that the number of FC 5274 features are suitable, or plan the delay times so that they do not all expire at the same time.
2.2.17 Expired virtual volumes and the Delete Expired function
To remain compatible with physical tape, logical volumes that are returned to scratch, or that are expired, retain all previously written content until they are reused or written from BOT. In a virtual tape environment, the retention of this scratched content can lead to any of the following situations:
TVCs might fill up with large amounts of expired data.
Stacked volumes might retain an excessive amount of expired data.
Stacked volumes fill up with already expired data.
To help manage the expired content, the TS7700 supports a function referred to as delete expire. When enabling delete expire processing against a configured scratch category, you can set a grace period for expired volumes ranging 1 hour - 144 weeks (the default is 24 hours). If the volume has not already been reused when the delay period has passed, the volume is marked as a candidate for auto deletion or delete expire.
When deleted, its active space in TVC is freed. If it was also stacked to one or more physical tapes, that region of physical tape is marked inactive.
The start timer for delete expire processing is set when the volume is moved to a designated scratch category, or a category with the Fast Ready attribute set, which has defined a delete expire value. If the scratch category has no delete expire value, the timer is not set.
During the delete expire process, the start timer and the delete expire value are used to determine whether the logical volume is eligible for the delete expire processing. If so, the content is deleted immediately.
If the logical volume is reused during a scratch mount before the expiration delete time expires, the existing content is immediately deleted at the time of first write.
It does not matter whether the volume is in cache or on back-end tape; after the delete expire time passes, the volume is no longer accessible without IBM SSR assistance. The default behavior is to Delete Expire up to 1000 delete-expire candidates per hour. This value can be modified by using the LI REQ command.
For more information about expired volume management, see “Defining the logical volume expiration time” on page 553. The explicit movement of a volume out of the delete expired configured category can occur before the expiration of this volume.
 
Important: Disregarding the Delete Expired Volumes setting can lead to an out-of-cache state in a TS7700D. With a TS7740 or TS7700T, it can cause excessive tape usage. In an extreme condition, it can cause an out-of-physical scratch state.
The disadvantage of not having this option enabled is that scratched volumes needlessly use TVC and physical stacked volume resources, so they demand more TVC active space while also requiring more physical stacked volumes in a TS7740 or TS7700T. The time that it takes a physical volume to fall below the reclamation threshold is also increased, because the data is still considered active. This delay in data deletion also causes scratched stale logical volumes to be moved from one stacked volume to another during reclamation.
Expire Hold settings
A volume that is expired or returned to scratch might be reused during a scratch mount before its delete expire grace period has passed. If retention of expired content is required, an extra Expire Hold setting can be enabled.
When Expire Hold is enabled as part of the delete expire settings, the expired or scratched volume is moved into a protected hold state in which it is not a candidate for scratch mounts. The volume is also not accessible from any host operation until the configured expire time grace period has passed. Starting with Release 2.1 of the TS7700, these held volumes can be moved back to a private category while still in a held state.
This additional option is made available to prevent any malicious or unintended overwriting of scratched data before the duration elapses. After the grace period expires, the volume is simultaneously removed from a held state and made a deletion candidate.
 
Remember: Volumes in the Expire Hold state are excluded from DFSMS OAM scratch counts, and are not candidates for TS7700 scratch mounts.
Delete Expired data that was previously stacked onto physical tape remains recoverable through an IBM services salvage process if the physical tape has not yet been reused, or if the secure erase process was not performed against it. Contact your IBM SSR if these services are required. Also, disabling reclamation as soon as any return to scratch mistake is made can help retain any content still present on physical tape.
 
Important: When Delete Expired is enabled for the first time against a scratch category, all volumes that are contained within that category are not candidates for delete expire processing. Only volumes that moved to the scratch category after the enablement of the Delete Expired are candidates for delete expire processing.
Changes to the Delete Expired values are effective to all logical volumes that are candidates for delete expire processing.
2.2.18 TVC management processes for TS7740 or TS7700T CPx
Two processes manage the TVC of the TS7740 and TS7700T in a stand-alone environment:
Premigration Management (TS7740 and TS7700T CPx)
This process is always actively queuing newly created volumes for movement to back-end physical tape. When the TS7700 determines that minimal host activity is occurring, or if the PMPRIOR threshold has been crossed, it begins servicing the premigration queue by copying the logical volume content to one or more physical tapes.
When copied or stacked to physical tape, the volume is a candidate for migration or the deletion out of TVC. Content in a TS7700T CPx partition that is configured with a delay premigration time is not queued for premigration until the delay criteria is met, or until the maximum delay premigration threshold for the partition is exceeded.
If your TS7740/TS7700T exceeds its PMPRIOR threshold of content queued for premigration, the TS7740 or TS7700T likely enters the sustained mode of operation in which the speed of which the TS7740 or TS7700T can absorb new workload is at the speed of which it can be premigrated to physical tape. Thresholds that are associated with the priority of premigration and the sustained mode of operations are tunable by using the
LI REQ commands.
Free-space Management (TS7740 and TS7700T CPx)
This process manages the amount of free space within the TVC of a TS7740 or TS7700T. When the premigration process completes, a volume is a candidate for deletion from disk cache, otherwise known as migration. Volumes that are selected for migration are chosen based on how much free space is needed, LRU algorithms, and configured policies, such as preference group.
In a TS7740 or in CPx partitions of a TS7700, the inability to free disk cache space through premigration and migration can lead to heavy host throttling. Volumes targeting the CP0 partition of a TS7720T are not susceptible to any throttling associated with moving content to physical tape.
2.2.19 TVC handling in outage situations
In a TS7740 environment, a “force paused” mode of the TS3500 or a resource shortage (out of physical scratch) leads to the situation where the stand-alone TS7740 does not accept any writes. That is true even when the TS7740 has free cache space. This function was introduced to ensure that no cache overflow occurs.
However, the cache sizes in a TS7700T are much bigger and depend on the installed configuration, so this behavior might not be appropriate. Therefore, in Release 3.3, a new command that is called the LI REQ command defines how a TS7700T behave in such a condition.
You can now specify whether a TS7700T CPx reacts the same as a TS7740 or accepts the incoming write in a stand-alone mode until the cache resources are exhausted.
2.2.20 Copy Consistency Point: Copy policy modes in a stand-alone cluster
In a stand-alone cluster, you cannot define any Copy Consistency Point.
2.2.21 TVC selection in a stand-alone cluster
Because there is only one TVC in a stand-alone cluster available, no TVC selection occurs.
2.2.22 TVC encryption
With R3.0, a TVC encryption feature was introduced.
TVC encryption is turned on for the whole disk cache. You cannot encrypt a disk cache partially. Therefore, all DDMs in all strings must be full disk encryption (FDE)-capable to enable the encryption. The disk cache encryption is supported for all TS7760 models with CSA, all TS7720 models with 3956-CS9 cache or higher, and for TS7740 with 3956-CC9.
Encryption can be enabled in the field at any time, and retroactively encrypts all existing content that is stored within the TVC. Because the encryption is done at the HDD level, encryption is not apparent to the TS7700 and has no effect on performance.
Starting with R3.0, only local key management is supported. Local key management is automated. There are no encryption keys (EKs) for the user to manage. Release 3.3 added support for the external key manager IBM Security Key Lifecycle Manager (SKLM, formerly IBM Tivoli Key Lifecycle Manager).
 
Note: The IBM Security Key Lifecycle Manager for z/OS (ISKLM) external key manager supports TS7700 physical tape, but does not support TS7700 disk encryption.
If you want to use an external key manager for both TVC and physical tape, you must use the same external key manager instance for both of them.
There are two differences between the usage of local or external key management:
If you have no connection to the external key manager, TS7700 will not run. Therefore, you must plan carefully to have a primary and an alternative key manager that are reachable in a disaster situation.
If a cluster that uses disk encryption with an external key manager is unjoined from a grid, the encryption must be disabled during this process. Otherwise, the TS7700 cannot be reused. Therefore, during the unjoin, the cluster is secured erased.
2.2.23 Physical volume pools
You can use the TS7740 and TS7700T to group volumes by pools when stacking to physical tape takes place.
The following list includes some examples of why physical volume pools are helpful:
Data from separate customers on the same physical volume can compromise certain outsourcing contracts.
Customers want to be able to “see, feel, and touch” their data by having only their data on dedicated media.
Customers need separate pools for different environments, such as test, user acceptance test (UAT), and production.
Traditionally, users are charged by the number of volumes they have in the tape library. With physical volume pooling, users can create and consolidate multiple logical volumes on a smaller number of stacked volumes, and reduce their media charges.
Recall times depend on the media length. Small logical volumes on the tape cartridges (JA, JB, and JC) can take a longer time to recall than volumes on the economy cartridge (JJ or JK). Therefore, pooling by media type is also beneficial.
Some workloads have a high expiration rate, which causes excessive reclamation. These workloads are better suited in their own pool of physical volumes.
Protecting data through encryption can be set on a per pool basis, which enables you to encrypt all or some of your data when it is written to the back-end tapes.
Migration from older tape media technology.
Reclaimed data can be moved to a different target pool, which enables aged data to move to a specific subset of physical tapes.
Second dedicated pool for key workloads to be Copy Exported.
There are benefits to using physical volume pools, so plan for the number of physical pools. See also “Relationship between reclamation and the number of physical pools” on page 55.
Using physical volume pools
Physical volume pool properties enable the administrator to define pools of stacked volumes within the TS7740/TS7700T. You can direct virtual volumes to these pools by using SMS constructs. There can be up to 32 general-purpose pools (01 - 32) and one common pool (00). A common scratch pool (Pool 00) is a reserved pool that contains only scratch stacked volumes for the other pools.
Each TS7740/TS7700T that is attached to an IBM TS4500 or IBM TS3500 tape library has its own set of pools.
Common scratch pool (Pool 00)
The common scratch pool is a pool that contains only scratch stacked volumes, and serves as a reserve pool. You can define a primary pool to borrow scratch stacked cartridges from the common scratch pool (Pool 00) if a scratch shortage occurs. This can be done either on a temporary or permanent basis.
Each pool can be defined to borrow single media type (for example, JA, JB, JC, JD), borrow mixed media, or have a first choice and a second choice. The borrowing options can be set by using the MI when you are defining stacked volume pool properties.
 
Remember: The common scratch pool must have at least three scratch cartridges available when one or more to reports low scratch count warnings.
General-purpose pools (Pools 01 - 32)
There are 32 general-purpose pools available for each TS7740/TS7700T cluster. These pools can contain both empty and full or filling stacked volumes. All physical volumes in a TS7740/TS7700T cluster are distributed among available pools according to the physical volume range definitions in place. The distribution is also based on the pools’ borrow and return attribute settings.
Those pools can have their properties tailored individually by the administrator for various purposes. When initially creating these pools, it is important to ensure that the correct borrowing properties are defined for each one. For more information, see “Stacked volume pool properties” on page 53.
By default, there is one pool, Pool 01, and the TS7740/TS7700T stores virtual volumes on any stacked volume available to it. This creates an intermix of logical volumes from differing sources, for example, an LPAR and applications on a physical cartridge.
The user cannot influence the physical location of the logical volume within a pool. Having all of the logical volumes in a single group of stacked volumes is not always optimal.
Using this facility, you can also perform the following tasks:
Separate different clients or LPAR data from each other.
Intermix or segregate media types.
Map separate SGs to the same primary pools.
Set up specific pools for Copy Export.
Set up pool or pools for encryption.
Set a reclamation threshold at the pool level.
Set reclamation parameters for stacked volumes.
Set up reclamation cascading from one pool to another.
Set maximum number of devices to use concurrent premigration on pool base.
Assign or eject stacked volumes from specific pools.
Physical pooling of stacked volumes is identified by defining a pool number, as shown in Figure 2-8.
Figure 2-8 TS7740/TS7700T Logical volume allocation to specific physical volume pool flow
Through the MI, you can add an SG construct, and assign a primary storage pool to it. Stacked volumes are assigned directly to the defined storage pools. The pool assignments are stored in the TS7740/TS7700T database. During a scratch mount, a logical volume is assigned to a selected SG.
This SG is connected to a storage pool with assigned physical volumes. When a logical volume is copied to tape, it is written to a stacked volume that belongs to this storage pool. In addition, MC can be used to define a secondary pool when two copies on physical tape are required.
Physical VOLSER ranges can be defined with a home pool at insert time. Changing the home pool of a range has no effect on existing volumes in the library. When also disabling borrow/return for that pool, this provides a method to have a specific range of volumes that are used exclusively by a specific pool.
 
Tip: Primary Pool 01 is the default private pool for TS7740/TS7700T stacked volumes.
Borrowing and returning: Out of physical stacked volume considerations
Using the concept of borrowing and returning, out-of-scratch scenarios can be automatically addressed.
With borrowing, stacked volumes can move from pool to pool and back again to the original pool. In this way, the TS7740/TS7700T can manage out-of-scratch and low scratch scenarios, which can occur within any TS7740/TS7700T from time to time.
You need at least two empty stacked volumes in the CSP to avoid any out of scratch condition. Empty pvols in other pools (regardless of the pool properties) are not considered. Ensure that non-borrowing active pools have at least two scratch volumes.
One physical pool with an out of stacked volume condition results in an out of stacked volume condition to the whole TSS740 / TS7700T cluster. Therefore, it is necessary to monitor all active pools.
 
Remember: Pools that have borrow/return enabled, and that contain no active data, eventually return all of the scratch volumes to the common scratch pool after 48 - 72 hours of inactivity.
Stacked volume pool properties
Logical volume pooling supports cartridge type selection. This can be used to create separate pools of 3592 tape cartridges with various capacities 128 GB - 10 TB, depending upon the type of media and tape drive technology used.
Lower capacity JJ, JK, or JL cartridges can be designated to a pool to provide consistently faster access to application data, such as hierarchical storage management (HSM) or Content Manager. Higher capacity JA, JB, JC, or JD cartridges that are assigned to a pool can address archival requirements, such as full volume dumps.
2.2.24 Logical and stacked volume management
Every time that a logical volume is modified (either by modification or by reuse of a scratch volume), the data from the previous use of this logical volume, which is on a stacked volume, becomes obsolete. The new virtual volume is placed in the cache and written to a stacked volume afterward (TS7740 or TS7700T). The previous copy on a stacked volume is invalidated, but it still uses up space on the physical tape. Auto delete of expired volumes or ejected volumes can also result in the creation of inactive space on physical tapes.
Virtual volume reconciliation
The reconciliation process periodically checks for active volume usage percentages, which remain on physical tape cartridges. This process automatically adjusts the active data values of the physical volumes, which are the primary attributes that are used for automatic physical volume reclamation processing.
The data that is associated with a logical volume is considered invalidated if any of the following conditions are true:
A host has assigned the logical volume to a scratch category. Later, the volume is selected for a scratch mount, and data is written to the volume. The older version of the volume is now invalid.
A host has assigned the logical volume to a scratch category. The category has a nonzero delete-expired data parameter value. The parameter value was exceeded, and the TS7740/TS7700T deleted the logical volume.
A host has modified the contents of the volume. This can be a complete rewrite of the volume or an append to it. The new version of the logical volume is premigrated to a separate physical location and the older version is invalidated.
The logical volume is ejected, in which case the version on physical tape is invalidated.
The pool properties change during a mount/demount sequence and a new pool is chosen.
The TS7740/TS7700T tracks the amount of active data on a physical volume. During a premigration or reclamation, the TS7700 attempts to fill the targeted volume and mark it 100% active. Although the granularity of the percentage of full TS7740/TS7700T tracks is 1/10 of 1%, it rounds down, so even 1 byte of inactive data drops the percentage to 99.9%. TS7740/TS7700T tracks the time that the physical volume went from 100% full to less than 100% full by performing the following tasks:
Checking on an hourly basis for volumes in a pool with a nonzero setting
Comparing this time against the current time to determine whether the volume is eligible for reclamation
Physical volume reclamation
Physical volume reclamation consolidates active data and frees stacked volumes for return-to-scratch use. Reclamation is part of the internal management functions of a TS7740/TS7700T. The reclamation process is basically a tape-to-tape copy. The physical volume to be reclaimed is mounted to a physical drive, and the active logical volumes that are there are copied to another filling cartridge under control of the TS7740/TS7700T.
One reclamation task needs two physical tape drives to run. At the end of the reclaim, the source volume is empty, and it is returned to the specified reclamation pool as an empty (scratch) volume. The data that is being copied from the reclaimed physical volume does not go to the TVC. Instead, it is transferred directly from the source to the target tape cartridge. During the reclaim, the source volume is flagged to be in READ.ONLY mode.
Physical tape volumes become eligible for space reclamation when they cross the occupancy threshold level that is specified by the administrator in the home pool definitions where those tape volumes belong. This reclaim threshold is set for each pool individually according to the specific needs for that client, and is expressed in a percentage (%) of tape usage.
Volume reclamation can be concatenated with a Secure Data Erase for that volume, if required. This configuration causes the volume to be erased after the reclamation. For more information, see 2.2.25, “Secure Data Erase function” on page 55.
Consider not running reclamation during peak workload hours of the TS7740/TS7700T. This ensures that recalls and migrations are not delayed due to physical drive shortages. You must choose the best period for reclamation by considering the workload profile for that TS7740/TS7700T cluster, and inhibit reclamation during the busiest period for the system.
A physical volume that is being ejected from the library is also reclaimed in a similar way before it can be ejected. The active logical volumes that are contained in the cartridge are moved to another physical volume, according to the policies defined in the volume’s home pool, before the physical volume is ejected from the library.
An MI-initiated PVOL move also runs this reclamation process.
Reclamation can also be used to migrate older data from a pool to another while it is being reclaimed, but only by targeting a separate specific pool for reclamation.
With Release 3.3, it is now possible to deactivate the reclaim on a physical pool base by specifying a “0” value in the Reclaim Threshold.
With the introduction of heterogeneous tape drive support for migration purposes, the data from the old cartridges (for example, JA and JB) is reclaimed to the new media (for example, JC and JD). To support a faster migration, the reclaim values for the sunset media can be different from the reclaim values for the current tape media. To allow the reclaim for sunset media, at least 15 scratch cartridges from the newer tape media needs to be available. For more information “Physical Volume Pools” on page 427.
Relationship between reclamation and the number of physical pools
The reclaim process is done on a pool basis, and each reclamation process needs two drives. If you define too many pools, it can lead to a situation where the TS7740/TS7700T is incapable of processing the reclamation for all pools in an appropriate manner. Eventually, pools can run out of space (depending on the borrow definitions), or you need more stacked volumes than planned.
The number of physical pools, physical drives, stacked volumes in the pools, and the available time tables for reclaim schedules must be considered and balanced.
You can limit the number of reclaim tasks running concurrent with the LI REQ setting.
2.2.25 Secure Data Erase function
Another concern is the security of old data. The TS7740/TS7700T provides physical volume erasure on a physical volume pool basis controlled by an extra reclamation policy. When Secure Data Erase is enabled, a physical cartridge is not made available as a scratch cartridge until an erasure procedure is complete. The Secure Data Erase function supports the erasure of a physical volume as part of the reclamation process. The erasure is performed by running a long erase procedure against the media.
A Long Erase operation on a TS11xx drive is completed by writing a repeating pattern from the beginning to the end of the physical tape, making all data previously present inaccessible through traditional read operations. The key here is that it is not a fully random from beginning to end pattern, and it has only one pass. The erasure is writing a single random pattern repeatedly with one pass, which might not be as secure as a multi-pass fixed pattern method, as explained by the US Department of Defense (DoD).
Therefore, the logical volumes that are written on this stacked volume are no longer readable. As part of this data erase function, an extra reclaim policy is added. The policy specifies the number of days that a physical volume can contain invalid logical volume data before the physical volume becomes eligible to be reclaimed.
When a physical volume contains encrypted data, the TS7740/TS7700T is able to run a fast erase of the data by erasing the EKs on the cartridge. Basically, it erases only the portion of the tape where the key information is stored. This form of erasure is referred to as a cryptographic erase.
Without the key information, the rest of the tape cannot be read. This method significantly reduces the erasure time. Any physical volume that has a status of read-only is not subject to this function, and is not designated for erasure as part of a read-only recovery (ROR).
If you use the eject stacked volume function, the data on the volume is not erased before ejecting. The control of expired data on an ejected volume is your responsibility.
Volumes that are tagged for erasure cannot be moved to another pool until erased, but they can be ejected from the library, because such a volume is removed for recovery actions.
Using the Move function also causes a physical volume to be erased, even though the number of days that are specified has not yet elapsed. This process includes returning borrowed volumes.
2.2.26 Copy Export function
One of the key reasons to use tape is for recovery of critical operations in a disaster. If you are using a grid configuration that is designed for DR purposes, the recovery time objectives (RTO) and recovery point objectives (RPO) can be measured in minutes. In case you do not require such low recovery times for all or a mixture of your workload, there is a function called Copy Export for the TS7740 and TS7700T.
The Copy Export function enables a copy of selected logical volumes that are written to secondary pools within the TS7740/TS7700T to be removed and taken offsite for DR purposes. The benefits of volume stacking, which places many logical volumes on a physical volume, are retained with this function. Because the physical volumes that are being exported are from a secondary physical pool, the primary logical volume remains accessible to the production host systems.
The following logical volumes are excluded from the export:
Volumes that are mounted during any portion of the export process
Volumes that are unable to create a valid primary or secondary pool copy
Volumes that had not completed replication into the source TS7740 or TS7700T at the start of the export process
These volumes will be candidates in the next copy export request.
The Copy Export sets can be used to restore data at a location that has equal or newer tape technology and equal or newer TS7700 Licensed Internal Code. A TS7700T Copy Export set can be restored into both TS7740 and TS7700T. A TS7740 Copy Export set can also be restored into both TS7740 and TS7700T. However, some rules apply:
TS7700T exported content that is restored to a TS7740 loses all knowledge of partitions.
TS7700T to TS7700T retains all partition information.
TS7740 exported content that is restored into a TS7700T has all content target the primary tape partition.
There is an offsite reclamation process against copy-exported stacked volumes. This process does not require the movement of physical cartridges. Rather, the logical volume is written newly to a copy-exported stacked volume, and the original copy exported stacked volume is marked invalid. For more information, see 12.1.5, “Reclaim process for Copy Export physical volumes” on page 743.
2.2.27 Encryption of physical tapes
The importance of data protection has become increasingly apparent with news reports of security breaches, loss, and theft of personal and financial information, and with government regulation. Encrypting the stacked cartridges minimizes the risk of unauthorized data access without excessive security management burdens or subsystem performance issues.
The encryption solution for tape virtualization consists of several components:
The encryption key manager
The TS1150, TS1140, TS1130, and TS1120 encryption-enabled tape drives
The TS7740/TS7700T
Encryption key manager
For physical tape, the TS7700 can use one of the following encryption key managers:
IBM Security Key Lifecycle Manager (formerly IBM Tivoli Key Lifecycle Manager)
IBM Security Key Lifecycle Manager for z/OS
This book uses the general term key manager for all EK managers.
 
Important: The EKM is no longer available and does not support the TS1140 and TS1150. If you need encryption support for the TS1140 or higher, you must install either IBM Security Key Lifecycle Manager or IBM Security Key Lifecycle Manager for z/OS.
IBM Security Key Lifecycle Manager replaces Tivoli Key Lifecycle Manager.
The key manager is the central point from which all EK information is managed and served to the various subsystems. The key manager server communicates with the TS7740/TS7700T and tape libraries, CUs, and Open Systems device drivers. For more information, see 4.4.7, “Planning for tape encryption in a TS7740, TS7720T, and TS7760T” on page 186.
The TS1150, TS1140, TS1130, and TS1120 encryption-enabled tape drives
The IBM TS1150, TS1140, TS1130, and TS1120 tape drives provide hardware that performs the encryption without reducing the data transfer rate.
The TS7740/TS7700T
The TS7740/TS7700T provides the means to manage the use of encryption and the keys that are used on a storage pool basis. It also acts as a proxy between the tape drives and the key manager servers, by using redundant Ethernet to communicate with the key manager servers and FICONs to communicate with the drives. Encryption must be enabled in each of the
tape drives.
Encryption on the TS7740/TS7700T is controlled on a storage pool basis. The SG DFSMS construct that is specified for a logical tape volume determines which storage pool is used for the primary and optional secondary copies in the TS7740/TS7700T.
The storage pools were originally created for management of physical media, and they have been enhanced to include encryption characteristics. Storage pool encryption parameters are configured through the TS7740/TS7700T MI under Physical Volume Pools.
For encryption support, all drives that are attached to the TS7740/TS7700T must be Encryption Capable, and encryption must be enabled. If TS7740/TS7700T uses TS1120 Tape Drives, they must also be enabled to run in their native E05 format. The management of encryption is performed on a physical volume pool basis. Through the MI, one or more of the 32 pools can be enabled for encryption.
Each pool can be defined to use specific EKs or the default EKs defined at the key manager server:
Specific EKs
Each pool that is defined in the TS7740/TS7700T can have its own unique EK. As part of enabling a pool for encryption, enter two key labels for the pool and an associated key mode. The two keys might or might not be the same. Two keys are required by the key manager servers during a key exchange with the drive. A key label can be up to 64 characters. Key labels do not have to be unique per pool.
The MI provides the capability to assign the same key label to multiple pools. For each key, a key mode can be specified. The supported key modes are Label and Hash. As part of the encryption configuration through the MI, you provide IP addresses for a primary and an optional secondary key manager.
Default EKs
The TS7740/TS7700T encryption supports the use of a default key. This support simplifies the management of the encryption infrastructure, because no future changes are required at the TS7740/TS7700T. After a pool is defined to use the default key, the management of encryption parameters is performed at the key manager:
 – Creation and management of encryption certificates
 – Device authorization for key manager services
 – Global default key definitions
 – Drive-level default key definitions
 – Default key changes as required by security policies
For logical volumes that contain data that is to be encrypted, host applications direct them to a specific pool that has been enabled for encryption by using the SG construct name. All data that is directed to a pool that is enabled for encryption is encrypted when they are premigrated to the physical stacked volumes, or reclaimed to the stacked volume during the reclamation process. The SG construct name is bound to a logical volume when it is mounted as a scratch volume.
Through the MI, the SG name is associated with a specific pool number. When the data for a logical volume is copied from the TVC to a physical volume in an encryption-enabled pool, the TS7740/TS7700T determines whether a new physical volume needs to be mounted. If a new cartridge is required, the TS7740/TS7700T directs the drive to use encryption during the mount process.
The TS7740/TS7700T also provides the drive with the key labels specified for that pool. When the first write data is received by the drive, a connection is made to a key manager and the key that is needed to perform the encryption is obtained. Physical scratch volumes are encrypted with the keys in effect at the time of first write to BOT.
Any partially filled physical volumes continue to use the encryption settings in effect at the time that the tape was initially written from BOT. The encryption settings are static until the volumes are reclaimed and rewritten again from BOT.
Figure 2-9 illustrates that the method for communicating with a key manager is through the same Ethernet interface that is used to connect the TS7740/TS7700T to your network for access to the MI.
Figure 2-9 TS7740/TS7700T encryption
The request for an EK is directed to the IP address of the primary key manager. Responses are passed through the TS7740/TS7700T to the drive. If the primary key manager did not respond to the key management request, the optional secondary key manager IP address is used. After the TS11x0 drive completes the key management communication with the key manager, it accepts data from the TVC.
When a logical volume needs to be read from a physical volume in a pool that is enabled for encryption, either as part of a recall or reclamation operation, the TS7740/TS7700T uses the key manager to obtain the necessary information to decrypt the data.
The affinity of the logical volume to a specific EK, or the default key, can be used as part of the search criteria through the TS7700 MI.
 
Remember: If you want to use external key management for both cache and physical tapes, you must use the same external key manager instance.
2.2.28 User Management: Roles and profiles
The TS7700 offers you internal user management, but also external user management through LDAP support.
You can use this user management to specify independent User IDs. Each User ID is assigned a role. The role identifies the access rights for this user. You can use this method to restrict the access to specific tasks.
In R3.2, a new read only role was introduced. Users who are assigned to this role can view information only about the MI, but cannot change any information.
You should consider restricting access to specific items. Especially the Tape Partition management and the access to the LIBRARY REQUEST should be considered carefully.
2.2.29 Security identification by using Lightweight Directory Access Protocol
Previous implementations are based on Tivoli System Storage Productivity Center to authenticate users to a client’s Lightweight Directory Access Protocol (LDAP) server. Beginning with Release 3.0 of Licensed Internal Code (LIC), both the TS7700 clusters and TS3000 System Console (TS3000 TSSC) have native support for an LDAP server (currently, only Microsoft Active Directory (MSAD) is supported).
Starting with R3.0, when LDAP is enabled, the TS7700 MI is controlled by the LDAP server. Also, the local actions that are run by the IBM SSR are secured by the LDAP server. All IBM standard users can no longer access the system without a valid LDAP user ID and password. You must have a valid account in the LDAP server, and the roles that are assigned to your user, to be able to communicate with the TS7700.
If your LDAP server is not available, you are not able to interact with TS7700 (not with IBM SSR or an operator).
 
Important: Create at least one external authentication policy for IBM SSRs before a service event.
With R3.2, IBM RACF® can now be used to control the access. That means that all users are defined to RACF and, in case of an access, the password is verified on the RACF database. Roles and profiles still must be maintained because the RACF database runs only the password authentication.
In Release 3.2, a change was introduced to allow specific access without the usage of LDAP (IBM SSR and second-level dial-in support).
2.2.30 Grid Resiliency Functions
With R4.1.2, a new function to improve the resiliency of a grid is introduced. For a stand-alone cluster, the only available feature from this function is called local fence.
Before R4.1.2, in rare cases a stand-alone cluster could initiate a Reboot without any notification to the customer. Starting with R4.1.2 the attached hosts are now notified as long as the cluster is still able to do so. Then the reboot is executed. The advantage of the notification is that an error reason is provided and that the customer can trigger these messages. In addition, more data will be collected and interpreted to trigger a local fence.
Local fence is automatically enabled, cannot be disabled by the customer, and has no parameters or options.
For more information, see the IBM TS7700 Series Grid Resiliency Improvements User’s Guide at:
2.2.31 Service preparation mode
This function is available only in a multi-cluster grid.
2.2.32 Service mode
This function is available only in a multi-cluster grid.
2.2.33 Control Unit Initiated Reconfiguration
The CUIR function is available only in a multi-cluster grid.
2.3 Multi-cluster grid configurations: Components, functions, and features
Multi-cluster grids are combinations of two to eight clusters that work together as one logical entity. TS7700D, TS7700T, and TS7740 can be combined as a hybrid grid, but you can also form a grid just from TS7700D, TS7700T, or TS7740 clusters. The configuration that is suitable for you depends on your requirements.
To enable multiple clusters to work together as a multi-cluster grid, some hardware configurations must be provided. Also, logical considerations need to be planned and implemented. The following topics are described in this section:
The base rules that apply in a multi-cluster grid
Required grid hardware
Implementation concepts for the grid
Components and features that are used in a grid
Figure 2-10 shows a four-cluster hybrid grid. The configuration consists of two TS7720s and two TS7740s. More examples are available in 2.4, “Grid configuration examples” on page 103.
Figure 2-10 TS7700D 4-cluster grid
2.3.1 Rules in a multi-cluster grid
In a multi-cluster grid, some general rules apply:
A grid configuration looks like a single tape library and tape drives to the hosts.
It is a composite library with underlying distributed libraries.
Up to eight clusters can form a grid.
Data integrity is accomplished by the concept of volume ownership.
All TS7700 models can coexist in a grid. If only a disk and a tape-attached model are combined, that configuration is called a hybrid grid.
If one cluster is not available, the grid still continues to work.
Clusters can be grouped into cluster families.
Mounts, both scratch (Fast Ready) and private (non-Fast Ready), can be satisfied from any cluster in the grid, which is controlled by your implementation.
 
Remember: Seven- and eight-cluster grid configurations are available with an RPQ.
In a multi-cluster grid, some rules for virtual and logical volumes apply:
You can store a logical volume or virtual volume in the following ways:
 – Single instance in only one cluster in a grid.
 – Multiple instances (two, three, four, five, or six) in different clusters in the grid, up to the number of clusters in the grid.
 – Each TS7740/TS7700T cluster in the grid can store dual copies on physical tape. Each copy is a valid source for the virtual or logical volume.
 – Selective dual copy is still a valid option in a TS7740/TS7700T. (In an extreme case, you can end up with 12 instances of the same data spread out on six different clusters using selective dual copy.)
You control the number of instances, and the method of how the instances are generated through different copy policies.
In a multi-cluster grid, the following rules for access to the virtual and logical volumes apply:
A logical volume can be accessed from any virtual device in the system.
Any logical volume (replicated or not) is accessible from any other cluster in the grid.
Each distributed library has access to any logical volumes within the composite library.
 
Note: You can still restrict access to clusters by using host techniques (for example, HCD).
With this flexibility, the TS7700 grid provides many options for business continuance and data integrity, meeting requirements for a minimal configuration up to the most demanding advanced configurations.
2.3.2 Required grid hardware
To combine single clusters into a grid, several requirements must be met:
Each of the TS7700 must have the Grid Enablement feature installed.
Each of the TS7700 engines must be connected to all other clusters in the grid through the grid network. Each cluster can have two or four links to the grid network.
Grid enablement
FC4015 must be installed on all clusters in the grid.
Grid network
A grid network is the client-supplied TCP/IP infrastructure that interconnects the TS7700 grid. Each cluster has two Ethernet adapters that are connected to the TCP/IP infrastructure. The single-port 10 gigabits per second (Gbps) long-wave optical fiber adapter is supported. This configuration accounts for two or four grid links, depending on the cluster configuration. See 7.1.1, “Common components for the TS7700 models” on page 246.
Earlier TS7740 might still have the single-port adapters for the copper connections and SW 1 Gbps connections. A miscellaneous equipment specification (MES) is available to upgrade the single port to dual-port adapters.
Dynamic Grid Load Balancing
Dynamic Grid Load Balancing is an algorithm that is used within the TS7700. It continually monitors and records the rate at which data is processed by each network link. Whenever a new task starts, the algorithm uses the stored information to identify the link that can most quickly complete the data transfer. The algorithm also identifies degraded link performance, and sends a warning message to the host.
Remote mount automatic IP failover
If a grid link fails during a remote mount, the Remote Mount IP Link Failover function attempts to reestablish the connection through an alternative link. During a failover, up to three extra links are attempted. If all configured link connections fail, the remote mount fails, resulting in a host job failure or a Synchronous mode copy break. When a remote mount or a Synchronous copy is in use and a TCP/IP link failure occurs, this intelligent failover function recovers by using an alternative link. The following restrictions apply:
Each cluster in the grid must operate by using a Licensed Internal Code level of 8.21.0.xx or later.
At least two grid connections must exist between clusters in the grid (either two or four
1 Gbps grid links or two or four 10 Gbps grid links).
Internet Protocol Security for grid links
When you are running the TS7700 R3.1 level of LIC, the TS7760 and the 3957-V07 and 3957-VEB models support Internet Protocol Security (IPSec) on the grid links. Use IPSec capabilities Only if required by the nature of your business.
 
Tip: Enabling grid encryption significantly affects the replication performance of the TS7700 grid.
Date and Time coordination
The TS7700 cluster tracks time in relation to Coordinated Universal Time. Statistics are also reported in relation to Coordinated Universal Time. All nodes in the grid subsystem coordinate their time with one another. There is no need for an external time source, such as an NTP server, even in a grid with large distances between the clusters.
However, if the grid is not connected to an external time source, the time that is presented from the grid (VEHSTATS and so on) might not show the same time as your LPARs, which can lead to some confusion during problem determination or for reporting, because the different time stamps do not match.
Therefore, the preferred method to keep nodes synchronized is by using a Network Time Protocol (NTP) server. The NTP server can be a part of the grid wide area network (WAN) infrastructure, your intranet, or a public server on the internet (Figure 2-11).
Figure 2-11 Time coordination with NTP servers
The NTP server address is configured into the system vital product data (VPD) on a system-wide scope. Therefore, all nodes access the same NTP server. All clusters in a grid need to be able to communicate with the same NTP server that is defined in VPD. In the absence of an NTP server, all nodes coordinate time with Node 0 or the lowest cluster index designation. The lowest index designation is Cluster 0, if Cluster 0 is available. If not, it uses the next available cluster.
2.3.3 Data integrity by volume ownership
In a multi-cluster grid, only one cluster at a time can modify volume data or attributes. To manage this, the concept of ownership was introduced.
Ownership
Any logical volume, or any copies of it, can be accessed by a host from any virtual device that is participating in a common grid, even if the cluster associated with the virtual device does not have a local copy. The access is subject to volume ownership rules. At any point in time, a logical volume is owned by only one cluster. The owning cluster controls access to the data and the attributes of the volume.
 
Remember: The volume ownership protects the volume from being accessed or modified by multiple clusters simultaneously.
Ownership can change dynamically. If a cluster needs to mount a logical volume on one of its virtual devices and it is not the owner of that volume, it must obtain ownership first. When required, the TS7700 node transfers the ownership of the logical volume as part of mount processing. This action ensures that the cluster with the virtual device that is associated with the mount has ownership.
If the TS7700 clusters in a grid, and the communication paths between them, are operational, the change of ownership and the processing of logical volume-related commands are not apparent to the host.
If a TS7700 Cluster has a host request for a logical volume that it does not own, and it cannot communicate with the owning cluster, the operation against that volume fails unless more direction is given.
Ownership can also be transferred manually by an LI REQ,OTCNTL for special purposes. For more information, see the IBM TS7700 Series z/OS Host Command Line Request User’s Guide on the following website:
If a cluster is not reachable, clusters do not automatically assume or take ownership of a logical volume without being directed. This can either be done manually, or can be automated with the Autonomic Ownership Takeover Manager (AOTM). Service outages have implied ownership takeover. The manual ownership possibility is presented on the MI when the grid determines that a cluster cannot be reached, and AOTM is either not installed or the failure cannot be managed by AOTM.
To support the concept of ownership, it was necessary to introduce tokens.
Tokens
Tokens are used to track changes to the ownership, data, or properties of a logical volume. The tokens are mirrored at each cluster that participates in a grid and represent the current state and attributes of the logical volume. Tokens have the following characteristics:
Every logical volume has a corresponding token.
The grid component manages updates to the tokens.
Tokens are maintained in an IBM DB2® database that is coordinated by the local hnodes.
Each cluster’s DB2 database has a token for every logical volume in the grid.
Tokens are internal data structures that are not directly visible to you. However, they can be retrieved through reports that are generated with the Bulk Volume Information Retrieval (BVIR) facility.
Tokens are part of the architecture of the TS7700. Even in a stand-alone cluster, they exist and are used in the same way as they are used in the grid configuration (with only one cluster running the updates and keeping the database). In a grid configuration, all members in the grid have the information for all tokens (also known as logical volumes) within the composite library mirrored in each cluster. Token information is updated real time at all clusters in a grid.
Ownership takeovers
In some situations, the ownership of the volumes might not be transferable, such as when there is a cluster outage. Without the AOTM, you need to take over manually. The following options are available:
Read-only Ownership Takeover
When Read-only Ownership Takeover is enabled for a failed cluster, ownership of a volume is taken from the failed TS7700 Cluster. Only read access to the volume is allowed through the other TS7700 clusters in the grid. After ownership for a volume has been taken in this mode, any operation that attempts to modify data on that volume or change its attributes fails. The mode for the failed cluster remains in place until another mode is selected or the failed cluster is restored.
Write Ownership Takeover (WOT)
When WOT is enabled for a failed cluster, ownership of a volume is taken from the failed TS7700 Cluster. Full access is allowed through the requesting TS7700 Cluster in the grid, and all other available TS7700 clusters in the grid.
The automatic ownership takeover method that is used during a service outage is identical to WOT, but without the need for a person or AOTM to initiate it. The mode for the failed cluster remains in place until another mode is selected or the failed cluster has been restored.
Scratch mounts continue to prefer volumes that are owned by the available clusters. Only after all available candidates have been exhausted does it take over a scratch volume from the unavailable cluster.
You can set the level of ownership takeover, Read-only or Write, through the TS7700 MI.
In the service preparation mode of a TS7700 cluster, ownership takeover is automatically enabled, making it possible for the remaining clusters to gracefully take over volumes with full read and write access. The mode for the cluster in service remains in place until it is taken out of service mode.
 
Important: You cannot set a cluster in service preparation after it has already failed.
For more information about an automatic takeover, see 2.3.34, “Autonomic Ownership Takeover Manager” on page 96.
2.3.4 I/O TVC selection
All vNodes in a grid have direct access to all logical volumes in the grid. The cluster that is selected for the mount is not necessarily the cluster that is chosen for I/O TVC selection. All I/O operations that are associated with the virtual tape drive are routed to and from its vNode to the I/O TVC.
When a TVC that is different from the local TVC at the actual mount point is chosen, this is called a remote mount. The TVC is then accessed by the grid network. You have several ways to influence the TVC selection.
During the logical volume mount process, the best TVC for your requirements is selected, based on the following considerations:
Availability of the cluster
Copy Consistency policies and settings
Scratch allocation assistance (SAA) for scratch mount processing
DAA for specific mounts
Override settings
Cluster family definitions
2.3.5 Copy consistency points
In a multi-cluster grid configuration, several policies and settings can be used to influence the location of data copies and when the copies are run.
Consistency point management is controlled through the MC storage construct. Using the MI, you can create MCs and define where copies are placed and when they are synchronized relative to the host job that created them. Depending on your business needs for more than one copy of a logical volume, multiple MCs, each with a separate set of definitions, can be created.
The following key questions help to determine copy management in the TS7700:
Where do you want your copies to be placed?
When do you want your copies to become consistent with the originating data?
Do you want logical volume copy mode retained across all grid mount points?
For different business reasons, data can be synchronously created in two places, copied immediately, or copied asynchronously. Immediate and asynchronous copies are pulled and not pushed within a grid configuration. The cluster that acts as the mount cluster informs the appropriate clusters that copies are required and the method they need to use. It is then the responsibility of the target clusters to choose an optimum source and pull the data into its
disk cache.
There are currently five available consistency point settings:
Sync As data is written to the volume, it is compressed and then simultaneously written or duplexed to two TS7700 locations. The mount point cluster is not required to be one of the two locations. Memory buffering is used to improve the performance of writing to two locations. Any pending data that is buffered in memory is hardened to persistent storage at both locations only when an implicit or explicit sync operation occurs. This provides a zero RPO at tape sync point granularity.
Tape workloads in IBM Z environments already assume sync point hardening through explicit sync requests or during close processing, enabling this mode of replication to be performance-friendly in a tape workload environment. When sync is used, two clusters must be defined as sync points. All other clusters can be any of the remaining consistency point options, enabling more copies to be made.
RUN The copy occurs as part of the Rewind Unload (RUN) operation, and completes before the RUN operation at the host finishes. This mode is comparable to the immediate copy mode of the PtP VTS.
Deferred The copy occurs after the rewind unload operation at the host. This mode is comparable to the Deferred copy mode of the PtP VTS. This is also called Asynchronous replication.
Time Delayed The copy occurs only after a specified time (1 hour - 379 days). If the data expires before the Time Delayed setting is reached, no copy is produced at all. For Time Delayed, you can specify after creation or after access in
the MC.
No Copy No copy is made.
On each cluster in a multi-cluster grid, a Copy Consistency Point setting is specified for the local cluster, and one for each of the other clusters. The settings can be different on each cluster in the grid. When a volume is mounted on a virtual tape device, the Copy Consistency Point policy of the cluster to which the virtual device belongs is accepted, unless Retain Copy mode was turned on at the MC.
 
Remember: The mount point (allocated virtual device) and the actual TVC used might be in different clusters. The Copy Consistency Policy is one of the major parameters that are used to control the TVC.
2.3.6 Cluster family concept
In earlier releases, copy consistency points were the primary rules that were used to determine I/O TVC selection and how replication occurred. When two or more clusters are in proximity to each other, these behaviors were not always ideal. For example, remote clusters could be used for I/O TVC selection versus adjacent clusters, or copies to remote clusters could pass data across distant links more than once.
The concept of families was introduced to help with the I/O TVC selection process, and to help make distant replication more efficient. For example, two clusters are at one site, and the other two are at a remote site. When the two remote clusters need a copy of the data, cluster families enforce that only one copy of the data is sent across the long grid link.
Also, when a cluster determines where to source a volume, it gives higher priority to a cluster in its family over another family. A cluster family establishes a special relationship between clusters. Typically, families are grouped by geographical proximity to optimize the use of grid bandwidth. Family members are given higher weight when determining which cluster to prefer for TVC selection.
Figure 2-12 on page 70 illustrates how cooperative replication occurs with cluster families. Cooperative replication is used for Deferred copies only. When a cluster needs to pull a copy of a volume, it prefers a cluster within its family. The example uses Copy Consistency Points of Run, Run, Deferred, Deferred [R,R,D,D].
With cooperative replication, one of the family B clusters at the DR site pulls a copy from one of the clusters in production family A. The second cluster in family B waits for the other cluster in family B to finish getting its copy, then pulls it from its family member. This way the volume travels only once across the long grid distance.
Figure 2-12 illustrates the concept of cooperative replication.
Figure 2-12 Cluster families
Cooperative replication includes another layer of consistency. A family is considered consistent when only one member of the family has a copy of a volume. Because only one copy is required to be transferred to a family, the family is consistent after the one copy is complete. Because a family member prefers to get its copy from another family member rather than getting the volume across the long grid link, the copy time is much shorter for the family member.
Because each family member is pulling a copy of a separate volume, this process makes a consistent copy of all volumes to the family quicker. With cooperative replication, a family prefers retrieving a new volume that the family does not have a copy of yet, over copying a volume within a family. With fewer than 20 (or the number of configured replication) tasks, copies must be sourced from outside of the family, and the family begins to replicate among itself.
Second copies of volumes within a family are deferred in preference to new volume copies into the family. Without families, a source cluster attempts to keep the volume in its cache until all clusters that need a copy have received their copy. With families, a cluster’s responsibility to keep the volume in cache is released after all families that need a copy have it. This process enables PG0 volumes in the source cluster to be removed from cache sooner.
Another benefit is the improved TVC selection in cluster families. For cluster families already using cooperative replication, the TVC algorithm favors using a family member as a copy source. Clusters within the same family are favored by the TVC algorithm for remote (cross) mounts. This favoritism assumes that all other conditions are equal for all the grid members.
For more information about cluster families, see IBM Virtualization Engine TS7700 Series Best Practices -TS7700 Hybrid Grid Usage, found at the following website:
2.3.7 Override settings concept
With the prior generation of PtP VTS, several optional override settings influenced how an individual VTC selected a VTS to run the I/O operations for a mounted tape volume. In the existing VTS, the override settings were only available to an IBM SSR. With the TS7700, you define and set the optional override settings that influence the selection of the I/O TVC and replication responses by using the MI.
 
Note: Remember, Synchronous mode copy is not subject to copy policy override settings.
TS7700 overrides I/O TVC selection and replication response
The settings are specific to a cluster, which means that each cluster can have separate settings, if wanted. The settings take effect for any mount requests received after the settings were saved. All mounts, independent of which MC is used, use the same override settings. Mounts already in progress are not affected by a change in the settings.
The following override settings are supported:
Prefer Local Cache for Fast Ready Mount Requests
This override prefers the mount point cluster as the I/O TVC for scratch mounts if it is available and contains a valid copy consistency definition other than No Copy.
Prefer Local Cache for non-Fast Ready Mount Requests
This override prefers the mount point cluster as the I/O TVC for private mounts if it is available, contains a valid copy consistency definition other than No Copy, and contains a valid copy of the volume. If the local valid copy is only on physical tape, a recall occurs versus using a remote cache resident copy.
Force Local TVC to have a copy of the data
The default behavior of the TS7700 is to make only a copy of the data based on the definitions of the MC associated with the volume mounted, and to select an I/O TVC that was defined to have a copy and a valid Copy Consistency Point defined. If the mount vNode is associated with a cluster for which the specified MC defined a Copy Consistency Point of No Copy, a copy is not made locally and all data access is to a remote TVC.
In addition, if the mount vNode has a specified defined Copy Consistency Point of Deferred, remote RUN clusters are preferred. This overrides the specified MC with a Copy Consistency Point of RUN for the local cluster independent of its currently configured Copy Consistency Point. Furthermore, it requires that the local cluster is always chosen as the I/O TVC. If the mount type is private (non-Fast Ready), and a consistent copy is unavailable in the local TVC, a copy is run to the local TVC before mount completion. The copy source can be any participating TS7700 in the grid.
In a TS7740/TS7700T, the logical volume might have to be recalled from a stacked cartridge. If, for any reason, the vNode cluster is not able to act as the I/O TVC, a mount operation fails, even if remote TVC choices are still available when this override is enabled.
The override does not change the definition of the MC. It serves only to influence the selection of the I/O TVC or force a local copy.
Copy Count Override
This override limits the number of RUN consistency points in a multi-cluster grid that must be consistent before the surfacing device end to a RUN command. Only Copy Consistency Points of RUN are counted. For example, in a three-cluster grid, if the MC specifies Copy Consistency Points of RUN, RUN, RUN, and the override is set to two, initial status or device end is presented after at least two clusters that are configured with a RUN consistency point are consistent.
This includes the original I/O TVC if that site is also configured with a RUN consistency point. The third RUN consistency point is changed to a Deferred copy after at least two of the three RUN consistency points are consistent. The third site that has its Copy Consistency Point changed to Deferred is called the floating deferred site. A floating deferred site has not completed its copy when the Copy Count value is reached.
Ignore cache preference groups for copy priority
If this option is selected, copy operations ignore the cache preference group when determining the priority of volumes that are copied to other clusters. When not set, preference group 0 volumes are preferred to enable the source cluster, which retains the volume in cache for replication purposes, to migrate the volume as quickly as possible. When set, the priority is in first-in first-out (FIFO) order.
Overrides for Geographically Dispersed Parallel Sysplex
The default behavior of the TS7700 is to follow the MC definitions and configuration characteristics to provide the best overall job performance. In certain IBM Geographically Dispersed Parallel Sysplex™ (IBM GDPS®) use cases, all I/O must be local to the mount vNode. There can be other requirements, such as DR testing, where all I/O must go only to the local TVC to ensure that the correct copy policies are implemented and that data is available where required.
In these GDPS use cases, you must set the Force Local TVC override to ensure that the local TVC is selected for all I/O. This setting includes the following options:
Prefer Local for Fast Ready Mounts
Prefer Local for non-Fast Ready Mounts
Force Local TVC to have a copy of the data
 
Consideration: Do not use the Copy Count Override in a GDPS environment.
2.3.8 Host view of a multi-cluster grid and Library IDs
In addition to the stand-alone cluster, the grid is represented by only one composite library to the host. But each of the multiple TS7700s must have a unique distributed library defined. It is necessary to enable the host to differentiate between the entire grid versus each cluster within the grid. This differentiation is required for messages and certain commands that target the grid or clusters within the grid.
Composite library
The composite library is the logical image of all clusters in a multi-cluster grid, and is presented to the host as a single library. The host sees a logical tape library with up to 96 CUs in a standard six cluster grid, or up to 186 CUs if all six clusters have been upgraded to support 496 drives.
The virtual tape devices are defined for the composite library only.
Figure 2-13 illustrates the host view of a three-cluster grid configuration.
Figure 2-13 TS7700 3-cluster grid configuration
Distributed library
Each cluster in a grid is a distributed library, which consists of a TS7700. In a TS7740/TS7700T, it is also attached to a physical tape library. Each distributed library can have up to 31 3490E tape controllers per cluster. Each controller has 16 IBM 3490E tape drives, and is attached through up to four FICON channel attachments per cluster. However, the virtual drives and the virtual volumes are associated with the composite library.
There is no difference from a stand-alone definition.
2.3.9 Tape Volume Cache
In general, the same rules apply as for stand-alone clusters.
However, in a multi-cluster grid, the different TVCs from all clusters are potential candidates for containing logical volumes. The group of TVCs can act as one composite TVC to your storage cloud, which can influence the following areas:
TVC management
Out of cache resources conditions
Selection of I/O cache
2.3.10 Virtual volumes and logical volumes
There is no difference between multi-cluster grids and stand-alone cluster.
 
Remember: Starting with V07/VEB servers and R3.0, the maximum number of supported virtual volumes is 4,000,000 virtual volumes per stand-alone cluster or multi-cluster grid. The default maximum number of supported logical volumes is still 1,000,000 per grid. Support for extra logical volumes can be added in increments of 200,000 volumes by using FC5270.
Important: All clusters in a grid must have the same quantity of installed instances of FC5270 configured. If you have configured a different number of FC5270s in clusters that are combined to a grid, the cluster with the lowest number of virtual volumes constrains all of the other clusters. Only this number of virtual volumes is then available in the grid.
2.3.11 Mounting a scratch virtual volume
In addition to the stand-alone capabilities, you can use SAA. This function widens the standard DAA support (for specific allocations) to scratch allocations, and enables you to direct a scratch mount to a set of specific candidate clusters. For more information, see “Scratch allocation assistance” on page 77.
2.3.12 Mounting a specific virtual volume
A mount for a specific volume can be sent to any device within any cluster in a grid configuration. With no additional assistance, the mount uses the TVC I/O selection process to locate a valid version of the volume.
The following scenarios are possible:
There is a valid copy in the TVC of the cluster where the mount is placed. In this case, the mount is signaled as complete and the host can access the data immediately.
There is no valid copy in the TVC of the cluster where the mount is placed. In this case, there are further options:
 – Another cluster has a valid copy already in cache. The virtual volume is read over the grid link from the remote cluster, which is called a remote mount. No physical mount occurs. In this case, the mount is signaled as complete and the host can access the data immediately. However, the data is accessed through the grid network from a different cluster.
 – No clusters have a copy in disk cache. In this case, a TS7740 or TS7700T CP1 - CP7 is chosen to recall the volume from physical tape to disk cache. Mount completion is signaled to the host system only after the entire volume is available in the TVC.
 – No copy of the logical volume can be determined in an active cluster, in cache, or on a stacked volume. The mount fails. Clusters in service preparation mode or in service mode are considered inactive.
To optimize your environment, DAA can be used. See “Device allocation assistance” on page 76.
If the virtual volume was modified during the mount operation, it is premigrated to back-end tape (if present), and has all copy policies acknowledged. The virtual volume is transferred to all defined consistency points. If you do not specify the Retain Copy Mode, the copy policies from the mount cluster are chosen at each close process.
If modification of the virtual volume did not occur when it was mounted, the TS7740/TS7720T does not schedule another copy operation, and the current copy of the logical volume on the original stacked volume remains active. Furthermore, copies to remote TS7700 clusters are not required if modifications were not made.
The exception is if the Retain Copy policy is not set, and the MC at the mounting cluster has different consistency points defined compared to the volume’s previous mount. If the consistency points are different, the volume inherits the new consistency points and creates more copies within the grid, if needed. Existing copies are not removed if already present. Remove any non-required copies by using the LIBRARY REQUEST REMOVE command.
2.3.13 Logical WORM support and characteristics
There is no difference between LWORM in multicluster and stand-alone cluster environments.
2.3.14 Virtual drives
From a technical perspective, there is no difference between virtual drives in a multi-cluster grid versus a stand-alone cluster. Each cluster has 256 drives per default. See Table 2-1.
Table 2-1 Number of maximum virtual drives in a multi-cluster grid
Cluster type
Number of maximum virtual drives
Stand-alone cluster
256
Dual-cluster grid/Two-cluster grid
512
Three-cluster grid
768
Four-cluster grid
1024
Five-cluster grid
1280
Six-cluster grid
1536
Seven-cluster grid
1792
Eight-cluster grid
2048
With the new FC 5275, you can add one LCU with 16 drives up to the maximum of 496 logical drives per cluster. This results in the following maximum numbers of virtual drives. See Table 2-2.
Table 2-2 Number of maximum virtual drives in a multi-cluster grid with FC 5275 installed
Cluster type
Number of maximum virtual drives
Stand-alone cluster
496
Dual-cluster grid/Two-cluster grid
992
Three-cluster grid
1488
Four-cluster grid
1984
Five-cluster grid
2480
Six-cluster grid
2976
Seven-cluster grid
3472
Eight-cluster grid
3968
To support this number of virtual drives, specific authorized program analysis reports (APARs) are needed to install the appropriate program temporary fixes (PTFs) for the Preventive Service Planning (PSP) bucket.
2.3.15 Allocation assistance
Scratch and private allocations in a z/OS environment can be more efficient or more selective using the allocation assistance functions incorporated into the TS7700 and z/OS software. DAA is used to help specific allocations choose clusters in a grid that provides the most efficient path to the volume data.
DAA is enabled, by default, in all TS7700 clusters. If random allocation is preferred, it can be disabled by using the LIBRARY REQUEST command for each cluster. If DAA is disabled for the cluster, DAA is disabled for all attached hosts.
SAA was introduced in TS7700 R2.0, and is used to help direct new allocations to specific clusters within a multi-cluster grid. With SAA, clients identify which clusters are eligible for the scratch allocation and only those clusters are considered for the allocation request. SAA is tied to policy management, and can be tuned uniquely per defined MC.
SAA is disabled, by default, and must be enabled by using the LIBRARY REQUEST command before any SAA MC definition changes take effect. Also, the allocation assistance features might not be compatible with Automatic Allocation managers based on offline devices. Verify the compatibility before you introduce either DAA or SAA.
 
Important: Support for the allocation assistance functions (DAA and SAA) was first added to the job entry subsystem 2 (JES2) environment. Starting with z/OS V2R1, DAA and SAA are also available to JES3.
Device allocation assistance
DAA enables the host to query the TS7700 to determine which clusters are preferred for a private (specific) mount request before the actual mount is requested. DAA returns to the host a ranked list of clusters (the preferred cluster is listed first) where the mount must be run.
The selection algorithm orders the clusters in the following sequence:
1. Those clusters with the highest Copy Consistency Point
2. Those clusters that have the volume already in cache
3. Those clusters in the same cluster family
4. Those clusters that have a valid copy on tape
5. Those clusters without a valid copy
If the mount is directed to a cluster without a valid copy, a remote mount can be the result. Therefore, in special cases, even if DAA is enabled, remote mounts and recalls can still occur.
Later, host processing attempts to allocate a device from the first cluster that is returned in the list. If an online non-active device is not available within that cluster, it moves to the next cluster in the list and tries again until a device is chosen. This process enables the host to direct the mount request to the cluster that results in the fastest mount, which is typically the cluster that has the logical volume resident in cache.
DAA improves a grid’s performance by reducing the number of cross-cluster mounts. This feature is important when copied volumes are treated as Preference Group 0 (removed from cache first), and when copies are not made between locally attached clusters of a common grid. With DAA, using the copy policy overrides to Prefer local TVC for Fast Ready mounts provides the best overall performance. Configurations that include the TS7760 and TS7720 deep cache dramatically increase their cache hit ratio.
Without DAA, configuring the cache management of replicated data as PG1 (prefer to be kept in cache with an LRU algorithm) is the best way to improve private (non-Fast Ready) mount performance by minimizing cross-cluster mounts. However, this performance gain includes a reduction in the effective grid cache size, because multiple clusters are maintaining a copy of a logical volume. To regain the same level of effective grid cache size, an increase in physical cache capacity might be required.
DAA (JES2) requires updates in host software (APAR OA24966 for z/OS V1R8, V1R9, and V1R10). DAA functions are included in z/OS V1R11 and later. DAA (JES3) is available starting with z/OS V2R1.
Scratch allocation assistance
With the grid configuration, using TS7760, TS7720, and TS7740 clusters is becoming more popular. There is a growing need for a method to enable z/OS to favor particular clusters over others for a workload. For example, OAM or DFSMShsm Migration Level 2 (ML2) migration might favor a TS7760 or TS7720 with its deep cache versus an archive workload that favors a TS7740 within the same grid configuration.
SAA functions extend the capabilities of DAA to the scratch mount requests. SAA filters the list of clusters in a grid to return to the host a smaller list of candidate clusters that are designated as scratch mount candidates. By identifying a subset of clusters in the grid as sole candidates for scratch mounts, SAA optimizes scratch mounts to a TS7700 grid.
Figure 2-14 shows the process of scratch allocation.
Figure 2-14 Scratch allocation direction to preferred cluster
A cluster is designated as a candidate for scratch mounts by using the Scratch Mount Candidate option on the MC construct, which is accessible from the TS7700 MI. Only those clusters that are specified through the assigned MC are considered for the scratch mount request.
When queried by the host that is preparing to issue a scratch mount, the TS7700 considers the candidate list that is associated with the MC, and considers cluster availability. The TS7700 then returns to the host a filtered, but unordered, list of candidate clusters suitable for the scratch mount operation.
The z/OS allocation process then randomly chooses a device from among those candidate clusters to receive the scratch mount. If all candidate clusters are unavailable or in service, all clusters within the grid become candidates. In addition, if the filtered list returns clusters that have no devices that are configured within z/OS, all clusters in the grid become candidates.
Be aware that SAA (and therefore this behavior) influences only the mount selection of the logical volume. If in the Management Class the unavailable cluster is defined as the only cluster where the data should be written to (TVC selection), the mount will be processed. However, the job is still unable to run because the selected TVC is unavailable. You will see CBR4000I and CBR4171I messages, and get a CBR4196D for a reply.
If either of the following events occurs, the mount enters the mount recovery process and does not use non-candidate cluster devices:
All devices in the selected cluster are busy.
Too few or no devices in the selected cluster are online.
You can use a new LIBRARY REQUEST option to enable or disable globally the function across the entire multi-cluster grid. Only when this option is enabled does the z/OS software run the additional routines that are needed to obtain the candidate list of mount clusters from a certain composite library. This function is disabled by default.
All clusters in the multi-cluster grid must be at release 2.0 level before SAA is operational. A supporting z/OS APAR OA32957 is required to use SAA in a JES2 environment of z/OS. Any z/OS environment with earlier code can exist, but it continues to function in the traditional way in relation to scratch allocations. SAA is also supported in a JES3 environment, starting with z/OS V2R1.
2.3.16 Selective Device Access Control
There is no difference between SDAC in multicluster and stand-alone cluster environments. However, configure SDAC so that each plex gets a portion of a cluster’s devices in a multicluster configuration to achieve HA.
2.3.17 Physical drives
In a multi-cluster grid, each TS7740/TS7700T can have different drives, media types, and Licensed Internal Code levels. The TS7740/TS7700T that is used to restore the export data for merging or DR purposes must have compatible drive hardware and equal or later Licensed Internal Code than the source TS7740/TS7700T. Ensure that if you use Copy Export that the restore TS7740/TS7700T has compatible hardware and a compatible Licensed Internal Code level.
2.3.18 Stacked volume
There is no difference between stacked volume in multicluster and stand-alone environments.
2.3.19 Selective Dual Copy function
The Selective Dual Copy function is used often in stand-alone clusters. However, you can also use it in a multi-cluster grid. There is no difference in its usage in a multicluster and a stand-alone environment.
2.3.20 General TVC management in multi-cluster grids
In multicluster configurations, the TS7700 cache resources are accessible by all participating clusters in the grid. The architecture enables any logical volume in cache to be accessed by any cluster through the common grid network. This capability results in the creation of a composite library effective cache size that is close to the sum of all grid cluster cache capacities.
To use this effective cache size, you need to manage the cache content. This is done by copy policies (how many copies of the logical volume need to be provided in the grid) and the cache management and removal policy (which data to keep preferably in the TVC). If you define your copy and removal policies in a way that every cluster maintains a copy of every logical volume, the effective cache size is no larger than a single cluster.
Therefore, You can configure your grid to take advantage of removal policies and a subset of consistency points to have a much larger effective capacity without losing availability or redundancy. Any logical volume that is stacked in physical tape can be recalled into TVC, making them available to any cluster in the grid.
Replication order
Volumes that are written to an I/O TVC that is configured for PG0 have priority, based on the peer TS7700 replication priority. Therefore, copy queues within TS7700 clusters handle volumes with I/O TVC PG0 assignments before volumes configured as PG1 within the I/O TVC. This behavior is designed to enable those volumes that are marked as PG0 to be flushed from cache as quickly as possible, and not left resident for replication purposes.
This behavior overrides a pure FIFO-ordered queue. There is a new setting in the MI under Copy Policy Override, Ignore cache Preference Groups for copy priority, to disable this function. When selected, it causes all PG0 and PG1 volumes to be treated in FIFO order.
 
Tip: These settings in the Copy Policy Override window override default TS7700 behavior, and can be different for every cluster in a grid.
Treatment of data that is not yet replicated to other clusters
Logical volumes that need to be replicated to one or more peer clusters are retained in disk cache regardless of their preference group assignments. This enables peer clusters to complete the replication process without requiring a recall. After the copy completes, the assigned preference group then takes effect. For example, those assigned as preference group 0 are then immediately migrated.
If replication is not completing and the retention backlog becomes too large, the original preference groups are recognized, enabling data that is not yet replicated to be migrated to tape. These volumes likely need to be recalled into disk cache later for replication to complete. The migration of not yet replicated data might be expected when replication is not completing due to an extended outage within the grid.
2.3.21 Expired virtual volumes and the Delete Expired function
The Delete Expired function is based on the time that a volume enters the scratch category. Each cluster in a multi-cluster grid uses the same time to determine whether a volume becomes a candidate, but each cluster independently chooses from the candidate list when it deletes data. Therefore, all clusters do not necessarily delete-expire a single volume at the same time. Instead, a volume that expires is eventually deleted on all clusters within the
same day.
2.3.22 TVC management for TS7740 and TS7700T CPx in a multi-cluster grid
In addition to the TVC management features from a stand-alone cluster, you can decide the following information in a multi-cluster grid:
How copies from other clusters are treated in the cache
How recalls are treated in the cache
Copy files preferred to reside in cache for local clusters- COPYFSC
Normally, all caches in a multi-cluster grid are managed as one composite cache. This configuration increases the likelihood that a needed volume is in a TVC by increasing the overall effective cache capacity. By default, the volume on the TVC selected for I/O operations is preferred to be in the cache on that cluster. The copy that is made on the other clusters is preferred to be removed from cache.
For example, in a two-cluster grid, consider that you set up a Copy Consistency Point policy of RUN, RUN, and that the host has access to all virtual devices in the grid. After that, the selection of virtual devices that are combined with I/O TVC selection criteria automatically balances the distribution of original volumes and copied volumes across the TVCs.
The original volumes (newly created or modified) are preferred to be in cache, and the copies are preferred to be removed from cache. The result is that each TVC is filled with unique newly created or modified volumes, roughly doubling the effective amount of cache available to host operations.
This behavior is controlled by the LI REQ SETTING CACHE COPYFSC option. When this option is disabled (default), logical volumes that are copied into cache from a Peer TS7700 are managed as PG0 (prefer to be removed from cache).
Copy files preferred to reside in cache for remote clusters: COPYFSC
For a multi-cluster grid that is used for DR consideration, particularly when the local clusters are used for all I/O (remote virtual devices varied offline), the default cache management method might not be wanted. If the remote cluster of the grid is used for recovery, the recovery time is minimized by having most of the needed volumes already in cache. Using the default setting would result in the situation, that the cache of the DR cluster is nearly empty, because all incoming logical volumes are copies and treated as PG0.
Based on your requirements, you can set or modify this control through the z/OS Host Console Request function for the remote cluster:
When off, which is the default, logical volumes that are copied into the cache from a peer TS7700 are managed as PG0 (preferred to be removed from cache).
When on, logical volumes that are copied into the cache from a peer TS7700 are managed by using the actions that are defined for the SC construct associated with the volume, as defined at the TS7700 receiving the copy.
 
Note: COPYFSC is a cluster-wide control. All incoming copies to that specific cluster are treated in the same way. All clusters in the grid can have different settings.
Recalls preferred for cache removal
There is no difference in a stand-alone cluster environment.
2.3.23 TVC management for TS7760 or TS7720 in a multi-cluster grid
Compared to the possibilities of TVC management from a TS7760 or TS7720 stand-alone cluster, a multi-cluster grid with TS7760/TS7720 has several options of cache management. The following options are true for TS7700D TS7700T CP0.
TS7760 and TS7720 Enhanced Removal Policies
The TS7720 Enhanced Volume Removal Policy provides tuning capabilities in grid configurations where one or more TS7760 and TS7720s are present. The tuning capabilities increase the flexibility of the subsystem effective cache in responding to changes in the host workload.
Because the TS7700D has a maximum capacity (the size of its TVC), after this cache fills, the Volume Removal Policy enables logical volumes to be automatically removed from this TS7700D TVC while a copy is retained within one or more peer clusters in the grid. When coupled with copy policies, TS7700D Enhanced Removal Policies provide various automatic data migration functions between the TS7700 clusters within the grid. This is also true for a TS7700T CP0.
In addition, when the automatic removal is run, it implies an override to the current Copy Consistency Policy in place, resulting in a lowered number of consistency points compared with the original configuration defined by the user.
When the automatic removal starts, all volumes in scratch categories are removed first, because these volumes are assumed to be unnecessary. To account for any mistake where private volumes are returned to scratch, these volumes must meet the same copy count criteria in a grid as the private volumes. The pinning option and minimum duration time criteria described next are ignored for scratch (Fast Ready) volumes.
To ensure that data will always be in a TS7700D or TS7700T CP0, or be there for at least a minimal amount of time, a volume retention time can be associated with each removal policy. This volume retention time (in hours) enables volumes to remain in a TS7720 TVC for a certain time before the volume becomes a candidate for removal. The time varies 0 - 65,536 hours. A volume retention time of zero assumes no minimal requirement.
In addition to the volume retention time, three policies are available for each volume in a TS7700D or TS7700T CP0:
Pinned
The copy of the volume is never removed from this cluster. There is no volume retention time applicable, and is implied as infinite. After a pinned volume is moved to scratch, it becomes a priority candidate for removal similar to the next two policies. This policy must be used cautiously, to prevent cache overruns.
Prefer Remove: When Space is Needed Group 0 (LRU)
The copy of a private volume is removed if the following conditions exist:
 – An appropriate number of copies exist on peer clusters.
 – The pinning duration (in number of hours) has elapsed since the last access.
 – The available free space on the cluster has fallen below the removal threshold.
The order in which volumes are removed under this policy is based on their LRU access times. Volumes in Group 0 are removed before the removal of volumes in Group 1, except for any volumes in scratch categories, which are always removed first. Archive and backup data can be a good candidate for this removal group, because it is not likely accessed after it is written.
Prefer Keep: When Space is needed Group 1 (LRU)
The copy of a private volume is removed if the following conditions exist:
 – An appropriate number of copies exist on peer clusters.
 – The pinning duration (in number of hours) has elapsed since the last access.
 – The available free space on the cluster has fallen below removal threshold.
 – Volumes with the Prefer Remove (LRU Group 0) policy have been exhausted.
The order in which volumes are removed under this policy is based on their LRU access times. Volumes in Group 0 are removed before the removal of volumes in Group 1, except for any volumes in scratch categories, which are always removed first.
Prefer Remove and Prefer Keep policies are similar to cache preference groups PG0 and PG1, except that removal treats both groups as LRU versus using their volume size. In addition to these policies, volumes that are assigned to a scratch category, and that were not previously delete-expired, are also removed from cache when the free space on a cluster falls below a threshold. Scratch category volumes, regardless of their removal policies, are always removed before any other removal candidates in descending volume size order.
Volume retention time is also ignored for scratch volumes. Only if the removal of scratch volumes does not satisfy the removal requirements are PG0 and PG1 candidates analyzed for removal. If an appropriate number of volume copies exist elsewhere, scratch removal can occur. If one or more peer copies cannot be validated, the scratch volume is not removed.
Figure 2-15 shows a representation of the TS7720 cache removal priority.
Figure 2-15 TS7720 cache removal priority
Host command-line query capabilities are supported that help override automatic removal behaviors and disable automatic removal within a TS7700D cluster, or for the CP0 in a TS7700T. For more information, see the IBM Virtualization Engine TS7700 Series z/OS Host Command Line Request User’s Guide on Techdocs. It is available at the following website:
The following host console requests are related:
LVOL {VOLSER} REMOVE
LVOL {VOLSER} REMOVE PROMOTE
LVOL {VOLSER} PREFER
SETTING CACHE REMOVE {DISABLE|ENABLE}
Delayed Replication in R3.1 changed the auto-removal algorithm so that removal of volumes where one or more delayed replication consistency points exist can take place only after those delayed replications have completed. If families are defined, only delayed consistency points within the same family must have completed.
This restriction prevents the removal of the only copy of a group before the delayed replications can complete. If no candidates are available for removal, any delayed replication tasks that have not had their grace period elapse replicate early, enabling candidates for removal to be created.
TS7700D and TS7700T CP0 cache full mount redirection
If the Enhanced Volume Removal Policies were not defined correctly or are disabled, a TS7700D TVC can become full. That is also true for a TS7700T CP0. Before becoming full, a warning message appears. Eventually, the disk cache becomes full and the library enters the Out of Cache Resources state. For multi-cluster grid configurations where more clusters are present, an Out of Cache Resources event causes mount redirection, so that an alternative TVC can be chosen.
During this degraded state, if a private volume is requested to the affected cluster, all TVC candidates are considered, even when the mount point cluster is in the Out of Cache Resources state. The grid function chooses an alternative TS7700 cluster with a valid consistency point and, if you have a TS7700D or TS7700T CP0, available cache space.
Scratch mounts that involve a TVC candidate that is Out of Cache Resources fail only if no other TS7700 cluster is eligible to be a TVC candidate. Private mounts are only directed to a TVC in an Out of Cache Resources state if there is no other eligible (TVC) candidate. When all TVCs within the grid are in the Out of Cache Resources state, private mounts are mounted with read-only access.
When all TVC candidates are either in the Paused, Out of Physical Scratch Resource, or Out of Cache Resources state, the mount process enters a queued state. The mount remains queued until the host issues a dismount command, or one of the distributed libraries exits the unwanted state. This behavior can be influenced by a new LI REQ,distlib,SETTING,PHYSLIB command.
Any mount that is issued to a cluster that is in the Out of Cache Resources state, and also has Copy Policy Override set to Force Local Copy, fails. The Force Local Copy setting excludes all other candidates from TVC selection.
 
Tip: Ensure that Removal Policies, Copy Consistency Policies, and threshold levels are applied to avoid an out-of-cache-resources situation.
Temporary removal threshold
This process is used in a TS7700 Tape Virtualization multi-cluster grid where automatic removal is enabled and a service outage is expected. Because automatic removal requires validation that one or more copies exist elsewhere within the grid, a cluster outage can prevent a successful check that leads to disk cache full conditions.
A temporary removal threshold is used to free enough cache space of the TS7700D or TS7700T CP0 cache in advance so that it does not fill up while another TS7700 cluster is in service. This temporary threshold is typically used when there are plans of taking down one TS7700 cluster for a considerable amount of time.
The process is run on the TS7700 MI.
In addition, the temporary removal threshold can also be used to free up space before a disaster recovery test with Flash Copy. During the disaster recovery test, no autoremoval or delete expire process is allowed. Therefore, you should use the temporary removal threshold to ensure that enough free space for the usual productions and the additional flash copies is available in the clusters in the DR family.
2.3.24 TVC management processes in a multi-cluster grid
The TVC management processes are the same as for stand-alone clusters. In addition to the already explained premigration management and free-space management, two further processes exist:
Copy management (TS7740 and TS7700T CP1 - CP7)
This process applies only to a multi-cluster grid configuration, and becomes effective when the amount of non-replicated data retained in the TVC reaches a predefined threshold. It applies in particular to Deferred Copy mode, and when started reduces the incoming host data rate independently of premigration or free-space management. The purpose of this process is to prevent logical volumes from being migrated to physical tape before being copied to one or more other TS7700 clusters.
This is done to avoid a possible recall operation from being initiated by remote clusters in the grid. Only when replication target clusters are known to be unavailable, or when the amount of retained data to be copied becomes excessive, is this retained data migrated ahead of the copy process, which might lead to a future recall to complete the copy. This process is also called copy throttling.
Copy time management
This process applies to multi-cluster grid configurations where the RUN Copy Consistency Point is used. When enabled, it limits the host input rate. It is intended to prevent any RUN copies from exceeding the missing-interrupt handler (MIH) timeout value for host jobs. If limiting the host input helps, the TS7700 enables the job to succeed before the MIH timer expires.
If limiting the host input does not help, the job changes to Deferred mode, and an alert is posted to the host console that the TS7700 has entered the Immediate-deferred state. You can modify this setting through the Host Console Request function to customize the level of throttling that is applied to the host input when this condition is detected. Because Synchronous mode copy is treated as Host I/O to the remote cluster, this is not applicable to Synchronous copies.
2.3.25 Copy Consistency Point: Copy policy modes in a multi-cluster grid
In a TS7700 Grid, you might want multiple copies of a virtual volume on separate clusters. You might also want to specify when the copies are created relative to the job that has written to a virtual volume, and that must be unique for each cluster.
Copy management is controlled through the MC storage construct. Using the MI, you can create MCs, and define where copies exist and when they are synchronized relative to the host job that created them.
When a TS7700 is included in a multi-cluster grid configuration, the MC definition window lists each cluster by its distributed library name, and enables a copy policy for each. For example, assume that three clusters are in a grid:
LIBRARY1
LIBRARY2
LIBRARY3
A portion of the MC definition window includes the cluster name and enables a Copy Consistency Point to be specified for each cluster. If a copy is to exist on a cluster’s TVC, you indicate a Copy Consistency Point. If you do not want a cluster to have a copy of the data, you specify the No Copy option.
As described in 2.3.5, “Copy consistency points” on page 68, you can either define Sync, Run, Deferred, Time Delayed, or No Copy.
 
Note: The default MC is deferred at all configured clusters, including the local. The default settings are applied whenever a new construct is defined through the MI, or to a mount command where MC was not previously defined.
Synchronous mode copy
To enable the synchronous mode copy (SMC), create an MC that specifies exactly two specific grid clusters with the Sync S mode.
Data is written into one TVC and simultaneously written to the secondary cluster as opposed to a RUN or DEFERRED copy where the data is not written to the cache in the I/O TVC and then read again from the cache to produce the copy. Instead, the data is written directly with a remote mount to the synchronous mode copy cluster. One or both locations can be remote.
All remote writes use memory buffering to get the most effective throughput across the grid links. Only when implicit or explicit sync operations occur does all data at both locations get flushed to persistent disk cache, providing a zero RPO of all data up to that point on tape. Mainframe tape operations do not require that each tape block is synchronized, enabling improved performance by only hardening data at critical sync points.
Applications that use data set-style stacking, and migrations, are the expected use cases for SMC. But also, any application that requires a zero RPO at sync point granularity can benefit from the Synchronous mode copy feature.
 
Important: The Synchronous mode copy takes precedence over any Copy Override settings.
Meeting the zero RPO objective can be a flexible requirement for certain applications and users. Therefore, a series of extra options are provided if the zero RPO cannot be achieved. For more information, see the IBM TS7700 Series Best Practices - Synchronous Mode Copy white paper that is available at the following website:
Several new options are available with the synchronous mode copy. These options are described in the following sections.
Synchronous Deferred On Write Failure option
The default behavior of SMC is to fail a write operation if both clusters with the S copy mode are not available or become unavailable during the write operations.
Enable this option to enable update operations to continue to any valid consistency point in the grid. If there is a write failure, the failed S locations are set to a state of synchronous-deferred. After the volume is closed, any synchronous-deferred locations are updated to an equivalent consistency point through asynchronous replication. If the Synchronous Deferred On Write Failure option is not selected, and a write failure occurs at either of the S locations, host operations fail.
During allocation, an R or D site is chosen as the primary consistency point only when both S locations are unavailable.
Whenever a Synchronous copy enters a synchronous-deferred state, the composite library enters a Degraded state. This can be prevented by using the LI REQ DEFDEG option.
Open Both Copies On Private Mount option
Enable this option to open both previously written S locations when a private mount occurs. If one or both S locations are on back-end tape, the tape copies are first recalled into disk cache within those locations. The Open Both Copies On Private Mount option is useful for applications that require synchronous updates during appends. Private mounts can be affected by cache misses when this option is used. Consider these other circumstances:
If a private mount on both locations is successfully opened, all read operations use the primary location. If any read fails, the host read also fails, and no failover to the secondary source occurs unless a z/OS dynamic device reconfiguration (DDR) swap is initiated.
If a write operation occurs, both locations receive write data, and they must synchronize it to TVC disk during each implicit or explicit synchronization command.
If either location fails to synchronize, the host job either fails or enters the synchronous-deferred state, depending on whether the Synchronous Deferred On Write Failure option is enabled.
Open Both Copies On z/OS implied Private Mount option
Enable this option to use the DISP=xxxx from the JCL to identify if a volume can be read only, or can be modified:
If DISP=OLD is specified, the TS7700 assumes that only a read occurred, and opens only a single copy on a private mount.
If DISP=SHR is specified, z/OS converts that to a DISP=OLD, because a tape does not support DISP=SHR. Then the mount is treated as coded with described with DISP=OLD.
If DISP=MOD is specified, the TS7700 assumes that an append occurs, and opens the copy on both sides.
Some applications open the virtual volume with a DISP=OLD parameter, and still append the volume. In this case, the append is successful, and a synchronous-deferred copy is produced.
 
Tip: With the introduction of the new z/OS implied update option, we advise you to use this option for DFSMShsm or equivalent products.
Rewind Unload (RUN)
If a Copy Consistency Point of RUN is defined for a cluster in the MC that is assigned to the volume, a consistent copy of the data must exist in that cluster’s TVC before command completion is indicated for the Rewind/Unload command.
If multiple clusters have a Copy Consistency Point of RUN, all of their associated TVCs must have a copy of the data before command completion is indicated for the Rewind/Unload command. These copies are produced in parallel. Options are available to override this requirement for performance tuning purposes.
Deferred
If a Copy Consistency Point of Deferred is defined, the copy to that cluster’s TVC can occur any time after the Rewind/Unload command has been processed for the I/O TVC.
Time Delayed Replication Policy in R3.1 or later
In the TS7700, all types of data can be stored. Some of this data usually has a short lifetime, and is replaced with other (more current) data. This is true for daily database backups, logs, and daily produced reports, such as generation data groups (GDGs), but also for other data. However, in certain conditions, this data is not replaced with more current content. Therefore, the actual logical volumes need to be treated as archive data (for example, GDGs).
Without Time Delayed Replication Policy, there was only the choice between replication or nocopy to the target clusters. Replication to the TS7700 (especially to TS7740 or TS7700T CP1 - CP7 in a multicluster hybrid grid) led to the situation that resources were being use for data that was soon to expire. No replication of this data to the TS7700 meant that if this data must be treated as archive, no additional copy could be created automatically.
Therefore, customers normally chose the always copy option and accepted the processor burden of replicating data that might soon expire. With the Time Delayed Replication Policy, you can now specify when the replication is done. A deferred copy will be made to all T sites after X hours have passed since volume creation or last access. The process to identify newly created T volumes runs every 15 minutes. You can specify only one T time for all Time replication target clusters in the MC. You can specify 1 - 65,535 hours.
Data already expired is still copied to the target clusters in these circumstances:
The TMS has not yet returned the volume to scratch.
The logical volume is scratch but not reused, and the scratch category has no expire delete definition.
The logical volume is scratch but not reused, and the scratch category has an expire delete setting, which has not yet been reached for this specific logical volume. Ensure that the expire delete setting for the scratch category and the Time Replication time combination fit together.
Using the Time Delayed policy, the automatic removal in the TS7700D or TS7700T CP0 can be influenced. The following rules apply:
In a grid without cluster families, all T copies need to be processed before an automatic removal can occur on any TS7700D or TS7700T CP0 in the grid.
If cluster families are defined, all T copies in the family must be processed before auto removal to any TS7700D or TS7700T CP0 in the cluster family can occur. However, a logical volume in a TS7700D or TS7020T CP0 can be removed even if all T copies in a different family have not been processed.
A TS7700D or TS7700T CP0 might run out of removal candidates, and the only candidates in sight are those delayed replications that have not yet had their time expire. In this case, the TS7700D or TS7700T CP0 detects this condition and triggers a subset of time-delayed copies to replicate early to create removal candidates. The TS7700D and TS7700T CP0 prioritizes these copies as fast as it can replicate them. To avoid this situation, configure delay times to be early enough to provide enough removal candidates to complete production workloads.
No Copy
No copy to this cluster is performed.
For examples of how Copy Consistency Policies work in different configurations, see 2.4, “Grid configuration examples” on page 103.
A mixture of Copy Consistency Points can be defined for an MC, enabling each cluster to have a unique consistency point.
 
Tip: The Copy Consistency Point is considered for both scratch and specific mounts.
Management Class locality to a cluster
MCs for the TS7700 are created at the MI associated with a cluster. The same MC can be defined differently at each cluster, and there are valid reasons for doing so. For example, one of the functions that are controlled through MC is to have a logical volume copied to two separate physical volume pools.
You might want to have two separate physical copies of your logical volumes on one of the clusters and not on the others. Through the MI associated with the cluster where you want the second copy, specify a secondary pool when defining the MC. For the MC definition on the other clusters, do not specify a secondary pool. For example, you might want to use the Copy Export function to extract a copy of data from the cluster to take to a DR site.
 
Important: During mount processing, the Copy Consistency Point information that is used for a volume is taken from the MC definition for the cluster with which the mount vNode is associated.
Define the Copy Consistency Point definitions of an MC to be the same on each cluster to avoid confusion about where copies are. You can devise a scenario in which you define separate Copy Consistency Points for the same MC on each of the clusters. In this scenario, the location of copies and when the copies are consistent with the host that created the data differs, depending on which cluster a mount is processed.
In these scenarios, use the Retain Copy mode option. When the Retain Copy mode is enabled against the currently defined MC, the previously assigned copy modes are retained independently of the current MC definition.
Retain Copy mode across grid mount points
Retain Copy mode is an optional setting in the Management Class where a volume’s existing Copy Consistency Points are used rather than applying the Copy Consistency Points that are defined at the mounting cluster. This setting applies to private volume mounts for reads or write appends. It is used to prevent more copies of a volume in the grid than wanted.
Figure 2-16 shows a four-cluster grid where Cluster 0 replicates to Cluster 2, and Cluster 1 replicates to Cluster 3. The wanted result is that only two copies of data remain in the grid after the volume is accessed. Later, the host wants to mount the volume that is written to Cluster 0. On systems where DAA is supported, DAA is used to determine which cluster is the best cluster from which to request the mount. DAA asks the grid from which cluster to allocate a virtual drive. The host then attempts to allocate a device from the best cluster (Cluster 0).
Figure 2-16 Four-cluster grid with DAA
 
Remember: DAA support for JES3 was added in z/OS V2R1.
On systems where DAA is not supported, 50% of the time, the host allocates to the cluster that does not have a copy in its cache. When the alternative cluster is chosen, the existing copies remain present, and more copies are made to the new Copy Consistency Points defined in the Management Class, resulting in more copies. If host allocation selects the cluster that does not have the volume in cache, one or two extra copies are created on Cluster 1 and Cluster 3 because the Copy Consistency Points indicate that the copies need to be made to Cluster 1 and Cluster 3.
For a read operation, four copies remain. For a write append, three copies are created. This process is illustrated in Figure 2-17.
Figure 2-17 Four-cluster grid without DAA, Retain Copy mode disabled
With the Retain Copy mode option set, the original Copy Consistency Points of a volume are used rather than applying the Management Class with the corresponding Copy Consistency Points of the mounting cluster. A mount of a volume to the cluster that does not have a copy in its cache results in a cross-cluster (remote) mount instead.
The cross-cluster mount uses the cache of the cluster that contains the volume. The Copy Consistency Points of the original mount are used. In this case, the result is that Cluster 0 and Cluster 2 have the copies, and Cluster 1 and Cluster 3 do not. This is shown in Figure 2-18.
Figure 2-18 Four-cluster grid without DAA, Retain Copy mode enabled
Another example of the need for Retain Copy mode is when one of the production clusters is not available. All allocations are made to the remaining production cluster. When the volume exists only in Cluster 0 and Cluster 2, the mount to Cluster 1 results in a total of three or four copies. This applies to JES2 and JES3 without Retain Copy mode enabled (Figure 2-19).
Figure 2-19 Four-cluster grid, one production cluster down, Retain Copy mode disabled
Figure 2-20 shows that the Retain Copy mode is enabled, and one of the production clusters is down. In the scenario where the cluster that contains the volume to be mounted is down, the host allocates to a device on the other cluster, in this case Cluster 1. A cross-cluster mount that uses the Cluster 2 cache occurs, and the original two copies remain. If the volume that is appended to it is changed on Cluster 2 only, Cluster 0 gets a copy of the altered volume when it rejoins the grid. Currently, only one valid copy is available in the grid.
Figure 2-20 Four-cluster grid, one production cluster down, Retain Copy mode enabled
For more information, see the IBM Virtualization Engine TS7700 Series Best Practices - TS7700 Hybrid Grid Usage white paper at the Techdocs website:
2.3.26 TVC (I/O) selection in a multi-cluster grid
The TVC associated with one of the clusters in the grid is selected as the I/O TVC for a specific tape mount request during mount request. The vNode is referred to as the
mount vNode
.
The TS7700 filters based on the following elements:
Cluster availability (offline cluster, cluster in service prep, or degraded are cleared).
Mount type:
 – Scratch. Clear all TS7700Ds and TS7700T CP0s with out of cache conditions, and remove no copy clusters.
 – Private. Clear cluster without a valid copy.
Preferences regarding the consistency point, override policies, and families.
With these three elements, an obvious favorite can be considered. If not, further filtering occurs where choices are ranked by certain performance criteria:
Cache residency
Recall times
Network latency
Host workload
The list is ordered favoring the clusters that are thought to provide the optimal performance.
With Release 3.3, two new LI REQ parameter settings are introduced that influence the TVC selection. You can use the SETTING2,PHYSLIB parameter to determine how a shortage or unavailability condition is treated in a TS7700T.
In addition, you can use the LI REQ parameter LOWRANK to give a cluster a lower ranking in the TVC selection. This parameter can be used under special conditions before you enter Service Mode. This parameter influences the TVC selection for Host I/O and the copy and mount behavior. In addition, it is a persistent setting, and can be set on every cluster independently. To avoid a negative impact to your data availability, set LOWRANK to the default after the maintenance is done.
For more information, see IBM TS7700 Series z/OS Host Command Line Request User’s Guide, found on the following website:
2.3.27 TVC handling in an unavailability condition
In a stand-alone environment with Release 3.3, you can define that TS7700T CPx partitions react as they did in prior releases and not accept further Host I/O. You can also ignore the unavailability condition and let further Host I/O processing proceed.
In a grid, you can define that the cluster is treated as degraded, which means that this cluster has a lower priority in the TVC selection. However, all TVC selection criteria are acknowledged, and if no other cluster can fulfill the selection criteria, the degraded cluster is chosen as the TVC cluster.
In addition, you can specify whether this cluster pull s copies from other clusters.
2.3.28 Remote (cross) cluster mounts
A remote (also known as cross) cluster mount is created when the I/O TVC selected is not in the cluster that owns the allocated virtual device. The logical volume is accessed through the grid network by using TCP/IP. Each I/O request from the host results in parts of the logical volume moving across the network. Logical volume data movement through the grid is bidirectional, and depends on whether the operation is a read or a write.
The amount of data that is transferred depends on many factors, one of which is the data compression ratio provided by the host FICON adapters. To minimize grid bandwidth requirements, only compressed data that is used or provided by the host is transferred across the network. Read-ahead and write buffering is also used to get the maximum from the remote cluster mount.
2.3.29 TVC encryption
From a technical point of view, TVC encryption is a cluster feature. Each cluster can be treated differently from the others in the multi-cluster grid. There is no difference from a stand-alone cluster.
2.3.30 Logical and stacked volume management
There is no real difference from a stand-alone environment. Each cluster is a separate entity. You can define different stacked volume pools with different rules on each distributed library.
2.3.31 Secure Data Erase
There is no difference from a stand-alone cluster.
2.3.32 Copy Export
In general, the Copy Export feature has the same functions as a stand-alone cluster. However, there are further considerations:
The Copy Export function is supported on all configurations of the TS7740/TS7700T, including grid configurations. In a grid configuration, each TS7740/TS7700T is considered a separate source TS7740/TS7700T.
Only the physical volumes that are exported from a source TS7740/TS7700T can be used for the recovery of a source TS7740/TS7700T. Physical volumes from more than one source TS7740/TS7700T in a grid configuration cannot be combined for recovery use.
 
Important: Ensure that scheduled Copy Export operations are always run from the same cluster for the same recovery set. Other clusters in the grid can also initiate independent copy export operations if their exported tapes are kept independent and used for an independent restore. Exports from two different clusters in the same grid cannot be joined.
Recovery that is run by the client is only to a stand-alone cluster configuration. After recovery, the Grid MES offering can be applied to re-create a grid configuration.
When a Copy Export operation is initiated, only the following logical volumes are considered for export:
 – They are assigned to the secondary pool specified in the Export List File Volume.
 – They are also on a physical volume of the pool or in the cache of the TS7700 running the export operation.
For a Grid configuration, if a logical volume is to be copied to the TS7700 that will run the Copy Export operation, but that copy has not yet completed when the export is initiated, it is not included in the current export operation. Ensure that all logical volumes that need to be included have completed replication to the cluster where the export process is run.
A service from IBM is available to merge a Copy Export set in an existing grid. Talk to your IBM SSR.
2.3.33 Encryption of physical tapes
There is no difference to a stand-alone cluster.
2.3.34 Autonomic Ownership Takeover Manager
AOTM is an optional function by which, after a TS7700 Cluster failure, one of the methods for ownership takeover is automatically enabled without operator intervention. Enabling AOTM improves data availability levels within the composite library.
AOTM uses the TS3000 TSSC associated with each TS7700 in a grid to provide an alternative path to check the status of a peer TS7700. Therefore, every TS7700 in a grid must be connected to a TSSC. To take advantage of AOTM, you must provide an IP communication path between the TS3000 TSSCs at the cluster sites. Ideally, the AOTM function uses an independent network between locations, but this is not a requirement.
With AOTM, the user-configured takeover mode is enabled if normal communication between the clusters is disrupted, and the cluster that is running the takeover can verify that the other cluster has failed or is otherwise not operational. For more information, see 9.3.11, “The Service icon” on page 508.
When a cluster loses communication with another peer cluster, it prompts the attached local TS3000 to communicate with the remote failing cluster’s TS3000 to confirm that the remote TS7700 is down. If it is verified that the remote cluster is down, the user-configured takeover mode is automatically enabled. If it cannot validate the failure, or if the system consoles cannot communicate with each other, AOTM does not enable a takeover mode. In this scenario, ownership takeover mode can be enabled only by an operator through the MI.
Without AOTM, an operator must determine whether one of the TS7700 clusters has failed, and then enable one of the ownership takeover modes. This process is required to access the logical volumes that are owned by the failed cluster. It is important that WOT be enabled only when a cluster has failed, and not when there is only a problem with communication between the TS7700 clusters.
If ownership takeover is enabled in the read/write mode against a network-inaccessible cluster, and the inaccessible cluster is in fact handling host activity, volumes can be modified at both locations. This results in conflicting volume versions. When the Read Ownership Takeover (ROT) is enabled rather than read/write mode, the original owning cluster can continue to modify the volume, where peers have read-only access to an earlier version.
Therefore, manually enabling ownership takeover when only network issues are present should be limited to only those scenarios where host activity is not occurring to the inaccessible cluster. If two conflicting versions are created, the condition is detected when communications are resolved, and the volumes with conflicting versions are moved into an error state. When in this error state, the MI can be used to choose which version is most current.
Even if AOTM is not enabled, configure it to provide protection from a manual takeover mode being selected when the cluster is functional. This additional TS3000 TSSC path is used to determine whether an unavailable cluster is still operational or not. This path is used to prevent the user from forcing a cluster online when it must not be, or enabling a takeover mode that can result in dual volume use.
2.3.35 Selective Write Protect for disaster recovery testing
This function enables clients to emulate disaster recovery events by running test jobs at a DR location within a TS7700 grid configuration, and enabling volumes only within specific categories to be manipulated by the test application. This configuration prevents any changes to production-written data. Up to 32 categories can be identified and set to be included or excluded from Write Protect Mode by using the Category Write Protect Property table.
When a cluster is write protect-enabled, all volumes that are protected cannot be modified or have their category or storage construct names modified. As in the TS7700 write protect setting, the option is at the cluster scope and configured through the MI. Settings are persistent.
Also, the new function enables any volume that is assigned to one of the categories that are contained within the configured list to be excluded from the general cluster’s write protect state. The volumes that are assigned to the excluded categories can be written to or have their attributes modified. In addition, those scratch categories that are not excluded can optionally have their Fast Ready characteristics ignored, including Delete Expire and hold processing. This enables the DR test to mount volumes as private that the production environment has since returned to scratch (they are accessed as read-only).
One exception to the write protect is those volumes in the insert category. To enable a volume to be moved from the insert category to a write protect-excluded category, the source category of insert cannot be write-protected. Therefore, the insert category is always a member of the excluded categories.
Be sure that you have enough scratch space when Expire Hold processing is enabled to prevent the reuse of production scratched volumes when you are planning for a DR test. Suspending the volumes’ Return-to-Scratch processing during the DR test is also advisable.
Because selective write protect is a cluster-wide function, separated DR drills can be conducted simultaneously within one multi-cluster grid, if each cluster has its own independent client-configured settings.
2.3.36 FlashCopy for disaster recovery testing
This function builds upon the TS7700s ability to provide DR testing capabilities by introducing FlashCopy consistency points within a DR location. A DR test host can use this DR family to run a DR test, while production continues on the remaining clusters of the grid.
For the DR host, the FlashCopy function provides data on a time consistent basis (Time zero). The production data continues to replicate during the entire test. The same volumes can be mounted at both sites at the same time, even with different data. To differentiate between read-only production data at time zero and fully read/write-enabled content that is created by the DR host, the selective write protect features must be used.
All access to write-protected volumes involves a snapshot from the time zero FlashCopy. Any production volumes that are not yet replicated to the DR location at the time of the snapshot cannot be accessed by the DR host, which mimics a true disaster.
Through selective write protect, a DR host can create new content to segregated volume ranges. There are 32 write exclusion categories now supported, versus the previous 16. Write protected media categories cannot be changed (by the DR host) while the Write Protection mode is enabled. This is true not only for the data, but also for the status of the volumes.
Therefore, it is not possible (by the DR host) to set production volumes from scratch to private or vice versa. When the DR site has just TS7700Ds, the flash that is initiated during the DR test is across all TS7700Ds in the DR-Family. As production returns logical volumes to scratch, deletes them, or reuses them, the DR site holds on to the old version in the flash. Therefore, return to scratch processing can now run at the production side during a test, and there is no need to defer it or use expire hold.
The TS7740 can be a part of a DR Family, but it has no FlashCopy capability itself. Therefore, the TS7740 can be used only for remote mounts from the TS7720 or TS7760. The devices of the TS7740 must not be used for mounting purposes. Enablement is done by configuring DR Families by using LI REQ and Write Protect or Flash by using the LI REQ (Library Request command) against all clusters in a DR Family.
For more information about FlashCopy setup, see Chapter 9, “Operation” on page 339. For DR testing examples, see Chapter 13, “Disaster recovery testing” on page 767.
The following items are extra notes for R3.1 FlashCopy for DR Testing:
Only TS7700 Grid configurations where all clusters are running R3.1 or later, and at least one TS7720 or TS7760 cluster exists, are supported.
Disk cache snapshot occurs to one or more TS7720 and TS7760 clusters in a DR family within seconds. TS7740 clusters do not support snapshot.
All logical volumes in a TS7700T CP0 partition, and all logical volumes from CPx kept in cache, are part of the DR-Flash.
If a TS7740 cluster is present within a DR family, an option is available enabling the TS7740 live copy to be accessed if it completed replication before time zero of the DR test. Although the initial snapshot itself does not require any extra space in cache, this might apply if the TS7720 or TS7760 has its live copy removed for some reason.
Volumes in the TS7720T that are stored in CPx partitions, and that are already migrated to physical tape, are not part of the DR-Flash. They can still be accessed if the LIVECOPY Option is enabled and the logical volume was created before time zero.
TS7720 clusters within the DR location should be increased in size to accommodate the delta space retained during the test:
 – Any volume that was deleted in production is not deleted in DR.
 – Any volume that is reused in production results in two DR copies (old at time zero
and new).
Automatic removal is disabled within TS7720 clusters during DR test, requiring a pre-removal to be completed before testing.
LI REQ DR Family settings can be completed in advance, enabling a single LI REQ command to be run to initiate the flash and start DR testing.
DR access introduces its own independent ownership, and enables DR read-only volumes to be mounted in parallel to the production-equivalent volumes.
The following terminology is used for FlashCopy for DR Testing:
Live copy A real-time instance of a virtual tape within a grid that can be modified and replicated to peer clusters.
This is the live instance of a volume in a cluster that is the most current true version of the volume. It is altered by a production host, or as the content created during a DR test.
FlashCopy A snapshot of a live copy at time zero. The content in the FlashCopy is fixed, and does not change even if the original copy is modified. A FlashCopy might not exist if a live volume was not present at time zero. In addition, a FlashCopy does not imply consistency, because the live copy might have been an obsolete or incomplete replication at time zero.
DR family A set of TS7700 clusters (most likely those at the DR site) that serve the purpose of disaster recovery. One to five clusters can be assigned to a DR family.
The DR family is used to determine which clusters are affected by a flash request or write-protect request by using the LI REQ (Library Request command).
Write Protect Mode When Write Protect Mode is enabled on a cluster, host commands fail if they are sent to logical devices in that cluster and attempt to modify a volume’s data or attributes. The FlashCopy is created on a cluster when it is in the write protect mode only. Also, only write-protected virtual tapes are flashed. Virtual tapes that are assigned to the excluded categories are not flashed.
Time zero The time when the FlashCopy is taken within a DR family. The time zero mimics the time when a real disaster happens. Customers can establish the time zero by using LI REQ (Library Request command).
2.3.37 Grid resiliency functions
A TS7700 Grid is made up of two or more TS7700 clusters interconnected through Ethernet connections. It is designed and implemented as a business continuance solution with implied enterprise resiliency. When a cluster in the grid has a problem, the multi-cluster grid should accommodate the outage and be able to continue the operation even if the state of the grid is degraded.
Grid-wide problems might occur due to a single cluster in the grid experiencing a problem. When there are problems in one cluster that cause it to be sick or unhealthy, but not completely dead (Sick But Not Dead (SBND)), the peer clusters might be greatly affected, and customer jobs end up being affected (long mount time, failed sync mode writes, much more than degraded).
Grid Resiliency Improvements are the functions to identify the symptoms and make the grid more resilient when a single cluster experiences a problem by removing the sick or unhealthy cluster from the grid, either explicitly or implicitly through different methods. By removing it, the rest of peer clusters can then treat it as “dead” and avoid further handshakes with it until it can be recovered.
The grid resiliency function is designed to detect permanent impacts. It is not designed to react to these situations:
Temporarily impacts (like small network issues)
Performance issues due to high workload
Note that due to the nature of a TS7700 grid, this isolation is not comparable to a mechanism in the disks, like hyperswap or similar techniques. Such techniques are based on local installed devices, whereas TS7700 grids can span thousands of miles. Therefore, the detection can take much longer than in a disk world, and also the actions might take longer.
The customer can specify different thresholds (e.g. mount timing, handshake and token timings, and error counters) and other parameters to influence the level of sensitivity to events that affect performance. To avoid a false fence condition, use the defaults for the thresholds at the beginning and adjust the parameters only if necessary.
Two different mechanisms exist in the grid:
Local Fence
As in a stand-alone environment, the cluster decides, based on hardware information, that it has suffered a SBND condition and fences itself. The function is automatically enabled after R4.1.2 is installed on the cluster, even if other clusters in the grid are not yet running on R4.1.2 or higher level. The local fence has no parameters or options and cannot be disabled.
Remote Fence
Depending on the parameters, one of the clusters in a grid might detect an unhealthy state of a peer cluster. In a grid with three or more clusters, all the clusters need to concur that a specific cluster is unhealthy for it to be fenced.
In a two cluster grid, both clusters need to agree that the same cluster is SBND. Otherwise, no remote fence occurs.
The remote fence is per default disabled. If the customer enables the remote fence action, multiple parameters need to be defined:
Primary Action:
 – ALERT: An Alert message will be sent to the attached hosts, and the cluster will be fenced. However, the cluster will remain online and will still be part of the grid. So the unhealthy situation will not be solved automatically. You might consider this option if you want to be notified about that SBND condition occurred, but want to execute the necessary actions manually.
 – REBOOT: The SBND cluster will be rebooted. If the reboot is successful, the cluster will automatically be varied back online to the grid. If the reboot is not successful, the reboot action will be repeated twice before the cluster remains offline. You might consider this option if availability of the complete grid is the main target, such as when the remaining grid resources cannot handle the workload during peak times.
 – REBOFF: The SBND cluster will be rebooted, but stays in an offline mode. You might consider this option if an analysis of the situation is always requested before the cluster come back to the grid.
 – OFFLINE,FORCE: The SBND cluster will be set to offline immediately with no reboot. This option provides the quickest shutdown, but the reboot action might take longer.
Secondary Action: The customer can enable a secondary option. If enabled and the primary option fails (for example, the primary action cannot be executed), the cluster will be isolated from the grid. That means that only the gridlink ports are disabled, and therefore there is no communication between the SBND cluster and all other clusters in the grid. Be aware that there are no actions to the virtual devices in Release 4.1.2. If virtual devices are still online to the connected IBM Z LPARS, the cluster can still accept mounts. However, this has multiple negative side effects:
 – Replication is not feasible to and from this cluster.
 – Private mounts can be routed to a cluster where a drive is available, but the ownership may not be transferable to this cluster due to the gridlink isolation. This mount will not be executed, and the job will hang.
 – Scratch mounts can only be successfully executed if ownership for scratch volumes is available. To avoid this, offline the devices from the isolated cluster as soon as possible from all attached IBM Z LPARS.
In addition to the thresholds, the Evaluation window and the amount of consecutive appearances can be specified.
For more information see IBM TS7700 Series Grid Resiliency Improvements User’s Guide at:
2.3.38 Service preparation mode
The transition of a cluster into service mode is called service prep. Service prep enables a cluster to be gracefully and temporarily removed as an active member of the grid. The remaining sites can acquire ownership of the volumes while the site is away from the grid. If a volume owned by the service cluster is not accessed during the outage, ownership is retained by the original cluster. Operations that target the distributed library that is entering service are completed by the site going into service before the move to service completes.
Other distributed libraries within the composite library remain available. The host device addresses that are associated with the site in service send Device State Change alerts to the host, enabling those logical devices that are associated with the service preparation cluster to enter the pending offline state.
If a cluster enters service prep, the following copy actions are processed:
All copies in flight (running currently), regardless of whether they are going to or from the cluster, are finished.
No copies from other clusters to the cluster that is entering the service mode are started.
All logical volumes that have not been copied yet, and that need to have at least one copy outside the cluster that is entering the service mode, are copied to at least one other cluster in the grid. This is true for all copies except those that are Time Delayed.
For time delayed copies, all data that should be copied in the next 8 hours to the target clusters is copied. All other data is not copied, even if the data is only in the cluster that is entering the service mode.
When service prep completes and the cluster enters service mode, nodes at the site in service mode remain online. However, the nodes are prevented from communicating with other sites. This stoppage enables service personnel to run maintenance tasks on the site’s nodes, run hardware diagnostics, and so on, without affecting other sites.
Only one service prep can occur within a composite library at a time. If a second service prep is attempted at the same time, it fails. You should put only one cluster in the service mode at the same point in time.
A site in service prep automatically cancels and reverts to an ONLINE state if any ONLINE peer in the grid experiences an unexpected outage. The last ONLINE cluster in a multicluster configuration cannot enter the service prep state. This restriction includes a stand-alone cluster. Service prep can be canceled by using the MI, or by the IBM SSR at the end of the maintenance procedure. Canceling service prep returns the subsystem to a normal state.
 
Important: We advise you not to place multiple clusters at the same time in service or service preparation. If you must put multiple clusters in service at once, wait for a cluster to be in final service mode before you start the service preparation for the next cluster.
If you use SAA, you might consider disabling SAA for the duration of the maintenance. This is necessary if you usually offline the drives to the z/OS systems before you enter the service preparation mode. In this case, the cluster is not yet identified as in service, but no devices are online to the z/OS. That would cause the job to go into device allocation recovery, if this were the only SAA candidate.
If you have multiple SAA candidates defined, you still might consider disabling SAA. This would be necessary if otherwise the amount of SAA selectable devices are not sufficient to run all of the jobs concurrently.
After SAA is enabled again, you should restart all attached OAM address spaces to ensure that the changed SAA state is recognized by the attached z/OS. If you do not restart the OAM address spaces, the system might react as though SAA is still disabled.
2.3.39 Service mode
After a cluster completes service prep and enters service mode, it remains in this state. The cluster must be explicitly taken out of service mode by the operator or the IBM SSR.
In smaller grid configurations, put only a single cluster into service at a time to retain the redundancy of the grid. This is only a suggestion, and does not prevent the action from taking place, if necessary.
If it is necessary to put multiple clusters in service mode, it is mandatory to bring them back to normal state together. In this situation a cluster cannot come back online if another cluster is still in service mode. Using the MI, you need to select each cluster independently and select Return to normal mode. The clusters wait until all clusters in service mode are brought back to “normal mode” before they exit the service mode.
 
Tip: Ensure that you can log on to the MIs of the clusters directly. A direct logon is possible, but you cannot navigate to or from other clusters in the grid when the cluster is in service mode.
2.3.40 Control Unit Initiated Reconfiguration
CUIR is available in R4.1.2 to reduce the manual intervention during microcode upgrade processes. Currently, it is your obligation to ensure that all devices are set to offline in all attached z/OS LPARS. With the CUIR function, the TS7700 notifies attached System Z LPARs by using an unsolicited interrupt when the cluster has entered or exited service-prep state. If customer enabled, the devices will be offlined automatically in the attached z/OS LPARs.
The TS7700 tracks the grouped devices to all path groups that reported CUIR is supported, and does not enter service until they are all varied offline.
The customer can decide, if an automatic online (AONLINE) will be executed, when the cluster returns to an operational state from service. Then the logical paths that are established from the system zLPAR which the cluster surfaced the unsolicited attention receive an unsolicited attention to request the zLPAR to vary the devices back online.
If the customer decides not to us AONLINE, the devices need to be varied online again by using the MI. The original z/OS command (Vary Online) cannot be used to online the devices in the z/LPAR. The manual function (through the MI) will also online the drives according to the “list” that has been produced by the CUIR. There is no option today to online drives only partially (for example, to test a system) for quality assurance tests before the cluster is varied online to all systems again.
 
Note: If a device is varied offline for CUIR reasons, and is unintentionally left in this state, the existing MVS VARY XXXX,RESET command can be used to reset the device for CUIR reasons. This command should only be used if there are devices that are left in this state, and should no longer be in this state.
CUIR has the following limitations:
Can be enabled only when all clusters in grid are at R4.1.2 or later.
Only native LPARS with z/OS 2.2 or later and JES2 can exploit CUIR.
Only Service preparation/Service mode is currently supported. The CUIR function might be extended in the future.
The following APARs needs to be installed: OA52398, OA52390, OA52376, OA52379, and OA52381.
New LI REQ commands are provided to enable/disable CUIR and AONLINE, and to get a overview of the current logical drive/path group information. The default is disabled.
For more information, see the IBM TS7700 Series Control Unit Initiated Reconfiguration (CUIR) User’s Guide at:
2.4 Grid configuration examples
Several grid configuration examples are provided. These examples describe the requirements for high availability (HA) and DR planning.
2.4.1 Homogeneous versus hybrid grid configuration
Homogeneous configurations contain either only TS7700Ds or TS7700Ts, or only TS7740. If you have an intermix of disk-only and tape-attached models, it is a hybrid configuration. Consider the following information when you choose whether a TS7720, TS7740, or a mixture of the two types is appropriate.
Requirement: Fast read response times and many reads
When your environment needs to process many reads in a certain amount of time, or it needs fast response times, the TS7700Ds or TS7700Ts CP0 is the best choice. The TS7740 is susceptible to disk cache misses, resulting in a recall, making the TS7740 not optimal for workloads that need the highest cache hit read percentages.
Although TS7760 disk-only configurations can store over 1.3 PB of post-compressed content in disk cache, your capacity needs might be far too large, especially when a large portion of your workload does not demand the highest read hit percentage. This is when the introduction of a TS7700T makes sense.
Requirement: No physical tape or dark site
Some clients are looking to completely eliminate physical tape from one or more data center locations. The TS7700Ds or a hybrid configuration supports these requirements. The complete elimination of physical tape might not be the ideal configuration, because the benefits of both physical tape and deep disk cache can be achieved with hybrid configurations.
Requirement: Big data
The TS7740/TS7700T is attached to an IBM TS3500 or IBM TS4500 tape library, and can store multiple PB of data while still supporting writes at disk speeds and read hit ratios up to 90% for many workloads. Depending on the size of your tape library (the number of library frames and the capacity of the tape cartridges that are being used), you can store more than 175 PB of data without compression.
Requirement: Offsite vault of data for DR purposes with Copy Export
Some clients require an extra copy on physical tape, require a physical tape to be stored in a vault, or depend on the export of physical tape for their DR needs. For these accounts, the TS7740/TS7700T is ideal.
Requirement: Workload movement with Copy Export
In specific use cases, the data that is associated with one or more workloads must be moved from one grid configuration to another without the use of TCP/IP. Physical tape and TS7740/TS7700T Copy Export with merge (available as a service offering) provide
this capability.
2.4.2 Planning for high availability or disaster recovery in limited distances
In many HA configurations, two TS7700 clusters are located within metro distance of each other. They are in one of the following situations:
The same data center within the same room
The same data center, in different rooms
Separated data centers, on a campus
Separated data centers, at a distance in the same metropolitan area
These clusters are connected through a local area network (LAN). If one of them becomes unavailable because it failed, is being serviced, or is being updated, data can be accessed through the other TS7700 Cluster until the unavailable cluster is available. The assumption is that continued access to data is critical, and no single point of failure, repair, or upgrade can affect the availability of data.
For these configurations, the multi-cluster grid can act as both an HA and DR configuration that assumes that all host and disk operations can recover at the metro distant location. However, metro distances might not be ideal for DR, because some disasters can affect an entire metro region. In this situation, a third location is ideal.
Configuring for high availability or metro distance
As part of planning a TS7700 Grid configuration to implement this solution, consider the following information:
Plan for the virtual device addresses in both clusters to be configured to the local hosts.
Plan a redundant FICON attachment of both sites (an extender that is longer than 10 kilometers (km), equivalent to 6.2 miles, for the FICON connections is suggested).
Determine the appropriate Copy Consistency Points. For the workloads that require the highest recovery point objective (RPO), use Sync, or use RUN. For those workloads that are less critical, use deferred replication.
Design and code the DFSMS ACS routines that point to a TS7700 MC with the appropriate Copy Consistency Point definitions.
Ensure that the AOTM is configured for an automated logical volume ownership takeover in case a cluster becomes unexpectedly unavailable within the grid configuration. Alternatively, prepare written instructions for the operators that describe how to perform the ownership takeover manually, if needed. See 2.3.34, “Autonomic Ownership Takeover Manager” on page 96.
2.4.3 Disaster recovery capabilities in a remote data center
A mechanical problem or human error event can make the local site’s TS7700 Cluster unavailable. Therefore, one or more grid members can be introduced, separated by larger distances, to provide business continuance or DR functions.
Depending on the distance to your DR data center, consider connecting your grid members in the DR location to the host in the local site.
No FICON attachment of the remote grid members
In this case, the only connection between the local site and the DR site is the grid network. There is no host connectivity between the local hosts and the DR site’s TS7700.
FICON attachment of the remote grid members
For distances longer than 10 km (6.2 miles), you need to introduce dense wavelength division multiplexing (DWDM) or channel extension equipment. Depending on the distance (latency), there might be a difference in read or write performance compared to the virtual devices on the local TS7700 Cluster:
The distance separating the sites can affect performance.
If the local TS7700 Cluster becomes unavailable, use this remote access to continue your operations by using a remote TS7700 Cluster.
If performance differences are a concern, consider using only the virtual device addresses in a remote TS7700 Cluster when the local TS7700 is unavailable. If these differences are an important consideration you need to provide operator procedures to take over ownership and to vary the virtual devices in a remote TS7700 from online to offline.
As part of planning a TS7700 grid configuration to implement this solution, consider the following information:
Plan for the necessary WAN infrastructure and bandwidth to meet the copy requirements that you need. You generally need more bandwidth if you are primarily using a Copy Consistency Point of SYNC or RUN, because any delays in copy time that are caused by bandwidth limitations can result in an elongation of job run times.
If you have limited bandwidth available between sites, copy critical data with a consistency point of SYNC or RUN, with the rest of the data using the Deferred Copy Consistency Point. Consider introducing cluster families only for three or more cluster grids.
Depending on the distance, the latency might not support the use of RUN or SYNC at all.
Under certain circumstances, you might consider the implementation of an IBM SAN42B-R SAN Extension Switch to gain higher throughput over large distances.
Plan for host connectivity at your DR site with sufficient resources to run your critical workloads.
Design and code the DFSMS ACS routines that point to the appropriate TS7700 MC constructs to control the data that gets copied, and by which Copy Consistency Point.
Prepare procedures that your operators run when the local site becomes unusable. The procedures include several tasks, such as bringing up the DR host, varying the virtual drives online, and placing the DR TS7700 Cluster in one of the ownership takeover modes. Even if you have AOTM configured, prepare the procedure for a manual takeover.
2.4.4 Configuration examples
The various examples in this section are installed in the field, depending on the requirements of the clients. In all of these examples, you can also replace the TS7740 with a TS7720T, depending on the customer requirements.
Example 1: Two-cluster grid
With a two-cluster grid, you can configure the grid for DR, HA, or both.
This example is a two-site scenario where the sites are separated by a 10 km (6.2 miles) distance. Although the customer needs big data, and read processes are limited, two TS7760Ts were installed, one in each site. Because of the limited distance, both clusters are FICON-attached to each host.
The client chooses to use Copy Export to store a third copy of the data in an offsite vault (Figure 2-21).
Figure 2-21 Two-cluster grid
Example 2: Three-cluster grid in two locations
In this example (Figure 2-22), one of the data center locations has several departments. The grid and the hosts are spread across the different departments. For DR purposes, the client introduced a remote site, where the third TS7740/TS7720T is installed.
The client runs many OAM and HSM workloads, so the large cache of the TS7760 provides the necessary bandwidth and response times. Also, the client wanted to have a third copy on a physical tape, which is provided by the TS776T in the remote location.
Figure 2-22 Three-cluster grid in two locations
Example 3: Three-cluster grid in three locations
This example is the same as configuration example 2. However, in this case, the two TS7760s and the attached hosts are spread across two data centers that are at a distance further than 10 km (6.2 miles). Again, the third location is a data-only store, where an existing TS7740 was used. See Figure 2-23.
Figure 2-23 Three-cluster grid in three locations
Example 4: Four-cluster grid in three locations
The setup in Figure 2-24 shows the configuration after a merge of existing grids. Before the merge, the grids were only spread across 10 km (6.2 miles). The client’s requirements changed. The client needed a third copy in a data center at a longer distance.
By merging environments, the client can address the requirements for DR and still use the existing environment.
Figure 2-24 Four-cluster grid in three locations
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.144.17.91