GDPS Global - GM
In this chapter, we discuss the capabilities and prerequisites of the GDPS Global - GM (GM) offering.
The GDPS GM offering provides a disaster recovery capability for businesses that have an RTO of as little as one hour, and an RPO as low as five seconds. It is often deployed in configurations where the application and recovery sites are more than 200 km (124 miles) apart and want to have integrated remote copy processing for mainframe and non-mainframe data.
The functions that are provided by GDPS GM fall into the following categories:
Protecting your data:
 – Protecting the integrity of the data on the secondary data in the event of a disaster or suspected disaster.
 – Managing the remote copy environment through GDPS scripts and NetView panels or the web interface.
 – Optionally supporting remote copy management and consistency of the secondary volumes for Fixed Block (FB) data. Depending on your application requirements, the consistency of the FB data can be coordinated with the CKD data.
Controlling the disk resources that are managed by GDPS during normal operations, planned changes, and following a disaster:
 – Support for recovering the production environment following a disaster.
 – Support for switching your data and systems to the recovery site.
 – Support for testing recovery and restart by using a practice FlashCopy point-in-time copy of the secondary data while live production continues to run in the application site and continues to be protected with the secondary copy.
This chapter includes the following topics:
6.1 Introduction to GDPS Global - GM
GDPS GM is a disaster recovery solution. It is similar in various respects to GDPS XRC in that it supports virtually unlimited distances. However, the underlying IBM Global Mirror (GM), remote copy technology also supports both IBM Z CKD data and distributed data, and GDPS GM also includes support for both.
GDPS GM can be viewed as a mixture of GDPS Metro and GDPS GM. Just as PPRC (IBM Metro Mirror) is a disk subsystem-based remote copy technology, GM is also disk-based, which means there is no requirement for a System Data Mover (SDM) system to drive the remote copy process. Also, like Metro Mirror, Global Mirror requires that the primary and secondary disk subsystems are from the same vendor.
Conversely, GDPS GM resembles GDPS XRC in that it is asynchronous and supports virtually unlimited distances between the application and recovery sites. Also, similar to GDPS XRC, GDPS GM does not provide any automation or management of the production systems. Instead, its focus is on managing the Global Mirror remote copy environment and automating and managing recovery of data and systems in case of a disaster. Like GDPS XRC, GDPS GM supports the ability to remote copy data from multiple systems and sysplexes. In contrast, each GDPS Metro installation supports remote copy for only a single sysplex.
The capabilities and features of GDPS GM are described in this chapter.
6.1.1 Protecting data integrity
Because the role of GDPS GM is to provide disaster recovery support, its highest priority is protecting the integrity of the data, CKD and FB, in the recovery site. This section discusses the support that is provided by GDPS for these various data types.
Traditional IBM Z (CKD) data
As described in 2.4.3, “Global Mirror” on page 32, Global Mirror protects the integrity of the remote-copied data by creating consistency groups, continuously or at intervals that are specified by the installation. The process is managed by the Master disk subsystem, which is based on the GDPS GM configuration.
There are no restrictions relating to which operating systems’ data can be supported; any system that writes to CKD devices (z/OS, z/VM, z/VSE, and Linux for System z) is supported. Regardless of which systems are writing to the devices, all management control is from the z/OS system that is running the GDPS GM local controlling system, also known as the K-sys.
How frequently a consistency group can be created depends on the bandwidth that is provided between the application and recovery site disks. IBM can perform a bandwidth analysis for you to help you identify the required capacity.
GDPS Global - GM uses devices in the primary and secondary disk subsystems to execute the commands to manage the environment. Some of these commands directly address a primary device, whereas others are directed to the LSS. To execute these LSS-level commands, you must designate at least one volume in each primary LSS as a GDPS utility device, which is the device that serves as the “go between” between GDPS and the LSS. These utility devices do not need to be dedicated devices; that is, they can be one of the devices that are being mirrored as part of your Global Mirror session. In fact, the utility devices also need to be mirrored.
Distributed (FB) data
GDPS GM provides the capability to manage a heterogeneous environment of IBM Z and distributed systems data. Through a function called Fixed Block Disk Management (FB Disk Management), GDPS GM can manage the Global Mirror remote copy configuration and FlashCopy for distributed systems data. The Global Mirror remote copy technology (see 2.4.3, “Global Mirror” on page 32) inherently provides data consistency for IBM Z and distributed systems data.
The FB devices can be in the same Global Mirror session as the CKD devices or in a separate session. If the FB devices and CKD devices are in the same session, they have the same consistency point and they must be recovered together. If they are in a different session, they have a different consistency point (the data for each session is consistent within itself, but the data for the two sessions is inconsistent with each other) and can be recovered separately.
FB Disk Management prerequisites
GDPS requires that the disk subsystems that contain the FB devices to support specific architectural features. The following architectural features are supported by all IBM disk subsystems:
The ability to manage FB devices through a CKD utility device
GDPS runs on z/OS and can communicate Global Mirror commands to the disk subsystem directly over a channel connection to CKD devices only. To communicate commands to the FB LSS and devices, the architecture allows the use of a CKD utility device in the same disk subsystem as a go-between to send commands and to monitor and control the mirroring of the FB devices. GDPS needs at least one CKD utility device in each hardware cluster of the storage subsystem where FB devices are located.
The ability to send SNMP traps to report certain errors
The FB LSS and devices must communicate certain error conditions back to GDPS (for example, an abnormal state of a Global Mirror session in GDPS GM). This status is communicated to the z/OS host that is running the GDPS that is controlling system through an IP connection by using SNMP traps. GDPS captures these traps and drives autonomic action, such as performing a freeze for a mirroring failure.
A sample FB device GDPS GM configuration is shown in Figure 6-1. Not shown are the IP connections from the attached disks to the z/OS host where GDPS is running.
Figure 6-1 Fixed Block disk support
 
6.2 GDPS Global - GM configuration
At its most basic, a GDPS GM configuration consists of one or more production systems, an application site controlling system (K-sys), a recovery site controlling system (R-sys), primary disks, and two sets of disks in the recovery site.
The GM copy technology uses three sets of disks. For more information about how GM works and how the disks are used to provide data integrity, see 2.4.3, “Global Mirror” on page 32.
The K-sys is responsible for controlling all remote copy operations and for sending configuration information to the R-sys. In normal operations, most operator and system programmer interaction with GDPS GM is through the K-sys.
The K-sys role is related to remote copy; it does not provide any monitoring, automation, or management of systems in the application site, nor any FlashCopy support for application site disks. There is no requirement for the K-sys to be in the same sysplex as the system or systems for which it is managing data. In fact, the K-sys is placed in a monoplex on its own.
You can also include the K-sys disks in the GDPS managed GM configuration and replicate them. The K-sys does not have the isolation requirements of the controlling system in a GDPS Metro configuration.
The R-sys is primarily responsible for validating the configuration, monitoring the GDPS managed resources (such as the disks in the recovery site), and carrying out all recovery actions for test purposes or if a real disaster occurs. For more information about testing by using FlashCopy, see 6.6, “Flexible testing and Logical Corruption Protection” on page 197.
The K-sys and R-sys communicate information to each other by using a NetView-to-NetView network communication mechanism over the wide area network (WAN). K-sys and R-sys are dedicated to their roles as GDPS controlling systems.
GDPS GM can control multiple Global Mirror sessions. Each session can consist of a maximum of 17 disk subsystems (combination of primary and secondary). All the members of the same session have the same consistency point.
Typically, the data for all systems that must be recovered together is managed through one session. For example, a z/OS sysplex is an entity where the data for all systems in the sysplex must be in the same consistency group.
If you have two production sysplexes under GDPS GM control, the data for each can be managed through a separate GM session, in which case they can be recovered individually. You can also manage the entire data for both sysplexes in a single GM session, in which case if one sysplex fails and you must start recovery, you also must recover the other sysplex.
Information about which disks are to be mirrored as part of each session and the intervals at which a consistency point is to be created for each session is defined in the GDPS remote copy configuration definition file (GEOMPARM). GDPS GM uses this information to control the remote copy configuration. As with the other GDPS offerings, the NetView panel interface (or the web interface) is used as the operator interface to GDPS.
Although the panel interface or web interface support management of GM, they are primarily intended for viewing the configuration and performing some operations against single disks. GDPS scripts are intended to be used for actions against the entire configuration because this is much simpler (with multiple panel actions combined into a single script command) and less error-prone.
The actual configuration depends on your business and availability requirements, the amount of data you are remote copying, the types of data you are remote copying (only CKD or both CKD and FB), and your RPO.
Figure 6-2 shows a typical GDPS GM configuration.
Figure 6-2 GDPS GM configuration
The application site, as shown in Figure 6-2, features the following items:
z/OS systems spread across several sysplexes
A non-sysplexed z/OS system
Two distributed systems
The K-sys
The primary disks (identified by A)
The K-sys’ own disks (marked by L)
The recovery site includes the following items:
The R-sys
A CPC with the CBU feature that also contains expendable workloads that can be displaced
Two backup distributed servers
The Global Mirror secondary disks (marked by B)
The Global Mirror FlashCopy targets (marked by C)
The R-sys disks (marked by L)
Although there is great flexibility in terms of the number and types of systems in the application site, several items are fixed. Consider the following points:
All the GM primary disks and the K-sys must be in the application site1.
All the GM secondary disks, the FlashCopy targets used by GM, and the GDPS R-sys must be in the recovery site2.
The following aspects in GDPS GM differ from the other GDPS offerings:
Although the K-sys should be dedicated to its role as a controlling system, it is not necessary to provide the same level of isolation for the K-sys as that required in a GDPS Metro or GDPS HM configuration.
Because of XRC time stamping, GDPS XRC requires that all of the systems that are writing to the primary disks have a common time source (Sysplex Timer or STP). GDPS GM does not have this requirement.
If insufficient bandwidth exists for XRC operations, GDPS XRC writes to the primary disk subsystem are paced. This means that the RPO is maintained, but at the potential expense of performance of the primary devices.
With GDPS GM, if there is insufficient bandwidth, the consistency points fall behind. This means that the RPO might not be achieved, but performance of the primary devices is protected.
In both cases, if you want to protect both response times and RPO, you must provide sufficient bandwidth to handle the peak write load.
The GDPS GM code runs under NetView and System Automation, and is run in the K-sys and R-sys only.
GDPS GM multiple R-sys collocation
GDPS GM can support multiple sessions. Therefore, the same instance of GDPS GM can be used to manage GM replication and recovery for several diverse sysplexes and systems. However, there are certain cases where different instances of GDPS GM are required to manage different sessions. One example is the GDPS GM leg of a GDPS MGM configuration: in such a configuration, GDPS GM is restricted to managing only one session. Clients might have other requirements that are based on workloads or organizational structure for isolating sessions to be managed by different instances of GDPS GM.
When you have multiple instances of GDPS GM, each instance needs its own K-sys. However, the R-sys “functions” of each instance can be combined to run in the same z/OS image. Each R-sys function runs in a dedicated NetView address space in the same z/OS. Actions, such as running scripts, can be done simultaneously in these NetView instances. This ability reduces the overall cost of managing the remote recovery operations for customers that require multiple GDPS GM instances.
6.2.1 GDPS GM in a 3-site or 4-site configuration
GDPS GM can be combined with GDPS Metro (or GDPS HM) in a 3-site or 4-site configuration, where GDPS Metro (or GDPS HM) is used across two sites within metropolitan distances (or even within a single site) to provide continuous availability through Parallel Sysplex use and GDPS HyperSwap. GDPS GM provides disaster recovery in a remote region.
We call this combination the GDPS Metro Global - GM (GDPS MGM) configuration. In such a configuration, GDPS Metro and GDPS GM provide more automation capabilities.
6.2.2 Other considerations
The availability of the GDPS K-sys in all scenarios is a fundamental requirement in GDPS. The K-sys monitors the remote copy process, implements changes to the remote copy configuration, and sends GDPS configuration changes to the R-sys.
Although the main role of the R-sys is to manage recovery following a disaster or to enable DR testing, it is important that the R-sys also be available at all times. This is because the K-sys sends changes to GDPS scripts and changes to the remote copy or remote site configuration to the R-sys at the time the change is introduced on the K-sys. If the R-sys is not available when such configuration changes are made, it is possible that it might not have the latest configuration information in the event of a subsequent disaster, resulting in an impact to the recovery operation.
Also, the R-sys plays a role in validating configuration changes. Therefore, it is possible that a change that contains errors that is rejected by the R-sys (if it was running) are not caught. This issue affects the remote copy or recovery operation.
Because GDPS GM is in essence a disaster recovery offering rather than a continuous availability offering, it does not support the concept of site switches that GDPS Metro provides3. It is expected that a switch to the recovery site is performed in the case of a real disaster only.
If you want to move operations back to the application site, you must set up GDPS GM in the opposite direction (which means that you also need two sets of disks in the application site), or use an alternative mechanism, such as Global Copy, that is outside the control of GDPS. If you intend to switch to run production in the recovery site for an extended period, providing two sets of disks and running GDPS GM in the reverse direction is the preferable option to provide disaster recovery capability.
6.3 Managing the GDPS environment
GDPS Global - GM automation code runs in one system in the application site only (the K-sys) and it does not provide for any monitoring or management of the production systems in this site. The K-sys has the following responsibilities:
It is the primary point of GDPS GM control for operators and system programmers in normal operations.
It manages the remote copy environment. Changes to the remote copy configuration (adding new devices into a running GM session or removing devices from a running session) are driven from the K-sys.
Changes to the configuration definitions or scripts (including configuration definitions for recovery site resources and scripts that are destined to be run on the R-sys) are defined in the K-sys and automatically propagated to the R-sys.
In the recovery site, GDPS GM runs only in one system: the R-sys. However, the role and capabilities of the R-sys are different from those of the K-sys. Although both are GDPS controlling systems, there are fundamental differences between them. The R-sys has the following responsibilities:
Validate the remote copy configuration in the remote site. This is a key role. GM is a hardware replication technology. Just because the GM primary disks can communicate to the GM secondary disks over remote copy links does not mean that in a recovery situation, systems can use these disks. The disks must be defined in that site’s I/O configuration. If you are missing some disks, this issue can cause recovery to fail because you cannot properly restart systems that need those disks.
Monitor the GDPS-managed resources in the recovery site and raise alerts for not-normal conditions. For example, GDPS uses the BCP Internal Interface (BCPii) to perform hardware actions, such as adding temporary CBU capacity to CPCs, deactivating LPARs for discretionary workloads, and activating LPARs for recovery systems. The R-sys monitors that it has BCPii connectivity to all CPCs that it must perform actions against.
Communicate status and alerts to the K-sys that is the focal management point during normal operations.
Automate reconfiguration of the recovery site (recovering the Global Mirror, taking a FlashCopy, activating CBU, activating backup partitions, and so on) for recovery testing or in the event of a true disaster.
The R-sys has no relation to any application site resources. The only connection it has to the application site is the network connection to the K-sys for exchanging configuration and status information.
6.3.1 User interfaces
The operator interface for GDPS GM is provided through NetView 3270 panels or a browser-based graphical user interface, which is also referred to as the GDPS GUI (see “GDPS graphical user interface” on page 184). In normal operations, the operators interact mainly with the K-sys, but there is also a similar set of interfaces for the R-sys.
 
Note: The GDPS GUI that is described in this chapter is new and replaces the former GDPS Web GUI that was described in the previous releases of this book. The GDPS Web GUI (which was based on the NetView Web Application) was removed from the GDPS solution V4R1.
The NetView interface for GDPS consists of two parts. The first and potentially the most important part is the Status Display Facility (SDF). GDPS sends an alert to SDF whenever there is a change of status to something that GDPS does not consider “normal” and that can affect the ability to recover so it is something that requires investigation and manual intervention.
SDF provides a dynamically updated color-coded panel that provides the status of the systems and highlights any problems in the remote copy configuration. If something changes in the environment that requires attention, the color of the associated field on the panel changes. K-sys sends alerts to the R-sys and R-sys sends alerts to K-sys so that both controlling systems are aware of any problems at all times.
During normal operations, the operators should always have a K-sys SDF panel within view so that they are immediately aware of anything that requires intervention or action. When R-sys is being used for managing testing or recovery operations, operators also should have access to the R-sys SDF panel.
The other part of the NetView interface consists of the panels that are provided by GDPS to help you manage and inspect the environment. The main GDPS panel is shown in Figure 6-3.
Figure 6-3 GDPS Main panel (K-sys)
From this panel, you can perform the following actions:
Query and control the disk remote copy configuration.
Start GDPS standard actions (the ability to control and initiate actions against LPARs).
On the K-sys, the only standard action supported is the ability to update IPL information for the recovery site LPARs.
On the R-sys, all standard actions are available.
Start GDPS scripts (Planned Actions).
Manage GDPS Health Checks.
View and refresh the definitions of the remote copy configuration.
Run GDPS monitors.
GDPS graphical user interface
The GDPS GUI is a browser-based interface that improves operator productivity. The GDPS GUI provides the same functional capability as the 3270-based panel, such as providing management capabilities for Remote Copy Management, Standard Actions, Sysplex Resource Management, SDF Monitoring, and browsing the CANZLOG, by using simple point-and-click procedures.
Advanced sorting and filtering is available in most of the views that are provided by the GDPS GUI. In addition, users can open multiple windows or tabs to allow for continuous status monitoring while performing other GDPS GM management functions.
The GDPS GUI display features the following main sections, as shown in Figure 6-4 on page 184:
1. The application header at the top of the page includes an Actions button with which various GDPS tasks can be performed, along with the Help function and the ability to logoff or switch between target systems.
2. The application menu is on the left side of the window. This menu gives access to various features and functions that are available through the GDPS GUI.
3. The active window shows context-based content, depending on the selected function. This tabbed area is where you can switch context by clicking a different tab.
4. A status summary area is shown at the bottom of the display.
Note: For the remainder of this section, only the GDPS GUI is shown to highlight the various GDPS management functions. The equivalent traditional 3270 panels are not shown here.
The initial status window (known as the dashboard) of the GDPS Global - GM GUI is shown in Figure 6-4. This window provides an instant view of the status and direction of replication, and disks and systems availability. Hovering over the various icons provides more information through pop-up windows.
Figure 6-4 GDPS GUI Dashboard (initial window)
Monitoring function: Status Display Facility
GDPS also provides many monitors to check the status of disks, sysplex resources, and so on. GDPS raises an alert whenever a configuration change occurs, or something in GDPS that requires manual intervention. GDPS uses the Status Display Facility (SDF) that is provided by System Automation as the primary status feedback mechanism for GDPS.
GDPS provides a dynamically updated window, as shown in Figure 6-5 on page 185. A summary of all current alerts is shown at the bottom of each window.
The initial view that is presented is for the SDF trace entries, meaning that you can follow, for example, script execution. Click one of the icons that represents the other alert categories to view the different alerts that are associated with automation or remote copy in either site, or click All to see all alerts. You can sort and filter the alerts based on several fields presented, such as severity.
 
Figure 6-5 GDPS GUI SDF window (Trace entries))
Remote copy operations
Although Global Mirror is a powerful copy technology, the z/OS operator interface to it is not particularly intuitive. Use the Disk Remote Copy panels that are provided by GDPS to make it easier for operators to check and manage the remote copy environment.
For GDPS to manage the remote copy environment, you first define the configuration to GDPS in the GEOMPARM file on the K-sys. The R-sys always receives the configuration information from the K-sys and validates the remote site disk configuration.
After the configuration is known to GDPS, you can use the GUI to check that the current configuration matches the one you want. You can start, stop, pause, and resynch mirroring. These actions can be done at the device, LSS, or session level. However, we suggest that GDPS control scripts are used for actions at the session level.
Figure 6-6 shows the GM sessions status panel for GDPS GM as viewed on the K-sys. It allows you to see the status of the GM sessions and obtain information about individual LSS or device pairs if required. The panel for the R-sys is similar, except that the R-sys can perform only a limited number of actions (typically only those necessary to take corrective action) against the devices in the recovery site. Control of the GM session can be done from the K-sys only; the R-sys can control only the devices in the recovery site.
Figure 6-6 GM sessions status panel as viewed on the K-sys
Figure 6-7 shows an example of our panel that displays the LLS pairs.
Figure 6-7 Sample panel for the LLS Pairs
Figure 6-8 shows the device pairs.
Figure 6-8 Sample panel for the Device Pairs
The GUI that is provided by GDPS is not intended to be a remote copy monitoring tool. Because of the overhead that is involved in gathering information about every device in the configuration to populate the windows, GDPS gathers this data on a timed basis only, or on demand following an operator instruction.
The normal interface for finding out about remote copy problems is the Status Display Facility, which is dynamically updated if or when a problem is detected.
Standard Actions
The K-sys does not provide any management functions for any systems in the application site or in the recovery site. The R-sys manages recovery in the recovery site. As a result, the available Standard Actions vary, depending on which type of controlling system you use.
On the K-sys, the only Standard Action that is available is to define the possible IPL address and Loadparms that can be used for recovery systems (production systems when they are recovered in the recovery site) and to select the one to use in the event of a recovery action. Changes that are made on this panel are automatically propagated to the R-sys. The K-sys Standard Actions panel is shown in Figure 6-9.
Figure 6-9 GDPS GM K-sys Standard Actions panel
Because the R-sys manages the recovery if a disaster occurs (or IPL for testing purposes) of the production systems in the recovery site, it has a wider range of functions available (see Figure 6-10 on page 189). Functions are provided to activate and deactivate LPARs, IPL and reset systems, and update the IPL information for each system.
Figure 6-10 Example GDPS GM R-sys Standard Actions panel for a selected system
Standard Actions are single-step actions and are intended to affect only one resource. For example, if you want to reset an expendable test system that is running in the recovery site, deactivate the LPAR of the expendable system, activate the recovery LPAR for a production system, and then, IPL the recovery system into the LPAR you just activated, you start four separate Standard Actions, one after the other. GDPS scripting, as described next, is a facility that is suited to multi-step, multi-system actions.
6.3.2 GDPS scripts
Nearly all the functions that can be started through the panels (and more) are also available from GDPS scripts. A script is a program that consists of one or more GDPS functions to provide a workflow.
In addition to the low-level functions that are available through the panels, scripts can start functions with a single command that might require multiple separate steps if performed through the panels. For example, if you have a new disk subsystem and are adding several LSSs that are populated with many devices to your Global Mirror configuration, this process can require a significantly large number of panel actions.
In comparison, this process can be accomplished by using a single script command. It is faster and more efficient to perform compound or complex operations by using scripts.
Scripts can be started manually through the GDPS panels or through a batch job. In GDPS GM, the only way to start the recovery of the secondary disks is through a GDPS script on the R-sys; starting a recovery directly from the mirroring panels is not supported.
Scripts are written by you to automate the handling of certain situations, both planned changes and also error situations. This function is an extremely important aspect of GDPS.
Scripts are powerful because they can access the full capability of GDPS. The ability to invoke all the GDPS functions through a script provides the following benefits:
Speed
The script runs the requested actions as quickly as possible. Unlike a human, it does not need to search for the latest procedures or the commands manual.
Consistency
If you were to look into most computer rooms immediately following a system outage, what would you see? Likely, mayhem. Operators are frantically scrambling for the latest system programmer instructions. All the phones are ringing. Every manager within reach is asking when the service will be restored. And every system programmer with access is vying for control of the keyboards.
All of this chaos results in errors because humans often make mistakes when under pressure. But with automation, your well-tested procedures run in the same way, time after time, regardless of how much you shout at them.
Automatic checking of results from commands
Because the results of many GDPS commands can be complex, manual checking of results can be time-consuming and presents the risk of missing something. In contrast, scripts automatically check that the preceding command (remember, that one command can have been six GM commands, each of the six run against thousands of devices) completed successfully before proceeding with the next command in the script.
Thoroughly tested procedures
Because scripts behave in a consistent manner, you can test your procedures over and over until you are sure they do everything that you want, in exactly the manner that you want. Also, because you must code everything and cannot assume a level of knowledge (as you might with instructions intended for a human), you are forced to thoroughly think out every aspect of the action the script is intended to undertake. Finally, because of the repeatability and ease of use of the scripts, they lend themselves more easily to frequent testing than manual procedures.
Planned Actions
In a GDPS GM environment, all actions affecting the recovery site are considered planned actions. You can think of this as pre-planned unplanned actions. GDPS scripts can be started from the Panels (option 6 on the main GDPS panel, as shown in Figure 6-3 on page 183) and from the GUI.
A control script that is running can be stopped if necessary. Control scripts that were stopped or failed can be restarted at any step of the script. These capabilities provide a powerful and flexible workflow management framework.
An example of a planned action in GDPS GM is a script that prepares the secondary disks and LPARs for a disaster recovery test.
Such a script performs the following actions:
Recovers the disks in the disaster site, which makes the B disks consistent with the C disks. The B disks are used for the test and the C disks contain a consistent copy that ages during the test.
Activates CBU capacity in the recovery site CPCs.
Activates backup partitions that are predefined for the recovery systems (that is, the production systems that are running in the recovery site).
Activates any backup coupling facility partitions in the recovery site.
Loads the systems into the partitions in the recovery site by using the B disks.
 
When the test is complete, you run another script in the R-sys to perform the following tasks:
Reset the recovery systems that were used for the test
Deactivate the LPARs that were activated for the test.
Undo CBU on the recovery site CPCs.
Issue a message to the operators to manually shut down any open systems servers in the recovery site that were used for the test.
Bring the B disks back into sync with the C disks (which are consistent with the primary disks at the time of the start of the test).
Finally, you run a script on the K-sys to resynchronize the recovery site disks with the production disks.
Batch scripts
In addition to the ability to start GDPS scripts from the GDPS panel interfaces, a script can be start from outside of GDPS by using a batch interface. These scripts are known as batch scripts and they cannot be started from the GDPS panels or GUI. This ability is especially suited to processes that are run regularly, and feature some interaction with the GDPS environment.
6.3.3 Application programming interfaces
GDPS provides two primary programming interfaces to allow other programs that are written by clients, Independent Software Vendors (ISVs), and other IBM product areas to communicate with GDPS. These APIs allow clients, ISVs, and other IBM product areas to complement GDPS automation with their own automation code. The following sections describe the APIs that are provided by GDPS.
Query Services
GDPS maintains configuration information and status information in NetView variables for the various elements of the configuration that it manages. GDPS Query Services is a facility that allows user-written REXX programs that are running under NetView to query and obtain the value of various GDPS variables. This configuration allows you to augment GDPS automation with your own automation REXX code for various purposes, such as monitoring or problem determination.
Query Services allows clients to complement GDPS automation with their own automation code. In addition to the Query Services function (which is part of the base GDPS product), GDPS provides several samples in the GDPS SAMPLIB library to demonstrate how Query Services can be used in client-written code.
RESTful APIs
As described in ““Query Services” on page 191”, GDPS maintains configuration information and status information about the various elements of the configuration that it manages. Query Services can be used by REXX programs to query this information.
The GDPS RESTful API also provides the ability for programs to query this information. Because it is a RESTful API, it can be used by programs that are written in various programming languages, including REXX, that are running on various server platforms.
In addition to querying information about the GDPS environment, the GDPS RESTful API allows programs that are written by clients, ISVs, and other IBM product areas to run actions against various elements of the GDPS environment. Examples of these actions include starting and stopping Global Mirror, updating the Global Mirror session parameters, and starting GDPS monitor processing. These capabilities enable clients, ISVs, and other IBM product areas provide an even richer set of functions to complement the GDPS functions.
GDPS provides samples in the GDPS SAMPLIB library to demonstrate how the GDPS RESTful API can be used in programs.
6.3.4 Additional system management information
In a GDPS GM environment, the remote controlling system can use the hardware and system management actions to reconfigure the recovery site by adding temporary capacity, activating backup partitions, and IPLing production systems. This can be either for test purposes or for a real recovery. GDPS does not manage the systems or the hardware in the application site.
Most of the GDPS Standard Actions and several script commands require GDPS to communicate with the HMC. The interface GDPS uses to communicate with the HMC is called the BCP Internal Interface (BCPii). This interface allows GDPS to automate many of the HMC actions, such as LOAD, RESET, Activate or Deactivate an LPAR, and Activate or Undo CBU or OOCoD.
The GDPS LOAD and RESET Standard Actions (available through the Standard Actions panel or the SYSPLEX script statement) allow specification of a CLEAR or NOCLEAR operand. This provides the operational flexibility to accommodate client procedures.
Extensive facilities for adding temporary processing capacity to the CPCs in the recovery site are provided by the GDPS scripting capability.
 
6.4 GDPS GM monitoring and alerting
The GDPS SDF panel is described in 6.3.1, “User interfaces” on page 182. It is this panel on which GDPS dynamically displays alerts (which are color-coded based on severity) when a non-normal status or situation is detected.
Alerts can be posted as a result of an unsolicited error situation for which GDPS listens. For example, if a problem occurs with the GM session and the session suspends outside of GDPS control, GDPS is aware of this issue because the disk subsystem that is the Master for the GM session posts an SNMP alert. GDPS listens for these SNMP alerts and, in turn, posts an alert on the SDF panel that notifies the operator of the suspension event.
Alerts can also be posted as a result of GDPS periodically monitoring key resources and indicators that relate to the GDPS GM environment. If any of these monitoring items are found to be in a state deemed to be not normal by GDPS, an alert is posted on SDF.
Because the K-sys and R-sys have different roles and affect different resources, they each monitor a different set of indicators and resources.
For example, the K-sys has TCP/IP connectivity to the A disk through which the GM Master disk subsystem posts SNMP alerts about GM problems. For this reason, it is important that the TCP/IP connectivity between the K-sys and the production disk is functioning properly. The K-sys, among other things, monitors this connection to ensure that it is functional so that if there is a GM problem, the SNMP alert will reach the K-sys.
Likewise, it is the R-sys that uses the BCP Internal Interface to perform hardware actions to reconfigure the recovery site, for disaster testing or in the event of a real recovery scenario. One of the resources that is monitored by the R-sys is the BCP Internal Interface connection to all CPCs in the recovery site on which the R-sys can perform hardware operations, such as CBU or LPAR activation.
In addition to posting alerts on their own SDF panel, the K-sys and R-sys forward any alerts to the other system for posting. Because the operator is notified of R-sys alerts on the K-sys SDF panel, it is sufficient for the operator to monitor the K-sys SDF panel during normal operations if the K-sys is up and running.
If an alert is posted, the operator must investigate (or escalate, as appropriate) and corrective action that must be taken for the reported problem as soon as possible. After the problem is corrected, it is detected during the next monitoring cycle and the alert is cleared by GDPS automatically.
GDPS GM monitoring and alerting capability is intended to ensure that operations are notified and can take corrective action for any problems in their environment that can affect the ability of GDPS GM to do recovery operations. This maximizes the installation’s chance of achieving RPO and RTO commitments.
6.4.1 GDPS GM health checks
In addition to GDPS GM monitoring, GDPS provides health checks. These health checks are provided as a plug-in to the z/OS Health Checker infrastructure to check that certain settings related to GDPS adhere to GDPS preferred practices.
The z/OS Health Checker infrastructure is intended to check various settings to determine whether these settings adhere to z/OS preferred practices values. For settings that are found to be not in line with preferred practices, exceptions are raised in the Spool Display and Search Facility (SDSF).
Many products, including GDPS, provide health checks as a plug-in to the z/OS Health Checker. There are various parameter settings that are related to GDPS, such as z/OS PARMLIB settings or NetView settings, and the recommendations and preferred practices for these settings are documented in GDPS publications. If these settings do not adhere to recommendations, this issue can hamper the ability of GDPS to perform critical functions in a timely manner.
Although GDPS monitoring detects that GDPS cannot perform a particular task and raises an alert, the monitor alert might be too late, at least for that particular instance of an incident. Often, if there are changes in the client environment, this issue might necessitate adjustment of some parameter settings that are associated with z/OS, GDPS, and other products. It is possible that you can miss making these adjustments, which might result in affecting GDPS.
The GDPS health checks are intended to detect such situations and avoid such incidents where GDPS is unable to perform its job because of a setting that is perhaps less than ideal.
For example, there are several address spaces that are associated with GDPS GM and preferred practices recommendations are documented for these. The GDPS code runs in the NetView address space and there are DFSMS address spaces that GDPS interfaces with to perform GM copy services operations.
GDPS recommends that these address spaces are assigned specific Workload Manager (WLM) service classes to ensure that they are dispatched in a timely manner and do not lock each other out. For example, one of the GDPS GM health checks determines if these address spaces are set up and running with the characteristics that are recommended by GDPS.
Similar to z/OS and other products that provide health checks, GDPS health checks are optional. Several preferred practices values that are checked and the frequency of the checks can be customized to cater to unique client environments and requirements.
GDPS also provides a useful interface for managing the health checks by using the GDPS panels. You can perform actions, such as activate or deactivate or run any selected health check, and view the customer overrides in effect for any preferred practices values.
Figure 6-11 shows a sample of the GDPS Health Check management panel. In this example you see that all the health checks are enabled. The status of the last run is also shown, which indicates whether the last run was successful or resulted in an exception. Any exceptions can also be viewed by using other options on the panel.
Figure 6-11 GDPS GM Health Check management panel
6.5 Other facilities related to GDPS
In this section, we describe other facilities that are provided by GDPS Global - GM that can assist in various ways.
6.5.1 GDPS GM Copy Once facility
GDPS provides a Copy Once facility to copy volumes that have data sets on them that are required for recovery but the content is not critical, so they do not need to be copied all the time. Page data sets and work volumes that contain only truly temporary data such as sort work volumes are primary examples. The Copy Once facility can be invoked whenever required to refresh the information about these volumes.
To restart your workload in the recovery site, you need to have these devices or data sets available (the content is not required to be up to date). If you do not remote copy all of your production volumes, you will need to either manually ensure that the required volumes and data sets are preallocated and kept up to date at the recovery site or use the GDPS Copy Once function to manage these devices.
For example, if you are not replicating your paging volumes, then you must create the volumes with the proper volume serial with required data sets in the recovery site. Then, each time you change your paging configuration in the application site, you must reflect the changes in your recovery site.
The GDPS Copy Once function provides a method of creating an initial copy of such volumes plus the ability to re-create the copy if the need arises as the result of any changes in the application site.
If you plan to use the Copy Once facility, you need to ensure that no data that needs to be continuously replicated is placed on the volumes you define to GDPS as Copy Once because these volumes will not be continuously replicated. The purpose of Copy Once is to ensure that you have a volume with the correct VOLSER, and with the data sets required for recovery allocated, available in the recovery site. The data in the data sets is not time-consistent with the data on the volumes that are continuously mirrored.
6.5.2 Global Mirror Monitor integration
GDPS provides a Global Mirror Monitor (also referred to as GM Monitor) that is fully integrated into GDPS. This function provides a monitoring and historical reporting capability for Global Mirror performance and behavior, and some autonomic capability based on performance. The GM Monitor provides the following capabilities:
Ability to view recent performance data for a Global Mirror session, for example to understand if an ongoing incident might be related to Global Mirror.
Generation of alerts and messages for Global Mirror behavior based on exceeding thresholds in a defined policy.
Ability to perform automatic actions such as pausing a GM session or resuming a previously paused session based on a defined policy.
Creation of SMF records with detailed historical Global Mirror performance and behavioral data for problem diagnosis, performance reporting, and capacity planning.
The GM Monitor function runs in the K-sys and supports both CKD and FB environments. An independent monitor can be started for each GM session in your GDPS configuration. GDPS stores the performance data collected by each active monitor. Recent data is viewable using the GDPS 3270 panels.
6.5.3 Easy Tier Heat Map Transfer
IBM DS8000 Easy Tier optimizes data placement (placement of logical volumes) across the various physical tiers of storage within a disk subsystem to optimize application performance. The placement decisions are based on learning the data access patterns, and can be changed dynamically and transparently using this data.
Global Mirror copies the data from the primary to the secondary disk subsystem. However, the Easy Tier learning information is not included in the Global Mirror scope. The secondary disk subsystems are optimized according to the workload on these subsystems, which is different than the activity on the primary (there is only write workload on the secondary whereas there is read/write activity on the primary).
Also, there is little activity on the tertiary disk (FlashCopy target disk, or FC1 disk), so it will be optimized differently than the primary disk or the secondary disk. As a result of these differences, during a recovery, the disks that you recover on (secondary or tertiary) are likely to display different performance characteristics compared to the former primary.
Easy Tier Heat Map Transfer is the DS8000 capability to transfer the Easy Tier learning from a Global Mirror primary disk to a target set of disk. With GDPS GM, the Easy Tier learning can be transferred to the secondary disk and the tertiary disk (FC1 disk) so that whatever disk you recover on can also be optimized based on this learning, and will have similar performance characteristics as the former primary.
GDPS integrates support for Heat Map Transfer. The appropriate Heat Map Transfer actions (such as start/stop of the processing and reversing transfer direction) are incorporated into the GDPS managed processes. For example, if Global Mirror is temporarily suspended for a planned or unplanned secondary disk outage, Heat Map Transfer is also suspended.
6.6 Flexible testing and Logical Corruption Protection
If you want to conduct a disaster recovery test, you can use GDPS GM to prepare the B disks to be used for the test. However, during the test, remote copying must be suspended. This is because the B disks are being used for the test, and the C disks contain a consistent copy of the production disks at the start of the test. If you were to have a real disaster during the test, the C disks will be used to give you a consistent restart point. All updates made to the production disks after the start of the test will need to be re-created, however. At the completion of the test, GDPS GM uses the Failover/Failback capability to resynchronize the
A and B disks without having to do a complete copy.
GDPS GM supports an additional FlashCopy disk device, referred to as F disks or FC1 disks. F disks are additional “practice” FlashCopy target devices that might optionally be created in the recovery site. These devices might be used to facilitate stand-alone testing of your disaster recovery procedures. Disaster testing can be conducted by IPLing recovery systems on the F disk while live production continues to run in the application site and continues to be protected by the B and C disks. In addition, the F disk can be used to create a “gold” or insurance copy of the data in the event of a disaster situation. If you have this additional practice FlashCopy, you will be able to schedule disaster tests on demand much more frequently because such tests will have little or no impact on your RPO and DR capability.
For added scalability, GDPS allows the GM FC disks (C) to be defined in alternative subchannel set MSS1 or to not be defined to the R-sys at all. (For more information, see “Addressing z/OS device limits in a GDPS GM environment” on page 34.) GDPS GM also supports the use of FC1 disk without having the FC1 disk defined to the R-sys.
By combining Global Mirror with FlashCopy, you can create a usable copy of your production data to provide for on-demand testing capabilities and other nondisruptive activities. If there is a requirement to perform disaster recovery testing while maintaining the currency of the production mirror or for taking regular additional copies, perhaps once or twice a day, for other purposes, then consider installing the additional disk capacity to support F disks in your Global Mirror environment.
6.6.1 Use of space-efficient FlashCopy
As discussed in “Space-efficient FlashCopy (FlashCopy SE)” on page 40, by using space-efficient (SE) FlashCopy volumes, you might be able to lower the amount of physical storage needed, and thereby reduce the cost associated with providing a tertiary copy of the data. GDPS has support to allow FlashCopy SE volumes to be used as FlashCopy target disk volumes.
This support is transparent to GDPS; if the FlashCopy target devices defined to GDPS are space-efficient volumes, GDPS will simply use them. All GDPS FlashCopy operations with the NOCOPY option, whether through GDPS scripts or panels, can use space-efficient targets.
Because the IBM FlashCopy SE repository is of fixed size, it is possible for this space to be exhausted, thus preventing further FlashCopy activity. Consequently, we suggest using space-efficient volumes for temporary purposes, so that space can be reclaimed regularly.
GDPS GM might use SE volumes as FlashCopy targets for either the C-disk or the F-disk. In the GM context, where the C-disk has been allocated to space-efficient volumes, each new Consistency Group reclaims used repository space since the previous Consistency Group, as the new flash is established with the C-disk. Therefore, a short Consistency Group Interval in effect ensures the temporary purpose recommendation for FlashCopy data. However, if the Consistency Group Interval grows long because of constrained bandwidth or write bursts, it is possible to exhaust available repository space. This will cause a suspension of GM, because any subsequent FlashCopy will not be possible.
Using space-efficient volumes for F disks depends on how you intend to use the F disks. These can be used for short-term, less-expensive testing, but are suitable for actual recovery because of their non-temporary nature.
6.6.2 Creating a test copy using GM CGPause and testing on isolated disks
The most basic GM configuration requires the GM secondary disk and the GM FlashCopy on the secondary disk subsystems. If you use an additional set of practice FlashCopy disks on the same disk subsystems, while you are performing recovery testing, you will have the I/O activity for GM mirroring and also the I/O activity generated by recovery testing on the same set of secondary disk subsystems. This I/O activity from the testing can potentially affect the GM mirroring.
GDPS GM supports creating a test copy on disk subsystems isolated from the secondary disk subsystems. We call these the X-disks. The GM secondary disks are connected to the X-disks using the Global Copy (PPRC-XD) asynchronous copy technology. The GM secondary disks are the primary disks for the relationship to the X-disks.
To create a consistent test copy on the X-disks, GDPS GM uses the Consistency Group Pause (CGPause) capability of the DS8000 disk subsystem to make the GM secondary disks consistent. After the GM secondary disks are consistent, GDPS waits until all data on these disks have been replicated to the X-disks and isolates the X-disks. GDPS then resumes the GM session.
The entire process of isolating the test copy on X-disks takes place in a short amount of time, which means minimal impact to GM operations has occurred during the creation of the test copy. Now, with the test copy isolated on disk subsystems other than the secondary disk subsystems, any testing performed does not interfere with or affect GM replication, which continues while you test on the X-disk copy.
GDPS also supports the same technique using CGPause to create practice FlashCopy. For environments that do not support CGPause, the GM secondary disks must first be recovered to make them consistent to take the practice FlashCopy. This is a much longer disruption to the GM session when compared to creating the FlashCopy test copy using CGPause.
In summary, CGPause minimizes the interruption to the GM session when creating a test copy. Isolating the test copy on a separate set of disk subsystems (X-disk) eliminates any impact the testing operation might have on the resumed GM session.
6.6.3 Logical Corruption Protection
In addition to the use of FlashCopy technology to provide flexible testing capabilities, GDPS GM uses another technology called Safeguarded Copy (SGC) to provide a powerful solution for protecting against various types of logical data corruption, including cyber attacks and internal threats. This capability is referred to as Logical Corruption Protection (LCP). For more information about LCP, see Chapter 10, “GDPS Logical Corruption Protection and Testcopy Manager” on page 299.
6.7 GDPS tools for GDPS GM
GDPS includes tools that provide functions that are complementary to GDPS function. The tools represent the kind of functions that many clients are likely to develop themselves to complement GDPS. Using the GDPS tools eliminates the necessity for you to develop similar function yourself. The tools are provided in source code format, which means that if the tool does not completely meet your requirements, you can modify the code to tailor it to your needs.
The GDPS Distributed Systems Hardware Management Toolkit is available for GDPS GM. It provides an interface for GDPS to monitor and control distributed systems’ hardware and virtual machines (VMs) by using script procedures that can be integrated into GDPS scripts. This tool provides REXX script templates that show examples of how to monitor/control: IBM AIX® HMC, VMware ESX server, IBM BladeCenter, and stand-alone x86 servers with Remote Supervisor Adapter II (RSA) cards.
6.8 Services component
As demonstrated, GDPS touches on much more than simply remote copy. It also includes automation, disk and system recovery, testing processes, and disaster recovery processes.
Most installations do not have all these skills readily available. Also, it is extremely rare to find a team that possesses this range of skills across many implementations. However, the GDPS GM offering provides access to a global team of specialists in all the disciplines you need to ensure a successful GDPS GM implementation.
Specifically, the Services component includes some or all of the following services:
Planning to determine availability requirements, configuration recommendations, implementation and testing plans. Planning session topics include hardware and software requirements and prerequisites, configuration and implementation considerations, cross-site connectivity planning and potentially bandwidth sizing, and operation and control.
Assistance in defining Recovery Point and Recovery Time objectives.
Installation and necessary customization of NetView and System Automation.
Remote copy implementation.
GDPS GM automation code installation and policy customization.
Education and training on GDPS GM setup and operations.
Onsite implementation assistance.
Project management and support throughout the engagement.
The sizing of the Services component of each project is tailored for that project based on many factors, including what automation is already in place, whether remote copy is already in place, and so on. This means that the skills provided are tailored to the specific needs of each implementation.
6.9 GDPS GM prerequisites
For more information about GDPS GM prerequisites, see this GDPS web page.
6.10 Comparison of GDPS GM versus other GDPS offerings
So many features and functions are available in the various members of the GDPS family that recalling them all and remembering which offerings support them is sometimes difficult. Table 6-1 lists the key features and functions and indicates which are delivered by the various GDPS offerings.
Table 6-1 Supported features matrix
Feature
GDPS Metro
GDPS HM
GDPS Virtual Appliance
GDPS XRC
GDPS GM
Continuous availability
Yes
Yes
Yes
No
No
Disaster recovery
Yes
Yes
Yes
Yes
Yes
CA/DR protection against multiple failures
Yes
No
No
No
No
Continuous Availability for foreign z/OS systems
Yes with z/OS proxy
No
No
No
No
Supported distance
200 km
300 km (BRS configuration)
200 km
300 km (BRS configuration)
200 km
300 km (BRS configuration)
Virtually unlimited
Virtually unlimited
Zero Suspend FlashCopy support
Yes, using CONSISTENT
Yes, using CONSISTENT for secondary only
No
Yes, using Zero Suspend FlashCopy
Yes, using CGPause
Reduced impact initial copy/resync
Yes
Yes
Yes
Not applicable
Not applicable
Tape replication support
Yes
No
No
No
No
Production sysplex automation
Yes
No
Not applicable
No
No
Span of control
Both sites
Both sites
(disk only)
Both sites
Recovery site
Disk at both sites; recovery site (CBU or LPARs)
GDPS scripting
Yes
No
Yes
Yes
Yes
Monitoring, alerting and health checks
Yes
Yes
Yes (except health checks)
Yes
Yes
Query Services
Yes
Yes
No
Yes
Yes
MSS support for added scalability
Yes (RS2 in MSS1, RS3 in MSS2)
Yes (secondary in MSS1)
No
No
Yes (GM FC and Primary for MGM in MSS1)
MGM 3-site and 4-site
Yes (all configurations)
Yes (3-site only and non-IR only)
No
Not applicable
Yes (all configurations)
MzGM
Yes
Yes
No
Yes
Not applicable
Fixed Block disk
Yes
Yes
No
No
Yes
z/OS equivalent function for Linux on IBM Z
Yes (Linux on IBM Z Systems running as a z/VM guest only)
 
No
Yes (Linux on IBM Z Systems running as a z/VM guest only)
Yes
Yes
GDPS GUI
Yes
Yes
Yes
No
Yes
6.11 Summary
GDPS GM provides automated disaster recovery capability over virtually unlimited distances for both CKD and FB devices. It does not have a requirement for a z/OS System Data Mover system as XRC does, but it does require an additional set of recovery disks when compared to GDPS XRC. It also does not provide the vendor independence that GDPS XRC provides.
The following controlling systems in a GDPS GM configuration provide different functions:
The K-sys, in the application site, is used to set up and control all remote copy operations.
The R-sys, in the recovery site, is used primarily to drive recovery in case of a disaster.
You define a set of scripts that can reconfigure the servers in the recovery site, recover the disks, and start the production systems. The powerful scripting capability allows you to perfect the actions to be taken, either for planned or unplanned changes, thus eliminating the risk of human error. Both the K-sys and R-sys monitor key indicators and resources in their span of control and alert the operator of any non-normal status so that corrective action can be taken in a timely manner to eliminate or minimize RPO and RTO impact.
 
The B disks in the recovery site can be used for disaster recovery testing. The C disks contain a consistent (although, aging) copy of the production volumes. Optionally, a practice FlashCopy (F disks) can be integrated to eliminate the risk of RPO impact associated with testing on the B disks.
In addition to its DR capabilities, GDPS GM also provides a user-friendly interface for monitoring and managing the remote copy configuration.

1 The application site is where production applications whose data is to be mirrored normally run, and it is the site where the Global Mirror primary disks are located. You might also see this site referred to as the local site or the A-site.
2 The recovery site is where the mirrored copies of the production disks are located, and it is the site to which production systems are failed over in the event of a disaster. You might also see this site referred to as the remote site or the R-site.
3 Region switches are supported by GDPS MGM in an Incremental Resynch configuration.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.17.5.68