Sysplex configurations
 
Important: This chapter is not intended as an introduction to z/OS sysplex concepts or operation. It is assumed that readers understand the basic concepts and terminology that is involved with z/OS sysplex use. For more information about Parallel Sysplex, see Chapter 1, “Introduction” on page 1.
This chapter describes the different types of sysplexes that are available and the considerations for running a sysplex under zPDT.
This chapter includes the following topics:
3.1 zPDT sysplex configurations
zPDT can be used in Parallel Sysplex and base sysplex configurations. Both configurations feature the following key requirements:
Shared direct access storage device (DASD)
XCF communication between the multiple z/OS systems in the sysplex
Synchronized time-of-day clocks among the z/OS systems in the sysplex
Two conceptual implementations (one Parallel Sysplex, one base sysplex) of these three elements (for zPDT) are shown in Figure 3-1.
Figure 3-1 zPDT Parallel Sysplex and base sysplex
The two implementations are different in the following ways:
The Parallel Sysplex is solely within IBM z/VM in a single zPDT instance. z/VM accesses the coupling facility (CF) function1 and emulates the required coupling channels. z/VM “owns” the emulated DASD and is configured to share them among the z/OS guests.
z/VM also provides a simulated external time reference (ETR)2 clock that is used by all z/OS guests. z/OS uses global resource serialization (GRS) (using the GRS Star lock structure in the CF) to coordinate data set sharing and other serialization across the members of the sysplex.
The base sysplex configuration involves multiple Linux machines.3 A Linux file sharing function, such as a Network File System (NFS), provides coordinated DASD functions at the Linux cache level. z/OS uses GRS functions through the channel-to-channel (CTC) links among z/OS systems to coordinate data set sharing and other serialization at the z/OS level. A synchronized time-of-day function is provided by the zPDT Server Time Protocol (STP)4 function (which was new in zPDT GA6).
In Figure 3-1 on page 18, all of the emulated DASDs for all z/OS systems are owned by one Linux machine, which becomes the Linux file server. In practice, the emulated DASD can be spread over several Linux machines, each of which can work as file sharing servers and clients.
Although not shown in Figure 3-1 on page 18, a base sysplex can also be under z/VM because z/VM supports virtual channel-to-channel adapters. However, if you have z/VM, it is more likely that you implement a Parallel Sysplex because of the greater functionality it provides.
Conversely, it is not possible to implement a Parallel Sysplex across multiple PCs. The main reason is that the support for running the CF control code is provided by z/VM. The virtual machine is called a coupling facility virtual machine (CFVM). z/VM requires that all of the systems that are communicating with a CFVM must be in the same z/VM system. A single z/VM system cannot span PCs. Therefore, all systems in a zPDT Parallel Sysplex must be in the same PC and running in the same z/VM.
Many details are not shown in Figure 3-1 on page 18, which illustrates only the general concepts that are involved.
3.2 Running a sysplex under zPDT
There are several reasons why you might want to configure your zPDT as a sysplex, including the following most common reasons:
Developers of products that use sysplex facilities (especially CF operations) need a platform for testing.
Developers in larger organizations often must share libraries or test data across multiple z/OS systems. Test data can be large and downloading it from larger IBM Z machines to a single shared zPDT volume (or volumes) is more attractive than downloading to multiple separate zPDT machines.
Application developers must test their programs in a sysplex environment to ensure that they do not do anything that might cause a problem in a production data sharing environment.
If multiple developers are sharing the system, a Parallel Sysplex gives you the ability to shut down one system while other developers use the other system. This ability provides more flexibility to make changes during normal working hours without disrupting other users of that environment.
Sysplex functions are not apparent to Time Sharing Option (TSO) users or normal batch jobs. However, they are apparent to systems programmers or z/OS system operators. More skills are needed to configure and operate a sysplex. In addition, configuring and operating a zPDT Parallel Sysplex requires basic z/VM administration skills. The advantages of sysplex operation must be weighed against the extra required skills.
Licenses
You must have the appropriate licenses to have a sysplex running under zPDT. Assuming you have the correct licenses for normal zPDT and z/OS use, 1090 users might need a separate license to use z/VM. In addition, 1091 users might need a separate feature (license) to use z/VM and Parallel Sysplex functions. Consult your zPDT provider for more information.
3.2.1 High-level concepts
The following high-level concepts must be understood before using a sysplex in a zPDT environment:
zPDT does not emulate coupling channels, which are needed for connections to CFs. z/VM provides the coupling channels emulation capability, which is why zPDT Parallel Sysplex systems must operate under z/VM. Multiple z/OS guests can be run under a single z/VM instance in a Parallel Sysplex configuration. Several zPDT instances cannot be linked (in the same or separate PCs) to create a Parallel Sysplex configuration.
A zPDT instance that is running only the CFCC code cannot be created. The CFCC function is available only to guests that are running under z/VM.
The zPDT product includes the following delivery streams (based on the token that is used):
 – The 1090 tokens5 are used by independent software vendors (ISVs) that obtained zPDT through the IBM PartnerWorld® for Developers organization. These tokens can be used with CF functions (under z/VM) to enable Parallel Sysplex operation.6 The same 1090 tokens are used by many IBM employees, with the same capabilities as the ISV users.
 – The 1091 tokens are used by zD&T clients. These tokens can have optional features enabled. The use of IBM Z CFs is an optional (priced) feature with 1091 tokens.
Although the zPDT sysplex description is directed to 1090 token users, it also applies to 1091 tokens that have CFs enabled. The 1090 version of zPDT operates with 1090 tokens only, and the 1091 version of zPDT operates with 1091 tokens only.7
A base sysplex does not involve a CF and can be created with 1090 or 1091 tokens (without extra license features). Several PCs can be connected (each running zPDT and z/OS) to form a base sysplex. If you plan on such a sysplex, see “Considerations for sharing DASD across multi-PC sysplex” on page 24 for information about the performance considerations for NFS.
The zPDT plus z/VM system that is hosting the Parallel Sysplex is limited to the number of IBM Z CPs that are allowed by one or more zPDT tokens that are on the PC, with an upper limit of eight CPs.8
The multiple zPDTs that are creating a base sysplex across multiple PCs are each limited by the token (or tokens) that are used by each zPDT. That is, each zPDT has its own token (or tokens) and there is no collective limit for the number of CPs present in the overall base sysplex.
This token description does not include the practical aspects of the use of a remote token license server. However, the underlying limitations of the number of CPs for a zPDT instance remain.
The importance and implementation of data set serialization across all z/OS instances that use shared DASD must be thoroughly understood. As a practical matter, this issue is handled by z/OS GRS. Regardless of whether the sysplex is a base sysplex or a Parallel Sysplex, sharing DASD across multiple z/OS images results in corrupted DASD unless GRS is properly configured and working.
A base sysplex requires time-of-day clock synchronization among the z/OS members. To provide this feature, the zPDT STP function must be started before any zPDT instances are started. (z/VM provides a simulated time-of-day function that is used if the sysplex is run under z/VM.) For more information about STP, see the Server Time Protocol chapter in IBM zPDT Guide and Reference System z Personal Development Tool, SG24-8205.
The topic of z/OS system tuning is not addressed in this book. Sysplex operation typically requires attention to various tuning tasks that are handled by z/OS systems programmers. Sysplex operation stresses various sharing functions that are related to ENQ and ENQ+RESERVE processing. IBM APAR II14297 addresses this area with suggestions that might be useful when operating on a zPDT base.
3.2.2 Planning for CFs under zPDT
For 1090 users, no other zPDT device map (devmap) statements are required to use CF functions.
For 1091 users, an extra parameter is required in the devmap, as shown in the following example:
[system]
memory 10000m
3270port 3270
cpuopt zVM_CouplingFacility
processors 3
 
Note: The z/VM directory that is provided by ADCD contains all of the definitions that are necessary for your Parallel Sysplex. The following section is provided only for your information and to help you understand the configuration that is provided by ADCD. If you are familiar with running a sysplex under z/VM, you can skip to 3.2.3, “Hardware” on page 22.
For a Parallel Sysplex, you must define at least one CF guest machine with a z/VM directory entry by using statements that are similar to the following example:
USER CFCC1 CFCC1 5000M 5000M G
XAUTOLOG CFCONSOL
OPTION CFVM TODENABLE
MACH ESA
CONSOLE 009 3125 T CFCONSOL
The CFCONSOL keyword on the CONSOLE statement in this example refers to the name of a z/VM virtual machine. In this case, any messages that are issued by the CF that is running in CFCC1 are sent to the CFCONSOL ID. This configuration allows you to consolidate the messages from all CFs to a single place. The console guest is defined as shown in the following example:
USER CFCONSOL CFCONSOL 4M 4M ABCDEFG
MACH ESA
CONSOLE 009 3215 T
The memory size for the CF (shown as 1200M in this example) depends on your requirements, including the size of your structures and how many are allocated concurrently.
The z/VM directory entry for each z/OS guest that uses a CF must contain OPTION and SPECIAL statements, as shown in the following example:
USER ADCD .....
OPTION CFUSER TODENABLE
.....
SPECIAL 1400 MSGP CFCC1 (CFCC1 is the name of a CFCC guest definition)
This SPECIAL statement defines a CF channel, which is emulated by z/VM. Also, the z/VM directory MDISK definition for volumes that contain z/OS couple data sets (CDSs) must contain a special parameter, as shown in the following example:
MDISK A9E 3390 DEVNO A9E MWV
DASDOPT WRKALLEG <==this parameter needed for couple data set volumes
For more information about the DASDOPT statement, see 4.3.6, “Configuring z/VM guests” on page 49.
3.2.3 Hardware
No special base hardware (for the Intel compatible computers that are running Linux and zPDT) or Linux release is required. However, as a practical matter, a Parallel Sysplex configuration normally requires more memory than a simple z/OS configuration under z/VM.
z/VM runs in IBM Z memory. IBM Z memory (in a zPDT environment) is virtual memory that is created by Linux. In principle, Linux virtual memory is only loosely related to the real memory of the PC. In practice, the real memory of the PC should be large enough to avoid paging and swapping by Linux.
In a simple case, base sysplex members are normal z/OS systems that often require no extra memory above what might be normal for that z/OS system’s workload.
As an overall suggestion, typical zPDT documentation states that the defined IBM Z memory (specified on the memory parameter in the devmap file) should be at least 1 GB smaller than the real PC memory for a dedicated zPDT system. In the case of a Parallel Sysplex configuration, 16 GB of real PC memory should be the minimum reasonable size, with more memory being better.9
With 16 GB of real memory, defining a 12 - 13 GB IBM Z environment is reasonable. This environment can be used with two z/OS guests (at 4 - 6 GB each, for example), two CFs (1 GB each), and z/VM.
It is important that the memory value is sufficiently smaller than the real memory amount to leave memory for zPDT device managers, a reasonable disk cache, and other functions. More Linux workloads require more memory, as does the use of more or larger z/OS guests. If you plan to use a memory value larger than 14 GB, see the “Alter Linux files” section in IBM zPDT Guide and Reference System z Personal Development Tool, SG24-8205.
Ignoring hypervisor situations, zPDT does not partition PC memory. The base Linux system has complete control of PC memory; zPDT runs in Linux virtual memory, as with any other Linux application. However, zPDT can be a major user of memory and runs faster if there is little paging at the Linux level and ample memory for the Linux disk cache.
Discussions that refer to, for example, an 8 GB notebook that can have a 6 GB zPDT do not imply that 6 GB is somehow partitioned solely for zPDT use. Rather, it is a discussion for obtaining best performance by ensuring that memory resources are sufficient. For more information about memory use in a zPDT environment, see the Memory section in the “Function, releases, content” chapter of IBM zPDT Guide and Reference System z Personal Development Tool, SG24-8205.
In principle, a 1090-L01 zPDT system (one CP) can be used on a base PC with one processor (“core”) to run a z/VM system with several Parallel Sysplex z/OS guests. In practice, this configuration is not practical and might result in various z/OS timeouts.
For any significant use of a Parallel Sysplex system under z/VM, use at least a 1090-L02 model that is running on a PC with a four-way processor (four “cores”). The individual z/OS guests (under z/VM) can be defined with one logical CP each. The PC must have more cores than the number of zPDT processors in use in any instance of zPDT. For example, on a quad core PC, you should not define a zPDT instance with four processors. For more information, see IBM zPDT Guide and Reference System z Personal Development Tool, SG24-8205.
The amount of disk space that you need depends on which ADCD volumes you restore and on how many other volumes you have for your own programs and data. It does not make sense to try to skimp on disk space. At the time of this writing, a 1 TB internal hard disk drive (HDD) costs approximately $75 US. Trying to run a system with insufficient disk space wastes time, results in abends because of a lack of space, and affects your ability to use the zPDT disk versioning capability.
 
Tip: Do not try to save money by using a small HDD in your zPDT PC. The extra cost for a 1 TB HDD compared to a 500 GB HDD (or a 2 TB HDD compared to a 1 TB HDD) is minimal. Providing plenty of disk space is one of the cheapest ways to improve operability of your zPDT sysplex.
3.2.4 Performance
A zPDT Parallel Sysplex environment is intended for basic development, self-education, minor proof-of-concept work, small demonstrations, and so on. It is not intended for any type of production or stressful work. It should never be used to gauge relative performance of any software. The primary performance limitations are in the following areas:
Disk access, especially on a notebook with a single, relatively slow disk. Physical disk access is reduced by the normal Linux disk cache functions. The effectiveness of the Linux disk cache is improved if ample PC memory is available. When two or more z/OS systems are accessing the same PC disk or disks (as in a Parallel Sysplex under z/VM), I/O performance becomes a critical factor.
Memory management (real and virtual). Paging (by Linux, z/VM, and z/OS guests) must be avoided as much as possible, mostly because of disk bottlenecks. An effective Linux disk cache is essential for reasonable zPDT performance.
IBM Z CP processing power. This issue is limited by the number of PC processors that are available, their speed, and the maximum number of CPs that can be defined under zPDT. In a zPDT Parallel Sysplex configuration, the available CPs are shared among all z/OS guests. Consider the following points:
 – If your zPDT system is running zIIP-eligible work and a physical core is used to back it, you might consider adding a “free zIIP” to your Devmap. For more information about the zPDT free zIIP support, see IBM ZPDT Guide and Reference, SG24-8205.
 
Note: No benefit is gained by adding a zIIP if you do not have another core for it to use. For example, if you have a 4-core PC and already have three CPs defined to your zPDT system (leaving the fourth core for Linux use), adding a zIIP to your Devmap delivers little, if any, extra capacity.
 – The WLM policy that is included with ADCD (up to and including the May 2020 level) assigns z/OSMF and other system tasks to a Discretionary service class. Work in a discretionary service class never overflows to a general purpose CP. This makes sense when z/OS is running on a real z CPC. However, in a zPDT environment, ensure that no important work is assigned to a discretionary service class.
z/VM overhead under zPDT. This overhead is greater than the z/VM overhead on a real IBM Z machine.
Allowing for these considerations, we found the performance of a Parallel Sysplex in our test environments to be reasonable when normal development workloads are involved. I/O performance, which is provided by the combination of the PC disk (or disks) and the Linux disk cache in memory, tends to be a limiting factor for more complex workloads. However, “reasonable” is subjective. We cannot predict the performance of your workloads on a sysplex system.
A zPDT base sysplex has different performance characteristics. In the simple case, only one z/OS (and no z/VM) is present in each base Linux PC. Each z/OS has the full power of its base machine and uses as many CPs as are provided in the zPDT token (or tokens) for that machine.
Considerations for sharing DASD across multi-PC sysplex
The implementation of shared DASD across multiple Linux PCs has significant performance implications. We noted the following aspects in our sample configuration:
In the simple configuration that is shown in Figure 3-1 on page 18, all the emulated 3390 DASD were in one Linux PC. Our other Linux PC required all DASD access to go through the LAN. For read operations, active 3390 tracks might be contained in the local Linux caches.
The sample configuration used NFSv4 for the shared file operation.10 zPDT (with the -shared option specified for the awsckd device manager in the devmap file) issues Linux file locks on file byte ranges that are the target of an operational CCW. NFSv4, then invalidates any matching cached ranges on other sharing machines. This configuration requires considerable communication overhead.
Shared DASD performance that uses NFSv4 with 1 Gb Ethernet links and that placed all emulated DASD on one PC had performance concerns.
We cannot predict whether the performance fits your requirements. However, you can try this configuration and see whether it is acceptable to you.
Some common-sense techniques can help. For example, IPLing multiple sysplex members at the same time creates a much greater stress on I/O responsiveness than is encountered during more normal operation.
An environment with more sophisticated (and more expensive) HDDs had much better performance (combined with greater complexity). For more information about our experiences in this area, see Appendix C, “Alternative disk configuration” on page 147.
NFS is not always a high-performance shared file system. It is attractive because it is widely available and simple to configure. The following methods can be used to address performance issues:
Use an alternative Linux shared file system (several are available). The disadvantage is that they are not as easy to configure as NFS, extensive documentation is not generally available, and these alternatives were not exercised by the zPDT development group.
Reduce the number of shared volumes. That is, provide each of the members of the sysplex with most or all of the standard z/OS volumes as local, unshared volumes. In the extreme case, only obvious functions, such as RACF and JES2, can use shared volumes. Volumes that contain development libraries and test data can be shared. This arrangement might appear obvious, but implementation is not trivial.
The IBM design of sysplex capability assumes that (almost) all volumes are shared and global resource serialization (GRS) locking is used to serialize access to data sets. Among other changes, extensive GRS parameter customization (in the GRSRNLxx member of PARMLIB) is needed to make this configuration effective. GRS locking is by data set name, not by volume name. The sample implementation does not explore this approach.
3.3 Sysplex goals
Chapter 4, “Installing the Sysplex Extensions 2020” on page 35 describes an implementation of a base sysplex and a Parallel Sysplex system. The implementations are intended to be practical, and you might follow the same steps to create your own sysplex systems.
In comparison to previous versions of the zPDT sysplex support, this version delivers a sysplex that uses more sysplex functions. You can choose which of those functions you want to use.
One of the reasons for delivering this more robust sysplex environment is to provide software vendors and customers with a configuration with which to easily test software and applications in a realistic sysplex environment. Access to a more real world-like configuration can result in fewer problems making it through to a production environment. This environment is also designed with an aim to keep the installation and management as easy and flexible as possible, which makes it easier for you to create and test various configurations.
The Sysplex Extensions 2020 are designed to adhere to IBM’s sysplex preferred practices (or at least, those guidelines that are applicable to a zPDT environment). By providing a working example of such a configuration, you can more easily determine whether configuring your systems and subsystems in this manner delivers benefits in your production environments that you are not receiving today.
3.3.1 General system characteristics
The specific examples of sysplex implementations have the following characteristics:
As far as possible, the volumes that are included in the package contain everything that you need to add the zPDT Sysplex Extensions 2020 to an ADCD system, which results in a base or Parallel Sysplex. Some manual changes are required, but those changes are kept to a minimum and are described in Chapter 4, “Installing the Sysplex Extensions 2020” on page 35.
The Sysplex Extensions 2020 package is based on the May 2020 ADCD z/OS 2.4 release. In principle, nothing is unique in this implementation that is tied to this release, but various details (such as the name of the ADCD data sets) might change from release to release.
This release of the zPDT Sysplex Extensions updates the sample CICSplex and DB2 data sharing environments. These environments are the most widespread sysplex environments in z/OS customers, so these environments can provide the most value to the largest number of users.
The sysplex implementations include shared consoles, SMF log streams, and RRS log streams. They do not include IBM VTAM® or IP failover or pass-through. JES2 Multiple Access Spool (MAS) (or “shared spool”) is configured, even for “normal” z/OS where it does no harm. The implementation also adds other Parallel Sysplex users, such as automatic restart manager, Sysplex Failure Manager, and Health Checker use of log streams, along with supporting documentation to help you use and evaluate them.
Future releases of the zPDT Sysplex Extensions might extend the provided Parallel Sysplex environments to include IBM IMS, IBM MQ, and IBM WebSphere® Application Server. In this version, those other subsystems can be used, but the implementation is left as an exercise for you.
For our Parallel Sysplex configuration, all DASD11 is “owned” by z/VM and is shared by the z/OS guests. For the multi-PC base sysplex configuration, all DASD is placed on one of the PCs. The other system accesses the DASD through Linux shared file facilities.
Only one set of z/OS volumes is involved. It is initially loaded with different parameters to run as one of the following systems:
 – The “normal” ADCD z/OS monoplex system, with no sysplex relationships. All the standard ADCD z/OS 2.4 IPL parameters can be used in this mode. The system name is S0W1.
 – The first system in a Parallel Sysplex. IPL parameter PS12 is used, and the system name is S0W1.13
 – The second system in a Parallel Sysplex. IPL parameter PS is used, and the system name is S0W2.
 – The first system in a base sysplex. IPL parameter BS is used, and the system name is S0W1.
 – The second system in a base sysplex. IPL parameter BS is used, and the system name is S0W2.
An IPL of the system must be done in consistent ways. The “normal” z/OS cannot be used at the same time that one of the sysplex configurations is being used. The S0W1 and S0W2 systems can be loaded in Parallel Sysplex mode or in base sysplex mode, but you should not load one system in one sysplex mode and the other system in a different sysplex mode concurrently.
z/VM 7.1 is used as the basis of the Parallel Sysplex system. There is nothing unique in our implementation that is tied to this release.
The name of our sysplex is ADCDPL because it is the name of the standard ADCD monoplex and there is no reason to change it. The name remains the same for the monoplex, base sysplex, and Parallel Sysplex.
There is no cold start and warm start distinction among the sysplex IPL parameters. Every load performs a CLPA.
A JES2 cold start can be triggered only by loading a “normal” z/OS (ADCD monoplex) and selecting an IPL parameter that produces a cold start. This parameter is IPL parameter CS in recent ADCD releases.
The sysplex configurations are mostly determined by members in PARMLIB, PROCLIB, TCPPARMS, and VTAMLST. The ADCD system provides empty libraries, such as USER.Z24B.PARMLIB and USER.Z24B.PROCLIB. These USER libraries are concatenated before the standard ADCD and SYS1 libraries. Most of our altered members are placed in the USER libraries by completing the steps that are described in Chapter 4, “Installing the Sysplex Extensions 2020” on page 35.
The Linux names (defined in the Linux system’s /etc/HOSTNAME file) help identify the personal computer described in the examples in this book. The first base sysplex machine is named W520 (and it is also running the STP server), and the second base sysplex machine is named W510. The Linux names are not meaningful in a Parallel Sysplex environment unless they are needed for external IP name resolution.
Table 3-1 lists the intended IPL parameters and IP addresses.
Table 3-1 IPL parameters and addresses
 
System name
IPL parameters
IP address
OMVS
data sets
Base ADCD z/OS
S0W1
CS, 00, and others
192.168.1.80
10.1.1.2
ZFS.S0W1
Base sysplex system 1
S0W1
BS or ST
192.168.1.80
10.1.1.2
ZFS.S0W1
Base sysplex system 2
S0W2
BS or ST
192.168.1.90
10.1.1.31
ZFS.S0W2
Parallel Sysplex system 1
S0W1
PS
192.168.1.80
10.1.1.2
ZFS.S0W1
Parallel Sysplex system 2
S0W2
PS
192.168.1.90
10.1.1.3
ZFS.S0W2

1 We can use 10.1.1.2 for this address because it is a different Linux than the other 10.1.1.2; however, the use of 10.1.1.3 avoids the confusion that it can cause.
Both members of the Parallel Sysplex are under one z/VM, which is running in one Linux system. The members of the base sysplex can be under z/VM (in which case the BS IPL parameter is used) or in two different Linux systems (in which case the ST IPL parameter is used). When running across two PCs, they are sharing all of the 3390 volumes via Linux file sharing facilities. For more information, see Figure 3-1 on page 18.
Figure 3-2 on page 28 shows the DASD volumes that are involved in the sysplex systems. The z/VM volumes might not be used for the base sysplex and can be omitted. As noted in Figure 3-2, there is a single IPL volume for z/OS. Different IPL parameters are used to start the different sysplex configurations that are provided (monoplex, base sysplex under z/VM, base sysplex spread over two PCs, and Parallel Sysplex).
The small numbers in Figure 3-2 are the addresses (device numbers) that we used in the zPDT devmap and in this documentation. There is no need to use these addresses in your sysplex system. The only semi-standard addresses are A80 (for the IPL volume) and A82 (for the IODF and master catalog) and these addresses are conventions that are used in ADCD documentation. If you build a sysplex that is based on a different ADCD release, your volsers will differ from those volsers that are shown in Figure 3-2.
The CF0001, DB2001, and CICS01 volumes in Figure 3-2 are volumes that are provided as part of the zPDT Sysplex Extensions 2020. The CF0001 volume contains CDSs and jobs that are related to sysplex setup. You can use other volumes for these functions. There is nothing special about the CF0001 volume except that it is mounted at an address that has the WRKALLEG option in the z/VM directory.
Figure 3-2 Total of 3390 volumes for sysplex
3.3.2 z/VM-related sysplex limitations
The following sysplex-related functions are not available when running under z/VM:
Enhanced Catalog Sharing (ECS)
This function uses a CF cache structure to eliminate the need to read the sharing subcell in the VSAM volume data set (VVDS) for catalogs that are defined to use ECS. During system initialization, the ECS code checks to see whether z/OS is running as a virtual machine under z/VM. If it is, an IEC377I message is issued and ECS is not used by that system, even if you define the ECS CF structure and enable catalogs for ECS.
BCPii
BCPii is a z/OS system function that provides the ability for a program that is running on z/OS to communicate with the Support Element and HMC of the CPC on which the system is running. This communication is a powerful capability and is used by the z/OS cross-system coupling facility (XCF) System Status Detection Partitioning Protocol to more quickly and more accurately determine the state of a system that stopped updating its time stamp in the sysplex CDS. However, because z/OS is running under the control of z/VM rather than directly on the CPC, BCPii shuts down during the system start if it determines that it is running under z/VM. Message HWI010I (BCPII DOES NOT OPERATE ON A VM GUEST) is issued and the BCPII address space ends.
3.4 System logger
The system logger component of z/OS provides generalized logging services to products or functions, such as CICS, IMS, SMF, LOGREC, OPERLOG, and Health Checker. The installation defines “log streams,” which are conceptually similar to infinitely large VSAM linear data sets. Log streams can be shared by multiple systems but often are dedicated to only one product or function.
When data (known as log blocks) is written to a log stream, it is initially placed in interim storage. To protect the availability of the data, system logger keeps two copies of the data while it is in interim storage. These copies are kept in two of the following locations:
A structure in the CF
A staging data set on DASD
A data space in z/OS
Exactly which two of these locations system logger uses for any specific log stream depends on parameters that you specify when you define the log stream.
After some time, the interim storage portion of a log stream fills up, which triggers a process known as offload. Offload consists of the following parts:
Any log blocks that are deleted by the application are physically deleted from interim storage.
Remaining log blocks are moved to offload data sets until the log stream utilization drops to an installation-specified threshold (the LOWOFFLOAD threshold). The first offload data set is allocated when the log stream is defined (even if it is never used), and more offload data sets are allocated automatically as needed.
Eventually, the older log blocks are deleted by system logger based on a retention period you specify for each log stream or by the application that owns the data.
A critical data set is the LOGR CDS. The LOGR CDS contains the definitions of all log streams. It also contains information about the staging and offload data sets and the range of log blocks in each offload data set. From that perspective, it can be viewed as being similar to a catalog.
Based on this configuration, you can see that there is a relationship between the following components:
The LOGR CDS, which contains the log stream definitions and information about every staging and offload data set.
One or more user catalogs that contain entries for the staging and offload data sets.
The volumes that contain the staging and offload data sets.
If any of these components are unavailable, the log stream is unusable, or at a minimum, it is in a LOST DATA status.
For the Sysplex Extensions to ship a predefined log stream, it must provide the following components:
LOGR CDS that contains all of the definitions
User catalog that contains the entries for the staging and offload data sets
Staging and offload data sets
The CICS component of the Sysplex Extensions places the user catalog for the CICS log stream data sets, and the data sets themselves, on the SMS-managed volumes that are provided as part of the package. The LOGR CDS is contained on the CF0001 volume. This configuration means that the entire system logger environment for the CICSplex is provided as part of the package.
However, the data sets for the OPERLOG, LOGREC, RRS, SMF, and Health Checker log streams are not on a volume that is provided by the package. Fortunately, none of these log streams are required at IPL time, which means that the LOGR CDS that is provided by the Sysplex Extensions package does not contain the definitions for these log streams. Instead, jobs are provided to create all of the definitions. These jobs are run after the first load of the Parallel Sysplex. For more information, see 4.4, “Starting your Parallel Sysplex” on page 73.
Something that you must consider is whether you have any log streams today. Two LOGR CDSs cannot be merged. Therefore, if you move to the CDSs that are contained with this package, all of the data in your log steams is lost. You must review your log streams and determine how you can handle the loss of those log streams. For some log stream users, such as SMF, you can empty the log stream before you shut down in preparation for the move to the new CDSs. For other log streams, such as CICS or RRS, you must perform a cold start after the move.
For more information about system logger in general, see the following resources:
Systems Programmer’s Guide to: z/OS System Logger, SG24-6898
z/OS MVS Setting Up a Sysplex, SA23-1399
3.4.1 Resource Recovery Services
Resource Recovery Services (RRS) is a z/OS component that provides two-phase commit support across different components and potentially across different systems. Common users of RRS are CICS, IMS, IBM MQ, DB2, and WebSphere Application Server.
RRS uses system logger log streams to track its actions and enable it to recover if there is a failure in RRS or of the system it is running on, or even of the entire sysplex. If it cannot retrieve its log records from system logger, it might be necessary to cold start RRS, which means that it loses information about any in-flight transactions that it was managing.
Depending on the type of work you are doing, this issue might be a factor. However, an important point is that the loss of log stream data in a zPDT environment is far more likely than in a “real” system, which is true because the CF and the connected z/OS systems are all in the one PC. If that PC fails or is shut down abruptly (RRS is stopped abruptly) and you did not define the log streams to use staging data sets, all of the log blocks that were in interim storage are lost.
If this issue is a problem for you, take the following measures to reduce the likelihood of being affected by this issue:
Define the RRS log streams to use staging data sets. You achieve this definition by updating the log stream definitions to say STG_DUPLEX(YES). There is a related parameter, DUPLEXMODE, that controls whether a staging data set is used always for that log stream or is used only if there is a single point of failure between the z/OS system and the CF containing the structure that is associated with that log stream. However, when running a Parallel Sysplex under z/VM, the CF is always in the same failure domain as all connected systems, so a staging data set always is used for that log stream if STG_DUPLEX(YES) is specified.
Always complete a controlled shutdown of the system. If the system is stopped in an orderly manner and RRS is stopped as part of that shutdown procedure, the log blocks in the CF are moved to an offload data set before RRS disconnects from the log stream. This process means that when the system is brought back up, any required log blocks are still accessible to system logger (and, therefore, to RRS).
For more information, see z/OS MVS Programming: Resource Recovery, SA23-1395.
At one time, IBM advised customers not to define the RRS archive log stream on the basis that it can contain much information that is rarely required. However, since then, the SETRRS ARCHIVELOGGING command was added with which you can dynamically turn on and off the archive log stream.
For that reason, the archive log stream is defined and then the SETRRS ARCHIVELOGGING,DISABLE command is run from the VTAMPS member. If you require the data in the RRS archive log stream, remove that command from the VTAMPS member or run the SETRRS ARCHIVELOGGING,ENABLE command to enable the use of that log stream again.
Considerations for base sysplex
By default, all RRSs in a sysplex are in the same RRS group, meaning that all of the RRSs write to the same set of log streams. In this case, the RRS group name matches the sysplex name.
However, if you are running in a base sysplex configuration, all log streams must be defined as DASDONLY log streams, and DASDONLY log streams cannot be shared between systems. If you want to use RRS in such an environment, you must change the command that is used to start RRS to include a unique group name (something other than the sysplex name). The base sysplex in this package used RRS group names that match the system name.
From a log stream perspective, the second qualifier of the log stream name is the RRS group name. Therefore, the following RRS log streams are defined in the LOGR CDS:
RRS.ADCPL.RESTART: Restart log stream for use in Parallel Sysplex mode.
RRS.S0W1.RESTART: Restart log stream for system S0W1 when in base sysplex mode.
RRS.S0W2.RESTART: Restart log stream for system S0W2 when in base sysplex mode.
This package sets up the VTAMPS Parmlib members so that the default group name is used if the systems are in Parallel Sysplex mode. If the system is started as a base sysplex, it automatically starts RRS in each system by using the system name as the group name. This process should not be apparent to you, except that if you move back and forth between a Parallel Sysplex and a base sysplex, RRS switches back and forth between different log streams. Therefore, information about any in-flight transactions that were being managed by RRS immediately before the switch are unavailable when the system comes up in the opposite sysplex mode.
For more information about RRS, see Systems Programmer’s Guide to Resource Recovery Services (RRS), SG24-6980.
3.4.2 LOGREC
One of the common problems with the LOGREC data sets is that they are a finite size, meaning that they can fill up, often at the time when you really need the information that they contain.
Another challenge in a sysplex environment is that problems on one member of the sysplex can often be related to a problem or event on another member of the sysplex. However, if you have a separate LOGREC data set for each system, the relationship between what is happening across the sysplex might not be so obvious.
It is for these reasons that LOGREC was one of the first users of system logger. Because the log streams provide far more space than the LOGREC data sets, you do not have to worry about the data set filling exactly when you need it.
Because multiple systems can write to a single LOGREC log stream, you have a single repository for all LOGREC data, and the data is in chronological order across the whole sysplex. Therefore, a single report shows you what is happening (in the correct sequence) across all of your systems.
Before z/OS 2.214, you used only LOGREC data sets if you specified the LOGREC data set name in the IEASYSxx member, which means that the system always came up in data set mode and you then used the SETLOGRC command to switch to log stream mode. The IEASYSPS member that is delivered by the Sysplex Extensions results in the system coming up in data set mode. (It specifies LOGREC=SYS1.&SYSNAME..LOGREC.) If you want to run in log stream mode, update the appropriate VTAMxx Parmlib member to issue the SETLOGRC LOGSTREAM command during system initialization.
3.4.3 OPERLOG
Related to the LOGREC log stream is the OPERLOG log stream. OPERLOG is an application that records and merges the hardcopy message set from each system in a sysplex that activates the application. OPERLOG is helpful when you need a sysplex-wide view of system messages; for example, if you are investigating a system problem, it can be invaluable to also see what is happening on the other systems in the sysplex.
The OPERLOG log stream is automatically enabled at IPL time if you include the OPERLOG keyword on the HARDCOPY DEVNUM statement in the CONSOLxx member, as shown in the following example:
HARDCOPY
DEVNUM(SYSLOG,OPERLOG)
CMDLEVEL(CMDS)
ROUTCODE(ALL)
If the OPERLOG log stream is not defined (as it is not when you load the Parallel Sysplex for the first time), you are presented with the CNZ4201E OPERLOG HAS FAILED message. Job LOGROPR, which is described in 4.4, “Starting your Parallel Sysplex” on page 73, allocates the OPERLOG log stream. If you want to enable the OPERLOG log stream after the log stream is defined, run the V OPERLOG,HARDCPY command.
If you want to stop using and delete its log stream, you must complete the following steps:
1. Run the V OPERLOG,HARDCPY,OFF command on every system.
Running this command stops the system from writing any more messages to the OPERLOG log stream.
2. Wait until everyone that might be using the OPERLOG log stream from TSO logs off (or at least, exited SDSF). Each system maintains a connection to the log stream until there is no longer anyone using the OPERLOG log stream on that system.
For more information about the use of OPERLOG, see z/OS MVS Planning: Operations, SA23-1390.
3.4.4 System Management Facilities
The ability to write System Management Facilities (SMF) data to log streams was introduced with z/OS 1.9. The original (and still default) way of running SMF is to write all SMF records that were created by a system to one VSAM data set.
This method made sense when most installations had only one IBM MVS™ system and when there were only a few products that created SMF records. However, today there are few sites that have only a single z/OS system. There also are many products from IBM and other vendors that create SMF records. In many cases, SMF data is used by extracting a subset of SMF record types from the SMF data sets of every system, merging them, and then running reports that are based on that installation-wide subset of SMF records.
SMF’s ability to write its records to one or more log streams is a far more suitable model in these cases. For example, you can create one log stream that contains performance-related SMF records from every member of the sysplex, another log stream that contains the CICS records from every member, and yet another one that contains the security violation SMF records. Each log stream can be defined with retention and availability characteristics that are appropriate for that SMF record type.
The SMFPRMxx member that is provided by the Sysplex Extensions package contains definitions for traditional SMF VSAM data sets (SYS1.MANx) and a single SMF log stream. However, the member also disables the writing of any SMF record. So, as delivered, this sysplex does not create any SMF records. You can easily change this configuration by updating the SMFPRMxx member in USER.Z24B.PARMLIB to specify ACTIVE rather than NOACTIVE and by activating the changed member.
If you make this change, the system starts writing to the SYS1.MANx data sets. If you want to use the log stream instead, update the member to specify RECORDING(LOGSTREAM) and activate the updated member.
For more information about the use of SMF log stream mode, see the following resources:
SMF Logstream Mode: Optimizing the New Paradigm, SG24-7919
z/OS MVS System Management Facility, SA38-0667
3.5 RACF sysplex usage
The recommended approach when RACF is used in a sysplex is to share the RACF database. If sysplex data sharing is not enabled, RACF uses hardware reserve/release processing on the RACF database to serialize access. This pre-sysplex approach is known as a shared RACF database. As more systems join the sysplex, I/O to the shared RACF database increases, which leads to more contention. Updates that are made on one system must be propagated to other systems’ local buffers. This propagation requires the deletion of all database buffers because there is no way to identify which particular buffer is affected.
RACF sysplex data sharing allows RACF to invalidate only the buffers that are no longer valid, and uses the CF to provide each RACF with access to many more database buffers than traditional RACF database sharing. RACF sysplex communication, which uses XCF services for communication between RACF subsystems, is a prerequisite to RACF sysplex data sharing.
RACF sysplex communication communicates changes that you make on one system to the other RACFs in the sysplex. Specifically, the following commands are propagated to the other systems:
RVARY SWITCH
RVARY ACTIVE
RVARY INACTIVE
RVARY DATASHARE
RVARY NODATASHARE
SETROPTS RACLIST (classname)
SETROPTS RACLIST (classname) REFRESH
SETROPTS NORACLIST (classname)
SETROPTS GLOBAL (classname)
SETROPTS GLOBAL (classname) REFRESH
SETROPTS GENERIC (classname) REFRESH
SETROPTS WHEN(PROGRAM)
SETROPTS WHEN(PROGRAM) REFRESH
Activating RACF sysplex database sharing and sysplex communication requires updates to the RACF data set name table (ICHRDSNT). Because you might have your own RACF databases, we did not provide an updated table because we do not know your RACF database data set names.

1 This code is the same CFCC licensed code that is used on larger System z machines.
2 A synchronized time-of-day clock is required for all members of a sysplex.
3 Multiple zPDT instances in the same Linux machine can also be used.
4 STP is another method to provide synchronized time-of-day clocks.
5 It is more correct to refer to the licenses that are acquired through a token. However, for brevity it is referred to as a token here.
6 Another license agreement covering z/VM usage might be required.
7 This separation of license control is for zPDT releases dated 2Q13 or later.
8 Up to eight CPs can be obtained by using multiple tokens or through a special large token.
9 These numbers are minimum numbers. Much larger systems can be used.
10 NFS (as opposed to NFSv4) uses an older locking design. This package uses NFSv4 for base sysplex operation. NFS or NFSv4 might be used for the read-only DASD sharing.
11 The emulated IBM Z DASD is referred to here. Other Linux files are in normal Linux locations.
12 A zPDT IPL statement has the format ipl a80 parm 0a82xx; for example, where the two xx characters are the referenced “IPL parameter” in this description.
13 The S0W1 name is set in the delivered ADCD z/OS system. The examples in this book do not alter this name.
14 Starting with z/OS 2.2, it is possible to IPL in LOGREC log stream mode and then switch to DATASET mode.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.145.8.42