Chapter 3. Central processor complex system design

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Central processor complex system design

This chapter describes the design of the IBM z14 ZR1 processor. By understanding the processor design, users become familiar with the functions that make the z14 ZR1 server a system that accommodates a broad mix of workloads for enterprises of all sizes.

This chapter includes the following topics:

•3.1, “Overview” on page 64

•3.2, “Design highlights” on page 64

•3.3, “CPC drawer design” on page 66

•3.4, “Processor unit design” on page 70

•3.5, “Processor unit functions” on page 83

•3.6, “Memory design” on page 94

•3.7, “Logical partitioning” on page 97

•3.8, “Intelligent Resource Director” on page 107

•3.9, “Clustering technology” on page 109

•3.10, “Virtual Flash Memory” on page 115

3.1 Overview

The z14 ZR1 symmetric multiprocessor (SMP) system is the next step in an evolutionary trajectory that began with the introduction of the IBM System/360 in 1964. Over time, the design was adapted to the changing requirements that were dictated by the shift toward new types of applications on which clients depend.

z14 ZR1 servers offer high levels of performance, reliability, availability, serviceability (RAS), resilience, and security. The z14 ZR1 server fits into the IBM strategy in which mainframes play a central role in creating an infrastructure for cloud, analytics, and mobile, which is underpinned by security. The z14 ZR1 server is designed so that everything around it, such as operating systems, middleware, storage, security, and network technologies that support open standards, helps you achieve your business goals.

For more information about the z14 ZR1 RAS features, see Chapter 9, “Reliability, availability, and serviceability” on page 327.

The z14 ZR1 processor includes the following features:

•Ultra-high frequency, large, high-speed buffers (caches) and memory

•Superscalar processor design

•Out-of-order core execution

•Simultaneous multithreading (SMT)

•Single-instruction multiple-data (SIMD)

•Flexible configuration options

The z14 ZR1 processor is the next implementation of IBM Z servers to address the ever-changing IT environment.

3.2 Design highlights

The physical packaging of z14 ZR1 server is new, with one CPC drawer that fits the 19-inch form factor rack (frame). The CPC drawer has a modular design that uses higher density single chip modules (SCM) for processors and system controller. Higher chip density addresses thermal design complexity that is related to building systems with ever-increasing capacities. The modular CPC drawer design is flexible and expandable, which offers unprecedented capacity to meet consolidation needs.

The microprocessor of the z14 ZR1 uses the same design as z14 M0x. The difference is the frequency: 4.5G Hz for z14 ZR1 versus 5.2 GHz for z14 M0x.

The Processor Unit (PU) SCMs are air-cooled (versus water-cooled for z14 M0x) because the lower frequency reduces the generated heat.

z14 ZR1 servers continue the line of mainframe processors that are compatible with an earlier version. The current evolution brings the following processor design enhancements:

•The processor chip is designed with 10 cores, with 5, 6, 7, 8, or 9 active cores

•Pipeline optimization

•Improved SMT and SIMD

•Better branch prediction

•Improved co-processor functionality

The z14 ZR1 processor uses 24-bit, 31-bit, and 64-bit addressing modes, multiple arithmetic formats, and multiple address spaces for robust interprocess security.

The z14 ZR1 system design has the following main objectives:

•Offer a data-centric approach to information (data) security that is simple, transparent, and consumable (extensive data encryption from inception to archive, in flight and at rest).

•Offer a flexible infrastructure to concurrently accommodate a wide range of operating systems and applications, from the traditional systems (for example, z/OS and z/VM) to the world of Linux, cloud, analytics, and mobile computing.

•Offer state-of-the-art integration capability for server consolidation by using virtualization capabilities in a highly secure environment:

– Logical partitioning, which allows 40 independent logical servers (logical partitions).

– z/VM, which can virtualize hundreds to thousands of servers as independently running virtual machines (guests).

– HiperSockets, which implement virtual LANs between logical partitions (LPARs) within the system.

– Efficient data transfer that uses direct memory access (SMC-D), Remote Direct Memory Access (SMC-R), and reduced storage access latency for transactional environments - zHyperLink Express.

– The IBM Z PR/SM is designed for Common Criteria Evaluation Assurance Level 5+ (EAL 5+) certification for security, so an application that is running on one partition (LPAR) cannot access another application on a different partition (essentially the same security as an air-gapped separated system).

This configuration allows for a logical and virtual server coexistence and maximizes system use and efficiency by sharing hardware resources.

•Offer high-performance computing to achieve the outstanding response times that are required by new workload-type applications. This performance is achieved by high-frequency, enhanced superscalar processor technology, out-of-order core execution, large high-speed buffers (cache) and memory, an architecture with multiple complex instructions, and high-bandwidth channels.

•Offer the processing capacity and scalability that are required by the most demanding applications, from the single-system and clustered-systems points of view.

•Offer the capability of concurrent upgrades for processors, memory, and I/O connectivity, which prevents system outages in planned situations.

•Implement a system with high availability and reliability. These goals are achieved with redundancy of critical elements and sparing components of a single system, and the clustering technology of the Parallel Sysplex environment.

•Include internal and external connectivity offerings, supporting open standards, such as Gigabit Ethernet (GbE) and Fibre Channel Protocol (FCP).

•Provide leading cryptographic performance. Every processor unit (PU) includes a dedicated and optimized CP Assist for Cryptographic Function (CPACF).

•Optional Crypto Express features with cryptographic coprocessors provide the highest standardized security certification.¹ These optional features can also be configured as Cryptographic Accelerators to enhance the performance of Secure Sockets Layer/Transport Layer Security (SSL/TLS) transactions.

•Be self-managing and self-optimizing by adjusting itself when the workload changes to achieve the best system throughput. This process can be done by using the Intelligent Resource Director or the Workload Manager functions, which are assisted by HiperDispatch.

•Have a balanced system design with pervasive encryption, which provides large data rate bandwidths for high-performance connectivity along with processor and system capacity, while protecting every byte that enters and exits the z14 ZR1.

The remaining sections describe the z14 ZR1 system structure, showing a logical representation of the data flow from PUs, caches, memory cards, and various interconnect capabilities.

3.3 CPC drawer design

A z14 ZR1 system has one CPC drawer, with up to 34 PUs that can be characterized for customer use, and up to 8128 GB of customer usable memory capacity. The CPC drawer is logically divided in two clusters to improve the processor and memory affinity and availability.

The following types of CPC drawer² configurations are available for z14 ZR1 system:

•One PU SCM (8 PUs), 2 PCIe Fanouts, up to 2 TB memory

•Two PU SCMs (16 PUs), 4 PCIe Fanouts, up to 4 TB memory

•Four PU SCMs (28 PUs), 8 PCIe Fanouts, up to 8 TB memory

•Four PU SCMs (34 PUs), 8 PCIe Fanouts, up to 8 TB memory

Table 3-1 z14 ZR1 features (PU and memory)

Feature	CP max	IFL max	I/O fanouts	Memory
Max30 (FC 0639)	6	30	8	64GB-8TB
Max24 (FC 0638)	6	24	8	64GB-8TB
Max12 (FC 0637)	6	12	4	64GB-4TB
Max4 (FC 0636)	4	4	2	64GB-2TB

The z14 ZR1 has up to four memory controller units (MCUs). The configuration uses five-channel redundant array of independent memory (RAIM) protection, with dual inline memory modules (DIMM) bus cyclic redundancy check (CRC) error retry.

The cache hierarchy (L1, L2, L3, and L4) is implemented with embedded dynamic random access memory (eDRAM) caches. Until recently, eDRAM was considered to be too slow for this use. However, a breakthrough in technology that was made by IBM eliminated that limitation. In addition, eDRAM offers higher density, less power utilization, fewer soft errors, and better performance.

z14 ZR1 servers use CMOS Silicon-on-Insulator (SOI) 14 nm chip technology, with advanced low latency pipeline design, which creates high-speed yet power-efficient circuit designs. The PU SCM has 17 layers of metal. For more information, see 2.8.1, “Considerations” on page 58.

3.3.1 Cache levels and memory structure

The z14 ZR1 memory subsystem focuses on keeping data “closer” to the PU core. With the current processor configuration, all on chip cache levels increased.

Although L1, L2, and L3 caches are implemented on the PU SCM, the fourth cache level (L4) is implemented within the system controller (SC) SCM. One L4 cache is present in each CPC drawer, which is shared by all PU SCMs. The cache structure of the z14 ZR1 has the following characteristics:

•Larger L1, L2, and L3 caches (more data closer to the core).

•L1 and L2 caches use eDRAM, and are private for each PU core.

•L2-L3 interface has a new Fetch cancel protocol, a revised L2 Least Recent Used (LRU) demote handling.

•L3 cache also uses eDRAM and is shared by all activated cores within the PU chip. The CPC drawer has up to four L3 caches, depending on CPC drawer feature. Therefore, a Max24 and Max30 CPC drawer feature four L3 caches, which results in 512 MB (4 x 128 MB) of this shared PU chip-level cache. For availability and reliability, L3 cache now implements symbol ECC.

•L4 cache also uses eDRAM, and is shared by all PU chips. L4 cache has 672 MB inclusive of L3’s, 42w Set Associative and 256 bytes cache line size.

In most real-world situations, several cache lines exist in multiple L3s underneath L4. The L4 does not contain the same line multiple times, but rather once with an indication of all the cores that have a copy of that line. As such, 672 MB of inclusive L4 can easily cover 512 MB of underlying L3 caches.

•Main storage has up to 8 TB addressable memory in the CPC drawer, which uses 20 DIMMs.

Considerations

Cache sizes are limited by ever-diminishing cycle times because they must respond quickly without creating bottlenecks. Access to large caches costs more cycles. Instruction and data cache (L1) sizes must be limited because larger distances must be traveled to reach long cache lines. This L1 access time generally occurs in one cycle, which prevents increased latency.

Also, the distance to remote caches as seen from the microprocessor becomes a significant factor. Although the L4 cache is rather large, several cycles are needed to travel the distance to the cache. The node-cache topology of z14 ZR1 servers is shown in Figure 3-1.

Figure 3-1 z14 ZR1 cache topology

Although large caches mean increased access latency, the new technology of CMOS 14S0 (14 nm chip lithography) and the lower cycle time allows z14 ZR1 servers to increase the size of cache levels (L1, L2, and L3) within the PU chip by using denser packaging. This design reduces traffic to and from the shared L4 cache, which is on another chip (SC chip).

Only when a cache miss occurs in L1, L2, or L3 is a request sent to L4. L4 is the coherence manager, which means that all memory fetches must be in the L4 cache before that data can be used by the processor. However, in the z14 ZR1 cache design, some lines of the L3 cache are not included in the L4 cache.

The cache structure of z14 ZR1 servers is compared with the previous generation of IBM Z servers (z13s) in Figure 3-2.

Figure 3-2 z14 ZR1 and z13s cache levels comparison

Compared to z13s, the z14 ZR1 cache design has larger L1, L2, and L3 cache sizes. In z14 ZR1 servers, more affinity exists between the memory of a partition, the L4 cache in the SC (which is accessed by the two logical clusters in the same CPC drawer), and the cores in the PU.

The access time of the private cache often occurs in one cycle. The z14 ZR1 cache level structure is focused on keeping more data closer to the PU. This design can improve system performance on many production workloads.

HiperDispatch

To help avoid latency in a high-frequency processor design, PR/SM and the dispatcher must be prevented from scheduling and dispatching a workload on any processor available, which keeps the workload in as small a portion of the system as possible. The cooperation between z/OS and PR/SM is bundled in a function called HiperDispatch. HiperDispatch uses the z14 ZR1 cache topology, which features reduced cross-cluster “help” and better locality for multi-task address spaces.

PR/SM can use dynamic PU reassignment to move processors (CPs, ZIIPs, IFLs, ICFs, SAPs, and spares) to a different chip to improve the reuse of shared caches by processors of the same partition. For more information about HiperDispatch, see 3.7, “Logical partitioning” on page 97.

3.3.2 CPC drawer topology

The z14 ZR1 CPC drawer topology with the interconnection between CP and SC is shown in Figure 1-3 on page 7. The SC regulates coherent cluster-to-cluster traffic.

3.4 Processor unit design

Processor cycle time is especially important for processor-intensive applications. Current systems design is driven by processor cycle time, although improved cycle time does not automatically mean that the performance characteristics of the system improve.

Through innovative processor design (pipeline and cache management redesigns), the IBM Z processor performance continues to evolve. With the introduction of out-of-order execution, ever improving branch prediction mechanism, and simultaneous multi-threading, the processing performance was enhanced beyond the slight frequency increase (z13s core runs at 4.3 GHz).

z14 ZR1 core frequency is 4.5 GHz, which allows the increased number of processors that share larger caches to have quick processing times for improved capacity and performance. Although the cycle time of the z14 ZR1 processor frequency was only slightly increased (4%) compared to z13s, the processor performance was increased even further (z14 ZR1 uni-processor PCI up 10% compared to z13s) through improved processor design, such as pipeline enhancements, out-of-order execution design, branch prediction, time of access to high-speed buffers (caches), and the relative nest intensity (RNI) redesigns. For more information about RNI, see 12.4, “Relative Nest Intensity” on page 401.

z13s servers introduced architectural extensions with instructions that reduce processor quiesce effects, cache misses, and pipeline disruption, and increase parallelism with instructions that process several operands in a single instruction (SIMD). The processor architecture was further developed for z14 ZR1and includes the following features:

•Optimized second-generation SMT

•Enhanced SIMD instructions set

•Improved Out-of-Order core execution

•Improvements in branch prediction and handling

•Pipeline optimization

•Enhanced branch prediction structure and sequential instruction fetching

The z14 ZR1 enhanced Instruction Set Architecture (ISA) includes a set of instructions that is added to improve compiled code efficiency. These instructions optimize PUs to meet the demands of various business and analytics workload types without compromising the performance characteristics of traditional workloads.

3.4.1 Simultaneous multithreading

Aligned with industry directions, z14 ZR1 servers can process up to two simultaneous threads in a single core while sharing certain resources of the processor, such as execution units, translation lookaside buffers (TLBs), and caches. When one thread in the core is waiting for other hardware resources, the second thread in the core can use the shared resources rather than remaining idle. This capability is known as simultaneous multithreading (SMT).

SMT is supported only for Integrated Facility for Linux (IFL) and IBM Z Integrated Information Processor (zIIP) speciality engines on z14³ ZR1 servers, and it requires operating system support.

An operating system with SMT support can be configured to dispatch work to a thread on a zIIP (for eligible workloads in z/OS) or an IFL (for z/VM and Linux on Z) core in single thread or SMT mode so that HiperDispatch cache optimization can be considered. For more information about operating system support, see Chapter 7, “Operating system support” on page 209.

SMT technology allows instructions from more than one thread to run in any pipeline stage at a time. SMT can handle up to four pending translations.

Each thread has its own unique state information, such as Program Status Word - S/360 Architecture (PSW) and registers. The simultaneous threads cannot necessarily run instructions instantly and must at times compete to use certain core resources that are shared between the threads. In some cases, threads can use shared resources that are not experiencing competition.

Two threads (A and B) that are running on the same processor core on different pipeline stages and sharing the core resources are shown in Figure 3-3.

Figure 3-3 Two threads running simultaneously on the same processor core

The use of SMT provides more efficient use of the processors’ resources and helps address memory latency, which results in overall throughput gains. The active thread shares core resources in space, such as data and instruction caches, TLBs, branch history tables, and, in time, pipeline slots, execution units, and address translators.

Although SMT increases the processing capacity, the performance in some cases might be superior if a single thread is used. Enhanced hardware monitoring supports measurement through CPUMF for thread usage and capacity.

For workloads that need maximum thread speed, the partition’s SMT mode can be turned off. For workloads that need more throughput to decrease the dispatch queue size, the partition’s SMT mode can be turned on.

SMT use is functionally transparent to middleware and applications. No changes are required to run them in an SMT-enabled partition.

3.4.2 Single-instruction multiple-data (enhanced for z14 ZR1)

The z14 ZR1 superscalar processor has 32 vector registers and an instruction set architecture that includes a subset of 139 new instructions (known as SIMD) that were added to improve the efficiency of complex mathematical models and vector processing. These new instructions allow many operands to be processed with a single instruction. The SIMD instructions use the superscalar core to process operands in parallel.

SIMD provides the next phase of enhancements of IBM Z analytics capability. The set of SIMD instructions is a type of data parallel computing and vector processing that can decrease the amount of code and accelerate code that handles integer, string, character, and floating point data types. The SIMD instructions improve performance of complex mathematical models and allow integration of business transactions and analytic workloads on IBM Z servers.

The 32 vector registers feature 128 bits. The 139 new instructions include string operations, vector integer, and vector floating point operations. Each register contains multiple data elements of a fixed size. The following instructions code specifies which data formats to use and the size of the elements:

•Byte (16 8-bit operands)

•Halfword (eight 16-bit operands)

•Word (four 32-bit operands)

•Doubleword (two 64-bit operands)

•Quadword (one 128-bit operand)

The collection of elements in a register is called a vector. A single instruction operates on all of the elements in the register. Instructions include a non-destructive operand encoding that allows the addition of the register vector A and register vector B and stores the result in the register vector A (A = A + B).

A schematic representation of a SIMD instruction with 16-byte size elements in each vector operand is shown in Figure 3-4.

Figure 3-4 Schematic representation of add SIMD instruction with 16 elements in each vector

The vector register file overlays the floating-point registers (FPRs), as shown in Figure 3-5. The FPRs use the first 64 bits of the first 16 vector registers, which saves hardware area and power, and makes it easier to mix scalar and SIMD codes. Effectively, the core gets 64 FPRs, which can further improve FP code efficiency.

Figure 3-5 Floating point registers overlaid by vector registers

SIMD instructions include the following examples:

•Integer byte to quadword add, sub, and compare

•Integer byte to doubleword min, max, and average

•Integer byte to word multiply

•String find 8-bits, 16-bits, and 32-bits

•String range compare

•String find any equal

•String load to block boundaries and load/store with length

For most operations, the condition code is not set. A summary condition code is used only for a few instructions.

z14 ZR1 SIMD features the following enhancements (compared to z13s):

•Doubled vector double precision Binary Floating Point (BFP) operations throughput (2x 64b)

•Added vector single precision BFP (4x 32b)

•Added vector quad precision BFP (128b)

•Added binary Fixed Multiply Add (FMA) operations to speed up code

•Vector Single Precision/ Double Precision/ Quad Precision (SP/DP/QP) compare/min/max with programming language support

•Enhanced to Storage-to-Storage Binary Coded Decimal (BCD)

•Vector load/store right-most with length

3.4.3 Out-of-Order execution

z14 ZR1 servers have an Out-of-Order core, much like the z13s. Out-of-Order yields significant performance benefits for compute-intensive applications. It does so by reordering instruction execution, which allows later (younger) instructions to be run ahead of a stalled instruction, and reordering storage accesses and parallel storage accesses. Out-of-Order maintains good performance growth for traditional applications.

Out-of-Order execution can improve performance in the following ways:

•Reordering instruction execution

Instructions stall in a pipeline because they are waiting for results from a previous instruction or the execution resource that they require is busy. In an in-order core, this stalled instruction stalls all later instructions in the code stream. In an out-of-order core, later instructions are allowed to run ahead of the stalled instruction.

•Reordering storage accesses

Instructions that access storage can stall because they are waiting on results that are needed to compute the storage address. In an in-order core, later instructions are stalled. In an out-of-order core, later storage-accessing instructions that can compute their storage address are allowed to run.

•Hiding storage access latency

Many instructions access data from storage. Storage accesses can miss the L1 and require 7 - 50 more clock cycles to retrieve the storage data. In an in-order core, later instructions in the code stream are stalled. In an out-of-order core, later instructions that are not dependent on this storage data are allowed to run.

The z14 ZR1 processor includes pipeline enhancements that benefit Out-of-Order execution. The IBM Z processor design features advanced micro-architectural innovations that provide the following benefits:

•Maximized instruction-level parallelism (ILP) for a better cycles per instruction (CPI) design.

•Maximized performance per watt. Two cores are added (as compared to the z13/z13s chip) at only slightly higher chip power.

•Enhanced instruction dispatch and grouping efficiency.

•Increased OoO resources (Global Completion Table entries, physical GPR entries, and physical FPR entries).

•Improved completion rate.

•Reduced cache/TLB miss penalty.

•Improved execution of D-Cache store and reload and new Fixed-point divide.

•New Operand Store Compare (OSC) (load-hit-store conflict) avoidance scheme.

•Enhanced branch prediction structure and sequential instruction fetching.

Program results

The Out-of-Order execution does not change any program results. Execution can occur out of (program) order, but all program dependencies are honored. The same results occur as in-order (program) execution.

This implementation requires special circuitry to make execution and memory accesses display in order to the software. The logical diagram of a z14 ZR1 core is shown in Figure 3-6 on page 75.

Figure 3-6 z14 ZR1 PU core logical diagram

Memory address generation and memory accesses can occur out of (program) order. This capability can provide a greater use of the z14 ZR1 superscalar core, and can improve system performance.

The z14 ZR1 processor unit core is a superscalar, out-of-order, SMT processor with 10 execution units. Up to six instructions can be decoded per cycle, and up to 10 instructions or operations can be started to run per clock cycle.

The execution of the instructions can occur out of program order. Memory address generation and memory accesses can also occur out of program order. Each core has special circuitry to display execution and memory accesses in order to the software. This technology results in shorter workload runtime.

Branch prediction

If the branch prediction logic of the microprocessor makes the wrong prediction, all instructions in the parallel pipelines are removed. The wrong branch prediction is expensive in a high-frequency processor design. Therefore, the branch prediction techniques that are used are important to prevent as many wrong branches as possible.

For this reason, various history-based branch prediction mechanisms are used, as shown on the in-order part of the z14 PU core logical diagram in Figure 3-6 on page 75. The branch target buffer (BTB) runs ahead of instruction cache pre-fetches to prevent branch misses in an early stage. Also, a branch history table (BHT), in combination with a pattern history table (PHT) and the use of tagged multi-target prediction technology branch prediction, offers a high branch prediction success rate.

The z14 ZR1 microprocessor improves the branch prediction throughput by using the new branch prediction and instruction fetch front end.

3.4.4 Superscalar processor

A scalar processor is a processor that is based on a single-issue architecture, which means that only a single instruction is run at a time. A superscalar processor allows concurrent (parallel) execution of instructions by adding resources to the microprocessor in multiple pipelines, each working on its own set of instructions to create parallelism.

A superscalar processor is based on a multi-issue architecture. However, when multiple instructions can be run during each cycle, the level of complexity is increased because an operation in one pipeline stage might depend on data in another pipeline stage. Therefore, a superscalar design demands careful consideration of which instruction sequences can successfully operate in a long pipeline environment.

On z14 ZR1 servers, up to six instructions can be decoded per cycle and up to 10 instructions or operations can be in execution per cycle. Execution can occur out of (program) order. These improvements also make possible the simultaneous execution of two threads in the same processor.

Many challenges exist in creating an efficient superscalar processor. The superscalar design of the PU made significant strides in avoiding address generation interlock situations. Instructions that require information from memory locations can suffer multi-cycle delays to get the needed memory content. Because high-frequency processors wait “faster” (spend processor cycles more quickly while idle), the cost of getting the information might become prohibitive.

3.4.5 Compression and cryptography accelerators on a chip

This section introduces the CPACF enhancements for z14 ZR1.

Coprocessor units

One coprocessor unit is available for compression and cryptography on each core in the chip. The compression engine uses static dictionary compression and expansion. The compression dictionary uses the L1-cache (instruction cache).

The cryptography engine is used for the CPACF, which offers a set of symmetric cryptographic functions for encrypting and decrypting clear key operations.

The coprocessors feature the following characteristics:

•Each core has an independent compression and cryptographic engine.

•The coprocessor was redesigned to support SMT operation and for throughput increase.

•It is available to any processor type.

•The owning processor is busy when its coprocessor is busy.

The location of the coprocessor on the chip is shown in Figure 3-7.

Figure 3-7 Compression and cryptography accelerators on a core in the chip

Compression enhancements

The compression features the following enhancements:

•Huffman compression on top of CMPSC compression (embedded in dictionary, reuse of generators)

•Order Preserving compression in B-Trees and other index structures

•Faster expansion algorithms

•Reduced overhead on short data

CPACF

CPACF accelerates the encrypting and decrypting of SSL/TLS transactions, virtual private network (VPN)-encrypted data transfers, and data-storing applications that do not require FIPS 140-2 level 4 security. The assist function uses a special instruction set for symmetrical clear key cryptographic encryption and decryption, and for hash operations. This group of instructions is known as the Message-Security Assist (MSA). For more information about these instructions, see z/Architecture Principles of Operation, SA22-7832.

Crypto functions enhancements

The crypto functions include the following enhancements:

•Reduced overhead on short data (hashing and encryption)

•4x throughput for AES

•Special instructions for elliptic curve crypto/RSA

•New hashing algorithms; for example, SHA-3

•Support for authenticated encryption (combined encryption and hashing; for example, AES-GCM)

•True random number generator (for example, for session keys)

For more information about cryptographic functions on z14 ZR1 servers, see Chapter 6, “Cryptographic features” on page 173.

3.4.6 Decimal floating point accelerator

The decimal floating point (DFP) accelerator function is present on each of the microprocessors (cores) on the 10-core chip. Its implementation meets business application requirements for better performance, precision, and function.

Base 10 arithmetic is used for most business and financial computation. Floating point computation that is used for work that is typically done in decimal arithmetic involves frequent data conversions and approximation to represent decimal numbers. This process makes floating point arithmetic complex and error-prone for programmers who use it for applications in which the data is typically decimal.

Hardware DFP computational instructions provide the following features:

•Data formats of 4, 8, and 16 bytes

•An encoded decimal (base 10) representation for data

•Instructions for running decimal floating point computations

•An instruction that runs data conversions to and from the decimal floating point representation

Benefits of the DFP accelerator

The DFP accelerator offers the following benefits:

•Avoids rounding issues, such as those issues that occur with binary-to-decimal conversions.

•It better controls existing binary-coded decimal (BCD) operations.

•Follows the standardization of the dominant decimal data and decimal operations in commercial computing, supporting the industry standardization (IEEE 745R) of decimal floating point operations. Instructions are added in support of the Draft Standard for Floating-Point Arithmetic - IEEE 754-2008, which is intended to supersede the ANSI/IEEE Standard 754-1985.

•Allows COBOL programs that use zoned-decimal operations to use the z/Architecture DFP instructions.

z14 ZR1 servers have two DFP accelerator units per core, which improve the decimal floating point execution bandwidth. The floating point instructions operate on newly designed vector registers (32 new 128-bit registers).

z14 ZR1 servers include new decimal floating point-packed conversion facility support with the following benefits:

•Reduces code path length because extra instructions to format conversion are no longer needed.

•Operates packed data in memory by all decimal instructions without general-purpose registers, which were required only to prepare for decimal floating point packed conversion instruction.

•Converting from packed can now force the input packed value to positive instead of requiring a separate OI, OILL, or load positive instruction.

•Converting to packed can now force a positive zero result instead of requiring ZAP instruction.

Software support

DFP is supported in the following programming languages and products:

•Release 4 and later of the High Level Assembler

•C/C++, which requires z/OS 1.10 with program temporary fixes (PTFs) for full support or later

•Enterprise PL/I Release 3.7 and Debug Tool Release 8.1 or later

•Java Applications that use the BigDecimal Class Library

•SQL support as of Db2 Version 9 and later

3.4.7 IEEE floating point

Binary and hexadecimal floating-point instructions are implemented in z14 ZR1 servers. They incorporate IEEE standards into the system.

The z14 ZR1 core implements two other execution subunits for 2x throughput on BFP (single/double precision) operations (see Figure 3-6 on page 75).

The key point is that Java and C/C++ applications tend to use IEEE BFP operations more frequently than earlier applications. Therefore, the better the hardware implementation of this set of instructions, the better the performance of applications.

3.4.8 Processor error detection and recovery

The PU core uses a process called transient recovery as an error recovery mechanism. When an error is detected, the instruction unit tries the instruction again and attempts to recover the error. If the second attempt is unsuccessful (that is, a permanent fault exists), a relocation process is started that restores the full capacity by moving work to another PU core. Relocation under hardware control is possible because the R-unit has the full designed state in its buffer. PU error detection and recovery are shown in Figure 3-8.

Figure 3-8 PU core error detection and recovery

3.4.9 Branch prediction

Because of the ultra-high frequency of the PUs, the penalty for a wrongly predicted branch is high. Therefore, a multi-pronged strategy for branch prediction, based on gathered branch history that is combined with other prediction mechanisms, is implemented on each microprocessor.

The BHT implementation on processors provides a large performance improvement. Originally introduced on the IBM ES/9000 9021 in 1990, the BHT is continuously improved.

The BHT offers significant branch performance benefits. The BHT allows each PU core to take instruction branches that are based on a stored BHT, which improves processing times for calculation routines. In addition to the BHT, z14 ZR1 servers use the following techniques to improve the prediction of the correct branch to be run:

•BTB

•PHT

•BTB data compression

The success rate of branch prediction contributes significantly to the superscalar aspects of z14 ZR1 servers. This success is because the architecture rules prescribe that, for successful parallel execution of an instruction stream, the correctly predicted result of the branch is essential.

The z14 ZR1 branch prediction includes the following enhancements over z13s:

•Branch prediction search pipeline extended from five to six cycles to accommodate new predictors for increased accuracy/performance.

•New predictors:

– Perceptron (neural network direction predictor)

– SSCRS (hardware-based super simple call-return stack)

•Capacity increases:

– Level 1 Branch Target Buffer (BTB1): 1 K rows x 6 sets → 2 K rows x 4 sets

– Level 2 Branch Target Buffer (BTB2): 16 K rows x 6 sets → 32 K rows x 4 sets

•Better power efficiency: Several structures were redesigned to maintain their accuracy while less power is used through smart access algorithms.

•New static IBM IA® regions expanded from four to eight. To conserve space, prediction structures do not store full target addresses. Instead, they use the locality and limited ranges of “4gig regions” of virtual instruction addresses - IA(0:31).

3.4.10 Wild branch

When a bad pointer is used or when code overlays a data area that contains a pointer to code, a random branch results. This process causes several ABENDs, including 0C1 or 0C4. Random branches are difficult to diagnose because clues about how the system got there are not evident.

With the wild branch hardware facility (named Breaking Event Address Register - BEAR), the last address from which a successful branch instruction was run is kept. z/OS uses this information with debugging aids, such as the SLIP command, to determine from where a wild branch came. It can also collect data from that storage location. This approach decreases the number of debugging steps that are necessary when you want to know from where the branch came.

3.4.11 Translation lookaside buffer

The translation lookaside buffer (TLB) in the instruction and data L1 caches use a secondary TLB to enhance performance.

The size of the TLB is kept as small as possible because of its short access time requirements and hardware space limitations. Because memory sizes recently increased significantly as a result of the introduction of 64-bit addressing, a smaller working set is represented by the TLB.

To increase the working set representation in the TLB without enlarging the TLB, large (1 MB) page and giant page (2 GB) support is available and can be used when appropriate. For more information, see “Large page support” on page 95.

With the enhanced DAT-2 (EDAT-2) improvements, the IBM Z servers support 2 GB page frames.

z14 ZR1 TLB enhancements

IBM z14 Model ZR1 switches to a logical-tagged L1 directory and inline TLB2. Each L1 cache directory entry contains the virtual address and Address Space Control Element (ASCE) because it no longer must access TLB for L1 cache hit. TLB2 is accessed in parallel to L2, which saves significant latency compared to TLB1-miss.

The new translation engine allows up to four translations pending concurrently. Each translation step is ~2x faster, which helps level 2 guests.

3.4.12 Instruction fetching, decoding, and grouping

The superscalar design of the microprocessor allows for the decoding of up to six instructions per cycle and the execution of up to 10 instructions per cycle. Both execution and storage accesses for instruction and operand fetching can occur out of sequence.

Instruction fetching

Instruction fetching normally tries to get as far ahead of instruction decoding and execution as possible because of the relatively large instruction buffers that are available. In the microprocessor, smaller instruction buffers are used. The operation code is fetched from the I-cache and put in instruction buffers that hold prefetched data that is awaiting decoding.

Instruction decoding

The processor can decode up to six instructions per cycle. The result of the decoding process is queued and later used to form a group.

Instruction grouping

From the instruction queue, up to 10 instructions can be completed on every cycle. A complete description of the rules is beyond the scope of this publication.

The compilers and JVMs are responsible for selecting instructions that best fit with the superscalar microprocessor. They abide by the rules to create code that best uses the superscalar implementation. All IBM Z compilers and JVMs are constantly updated to benefit from new instructions and advances in microprocessor designs.

3.4.13 Extended Translation Facility

The z/Architecture instruction set includes instructions in support of the Extended Translation Facility. They are used in data conversion operations for Unicode data, which causes applications that are enabled for Unicode or globalization to be more efficient. These data-encoding formats are used in web services, grid, and on-demand environments in which XML and SOAP technologies are used. The High Level Assembler supports the Extended Translation Facility instructions.

3.4.14 Instruction set extensions

The processor supports the following instructions to support functions:

•Hexadecimal floating point instructions for various unnormalized multiply and multiply add instructions.

•Immediate instructions, including various add, compare, OR, exclusive-OR, subtract, load, and insert formats. The use of these instructions improves performance.

•Load instructions for handling unsigned halfwords, such as those unsigned halfwords that are used for Unicode.

•Cryptographic instructions, which are known as the MSA, offer the full complement of the AES, SHA-1, SHA-2, and DES algorithms. They also include functions for random number generation.

•Extended Translate Facility-3 instructions, which are enhanced to conform with the current Unicode 4.0 standard.

•Assist instructions that help eliminate hypervisor processor usage.

•SIMD instructions, which allow the parallel processing of multiple elements in a single instruction.

3.4.15 Transactional Execution

The Transactional Execution (TX) capability, which is known in the industry as hardware transactional memory, runs a group of instructions atomically; that is, all of their results are committed or no result is committed. The execution is optimistic. The instructions are run, but previous state values are saved in a transactional memory. If the transaction succeeds, the saved values are discarded; otherwise, they are used to restore the original values.

The Transaction Execution Facility provides instructions, including declaring the beginning and end of a transaction, and canceling the transaction. TX is expected to provide significant performance benefits and scalability by avoiding most locks. This benefit is especially important for heavily threaded applications, such as Java.

3.4.16 Runtime Instrumentation

Runtime Instrumentation (RI) is a hardware facility for managed run times, such as the Java Runtime Environment (JRE). RI allows dynamic optimization of code generation as it is being run. It requires fewer system resources than the current software-only profiling, and provides information about hardware and program characteristics. RI also enhances JRE in making the correct decision by providing real-time feedback.

3.5 Processor unit functions

The PU functions are described in this section.

3.5.1 Overview

All PUs on a z14 ZR1 server are physically identical. When the system is started, one integrated firmware processor (IFP) is allocated from the pool of PUs that is available for the entire system. The other PUs can be characterized to specific functions (CP, IFL, ICF, zIIP, or SAP).

The function that is assigned to a PU is set by the Licensed Internal Code (LIC). The LIC is loaded when the system is started at power-on reset (POR) and the PUs are characterized.

Only characterized PUs include a designated function. Non-characterized PUs are considered spares. Order at least one CP, IFL, or ICF on a z14 ZR1 server.

This design brings outstanding flexibility to z14 ZR1 servers because any PU can assume any available characterization. The design also plays an essential role in system availability because PU characterization can be done dynamically, with no system outage.

For more information about software level support of functions and features, see Chapter 7, “Operating system support” on page 209.

Concurrent PU upgrades

Concurrent upgrades can be done by the LIC, which assigns a PU function to a previously non-characterized PU. The upgrade can be done concurrently through the following facilities:

•Customer Initiated Upgrade (CIU) for permanent upgrades

•On/Off Capacity on Demand (On/Off CoD) for temporary upgrades

•Capacity BackUp (CBU) for temporary upgrades

•Capacity for Planned Event (CPE) for temporary upgrades

If the PU SCMs in the CPC drawer have no available (unused) PUs, an upgrade results in a CPC feature upgrade (Max4 => Max12 ==> Max24 ==> Max30) which means the installation or replacement of PU SCMs (the maximum is four in the CPC drawer). This operation is not concurrent.

For more information about Capacity on Demand, see Chapter 8, “System upgrades” on page 281.

PU sparing

In the rare event of a PU failure, the failed PU’s characterization is dynamically and transparently reassigned to a spare PU. z14 ZR1 servers have one spare PU. PUs that are not characterized on a CPC configuration can also be used as extra spare PUs. For more information about PU sparing, see 3.5.9, “Sparing rules” on page 93.

PU pools

PUs that are defined as CPs, IFLs, ICFs, and zIIPs are grouped in their own pools from where they can be managed separately. This configuration significantly simplifies capacity planning and management for LPARs. The separation also affects weight management because CP and zIIP weights can be managed separately. For more information, see “PU weighting” on page 84.

All assigned PUs are grouped in the PU pool. These PUs are dispatched to online logical PUs. As an example, consider a z14 ZR1 server with 4 CPs, 2 IFLs, 2 zIIPs, and 1 ICF. This system has a PU pool of 9 PUs, called the pool width. Subdivision defines the following pools:

•A CP pool of four CPs

•An ICF pool of one ICF

•An IFL pool of two IFLs

•A zIIP pool of two zIIPs

PUs are placed in the pools in the following circumstances:

•When the system is PORed

•At the time of a concurrent upgrade

•As a result of adding PUs during a CBU

•Following a capacity on-demand upgrade through On/Off CoD or CIU

PUs are removed from their pools when a concurrent downgrade occurs as the result of the removal of a CBU. They are also removed through the On/Off CoD process and the conversion of a PU. When a dedicated LPAR is activated, its PUs are taken from the correct pools. This process is also the case when an LPAR logically configures a PU as on, if the width of the pool allows for it.

For an LPAR, logical PUs are dispatched from the supporting pool only. The logical CPs are dispatched from the CP pool, logical zIIPs from the zIIP pool, logical IFLs from the IFL pool, and the logical ICFs from the ICF pool.

PU weighting

Because CPs, zIIPs, IFLs, and ICFs have their own pools from where they are dispatched, they can be given their own weights. For more information about PU pools and processing weights, see the IBM Z Processor Resource/Systems Manager Planning Guide, SB10-7169.

3.5.2 Central processors

A central processor (CP) is a PU that uses the full z/Architecture instruction set. It can run z/Architecture-based operating systems (z/OS, z/VM, TPF, z/TPF, z/VSE, and Linux), the Coupling Facility Control Code (CFCC), and IBM zAware. Up to 30 PUs can be characterized as CPs, depending on the configuration.

The z14 ZR1 server can be started in LPAR (PR/SM) mode or in Dynamic Partition Manger (DPM) mode. For more information, see Appendix E, “IBM Dynamic Partition Manager” on page 451.

CPs are defined as dedicated or shared. Reserved CPs can be defined to an LPAR to allow for nondisruptive image upgrades. If the operating system in the LPAR supports the logical processor adds function, reserved processors are no longer needed. Regardless of the CPC installed feature, an LPAR can have up to 170 logical CPs that are defined (the sum of active and reserved logical CPs). In practice, define no more CPs than the operating system supports.

All PUs that are characterized as CPs within a configuration are grouped into the CP pool. The CP pool can be seen on the Hardware Management Console (HMC) workplace. Any z/Architecture operating system, CFCCs, and appliances (Secure Service Container) can run on CPs that are assigned from the CP pool.

The z14 zR1 server can be configured with 26 distinct capacity settings for each CP (156 subcapacity settings). The capacity setting for one CP is listed in Table 3-2.

Table 3-2 Capacity settings for one CP

CP capacity	Feature code
CP-A	1069
CP-B	1070
CP-C	1071
CP-D	1072
CP-E	1073
CP-F	1074
CP-G	1075
CP-H	1076
CP-I	1077
CP-J	1078
CP-K	1079
CP-L	1080
CP-M	1081
CP-N	1082
CP-O	1083
CP-P	1084
CP-Q	1085
CP-R	1086
CP-S	1087
CP-T	1088
CP-U	1089
CP-V	1090
CP-W	1091
CP-X	1092
CP-Y	1093
CP-Z	1094

Information about CPs in the remainder of this chapter applies to all CP capacity settings, unless indicated otherwise. For more information about granular capacity, see 2.7.5, “Model capacity identifier” on page 53.

3.5.3 Integrated Facility for Linux

An IFL is a PU that can be used to run Linux on Z, Linux guests on z/VM operating systems, and Secure Service Container (SSC). Up to 30 PUs can be characterized as IFLs, depending on the configuration. IFLs can be dedicated to a Linux, z/VM, or Secure Service Container LPAR, or can be shared by multiple Linux guests, z/VM LPARs, or SSC that are running on the same z14 server. Only z/VM, Linux on Z operating systems, appliances that are running in a Secure Service Container LPAR, and designated software products can run on IFLs. IFLs are orderable by using FC 1064.

IFL pool

All PUs that are characterized as IFLs within a configuration are grouped into the IFL pool. The IFL pool can be seen on the HMC workplace.

IFLs do not change the model capacity identifier of the z14 ZR1 server. Software product license charges that are based on the model capacity identifier are not affected by the addition of IFLs.

Unassigned IFLs

An IFL that is purchased but not activated is registered as an unassigned IFL (FC 1068). When the system is later upgraded with another IFL, the system recognizes that an IFL was purchased and is present.

3.5.4 Internal Coupling Facility

An Internal Coupling Facility (ICF) is a PU that is used to run the CFCC for Parallel Sysplex environments. Within the sum of all unassigned PUs in the CPC drawer, up to 30 ICFs can be characterized, depending on CPC drawer feature. However, the maximum number of ICFs that can be defined on a coupling facility LPAR is limited to 16. ICFs are orderable by using FC 1065.

ICFs exclusively run CFCC. ICFs do not change the model capacity identifier of the z14 server. Software product license charges that are based on the model capacity identifier are not affected by the addition of ICFs.

All ICFs within a configuration are grouped into the ICF pool. The ICF pool can be seen on the HMC workplace.

The ICFs can be used by coupling facility LPARs only. ICFs are dedicated or shared. ICFs can be dedicated to a CF LPAR, or shared by multiple CF LPARs that run on the same system. However, having an LPAR with dedicated and shared ICFs at the same time is not possible.

Coupling Thin Interrupts

With the introduction of Driver 15F (zEC12 and zBC12), the IBM z/Architecture provides a new thin interrupt class called Coupling Thin Interrupts. The capabilities that are provided by hardware, firmware, and software support the generation of coupling-related “thin interrupts” when the following situations occur:

•On the coupling facility (CF) side:

– A CF command or a CF signal (arrival of a CF-to-CF duplexing signal) is received by a shared-engine CF image.

– The completion of a CF signal that was previously sent by the CF occurs (completion of a CF-to-CF duplexing signal).

•On the z/OS side:

– CF signal is received by a shared-engine z/OS image (arrival of a List Notification signal).

– An asynchronous CF operation completes.

The interrupt causes the receiving partition to be dispatched by an LPAR, if it is not dispatched. This process allows the request, signal, or request completion to be recognized and processed in a more timely manner.

After the image is dispatched, “poll for work” logic in CFCC and z/OS can be used largely as is to locate and process the work. The new interrupt expedites the redispatching of the partition.

LPAR presents these Coupling Thin Interrupts to the guest partition; therefore, CFCC and z/OS both require interrupt handler support that can deal with them. CFCC also changes to relinquish control of the processor when all available pending work is exhausted, or when the LPAR undispatches it off the shared processor, whichever comes first.

CF processor combinations

A CF image can have one of the following combinations that are defined in the image profile:

•Dedicated ICFs

•Shared ICFs

•Dedicated CPs

•Shared CPs

Shared ICFs add flexibility. However, running only with shared coupling facility PUs (ICFs or CPs) is not a preferable production configuration. It is preferable for a production CF to operate by using dedicated ICFs. With CFCC Level 19 (and later; z14 ZR1 servers run CFCC level 22), Coupling Thin Interrupts are available, and dedicated engines continue to be recommended to obtain the best coupling facility performance.

The CPC on the left side of Figure 3-9 has two environments that are defined (production and test), and each has one z/OS and one coupling facility image. The coupling facility images share an ICF.

Figure 3-9 ICF options: Shared ICFs

The LPAR processing weights are used to define the amount of processor capacity of each CF image. The capped option can also be set for a test CF image to protect the production environment.

Connections between these z/OS and CF images can use internal coupling links to avoid the use of real (external) coupling links, and achieve the best link bandwidth available.

Dynamic CF dispatching

The dynamic coupling facility dispatching function features a dispatching algorithm that you can use to define a backup CF in an LPAR on the system. When this LPAR is in backup mode, it uses few processor resources. When the backup CF becomes active, only the resources that are necessary to provide coupling are allocated.

CFCC Level 19 introduced Coupling Thin Interrupts and the new DYNDISP specification. It allows more environments with multiple CF images to coexist in a server, and to share CF engines with reasonable performance. For more information, see 3.9.3, “Dynamic CF dispatching” on page 113.

Coupling Facility Processor Scalability

CF work management and dispatcher changed to allow improved efficiency as processors are added to scale up the capacity of a CF image.

CF images support up to 16 processors. To obtain sufficient CF capacity, customers might be forced to split the CF workload across more CF images. However, this change brings more configuration complexity and granularity (more, smaller CF images, more coupling links, and logical CHPIDs to define and manage for connectivity, and so on).

To improve CF processor scaling for the customer’s CF images and to make effective use of more processors as the sysplex workload increases, CF work management and dispatcher provide the following improvements (z14 ZR1):

•Non-prioritized (FIFO-based) work queues, which avoids overhead of maintaining ordered queues in the CF.

•Streamlined system-managed duplexing protocol, which avoids costly latching deadlocks that can occur between primary and secondary structure.

•“Functionally specialized” ICF processors that operate for CF images with dedicated processors defined under certain conditions that realize the following benefits:

– One “functionally specialized” processor for inspecting suspended commands

– One “functionally specialized” processor for pulling in new commands

– The remaining processors are non-specialized for general CF request processing

– Avoids many inter-processor contentions that were associated with CF dispatching

3.5.5 IBM Z Integrated Information Processor

An IBM Z Integrated Information Processor (zIIP)⁴ reduces the standard processor (CP) capacity requirements for z/OS Java, XML system services applications, and a portion of work of z/OS Communications Server and Db2 UDB for z/OS Version 8 or later, which frees up capacity for other workload requirements.

A zIIP enables eligible z/OS workloads to have a portion of them directed to zIIP. The zIIPs do not increase the MSU value of the processor and so do not affect the IBM software license changes.

z14 ZR1 processors support SMT. z14 ZR1 servers implement two threads per core on IFLs and zIIPs. SMT must be enabled at the LPAR level and supported by the z/OS operating system. SMT was enhanced for z14 ZR1 and it is enabled for SAPs by default (no customer intervention required).

How zIIPs work

zIIPs are designed for supporting designated z/OS workloads. One of the workloads is Java code execution. When Java code must be run (for example, under control of IBM WebSphere), the z/OS JVM calls the function of the zIIP. The z/OS dispatcher then suspends the JVM task on the CP that it is running on and dispatches it on an available zIIP. After the Java application code execution is finished, z/OS redispatches the JVM task on an available CP. After this process occurs, normal processing is resumed.

This process reduces the CP time that is needed to run Java WebSphere applications, which frees that capacity for other workloads.

The logical flow of Java code that is running on a z14 ZR1 server that has a zIIP available is shown in Figure 3-10. When JVM starts running a Java program, it passes control to the z/OS dispatcher that verifies the availability of a zIIP.

Figure 3-10 Logical flow of Java code execution on a zIIP

The availability is treated in the following manner:

•If a zIIP is available (not busy), the dispatcher suspends the JVM task on the CP and assigns the Java task to the zIIP. When the task returns control to the JVM, it passes control back to the dispatcher. The dispatcher then reassigns the JVM code execution to a CP.

•If no zIIP is available (all busy), the z/OS dispatcher allows the Java task to run on a standard CP. This process depends on the option that is used in the OPT statement in the IEAOPTxx member of SYS1.PARMLIB.

A zIIP runs only IBM authorized code. This IBM authorized code includes the z/OS JVM in association with parts of system code, such as the z/OS dispatcher and supervisor services. A zIIP cannot process I/O or clock comparator interruptions, and it does not support operator controls, such as IPL.

Java application code can run on a CP or a zIIP. The installation can manage the use of CPs so that Java application code runs only on CPs, only on zIIPs, or on both.

Two execution options for zIIP-eligible code execution are available. These options are user-specified in IEAOPTxx and can be dynamically altered by using the SET OPT command. The following options are supported for z/OS V1R10 and later releases:

•Option 1: Java dispatching by priority (IIPHONORPRIORITY=YES)

This option is the default option and specifies that CPs must not automatically consider zIIP-eligible work for dispatching on them. The zIIP-eligible work is dispatched on the zIIP engines until Workload Manager (WLM) determines that the zIIPs are overcommitted. WLM then requests help from the CPs. When help is requested, the CPs consider dispatching zIIP-eligible work on the CPs themselves based on the dispatching priority relative to other workloads. When the zIIP engines are no longer overcommitted, the CPs stop considering zIIP-eligible work for dispatch.

This option runs as much zIIP-eligible work on zIIPs as possible. It also allows it to spill over onto the CPs only when the zIIPs are overcommitted.

•Option 2: Java dispatching by priority (IIPHONORPRIORITY=NO)

zIIP-eligible work runs on zIIPs only while at least one zIIP engine is online. zIIP-eligible work is not normally dispatched on a CP, even if the zIIPs are overcommitted and CPs are unused. The exception is that zIIP-eligible work can sometimes run on a CP to resolve resource conflicts.

Therefore, zIIP-eligible work does not affect the CP utilization that is used for reporting through the subcapacity reporting tool (SCRT), no matter how busy the zIIPs are.

If zIIPs are defined to the LPAR but are not online, the zIIP-eligible work units are processed by CPs in order of priority. The system ignores the IIPHONORPRIORITY parameter in this case and handles the work as though it had no eligibility to zIIPs.

zIIPs provide the following benefits:

•Potential cost savings.

•Simplification of infrastructure as a result of the colocation and integration of new applications with their associated database systems and transaction middleware, such as Db2, IMS, or CICS. Simplification can happen, for example, by introducing a uniform security environment, and by reducing the number of TCP/IP programming stacks and system interconnect links.

•Prevention of processing latencies that occur if Java application servers and their database servers are deployed on separate server platforms.

The following Db2 UDB for z/OS V8 or later workloads are eligible to run in Service Request Block (SRB) mode:

•Query processing of network-connected applications that access the Db2 database over a TCP/IP connection by using IBM Distributed Relational Database Architecture™ (DRDA). DRDA enables relational data to be distributed among multiple systems. It is native to Db2 for z/OS, which reduces the need for more gateway products that can affect performance and availability. The application uses the DRDA requester or server to access a remote database. IBM Db2 Connect is an example of a DRDA application requester.

•Star schema query processing, which is mostly used in Business Intelligence (BI) work. A star schema is a relational database schema for representing multidimensional data. It stores data in a central fact table and is surrounded by more dimension tables that hold information about each perspective of the data. For example, a star schema query joins various dimensions of a star schema data set.

•Db2 utilities that are used for index maintenance, such as LOAD, REORG, and REBUILD. Indexes allow quick access to table rows, but over time, the databases become less efficient and must be maintained as data in large databases is manipulated.

The zIIP runs portions of eligible database workloads, which helps to free computer capacity and lower software costs. Not all Db2 workloads are eligible for zIIP processing. Db2 UDB for z/OS V8 and later gives z/OS the information to direct portions of the work to the zIIP. The result is that in every user situation, different variables determine how much work is redirected to the zIIP.

On a z14 server, the following workloads can also benefit from zIIPs:

•z/OS Communications Server uses the zIIP for eligible Internet Protocol Security (IPSec) network encryption workloads. This configuration requires z/OS V1R10 or later. Portions of IPSec processing use the zIIPs, specifically end-to-end encryption with IPSec. The IPSec function moves a portion of the processing from the general-purpose processors to the zIIPs. In addition, to run the encryption processing, the zIIP also handles the cryptographic validation of message integrity and IPSec header processing.

•z/OS Global Mirror, formerly known as Extended Remote Copy (XRC), also uses the zIIP. Most z/OS Data Facility Storage Management Subsystem (DFSMS) system data mover (SDM) processing that is associated with z/OS Global Mirror can run on the zIIP. This configuration requires z/OS V1R10 or later releases.

•The first IBM user of z/OS XML system services is Db2 V9. For Db2 V9 before the z/OS XML System Services enhancement, z/OS XML System Services non-validating parsing was partially directed to zIIPs when used as part of a distributed Db2 request through DRDA. This enhancement benefits Db2 V9 by making all z/OS XML System Services non-validating parsing eligible to zIIPs. This configuration is possible when processing is used as part of any workload that is running in enclave SRB mode.

•z/OS Communications Server also allows the HiperSockets Multiple Write operation for outbound large messages (originating from z/OS) to be run by a zIIP. Application workloads that are based on XML, HTTP, SOAP, and Java, and traditional file transfer can benefit.

•For BI, IBM Scalable Architecture for Financial Reporting provides a high-volume, high-performance reporting solution by running many diverse queries in z/OS batch. It can also be eligible for zIIP.

zIIP installation

One CP must be installed with or before any zIIP is installed. In z14 ZR1, the zIIP-to-CP ratio is 2:1, which means that up to 12 zIIPs can be characterized. The allowable number of zIIPs for each feature is listed in Table 3-3.

Table 3-3 Number of zIIPs per model

z14 ZR1 feature	Max4	Max12	Max24	Max30
Maximum zIIPs	0 - 2	0 - 8	0 - 12	0 - 12

zIIPs are orderable by using FC 1067. Up to two zIIPs can be ordered for each CP or marked CP configured in the system.

PUs that are characterized as zIIPs within a configuration are grouped into the zIIP pool. This configuration allows zIIPs to have their own processing weights, independent of the weight of parent CPs. The zIIP pool can be seen on the hardware console.

The number of permanent zIIPs plus temporary zIIPs cannot exceed twice the number of purchased CPs plus temporary CPs. Also, the number of temporary zIIPs cannot exceed the number of permanent zIIPs.

zIIPs and logical partition definitions

zIIPs are dedicated or shared, depending on whether they are part of an LPAR with dedicated or shared CPs. In an LPAR, at least one CP must be defined before zIIPs for that partition can be defined. The number of zIIPs that are available in the system is the number of zIIPs that can be defined to an LPAR.

LPAR: In an LPAR, as many zIIPs as are available can be defined together with at least one CP.

3.5.6 System assist processors

A system assist processor (SAP) is a PU that runs the channel subsystem LIC to control I/O operations. All SAPs run I/O operations for all LPARs. All models feature standard SAPs configured. The number of standard SAPs on the z14 ZR1 is two, regardless of the CPC drawer feature.

SAP configuration

A standard SAP configuration provides a well-balanced system for most environments. However, some application environments have high I/O rates, typically Transaction Processing Facility (TPF) environments. In this case, more SAPs can be ordered. Assigning of more SAPs can increase the capability of the channel subsystem to run I/O operations.

Optional other orderable (extra) SAPs

The option to order more SAPs is available on all z14 ZR1 features (FC 1066). These extra SAPs increase the capacity of the channel subsystem to run I/O operation, which is suggested for TPF environments. In z14 ZR1 systems, the maximum number of optional (extra) orderable SAPs is two, regardless of the CPC drawer feature.

3.5.7 Reserved processors

Reserved processors are defined by the PR/SM to allow for a nondisruptive capacity upgrade. Reserved processors are similar to spare logical processors, and can be shared or dedicated. Reserved PUs can be defined to an LPAR dynamically to allow for nondisruptive image upgrades.

Reserved processors can be dynamically configured online by an operating system that supports this function if enough unassigned PUs are available to satisfy the request. The PR/SM rules that govern logical processor activation remain unchanged.

By using reserved processors, you can define more logical processors than the number of available CPs, IFLs, ICFs, and zIIPs in the configuration to an LPAR. This process makes it possible to nondisruptively configure more logical processors online after more CPs, IFLs, ICFs, and zIIPs are made available concurrently. They can be made available with one of the Capacity on-demand options.

The maximum number of reserved processors that can be defined to an LPAR depends on the number of logical processors that are defined. The maximum number of logical processors plus reserved processors is 170. If the operating system in the LPAR supports the logical processor add function, reserved processors are no longer needed.

Do not define more active and reserved processors than the operating system for the LPAR can support. For more information about logical processors and reserved processors and their definitions, see 3.7, “Logical partitioning” on page 97.

3.5.8 Integrated firmware processor

An integrated firmware processor (IFP) is allocated from the pool of PUs and is available for the entire system. Unlike other characterized PUs, the IFP is standard and its definition is not controlled by the client. It is a single PU that is dedicated solely to supporting the management and service operations of the following native Peripheral Component Interconnect Express (PCIe) features:

•10GbE Remote Direct Memory Access (RDMA) over Converged Ethernet (RoCE) Express

•25GbE and 10GbE RoCE Express2

•zEnterprise Data Compression (zEDC) Express

•IBM zHyperLink Express

•Coupling Express Long Reach

The IFP is started at POR. The IFP supports Resource Group (RG) LIC to provide native PCIe I/O feature virtualization and service functions. For more information, see Appendix C, “Native Peripheral Component Interconnect Express” on page 419.

3.5.9 Sparing rules

On a z14 ZR1 system, one PU is reserved as spare. The reserved spare is available to replace any characterized PU, whether they are CP, IFL, ICF, zIIP, SAP, or IFP.

Systems with a failed PU for which no spare is available call home for a replacement. A system with a failed PU that is spared and requires an SCM to be replaced (referred to as a pending repair) can still be upgraded when sufficient PUs are available.

Transparent CP, IFL, ICF, zIIP, SAP, and IFP sparing

Depending on the feature, sparing of CP, IFL, ICF, zIIP, SAP, and IFP is transparent and does not require operating system or operator intervention.

With transparent sparing, the status of the application that was running on the failed processor is preserved. The application continues processing on a newly assigned CP, IFL, ICF, zIIP, SAP, or IFP (allocated to the spare PU) without client intervention.

Application preservation

If no spare PU is available, application preservation (z/OS only) is started. The state of the failing processor is passed to another active processor that is used by the operating system. Through operating system recovery services, the task is resumed successfully (in most cases, without client intervention).

Dynamic SAP and IFP sparing and reassignment

Dynamic recovery is provided if a failure of the SAP or IFP occurs. If the SAP or IFP fails, and if a spare PU is available, the spare PU is dynamically assigned as a new SAP or IFP. If no spare PU is available, and more than one CP is characterized, a characterized CP is reassigned as an SAP or IFP. In either case, client intervention is not required. This capability eliminates an unplanned outage and allows a service action to be deferred to a more convenient time.

3.6 Memory design

This section describes design and implementation considerations for the z14 ZR1 memory.

3.6.1 Overview

The z14 ZR1 system has only one CPC drawer. As such, not all memory upgrade scenarios are nondisruptive. Therefore, memory upgrades require a different approach than in the multi-CPC drawer systems. Considering the z14 zR1 plan ahead memory, the system flexibility, high availability can be achieved for memory upgrades.

Concurrent memory upgrades are supported up to the physical memory that is installed.

z14 ZR1 servers be configured with more physically installed memory than the initial client available capacity. Memory upgrades within the physically installed capacity can be done concurrently by LIC, and no hardware changes are required. However, memory upgrades cannot be done through CBU or On/Off CoD.

Any other memory upgrade is disruptive. Therefore, plan ahead memory is important. With memory plan ahead option, memory DIMMs are preinstalled to support a specified target planned memory size.

Note: The pre-planned (preinstalled, available for LICCC activation) memory amount cannot exceed 2TB.

Physical memory upgrades require processor drawer removal and reinstall after adding or replacing memory DIMMs. Because z14 ZR1 is a single CPC drawer system, physical memory upgrades are always disruptive.

When the total installed memory amount is larger than the customer usable memory required for a configuration, the LIC Configuration Control (LICCC) determines how much memory is used.

Memory allocation

When system is activated (POR), PR/SM determines the total installed memory and the client enabled memory. Later in the process during LPAR activation, PR/SM assigns and allocates memory to each partition according to partition image profile.

Large page support

By default, page frames are allocated with a 4 KB size. z14 zR1 servers also support large page sizes of 1 MB or 2 GB. The first z/OS release that supports 1 MB pages is z/OS V1R9. Linux on Z large pages support (1 MB) is available in SUSE Linux Enterprise Server 10 SP2 or later, Red Hat Enterprise Linux (RHEL) 5.2 or later, and Ubuntu 16.04 LTS or later.

The TLB reduces the amount of time that is required to translate a virtual address to a real address. This translation is done by dynamic address translation (DAT) mechanism when it must find the correct page for the correct address space. Each TLB entry represents one page. As with other buffers or caches, lines are discarded from the TLB on a least recently used (LRU) basis.

The worst-case translation scenario is encountered when a TLB miss occurs and the segment table (which is needed to find the page table) and the page table (which is needed to find the entry for the particular page in question) are not in cache. This case involves two complete real memory access delays plus the address translation delay. Because the duration of a processor cycle is much shorter than the duration of a memory cycle, a TLB miss is relatively costly.

It is preferable to have addresses in the TLB. With 4 K pages, holding all of the addresses for 1 MB of storage takes 256 TLB lines. When 1 MB pages are used, it takes only one TLB line. Therefore, large page size users have a much smaller TLB footprint.

Large pages allow the TLB to better represent a large working set and suffer fewer TLB misses by allowing a single TLB entry to cover more address translations.

Users of large pages are better represented in the TLB and are expected to see performance improvements in elapsed time and processor usage. These improvements occur because DAT and memory operations are part of processor busy time even though the processor waits for memory operations to complete without processing anything else in the meantime.

To overcome the processor usage that is associated with creating a 1 MB page, a process must run for some time. It also must maintain frequent memory access to keep the pertinent addresses in the TLB.

Short-running work does not overcome the processor usage. Short processes with small working sets are expected to receive little or no improvement. Long-running work with high memory-access frequency is the best candidate to benefit from large pages.

Long-running work with low memory-access frequency is less likely to maintain its entries in the TLB. However, when it does run, few address translations are required to resolve all of the memory it needs. Therefore, a long-running process can benefit even without frequent memory access.

Weigh the benefits of whether something in this category must use large pages as a result of the system-level costs of tying up real storage. A balance exists between the performance of a process that uses large pages and the performance of the remaining work on the system.

On z14 ZR1 servers, 1 MB large pages become pageable if Virtual Flash Memory⁵ is available and enabled. They are available only for 64-bit virtual private storage, such as virtual memory that is greater than 2 GB.

It is easy to assume that increasing the TLB size is a feasible option to deal with TLB-miss situations. However, this process is not as straightforward as it seems. As the size of the TLB increases, so does the processor usage that is involved in managing the TLB’s contents. Correct sizing of the TLB is subject to complex statistical modeling to find the optimal tradeoff between size and performance.

3.6.2 Main storage

Main storage is addressable by programs and storage that is not directly addressable by programs. Non-addressable storage includes the hardware system area (HSA).

Main storage provides the following functions:

•Data storage and retrieval for PUs and I/O

•Communication with PUs and I/O

•Communication with and control of optional expanded storage

•Error checking and correction

Main storage can be accessed by all processors, but cannot be shared between LPARs. Any system image (LPAR) must include a defined main storage size. This defined main storage is allocated exclusively to the LPAR during partition activation.

3.6.3 Hardware system area

The HSA is a reserved storage area that contains system LIC and configuration-dependent control blocks. On z14 ZR1 servers, the HSA has a fixed size of 64 GB and is not part of the client purchased memory.

The fixed size of the HSA eliminates planning for future expansion of the HSA because the hardware configuration definition (HCD) and input/output configuration program (IOCP) always reserves space for the following items:

•Three channel subsystems (CSSs)

•A total of 15 LPARs in the first two CSSs and 10 LPARs for the third CSS for a total of 40 LPARs per system

•Subchannel set 0 with 63.75-K devices in each CSS

•Subchannel set 1 with 64-K devices in each CSS

•Subchannel set 2 with 64-K devices in each CSS

The HSA includes sufficient reserved space to allow for dynamic I/O reconfiguration changes to the maximum capability of the processor.

3.6.4 Virtual Flash Memory

IBM Virtual Flash Memory (VFM) is the replacement for the Flash Express features that were available on the IBM zEC12, zBC12, IBM z13®, and IBM z13s. No application changes are required to change from IBM Flash Express to VFM.

On z14 ZR1, up to four Virtual Flash Memory features (FC 0614) can be ordered. One VFM feature has 512 GB on z14 ZR1.

3.7 Logical partitioning

The logical partitioning features are described in this section.

3.7.1 Overview

Logical partitioning is a function that is implemented by the PR/SM⁶. z14 ZR1 can be managed in standard or Dynamic Partition Manager (DPM) modes. DPM provides a dynamic LPAR and resource management of the z14 ZR1 and uses a graphical, interactive interface to the PR/SM.

HiperDispatch

PR/SM and z/OS work in tandem to use processor resources more efficiently. HiperDispatch is a function that combines the dispatcher actions and the knowledge that PR/SM has about the topology of the system.

Performance can be optimized by redispatching units of work to the same processor group, which keeps processes running near their cached instructions and data, and minimizes transfers of data ownership among processors in different PU SCMs.

The nested topology is returned to z/OS by the Store System Information (STSI) instruction. HiperDispatch uses the information to concentrate logical processors around shared caches (L3 at PU chip level, and L4 at drawer level), and dynamically optimizes the assignment of logical processors and units of work.

z/OS dispatcher manages multiple queues, called affinity queues, with a target number of eight processors per queue, which fits well onto a single PU chip. These queues are used to assign work to as few logical processors as are needed for an LPAR workload. Therefore, even if the LPAR is defined with many logical processors, HiperDispatch optimizes this number of processors to be near the required capacity.

Tip: z/VM V6.3 and later also support HiperDispatch, which is required for activating SMT. (z14 ZR1 supports z/VM V6.4 or newer.)

Logical partitions

PR/SM enables z14 ZR1 servers to be started for a logically partitioned operation, supporting up to 40 LPARs. Each LPAR can run its own operating system image in any image mode, independently from the other LPARs.

An LPAR can be added, removed, activated, or deactivated at any time. Changing the number of LPARs is not disruptive and does not require a POR. Certain facilities might not be available to all operating systems because the facilities might include software corequisites.

Each LPAR has the following resources that are the same as a real CPC:

•Processors

Known as logical processors, they can be defined as CPs, IFLs, ICFs, or zIIPs. They can be dedicated to an LPAR or shared among LPARs. When shared, a processor weight can be defined to provide the required level of processor resources to an LPAR. Also, the capping option can be turned on, which prevents an LPAR from acquiring more than its defined weight and limits its processor consumption.

LPARs for z/OS can have CP and zIIP logical processors. The two logical processor types can be defined as all dedicated or all shared. The zIIP support is available in z/OS.

The weight and number of online logical processors of an LPAR can be dynamically managed by the LPAR CPU Management function of the Intelligent Resource Director (IRD). These functions can be used to achieve the defined goals of this specific partition and of the overall system. The provisioning architecture of z14 ZR1 servers (as described in Chapter 8, “System upgrades” on page 281) adds another dimension to the dynamic management of LPARs.

PR/SM is enhanced to support an option to limit the amount of physical processor capacity that is used by an individual LPAR when a PU is defined as a general-purpose processor (CP) or an IFL that is shared across a set of LPARs.

This enhancement is designed to provide a physical capacity limit that is enforced as an absolute (versus relative) limit. It is not affected by changes to the logical or physical configuration of the system. This physical capacity limit can be specified in units of CPs or IFLs. The Change LPAR Controls and Customize Activation Profiles tasks on the HMC were enhanced to support this new function.

For the z/OS Workload License Charges (WLC) pricing metric and metrics that are based on it, such as Advanced Workload License Charges (AWLC), an LPAR defined capacity can be set. This defined capacity enables the soft capping function.

Workload charging introduces the capability to pay software license fees that are based on the processor use of the LPAR on which the product is running, rather than on the total capacity of the system. Consider the following points:

– In support of WLC, the user can specify a defined capacity in millions of service units (MSUs) per hour. The defined capacity sets the capacity of an individual LPAR when soft capping is selected.

The defined capacity value is specified on the Options tab in the Customize Image Profiles window.

– WLM keeps a four-hour rolling average of the processor usage of the LPAR. When the four-hour average processor consumption exceeds the defined capacity limit, WLM dynamically activates LPAR capping (soft capping). When the rolling four-hour average returns below the defined capacity, the soft cap is removed.

For more information about WLM, see System Programmer's Guide to: Workload Manager, SG24-6472.

For more information about software licensing, see 7.8, “Software licensing” on page 277.

Weight settings: When defined capacity is used to define an uncapped LPAR’s capacity, carefully consider the weight settings of that LPAR. If the weight is much smaller than the defined capacity, PR/SM uses a discontinuous cap pattern to achieve the defined capacity setting. This configuration means PR/SM alternates between capping the LPAR at the MSU value that corresponds to the relative weight settings, and no capping at all. It is best to avoid this scenario and instead attempt to establish a defined capacity that is equal or close to the relative weight.

•Memory

Memory (main storage) must be dedicated to an LPAR. The defined storage must be available during the LPAR activation; otherwise, the LPAR activation fails.

Reserved storage can be defined to an LPAR, which enables nondisruptive memory addition to and removal from an LPAR by using the LPAR dynamic storage reconfiguration (z/OS and z/VM). For more information, see 3.7.5, “LPAR dynamic storage reconfiguration” on page 106.

•Channels

Channels can be shared between LPARs by including the partition name in the partition list of a channel-path identifier (CHPID). I/O configurations are defined by the IOCP or the HCD with the CHPID mapping tool (CMT). The CMT is an optional tool that is used to map CHPIDs onto physical channel IDs (PCHIDs). PCHIDs represent the physical location of a port on a card in a PCIe+ I/O drawer.

IOCP is available on the z/OS, z/VM, and z/VSE operating systems, and as a stand-alone program on the hardware console. For more information, see IBM Z Input/Output Configuration Program User’s Guide for ICP IOCP, SB10-7172-01.

HCD is available on the z/OS and z/VM operating systems. Consult the appropriate 3907DEVICE Preventive Service Planning (PSP) buckets before implementation.

Fibre Channel connection (FICON) channels can be managed by the Dynamic CHPID Management (DCM) function of the Intelligent Resource Director. DCM enables the system to respond to ever-changing channel requirements by moving channels from lesser-used control units to more heavily used control units, as needed.

Modes of operation

The modes of operation are listed in Table 3-4. All available mode combinations, including their operating modes and processor types, operating systems, and addressing modes, also are listed. Only the currently supported versions of operating systems are considered.

Table 3-4 z14 modes of operation

Image mode	PU type	Operating system	Addressing mode
z/Architecture (General)¹	CP and zIIP	z/OS z/VM	64-bit
z/Architecture (General)¹	CP	z/VSE Linux on Z z/TPF	64-bit
Coupling facility	ICF or CP	CFCC	64-bit
Linux only	IFL or CP	Linux on Z (64-bit)	64-bit
		z/VM	64-bit
		Linux on Z (31-bit)	31-bit
z/VM	CP, IFL, zIIP, or ICF	z/VM	64-bit
SSC²	IFL or CP	z/VSE Network Appliance³	64 bit

¹ Formerly ESA/390 mode

² Secure Service Container

³ More appliances to be announced and supported in the future

For more information about operating system support, see Chapter 7, “Operating system support” on page 209.

Logically partitioned mode

If the z14 ZR1 server runs in LPAR mode, each of the 40 LPARs can be defined to operate in one of the following image modes:

•z/Architecture (General) mode to run the following systems:

– A z/Architecture operating system, on dedicated or shared CPs

– ESA/390 operating systems

– A Linux on Z operating system, on dedicated or shared CPs

– z/OS, on any of the following processor units:

• Dedicated or shared CPs

• Dedicated CPs and dedicated zIIPs

• Shared CPs and shared zIIPs

zIIP usage: zIIPs can be defined to an z/Architecture mode or z/VM mode image, as listed in Table 3-4 on page 99. However, zIIPs are used only by z/OS. Other operating systems cannot use zIIPs, even if they are defined to the LPAR. z/VM¹ supports real and virtual zIIPs to guest z/OS systems.

¹ z/VM V6R4 or newer is supported on IBM z14 ZR1.

•z/Architecture (General) mode is also used to run the z/TPF operating system on dedicated or shared CPs

•Coupling facility mode, by loading the CFCC code into the LPAR that is defined as one of the following types:

– Dedicated or shared CPs

– Dedicated or shared ICFs

•LINUX only mode to run the following systems:

– A Linux on Z operating system, on either of the following types:

• Dedicated or shared IFLs

• Dedicated or shared CPs

– A z/VM operating system, on either of the following types:

• Dedicated or shared IFLs

• Dedicated or shared CPs

•z/VM mode to run z/VM on dedicated or shared CPs or IFLs, plus zIIPs and ICFs

•SSC (Secure Service Container) mode LPAR can run on:

– Dedicated or shared CPs

– Dedicated or shared IFLs

All LPAR modes, required characterized PUs, operating systems, and the PU characterizations that can be configured to an LPAR image are listed in Table 3-5. The available combinations of dedicated (DED) and shared (SHR) processors are also included. For all combinations, an LPAR also can include reserved processors that are defined, which allows for nondisruptive LPAR upgrades.

Table 3-5 LPAR mode and PU usage

LPAR mode	PU type	Operating systems	PUs usage
z/Architecture (General)	CPs	z/Architecture operating systems (z/OS, z/VSE, z/TPF) Linux on Z	CPs DED or CPs SHR
z/Architecture (General)	CPs and zIIPs	z/OS z/VM (V6R3 and later for guest exploitation). ESA/390 operating systems¹	CPs DED or zIIPs DED or CPs SHR or zIIPs SHR
Coupling facility	ICFs or CPs	CFCC	ICFs DED or ICFs SHR or CPs DED or CPs SHR
LINUX only	IFLs or CPs	Linux on Z z/VM	IFLs DED or IFLs SHR or CPs DED or CPs SHR
z/VM	CPs, IFLs, zIIPs, or ICFs	z/VM (V6R4 and later)	All PUs must be SHR or DED
SSC²	IFLs, or CPs	IBM zAware z/VSE Network Appliance	IFLs DED or IFLs SHR or CPs DED or CPs SHR

¹ ESA/390 operating systems cannot be IPL’ed on z14 ZR1. Limited support for z/VM guests.

² Secure Service Container

Dynamically adding or deleting a logical partition name

Dynamically adding or deleting an LPAR name is the ability to add or delete LPARs and their associated I/O resources to or from the configuration without a POR.

The extra channel subsystem and multiple image facility (MIF) image ID pairs (CSSID/MIFID) can be assigned later to an LPAR for use (or later removed). This process can be done through dynamic I/O commands by using the HCD. At the same time, required channels must be defined for the new LPAR.

Partition profile: Cryptographic coprocessors are not tied to partition numbers or MIF IDs. They are set up with Adjunct Processor (AP) numbers and domain indexes. These numbers are assigned to a partition profile of a specific name. The client assigns these AP numbers and domains to the partitions and continues to have the responsibility to clear them out when their profiles change.

Adding logical processors to a logical partition

Logical processors can be concurrently added to an LPAR by defining them as reserved in the image profile and later configuring them online to the operating system by using the appropriate console commands. Logical processors also can be concurrently added to a logical partition dynamically by using the Support Element (SE) “Logical Processor Add” function under the CPC Operational Customization task. This SE function allows the initial and reserved processor values to be dynamically changed. The operating system must support the dynamic addition of these resources. In z/OS, this support is available since Version 1 Release 10 (z/OS V1.10), while z/VM supports this addition since z/VM V5.4, and z/VSE since V4.3.

Adding a crypto feature to a logical partition

You can plan the addition of Crypto Express6S(5S) features to an LPAR on the crypto page in the image profile by defining the Cryptographic Candidate List, and the Usage and Control Domain indexes, in the partition profile. By using the Change LPAR Cryptographic Controls task, you can add crypto adapters dynamically to an LPAR without an outage of the LPAR. Also, dynamic deletion or moving of these features does not require pre-planning. Support is provided in z/OS, z/VM, z/VSE, Secure Service Container (based on appliance requirements), and Linux on Z.

LPAR dynamic PU reassignment

The system configuration is enhanced to optimize the PU-to-CPC drawer assignment of physical processors dynamically. The initial assignment of client-usable physical processors to PU SCMs can change dynamically to better suit the LPAR configurations that are in use.

Swapping of specialty engines and general processors with each other, with spare PUs, or with both, can occur as the system attempts to compact LPAR configurations into physical configurations that span the least number of PU SCMs.

LPAR dynamic PU reassignment can swap client processors of different types between PU SCMs. For example, reassignment can swap an IFL on a PU SCM 1 with a CP on PU SCM 2. Swaps can also occur between PU SCMs within different PU clusters and can include spare PU. The goals are to pack the LPAR on fewer PU chips, based on the z14 ZR1 PU SCMs topology. The effect of this process is evident in dedicated and shared LPARs that use HiperDispatch.

LPAR dynamic PU reassignment is transparent to operating systems.

LPAR group capacity limit (LPAR group absolute capping)

The group capacity limit feature allows the definition of a group of LPARs on a z14 ZR1 system, and limits the combined capacity usage by those LPARs. This process allows the system to manage the group so that the group capacity limits in MSUs per hour are not exceeded. To use this feature, you must be running z/OS V1.10 or later in the all LPARs in the group.

PR/SM and WLM work together to enforce the capacity that is defined for the group and the capacity that is optionally defined for each individual LPAR.

LPAR absolute capping

Absolute capping is a logical partition control that was made available with zEC12 and is supported on z14 ZR1 servers. With this support, PR/SM and the HMC are enhanced to support a new option to limit the amount of physical processor capacity that is used by an individual LPAR when a PU is defined as a general-purpose processor (CP), zIIP, or an IFL processor that is shared across a set of LPARs.

Unlike traditional LPAR capping, absolute capping is designed to provide a physical capacity limit that is enforced as an absolute (versus relative) value that is not affected by changes to the virtual or physical configuration of the system.

Absolute capping provides an optional maximum capacity setting for logical partitions that is specified in the absolute processors capacity (for example, 5.00 CPs or 2.75 IFLs). This setting is specified independently by processor type (namely CPs, zIIPs, and IFLs) and provides an enforceable upper limit on the amount of the specified processor type that can be used in a partition.

Absolute capping is ideal for processor types and operating systems that the z/OS WLM cannot control. Absolute capping is not intended as a replacement for defined capacity or group capacity for z/OS, which are managed by WLM.

Absolute capping can be used with any z/OS, z/VM, or Linux on z LPAR that is running on an IBM Z server. If specified for a z/OS LPAR, it can be used concurrently with defined capacity or group capacity management for z/OS. When used concurrently, the absolute capacity limit becomes effective before other capping controls.

Dynamic Partition Manager mode

Dynamic Partition Manager (DPM) is an IBM Z server operation mode that provides a simplified approach to create and manage virtualized environments, which reduces the barriers of its adoption for new and existing customers.

The implementation provides built-in integrated capabilities that allow advanced virtualization management on IBM Z servers. With DPM, you can use your Linux and virtualization skills while taking advantage of the full value of IBM Z hardware, robustness, and security in a workload optimized environment.

DPM provides facilities to define and run virtualized computing systems by using a firmware-managed environment that coordinates the physical system resources that are shared by the partitions. The partitions’ resources include processors, memory, network, storage, crypto, and accelerators.

DPM provides a new mode of operation for IBM Z servers that provide the following services:

•Facilitates defining, configuring, and operating PR/SM LPARs in a similar way to how someone performs these tasks on another platform.

•Lays the foundation for a general IBM Z new user experience.

DPM is not another hypervisor for IBM Z servers. DPM uses the PR/SM hypervisor infrastructure and provides an intelligent interface that allows customers to define, use, and operate the platform virtualization without IBM Z experience or skills. For more information about DPM, see Appendix E, “IBM Dynamic Partition Manager” on page 451.

3.7.2 Storage operations

In z14 ZR1 servers, memory can be assigned as main storage supporting up to 40 LPARs. Before you activate an LPAR, main storage must be defined to the LPAR. All installed storage can be configured as main storage. Each z/OS individual LPAR can be defined with a maximum of 4 TB of main storage. z/VM V6R4 supports 2 TB of main storage.

Main storage can be dynamically assigned to expanded storage and back to main storage as needed without a POR.

Memory cannot be shared between system images. It is possible to dynamically reallocate storage resources for z/Architecture LPARs that run operating systems that support dynamic storage reconfiguration (DSR). This process is supported by z/OS and z/VM. z/VM, in turn, virtualizes this support to its guests. For more information, see 3.7.5, “LPAR dynamic storage reconfiguration” on page 106.

Operating systems that run as guests of z/VM can use the z/VM capability of implementing virtual memory to guest virtual machines. The z/VM dedicated real storage can be shared between guest operating systems.

The z14 ZR1 storage allocation and usage possibilities, depending on the image mode, are listed in Table 3-6.

Table 3-6 Main storage definition and usage possibilities

Image mode	Architecture mode (addressability)	Maximum main storage
Image mode	Architecture mode (addressability)	Architecture	z14 ZR1 definition
z/Architecture (General)	z/Architecture (64-bit)	16 EB	4 TB
Coupling facility	CFCC (64-bit)	1.5 TB	1 TB
Linux only	z/Architecture (64-bit)	16 EB	2 TB
z/VM	z/Architecture (64-bit)	16 EB	2 TB
SSC¹	z/Architecture (64-bit)	16 EB	2 TB

¹ Secure Service Container

The following modes are provided:

•z/Architecture mode

In z/Architecture (General, formerly ESA/390 or ESA/390-TPF) mode, storage addressing is 64-bit, which allows for virtual addresses up to 16 exabytes (16 EB). The 64-bit architecture theoretically allows a maximum of 16 EB to be used as main storage. However, the current main storage limit for LPARs is 8 TB for z14 ZR1. The operating system that runs in z/Architecture mode must support the real storage. Currently, z/OS supports up to 4 TB⁷ of real storage (z/OS V2R1 and later releases).

•CF mode

In CF mode, storage addressing is 64 bit for a CF image that runs at CFCC Level 12 or later. This configuration allows for an addressing range up to 16 EB. However, the current z14 ZR1 definition limit for CF LPARs is 1 TB of storage. The following CFCC levels are supported in a Sysplex with IBM z14 ZR1:

– CFCC Level 23, available on z14 (Driver level 36)

– CFCC Level 22, available on z14 ZR1 (Driver level 32)

– CFCC Level 21, available on z13 and z13s (Driver Level 27)

– CFCC Level 20, available for z13 servers with Driver Level 22

Restriction: z14 ZR1 does not support direct coupling connectivity to zEC12/zBC12 systems.

For more information, see 3.9.1, “Coupling Facility Control Code” on page 110.

Expanded storage cannot be defined for a CF image. Only IBM CFCC can run in CF mode.

•Linux only mode

In Linux only mode, storage addressing can be 31 bit or 64 bit, depending on the operating system architecture and the operating system configuration.

Only Linux and z/VM operating systems can run in Linux only mode. Linux on Z 64-bit distributions (SUSE Linux Enterprise Server 10 and later, Red Hat RHEL 5 and later, and Ubuntu 16.04 LTS and later) use 64-bit addressing and operate in z/Architecture mode. z/VM also uses 64-bit addressing and operates in z/Architecture mode.

•z/VM mode

In z/VM mode, certain types of processor units can be defined within one LPAR. This feature increases flexibility and simplifies systems management by allowing z/VM to run the following tasks in the same z/VM LPAR:

– Manage guests to operate Linux on Z on IFLs

– Operate z/VSE and z/OS on CPs

– Offload z/OS system software processor usage, such as Db2 workloads on zIIPs

– Provide an economical Java execution environment under z/OS on zIIPs

•Secure Service Container (SSC) mode

In SSC mode, storage addressing is 64 bit for an embedded product. This configuration allows for an addressing range up to 16 EB. However, the current z14 ZR1 definition limit for LPARs is 8 TB of storage (physical memory limit).

Currently, the z/VSE Network Appliance (available on z14 ZR1) runs in an SSC LPAR.

3.7.3 Reserved storage

Reserved storage can be optionally defined to an LPAR, which allows a nondisruptive image memory upgrade for this partition. Reserved storage can be defined to central and expanded storage, and to any image mode except CF mode.

An LPAR must define an amount of main storage and optionally (if not a CF image), an amount of expanded storage. Main storage and expanded storage can have the following storage sizes defined:

•The initial value is the storage size that is allocated to the partition when it is activated.

•The reserved value is another storage capacity beyond its initial storage size that an LPAR can acquire dynamically. The reserved storage sizes that are defined to an LPAR do not need be available when the partition is activated. They are predefined storage sizes to allow a storage increase, from an LPAR point of view.

Without the reserved storage definition, an LPAR storage upgrade is a disruptive process that requires the following steps:

1. Partition deactivation.

2. An initial storage size definition change.

3. Partition activation.

The extra storage capacity for an LPAR upgrade can come from the following sources:

•Any unused available storage

•Another partition that features released storage

•A memory upgrade

A concurrent LPAR storage upgrade uses DSR. z/OS uses the reconfigurable storage unit (RSU) definition to add or remove storage units in a nondisruptive way.

z/VM V5R4⁸ and later releases support the dynamic addition of memory to a running LPAR by using reserved storage. It also virtualizes this support to its guests. Removing storage from the z/VM LPAR is disruptive. Removing memory from a z/VM guest is not disruptive to the z/VM LPAR.

SLES 11 and later supports concurrent add and remove.

3.7.4 Logical partition storage granularity

Granularity of main storage for an LPAR depends on the largest main storage amount that is defined for initial or reserved main storage, as listed in Table 3-7.

Table 3-7 Logical partition main storage granularity (z14)

Largest main storage amount	Main storage granularity
Main storage amount </= 512 GB	1 GB
512 GB < main storage amount </= 1 TB	2 GB
1 TB < main storage amount </= 2 TB	4 GB
2 TB < main storage amount </= 4 TB	8 GB
4 TB < main storage amount </= 8 TB	16 GB

LPAR storage granularity information is required for LPAR image setup and for z/OS RSU definition. LPARs are limited to a maximum size of 8 TB of main storage. However, the maximum amount of memory that is supported by z/OS V2.3 at the time of this writing is 4 TB; for z/VM V6R4 and V7R1, the limit is 2 TB.

3.7.5 LPAR dynamic storage reconfiguration

Dynamic storage reconfiguration on z14 ZR1 servers allows an operating system that is running on an LPAR to add (nondisruptively) its reserved storage amount to its configuration. This process can occur only if unused storage exists. This unused storage can be obtained when another LPAR releases storage, or when a concurrent memory upgrade occurs.

With dynamic storage reconfiguration, the unused storage need not be continuous.

When an operating system that is running on an LPAR assigns a storage increment to its configuration, PR/SM determines whether any free storage increments are available. PR/SM then dynamically brings the storage online.

PR/SM dynamically takes offline a storage increment and makes it available to other partitions when an operating system that is running on an LPAR releases a storage increment.

3.8 Intelligent Resource Director

Intelligent Resource Director (IRD) is a z14 ZR1 and IBM Z capability that is used by z/OS only. IRD is a function that optimizes processor and channel resource utilization across LPARs within a single IBM Z server.

This feature extends the concept of goal-oriented resource management. It does so by grouping system images that are on the same z14 ZR1 or Z servers that are running in LPAR mode, and in the same Parallel Sysplex, into an LPAR cluster. This configuration allows WLM to manage resources (processor and I/O) across the entire cluster of system images and not only in one single image.

An LPAR cluster is shown in Figure 3-11. It contains three z/OS images and one Linux image that is managed by the cluster. Included as part of the entire Parallel Sysplex is another z/OS image and a CF image. In this example, the scope over which IRD has control is the defined LPAR cluster.

Figure 3-11 IRD LPAR cluster example

IRD features the following characteristics:

•IRD processor management

WLM dynamically adjusts the number of logical processors within an LPAR and the processor weight that is based on the WLM policy. The ability to move the processor weights across an LPAR cluster provides processing power where it is most needed, based on WLM goal mode policy.

The processor management function is automatically deactivated when HiperDispatch is active. However, the LPAR weight management function remains active with IRD with HiperDispatch.

For more information about HiperDispatch, see 3.7, “Logical partitioning” on page 97.

HiperDispatch manages the number of logical CPs in use. It adjusts the number of logical processors within an LPAR to achieve the optimal balance between CP resources and the requirements of the workload.

HiperDispatch also adjusts the number of logical processors. The goal is to map the logical processor to as few physical processors as possible. This configuration uses the processor resources more efficiently by trying to stay within the local cache structure. Doing so makes efficient use of the advantages of the high-frequency microprocessors, and improves throughput and response times.

•Dynamic channel path management (DCM)

DCM moves FICON channel bandwidth between disk control units to address current processing needs. z14 ZR1 servers support DCM within a channel subsystem.

•Channel subsystem priority queuing

This function on z14 ZR1 and Z servers allows the priority queuing of I/O requests in the channel subsystem and the specification of relative priority among LPARs. When running in goal mode, WLM sets the priority for an LPAR and coordinates this activity among clustered LPARs.

For more information about implementing LPAR processor management under IRD, see z/OS Intelligent Resource Director, SG24-5952.

3.9 Clustering technology

Parallel Sysplex is the clustering technology that is used with z14 ZR1 servers. The components of a Parallel Sysplex as implemented within the z/Architecture are shown in Figure 3-12, which is one of many possible Parallel Sysplex configurations.

Figure 3-12 Sysplex hardware overview

A z14 M0x system that contains multiple z/OS sysplex partitions also is shown in Figure 3-12. It contains an internal CF (CF02), a z14 ZR1 system that contains a stand-alone CF (CF01), and a z13/z13s that contains multiple z/OS sysplex partitions.

STP over coupling links provides time synchronization to all systems. The appropriate CF link technology (Integrated Coupling Adapter, Coupling Express Long reach, 1x InfiniBand, or 12x InfiniBand) selection, depends on the system configuration and how distant they are physically. The ISC-3 coupling link is not supported since z13 servers. For more information about link technologies, see “Coupling links” on page 153.

Important: New for z14 ZR1, the z14 ZR1 supports only PCIe-based coupling technology (ICA SR and Coupling Express Long Reach). As a consequence, the z14 ZR1 cannot be connected to a zEC12 or zBC12 server directly; therefore, they cannot be part of the same Parallel Sysplex cluster.

Parallel Sysplex technology is an enabling technology that allows highly reliable, redundant, and robust IBM Z technology to achieve near-continuous availability. A Parallel Sysplex makes up one or more (z/OS) operating system images that are coupled through one or more Coupling Facilities. The images can be combined to form clusters.

A correctly configured Parallel Sysplex cluster maximizes availability in the following ways:

•Continuous (application) availability: Changes can be introduced, such as software upgrades, one image at a time, while the remaining images continue to process work. For more information, see Parallel Sysplex Application Considerations, SG24-6523.

•High capacity: 2 - 32 z/OS images in a sysplex.

•Dynamic workload balancing: Because it is viewed as a single logical resource, work can be directed to any similar operating system image in a Parallel Sysplex cluster that has available capacity.

•Systems management: The architecture provides the infrastructure to satisfy client requirements for continuous availability. It also provides techniques for achieving simplified systems management consistent with this requirement.

•Resource sharing: Several base (z/OS) components use the CF shared storage. This configuration enables sharing of physical resources with significant improvements in cost, performance, and simplified systems management.

•Single system image: The collection of system images in the Parallel Sysplex is displayed as a single entity to the operator, user, and database administrator. A single system image ensures reduced complexity from operational and definition perspectives.

•N-1 support: Multiple hardware generations (normally three) are supported in the same Parallel Sysplex. This configuration provides for a gradual evolution of the systems in the Parallel Sysplex without having to change all of them simultaneously. Similarly, software support for multiple releases or versions is supported. However, a direct connection between z14 ZR1 with a N-2 Z servers is not supported because z14 ZR1 does not have InfiniBand coupling.

Through state-of-the-art cluster technology, the power of multiple images can be harnessed to work together on common workloads. The IBM Z Parallel Sysplex cluster takes the commercial strengths of the platform to improved levels of system management, competitive price for performance, scalable growth, and continuous availability.

3.9.1 Coupling Facility Control Code

The LPAR that is running the Coupling Facility Control Code (CFCC) can be on z14, z14 ZR1, z13, z13s, zEC12, and zBC12 systems. For more information about CFCC requirements for supported systems, see “Coupling facility and CFCC considerations” on page 240.

Consideration: z14 ZR1 servers cannot coexist in the same sysplex with zEC12/zBC12 and previous systems. The introduction of z14 ZR1 servers into existing installations might require more planning.

CFCC Level 23

CFCC level 23 is delivered on the z14 ZR1 with driver level 36. CFCC Level 23 introduces the following enhancements:

•Asynchronous cross-invalidate (XI) of CF cache structures. Requires PTF support for z/OS and explicit data manager support (Db2 V12 with PTFs):

– Instead of performing XI signals synchronously on every cache update request that causes them, data managers can “opt in” for the CF to perform these XIs asynchronously (and then sync them up with the CF at or before transaction completion). Data integrity is maintained if all XI signals complete by the time transaction locks are released.

– Results in faster completion of cache update CF requests, especially with cross-site distance that is involved.

– Provides improved cache structure service times and coupling efficiency

•Coupling Facility hang detect enhancements provide a significant reduction in failure scope and client disruption (CF-level to structure-level), with no loss of FFDC collection capability:

– When a hang is detected, the CF confines the scope of the failure in most cases to “structure damage” for the single CF structure the hung command was processing against, capture diagnostics with a non-disruptive CF dump, and continue operating without aborting or rebooting the CF image.

– Provides a significant reduction in failure scope and client disruption (CF-level to structure-level), with no loss of FFDC collection capability.

•Coupling Facility ECR granular latching

– With this support, most CF list and lock structure ECR processing no longer uses structure-wide latching. It serializes its execution by using the normal structure object latches that all mainline commands use.

– Eliminates the performance degradation caused by structure-wide latching.

– A small number of “edge conditions” in ECR processing still require structure-wide latching to be used to serialize them.

– Cache structure ECR processing continues to require and use structure-wide latches for its serialization.

z14 ZR1 servers with CFCC Level 23 require z/OS V1R13 or later, and z/VM V6R4 or later for virtual guest coupling.

CFCC Level 22

CFCC level 22 is delivered on the z14 ZR1 servers with driver level D32. CFCC Level 22 introduces the following enhancements:

•CF Enhancements:

– CF structure encryption.

CF Structure encryption is transparent to CF-using middleware and applications, while CF users are unaware of and not involved in the encryption. All data and adjunct data that flows between z/OS and the CF is encrypted. The intent is to encrypt all data that might be sensitive.

Internal control information and related request metadata are not encrypted, including locks and lock structures.

z/OS generates the required structure-related encryption keys and does much of the key management automatically by using CFRM that uses secure, protected keys (never clear keys). Secure keys maintained in CFRM couple dataset.

•CF Asynchronous Duplexing for Lock Structures:

– New asynchronous duplexing protocol for lock structures:

• z/OS sends command to primary CF only

• Primary CF processes command and returns result

• Primary CF forwards description of required updates to secondary CF

• Secondary CF updates secondary structure instance asynchronously

– Provided for lock structures only:

• z/OS V2.2 SPE with PTFs for APAR OA47796

• Db2 V12 with PTFs

• Most performance-sensitive structures for duplexing

– Benefit/Value:

• Db2 locking receives performance similar to simplex operations

• Reduces CPU and CF link overhead

• Avoids the overhead of synchronous protocol handshakes on every update

• Duplexing failover much faster than log-based recovery

– Targeted at multi-site clients who run split workloads at distance to make duplexing lock structures at distance practical.

•CF Processor Scalability:

– CF work management and dispatcher changes to allow improved efficiency as processors are added to scale up the capacity of a CF image.

– Functionally specialized ICF processors that operate for CF images having more than a threshold number of dedicated processors defined for them:

• One functionally specialized processor for inspecting suspended commands.

• One functionally specialized processor for pulling in new commands.

• The remaining processors are non-specialized for general CF request processing.

– Avoids many inter-processor contentions that were associated with CF dispatching.

•Enable systems management applications to collect valid CF LPAR information through z/OS BCPii:

– System Type (CFCC)

– System Level (CFCC LEVEL)

– Dynamic Dispatch settings to indicate CF state (dedicated, shared, and thin interrupt), which are useful when investigating functional performance problems

z14 ZR1 systems with CFCC Level 22 require z/OS V1R13 with PTFs or later, and z/VM V6R4 or later for guest virtual coupling.

To support an upgrade from one CFCC level to the next, different levels of CFCC can be run concurrently while the CF LPARs are running on different servers. CF LPARs that run on the same server share the CFCC level.

z14 ZR1 servers (CFCC level 22) can coexist in a sysplex with CFCC levels 20 and 21.

The CFCC is implemented by using the active wait technique. This technique means that the CFCC is always running (processing or searching for service) and never enters a wait state.

This setting also means that the CF Control Code uses all the processor capacity (cycles) that are available for the CF LPAR. If the LPAR that is running the CFCC includes only dedicated processors (CPs or ICFs), all processor capacity (cycles) can be used. However, this configuration can be an issue if the LPAR that is running the CFCC also includes shared processors. Therefore, enable dynamic dispatching on the CF LPAR.

Starting with CFCC Level 19 and Coupling Thin Interrupts, shared-processor CF can provide more consistent CF service time and acceptable usage in a broader range of configurations. For more information, see 3.9.3, “Dynamic CF dispatching” on page 113.

Performance consideration: Dedicated processor CF still provides the best CF image performance for production environments.

CF structure sizing changes are expected when moving from CFCC Level 17 (or earlier) to CFCC Level 20 or later. Review the CF structure size by using the CFSizer tool.

For more information about the recommended CFCC levels, see the current exception letter that is published on Resource Link (login required).

3.9.2 Coupling Thin Interrupts

CFCC Level 19 introduced Coupling Thin Interrupts to improve performance in environments that share CF engines. Although dedicated engines are preferable to obtain the best CF performance, Coupling Thin Interrupts can help facilitate the use of a shared pool of engines, which helps to lower hardware acquisition costs.

The interrupt causes a shared logical processor CF partition to be dispatched by PR/SM (if it is not already dispatched), which allows the request or signal to be processed in a more timely manner. The CF relinquishes control when work is exhausted or when PR/SM takes the physical processor away from the logical processor.

The use of Coupling Thin Interrupts is controlled by the new DYNDISP specification.

You can experience CF response time improvements or more consistent CF response time when CFs are used with shared engines. This improvement can allow more environments with multiple CF images to coexist in a server, and share CF engines with reasonable performance.

The response time for asynchronous CF requests can also be improved as a result of the use of Coupling Thin Interrupts on the z/OS host system, regardless of whether the CF is using shared or dedicated engines.

3.9.3 Dynamic CF dispatching

Dynamic CF dispatching uses the following process on a CF:

1. If no work is available, CF enters a wait state (by time).

2. After an elapsed time, CF wakes up to see whether any new work is available (that is, if any requests are in the CF Receiver buffer).

3. If no work exists, CF sleeps again for a longer period.

4. If new work is available, CF enters the normal active wait until no other work is available. After all work is complete, the process starts again.

With the introduction of the Coupling Thin Interrupt support, which is used only when the CF partition is using shared engines and the new DYNDISP=THININTERRUPT parameter, the CFCC code is changed to handle these interrupts correctly. CFCC was also changed to relinquish voluntarily control of the processor whenever it runs out of work to do. It relies on Coupling Thin Interrupts to dispatch the image again in a timely fashion when new work (or new signals) arrives at the CF to be processed.

This capability allows ICF engines to be shared by several CF images. In this environment, it provides faster and far more consistent CF service times. It can also provide performance that is reasonably close to dedicated-engine CF performance if the CF engines are not CF Control Code thin interrupts.

The introduction of thin interrupts allows a CF to run by using a shared processor while maintaining good performance. The shared engine is allowed to be undispatched when no more work exists, as in the past. The new thin interrupt now gets the shared processor that is dispatched when a command or duplexing signal is presented to the shared engine.

This function saves processor cycles and is an excellent option to be used by a production backup CF or a testing environment CF. This function is activated by using the CFCC DYNDISP ON command.

The CPs can run z/OS operating system images and CF images. For software charging reasons, generally use only ICF processors to run CF images.

Dynamic CF dispatching is shown in Figure 3-13.

Figure 3-13 Dynamic CF dispatching (shared CPs or shared ICF PUs)

For more information about CF configurations, see Coupling Facility Configuration Options, GF22-5042.

3.10 Virtual Flash Memory

Flash Express is not supported on z14 ZR1. This feature was replaced by IBM Z Virtual Flash Memory (zVFM), FC 0614.

3.10.1 IBM Z Virtual Flash Memory overview

Virtual Flash Memory (VFM) is an IBM solution to replace the external zFlash Express feature with support that is based on main memory.

The “storage class memory” that is provided by Flash Express adapters is replaced with memory allocated from main memory (VFM).

VFM is designed to help improve availability and handling of paging workload spikes when z/OS V2.1, V2.2, or V2.3 is running. With this support, z/OS is designed to help improve system availability and responsiveness by using VFM across transitional workload events, such as market openings and diagnostic data collection. z/OS is also designed to help improve processor performance by supporting middleware use of pageable large (1 MB) pages.

VFM can also be used in CF images to provide extended capacity and availability for workloads that use IBM WebSphere MQ Shared Queues structures. The use of VFM can help availability by reducing latency from paging delays that can occur at the start of the workday or during other transitional periods. It is also designed to eliminate delays that can occur when collecting diagnostic data during failures.

3.10.2 VFM feature

A VFM feature (FC 0614) has 512 GB on z14 ZR1. The maximum number of VFM features is four per z14 ZR1 server.

3.10.3 VFM administration

The allocation and definition information of VFM for all partitions is viewed through the Storage Information panel under the Operational Customization panel.

Tip: For LPARs that require VFM, the maximum amount (2 TB for z14 ZR1) should be defined for every LPAR where a supporting OS might run, even if the initial amount is zero. Defining the maximum amount of supported VFM allows the dynamic addition of the VFM later without requiring an LPAR reactivation.

VFM is much simpler to manage (HMC task) and no hardware repair and verify (no cables and no adapters) are needed. Also, because this feature is part of internal memory, VFM is protected by RAIM and ECC and can provide better performance because no I/O to an attached adapter occurs.

Note: Use cases for FlashExpress did not change (for example, z/OS paging and CF shared queue overflow). Instead, they transparently benefit from the changes in the hardware implementation. No option is available for VFM plan ahead.

¹ Federal Information Processing Standard (FIPS)140-2 Security Requirements for Cryptographic Modules.

² The CPC drawer always includes one System Controller Single Chip Module (SC SCM).

³ In addition to optional SMT support for zIIPs and IFLs, z14 introduced SMT as default for SAPs (not user controllable).

⁴ IBM z Systems Application Assist Processors (zAAPs) are not available on z14 ZR1 servers. A zAAP workload is dispatched to available zIIPs (zAAP on zIIP capability).

⁵ Virtual Flash Memory replaced IBM zFlash Express for z14. No carry forward of zFlash Express exists.

⁶ PR/SM -Processor Resource/Systems Manager.

⁷ 1 TB for z/OS V1R13.

⁸ z14 ZR1 supports z/VM 6.4 or newer.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Chapter 3. Central processor complex system design

Create new playlist

Sign In

Sign Up

Table of Contents for
Chapter 3. Central processor complex system design