Chapter 7. CICS TS for z/OS V5.3

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

CICS TS for z/OS V5.3

The CICS Transaction Server (TS) for z/OS V5.3 release introduces a significant number of performance improvements. Included in the CICS V5.3 performance report are the following subject areas:

•Key performance benchmarks that are presented as a comparison against the CICS TS V5.2 release.

•An outline of improvements made regarding the threadsafe characteristics of the CICS run time.

•Details of the changes that are made to performance-critical CICS initialization parameters, and the effect of these updates.

•Description of all the updated statistics and monitoring fields.

•Benchmarks that document improvements in XML and JavaScript Object Notation (JSON) web services.

•A description of how CICS can protect itself from unconstrained resource demand from inbound HTTP requests.

•High-level views of new functionality that was introduced in the CICS V5.3 release, including performance benchmark results where appropriate.

This chapter includes the following topics:

•7.1, “Introduction” on page 109

•7.2, “Release-to-release comparisons” on page 109

•7.3, “Improvements in threadsafety” on page 111

•7.4, “Changes to system initialization parameters” on page 113

•7.5, “Enhanced instrumentation” on page 114

•7.6, “Low-level CICS optimizations” on page 117

•7.7, “Web support and web service optimization” on page 121

•7.8, “Java workloads” on page 122

•7.9, “Java 8 performance” on page 127

•7.10, “Simultaneous multithreading with Java workloads” on page 130

•7.11, “Reporting of CPU time to z/OS Workload Manager” on page 133

•7.12, “z/OS Connect for CICS” on page 134

•7.13, “HTTP flow control” on page 140

•7.14, “High transaction rate performance study” on page 143

7.1 Introduction

When the results were compiled for this chapter, the workloads were run on an IBM z13™ model NE1 (machine type 2964). A maximum of 32 dedicated central processors (CPs) were available on the measured logical partition (LPAR), with a maximum of 4 dedicated CPs available to the LPAR that was used to simulate users. These LPARs are configured as part of a Parallel Sysplex. An internal coupling facility was co-located on the same central processor complex (CPC) as the measurement and driving LPARs. They were connected by using internal coupling peer (ICP) links. An IBM System Storage DS8870 (machine type 2424) was used to provide external storage.

This chapter presents the results of several performance benchmarks when run in a CICS TS for z/OS V5.3 environment. Unless otherwise stated in the results, the CICS V5.3 environment was the code that was available at general availability (GA) time. Several of the performance benchmarks are presented in the context of a comparison against CICS TS V5.2. The CICS TS V5.2 environment contained all PTFs that were issued before 10 March 2015. All LPARs used z/OS V2.1.

For more information about performance terms that are used in this chapter, see Chapter 1, “Performance terminology” on page 3. For more information about the test methodology that was used, see Chapter 2, “Test methodology” on page 11. For more information about the workloads that were used, see Chapter 3, “Workload descriptions” on page 19.

Where reference is made to an LSPR processor equivalent, the indicated machine type and model can be found in the large systems performance reference (LSPR) document. For more information about obtaining and the use of LSPR data, see 1.3, “Large Systems Performance Reference” on page 6.

7.2 Release-to-release comparisons

This section describes some of the results from a selection of regression workloads that are used to benchmark development releases of CICS TS. For more information about the use of regression workloads, see Chapter 3, “Workload descriptions” on page 19.

7.2.1 Data Systems Workload dynamic routing

The Data Systems Workload (DSW) dynamic routing workload is used in 7.6, “Low-level CICS optimizations” on page 117 to demonstrate several performance benefits that are combined to reduce the overall CPU cost per transaction. For more information about a comparison between CICS TS V5.2 and CICS TS V5.3 performance, see 7.6, “Low-level CICS optimizations” on page 117.

7.2.2 RTW threadsafe

This section presents the performance figures for the threadsafe variant of the Rational Transactional Workload (RTW), as described in 3.3, “Relational Transactional Workload” on page 23.

Table 7-1 lists the results of the RTW threadsafe workload that uses the CICS TS V5.2 release. Table 7-2 lists the same figures for the CICS TS V5.3 release.

Table 7-1 Performance results for CICS TS V5.2 with RTW threadsafe workload

ETR	CICS CPU	CPU per transaction (ms)
333.49	45.83%	1.374
499.64	68.29%	1.367
713.32	98.79%	1.385
996.24	138.84%	1.394
1241.42	173.42%	1.397

Table 7-2 Performance results for CICS TS V5.3 with RTW threadsafe workload

ETR	CICS CPU	CPU per transaction (ms)
334.12	46.29%	1.385
500.50	69.16%	1.382
714.30	98.77%	1.383
997.32	139.06%	1.394
1242.71	175.74%	1.414

The average CPU per transaction figure for CICS TS V5.2 is calculated to be 1.383 ms. The CICS TS V5.3 figure is calculated to be 1.392 ms. The difference between these two figures is 0.6%, which is within our measurement accuracy of ±1%; therefore, the performance of the two releases is considered to be equivalent.

These figures are shown in Figure 7-1.

Figure 7-1 Plot of CICS TS V5.2 and V5.3 performance results for RTW threadsafe workload

As shown in Figure 7-1 on page 110, the lines are straight, which indicates linear scaling as transaction throughput increases. The lines also are overlaid, which indicates equivalent performance when the releases are compared.

7.3 Improvements in threadsafety

All new CICS API commands in CICS V5.3 are threadsafe. Also, some system programming interface (SPI) commands were made threadsafe in this release. There also were some specific functional areas that were improved to reduce task control block (TCB) switches.

7.3.1 Threadsafe API and SPI commands

The following new CICS API commands are threadsafe:

•REQUEST PASSTICKET

•CHANNEL commands:

– DELETE CHANNEL

– QUERY CHANNEL

The WRITE OPERATOR CICS API command was made threadsafe.

For more information about CICS API commands, see the “CICS command summary” topic in the IBM Knowledge Center at this website:

https://ibm.biz/Bd427B

The following CICS SPI commands were made threadsafe:

•INQUIRE RRMS

•INQUIRE STORAGE

•INQUIRE STREAMNAME

•INQUIRE SUBPOOL

•INQUIRE TASK LIST

•INQUIRE TSPOOL

•INQUIRE UOWENQ

•PERFORM SECURITY REBUILD

•PERFORM SSL REBUILD

•ENQMODEL commands:

– INQUIRE ENQMODEL

– SET ENQMODEL

– DISCARD ENQMODEL

•JOURNALMODEL commands:

– INQUIRE JOURNALMODEL

– DISCARD JOURNALMODEL

•JOURNALNAME commands:

– INQUIRE JOURNALNAME

– SET JOURNALNAME

– DISCARD JOURNALNAME

•TCLASS commands:

– INQUIRE TCLASS

– SET TCLASS

•TCPIP commands:

– INQUIRE TCPIP

– SET TCPIP

•TCPIPSERVICE commands:

– INQUIRE TCPIPSERVICE

– SET TCPIPSERVICE

– DISCARD TCPIPSERVICE

•TDQUEUE commands:

– INQUIRE TDQUEUE

– SET TDQUEUE

– DISCARD TDQUEUE

•TRANCLASS commands:

– INQUIRE TRANCLASS

– SET TRANCLASS

– DISCARD TRANCLASS

•TSMODEL commands:

– INQUIRE TSMODEL

– DISCARD TSMODEL

•TSQUEUE / TSQNAME commands:

– INQUIRE TSQUEUE / TSQNAME

– SET TSQUEUE / TSQNAME

•UOW commands:

– INQUIRE UOW

– SET UOW

•WEB commands:

– INQUIRE WEB

– SET WEB

For more information about CICS SPI commands, see the “System commands” topic in the IBM Knowledge Center at this website:

https://ibm.biz/Bd427x

7.3.2 Optimizations for SSL support

Several TCB switches were removed for inbound requests that use SSL. For more information about this and other improvements in CICS web support, see IBM CICS Performance Series: Web Services Performance in CICS TS V5.3, REDP-5322, which is available at this website:

http://www.redbooks.ibm.com/redbooks.nsf/RedpieceAbstracts/redp5322.html

7.3.3 Offloading authentication requests to open TCBs

RACF APAR OA43999 introduced the Enhanced Password Algorithm, which applies to z/OS V1.12, V1.13, and V2.1. This RACF APAR implements the following support:

•Accept more special characters within passwords

•Allow stronger encryption of passwords

•Define users with a password phrase and no password

•Expire a password without changing it

•Clean up password history

For more information about the new function APAR, see the following IBM support website:

http://www.ibm.com/support/docview.wss?uid=isg1OA43999

If the APARs are installed, CICS starts a new callable service IRRSPW00 for password authentication. This service is used for the following authentication operations:

•Basic authentication requests

•EXEC CICS VERIFY PASSWORD API command

•EXEC CICS VERIFY PHRASE API command

•EXEC CICS SIGNON API command

The IRRSPW00 service runs on open TCBs or switches to an L8 TCB, which reduces contention on the resource-owning (RO) TCB.

Note: The ability to perform authentication requests on an open TCB was also made available to CICS TS V4.2 in APAR PI21865, and CICS TS V5.1 and V5.2 in APAR PI21866.

7.4 Changes to system initialization parameters

Several performance-related CICS system initialization (SIT) parameters were changed in the CICS TS V5.3 release. This section describes changes to the SIT parameters that have the most affect on CICS performance. All comparisons to previous limits or default values refer to CICS TS V5.2.

7.4.1 Storage protection

Storage protection (SIT parameter STGPROT) is now enabled by default. For more information about storage protection, see the “The storage protection global option” topic in the IBM Knowledge Center at this website:

https://ibm.biz/Bd427M

7.4.2 Internal trace table size

The default size for the internal trace table (SIT parameter TRTABSZ) increased to 12 MB. For more information about the internal trace facility, see the “Internal trace table” topic in the IBM Knowledge Center at this website:

https://ibm.biz/Bd427G

Storage for the internal trace table is allocated outside of any CICS DSA. In CICS releases since CICS TS V4.2, the internal trace table is allocated in 64-bit virtual storage.

7.5 Enhanced instrumentation

The CICS TS V5.3 release continues the expansion of information that is reported by the CICS monitoring and statistics component. This section describes the extra fields that are now available in the CICS statistics SMF records.

For more information about changes in monitoring fields across a range of CICS releases, see the “Changes to CICS monitoring” topic in the IBM Knowledge Center at this website:

https://ibm.biz/Bd427b

7.5.1 DFHCICS performance group

The number of named counter server GET requests (field NCGETCT) field was added to the DFHCICS performance group. This field shows the total number of requests to a named counter server to satisfy EXEC CICS GET COUNTER and EXEC CICS GET DCOUNTER API commands that are issued by the user task.

For more information about counters that are available in the DFHCICS performance group, see the “Performance data in group DFHCICS” topic in the IBM Knowledge Center at this website:

https://ibm.biz/Bd427g

7.5.2 DFHTASK performance group

The dispatcher allocate pthread wait time (field DSAPTHWT) field was added to the DFHTASK performance group. This field shows the dispatcher allocated pthread wait time. This time is the time that the transaction waited for a Liberty pthread to be allocated during links to Liberty programs.

For more information about counters that are available in the DFHTASK performance group, see the “Performance data in group DFHTASK” topic in the IBM Knowledge Center at this website:

https://ibm.biz/Bd4ZpL

7.5.3 DFHTEMP performance group

The following fields were added to the DFHTEMP performance group:

•Number of shared temporary storage GET operations (field TSGETSCT)

Number of temporary storage GET requests from shared temporary storage that are issued by the user task.

•Number of shared temporary storage GET operations (field TSPUTSCT)

Number of temporary storage PUT requests to shared temporary storage that are issued by the user task.

The total temporary storage operations (field TSTOTCT) field in the DFHTEMP performance group was updated. This field is the sum of the temporary storage read queue (TSGETCT), read queue shared (TSGETSCT), write queue auxiliary (TSPUTACT), write queue main (TSPUTMCT), write queue shared (TSPUTSCT), and delete queue requests that are issued by the user task.

For more information about counters that are available in the DFHTEMP performance group, see the “Performance data in group DFHTEMP” topic in the IBM Knowledge Center at this website:

https://ibm.biz/Bd427L

7.5.4 DFHWEBB performance group

The following fields were added to the DFHWEBB performance group:

•JSON request body length (field WBJSNRQL)

For JSON web service applications, the JSON message request length.

•JSON response body length (field WBJSNRPL)

For JSON web service applications, the JSON message response length.

For more information about counters that are available in the DFHWEBB performance group, see the “Performance data in group DFHWEBB” topic in the IBM Knowledge Center at this website:

https://ibm.biz/Bd427T

7.5.5 Monitoring domain global statistics

The following fields were added to the collected monitoring domain statistics:

•Total transaction CPU time (field MNGCPUT)

The total transaction CPU time that is accumulated for the CICS dispatcher managed TCB modes that are used by the transactions that completed during the interval.

•Total transaction CPU time on CP (field MNGTONCP)

The total transaction CPU time on a standard processor that is accumulated by the CICS dispatcher managed TCB modes that are used by the transactions that completed during the interval.

•Total transaction CPU offload on CP (field MNGOFLCP)

The total transaction CPU time on a standard processor but was eligible for offload to a specialty processor (zIIP or zAAP) that was accumulated for the CICS dispatcher that was managed TCB modes used by the transactions that completed during the interval.

A sample DFHSTUP report that contains the new fields is shown in Example 7-1.

Example 7-1 Sample CICS TS V5.3 DFHSTUP monitoring domain global statistics report fragment

Average user transaction resp time. . : 00:00:00.001256

Peak user transaction resp time . . . : 00:00:00.061583

Peak user transaction resp time at. . : 11/24/2015 22:25:58.7568

Total transaction CPU time. . . . . . : 00:00:14.192698

Total transaction CPU time on CP. . . : 00:00:14.192698

Total transaction CPU offload on CP . : 00:00:00.000000

For more information about monitoring domain statistics, see the “Monitoring domain: global statistics” topic in the IBM Knowledge Center at this website:

https://ibm.biz/Bd42WH

7.5.6 TCP/IP global statistics

The following fields were added to TCP/IP global statistics:

•Performance tuning for HTTP connections (field SOG_SOTUNING)

Indicates whether performance tuning for HTTP connections occurs.

•Socket listener has paused listening for HTTP connections (field SOG_PAUSING_HTTP_LISTENING)

Indicates whether the listener paused listening for HTTP connection requests because the number of tasks in the region reached the limit for accepting new HTTP connection requests.

•Number of times socket listener notified at task accept limit (field SOG_TIMES_AT_ACCEPT_LIMIT)

Is the number of times the listener was notified that the number of tasks in the region reached the limit for accepting new HTTP connection requests.

•Last time socket listener paused listening for HTTP connections (field SOG_TIME_LAST_PAUSED_HTTP_LISTENING)

The last time the socket listener paused listening for HTTP connection requests because the number of tasks in the region reached the limit for accepting new HTTP connection requests.

•Region stopping HTTP connection persistence (field SOG_STOPPING_PERSISTENCE)

Indicates whether the region is stopping HTTP connection persistence because the number of tasks in the region exceeded the limit.

•Number of times region stopped HTTP connection persistence (field SOG_TIMES_STOPPED_PERSISTENT)

The number of times the region took action to stop HTTP connection persistence because the number of tasks in the region exceeded the limit.

•Last time stopped HTTP connection persistence (field SOG_TIME_LAST_STOPPED_PERSISTENT)

The last time the region took action to stop HTTP connection persistence because the number of tasks in the region exceeded the limit.

•Number of persistent connections made non-persistent (field SOG_TIMES_MADE_NON_PERSISTENT)

The number of times a persistent HTTP connection was made non-persistent because the number of tasks in the region exceeded the limit.

•Number of times disconnected an HTTP connection at max uses (field SOG_TIMES_CONN_DISC_AT_MAX)

The number of times a persistent HTTP connection was disconnected because the number of uses exceeded the limit.

For more information about performance tuning for HTTP connections and a sample DFHSTUP report, see 7.13, “HTTP flow control” on page 140. For more information about TCP/IP global statistics, see the “TCP/IP: Global statistics” topic in the IBM Knowledge Center at this website:

https://ibm.biz/Bd42WY

7.5.7 URIMAP global statistics

The direct attach count (field WBG_URIMAP_DIRECT_ATTACH) field was added to URIMAP global statistics. This field shows the number of requests that are processed by a directly attached user task.

The direct attach count statistics field was added in support of the web optimizations, as described in 7.7, “Web support and web service optimization” on page 121. For more information about URIMAP global statistics, see the “URIMAP definitions: Global statistics” topic in the IBM Knowledge Center at this website:

https://ibm.biz/Bd42WP

7.6 Low-level CICS optimizations

The CICS TS V5.3 release includes the following low-level optimizations that can provide a performance benefit to many workloads:

•Use of the store clock fast (STCKF) hardware instruction that was introduced by the IBM System z9 processor.

•Storage alignment of some key CICS control blocks to improve the interaction between the CICS TS run time and the hardware cache subsystem.

•Use of hardware instructions to pre-fetch data into the processor cache, which reduces the number of CPU cycles that are wasted while waiting for data.

•A reduction in lock contention through tuning the CICS Monitoring Facility algorithms.

•More efficient algorithms that are used for multiregion operation (MRO) session management.

•More tuning of other internal procedures.

These improvements in efficiency have particular benefit for CICS trace, CICS monitoring, and for MRO connections that have high session counts.

The remainder of this section describes the results of performance benchmarks that use the DSW workload. For this performance benchmark, two TOR regions were configured to dynamically route transactions to four AOR regions that use CICSPlex System Manager. Each AOR function shipped file control requests to an FOR, where VSAM data is accessed in Local Shared Resources (LSR) mode. A more information about the workload, see 3.2, “Data Systems Workload” on page 20.

The following configurations were tested to show the relative benefits of the improvements in each of the monitoring, trace, and MRO session management components:

•Monitoring and trace enabled

•Monitoring disabled, trace enabled

•Monitoring enabled, trace disabled

•Monitoring and trace disabled

•Monitoring and trace disabled with low numbers of MRO sessions

Comparisons are made between CICS TS V5.2 and CICS TS V5.3.

7.6.1 Monitoring and trace enabled

For this scenario, performance class monitoring was enabled by using MN=ON and MNPER=ON. Internal trace was enabled with INTTR=ON. All other trace-related SIT parameters used their default values. Figure 7-2 shows the benchmark results for this configuration that uses CICS TS V5.2 and V5.3.

Figure 7-2 DSW performance results with monitoring and trace enabled

The average CPU per transaction for CICS TS V5.2 was 0.702 ms, and the equivalent value for V5.3 was 0.643 ms. For this workload, a reduction of 0.059 ms per transaction represents a decrease of 8%.

The straight lines in the plot indicate that both configurations scale linearly as the transaction rate increases.

7.6.2 Monitoring disabled, trace enabled

This scenario extends the scenario that is described in 7.6.1, “Monitoring and trace enabled” on page 118 by disabling performance class monitoring. Performance class monitoring was disabled by using the SIT parameter MN=OFF. Internal trace was enabled by using INTTR=ON, and all other trace-related SIT parameters used their default values. Figure 7-3 on page 119 shows the results of the benchmark for CICS TS V5.2 and V5.3.

Figure 7-3 DSW performance results with monitoring disabled and trace enabled

Average CPU per transaction for CICS TS V5.2 was 0.625 ms, and the equivalent value for V5.3 was 0.593 ms. A reduction of 0.032 ms per transaction represents a decrease of 5% for this workload.

7.6.3 Monitoring enabled, trace disabled

In this scenario, the configuration is a mirror of the scenario that is described in 7.6.2, “Monitoring disabled, trace enabled” on page 118. In this scenario, performance class monitoring was enabled by using MN=ON and MNPER=ON. Internal trace was disabled with INTTR=OFF and all other trace-related SIT parameters used their default values. Figure 7-4 shows the results of the benchmark results for CICS TS V5.2 and V5.3.

Figure 7-4 DSW performance results with monitoring enabled and trace disabled

Average CPU per transaction for CICS TS V5.2 was 0.486 ms, and the equivalent value for V5.3 was 0.440 ms. A reduction of 0.046 ms per transaction represents a decrease of 9% for this workload.

7.6.4 Monitoring and trace disabled

In this scenario, performance class monitoring and trace were disabled. Performance class monitoring was disabled by using MN=OFF. Internal trace was disabled by setting INTTR=OFF and all other trace-related SIT parameters used their default values. Figure 7-5 shows the results of the benchmark results for CICS TS V5.2 and V5.3.

Figure 7-5 DSW performance results with monitoring and trace disabled

Average CPU per transaction for CICS TS V5.2 was 0.447 ms, and the equivalent value for V5.3 was 0.428 ms. A reduction of 0.019 ms per transaction represents a decrease of 4% for this workload.

7.6.5 Monitoring and trace disabled with low numbers of MRO sessions

The final scenario isolates the performance improvements in CICS that are not directly related to monitoring, trace, or MRO session management. Performance class monitoring was disabled by using MN=OFF. Internal trace was disabled with INTTR=OFF and all other trace-related SIT parameters used their default values. All MRO connections were configured to have a minimal number of sessions defined. Figure 7-6 on page 121 shows the benchmark results for CICS TS V5.2 and V5.3.

Figure 7-6 DSW performance results with monitoring and trace disabled and low session count

Average CPU per transaction for CICS TS V5.2 was 0.438 ms, and the equivalent value for V5.3 was 0.431 ms. A reduction of 0.007 ms per transaction represents a decrease of 2% for this workload.

7.6.6 Low-level CICS optimizations conclusions

Each scenario demonstrated a reduction in CPU usage per transaction for this workload. Where a workload uses any combination of performance class monitoring, trace, or many MRO sessions, the benefits that are realized in CICS V5.3 can be significant.

Even workloads that do not use these facilities can achieve a reduction in CPU use, as described in 7.6.5, “Monitoring and trace disabled with low numbers of MRO sessions” on page 120.

7.7 Web support and web service optimization

In CICS TS V5.3, the pipeline processing of HTTP requests is streamlined so that an intermediate web attach task (CWXN transaction) is no longer required in most situations. Removing the intermediate web attach task reduces CPU and memory overheads for most types of SOAP and JSON-based HTTP CICS web services.

The socket listener task (CSOL transaction) is optimized to attach user transactions directly for fast-arriving HTTP requests. The web attach task is bypassed, which reduces the CPU time that is required to process each request.

There also is a benefit for inbound HTTPS requests, where SSL support is provided by the Application Transparent Transport Layer Security (AT-TLS) feature of IBM z/OS Communications Server. In CICS, TCPIPSERVICE resources define the association between ports and CICS services, including CICS web support. These resources can be configured as AT-TLS aware and obtain security information from AT-TLS.

Performance is also improved for HTTPS requests where SSL support is provided by CICS. Although these requests still require the CWXN transaction, the number of TCB change mode operations was reduced.

For more information about the CPU savings that were achieved for an HTTP web services workload in several configuration scenarios, see IBM CICS Performance Series: Web Services Performance in CICS TS V5.3, REDP-5322, which is available at this website:

http://www.redbooks.ibm.com/redbooks.nsf/RedpieceAbstracts/redp5322.html

7.8 Java workloads

Optimizations to the thread and TCB management mechanisms in CICS TS V5.3 provide a benefit to Java applications that are hosted in OSGi JVM servers, and WebSphere Application Server Liberty JVM servers.

This section presents a comparison between CICS TS V5.2 and V5.3 when Java workloads are run.

7.8.1 Java workload configuration

The hardware and software that was used for the benchmarks is described in 7.1, “Introduction” on page 109. The measurement LPAR was configured with three GCPs and one zIIP, which resulted in an LSPR equivalent processor of 2964-704. The driving LPAR was configured with three GCPs, which resulted in an LSPR equivalent processor of 2964-703.

To minimize variance in the performance results that might be introduced by the Just-In-Time compiler (JIT), the workload was run at a constant transaction rate for 20 minutes to provide a warm-up period. The request rate was increased every 5 minutes, with the mean CPU usage per request calculated by using the final minute of data from the 5-minute interval. CPU usage data was collected by using IBM z/OS Resource Measurement Facility (RMF).

All configurations used a single CICS region with one installed JVMSERVER resource with a configured maximum of 25 threads. CICS TS V5.2 and CICS TS V5.3 used Java 7.1 SR3 (64-bit) and IBM WebSphere Application Server Liberty V8.5.5.7.

Note: IBM WebSphere Application Server Liberty V8.5.5.7 support for CICS V5.1 and V5.2 is provided by CICS APAR PI50345.

For database access, all workload configurations accessed DB2 V10 by using the JDBC type 2 driver.

7.8.2 Java servlet workload

The Java servlet application is hosted in a CICS JVM server that uses the embedded IBM WebSphere Application Server Liberty server. The workload is driven through HTTP requests by using IBM Workload Simulator for z/OS, as described in section 2.4, “Driving the workload” on page 14. The servlet application accesses VSAM data by using the JCICS API and accesses DB2 by using the JDBC API. For more information about the workload, see 3.4, “Liberty servlet with JDBC and JCICS access” on page 24.

Both configurations used the following JVM options:

•-Xgcpolicy:gencon

•-Xcompressedheap

•-XXnosuballoc32bitmem

•-Xmx200M

•-Xms200M

•-Xmnx60M

•-Xmns60M

•-Xmox140M

•-Xmos140M

The results of the benchmark are shown in Figure 7-7.

Figure 7-7 Comparing overall CPU utilization for Java servlet workload with CICS TS V5.2 and V5.3

It can be seen in Figure 7-7 that the new thread management mechanism in CICS Liberty provides reduced CPU costs and improved scalability characteristics, with V5.3 maintaining cost per request to higher request rates than V5.2.

The chart in Figure 7-8 presents the same data as Figure 7-7 on page 123, but broken into usage that is non-eligible for offload and usage that is eligible for offload to a zIIP engine.

Figure 7-8 Comparing offload-eligible CPU utilization for Java workload with CICS TS V5.2 and V5.3

The chart in Figure 7-8 shows better scalability for the non-eligible component of the CPU usage. The chart also shows that the overall reduction in CPU usage that is shown in Figure 7-7 on page 123 is achieved by reducing the amount of zIIP-eligible CPU.

7.8.3 Java OSGi workload

The Java OSGi workload is composed of several applications. This mixture of applications includes some of the JCICS sample applications as described in the “The JCICS example programs” topic in the IBM Knowledge Center at this website:

https://ibm.biz/Bd4zcX

The CICS BUNDLE JDBC, “Hello World”, and temporary storage queue (TSQ) examples were modified to include Java programming to simulate extra business logic, such as creating and manipulating strings, generating random numbers, and performing mathematical operations on these numbers.

The workload is driven by running CICS transactions at a simulated console by using IBM Workload Simulator for z/OS, as described in section 2.4, “Driving the workload” on page 14.

An overview of the workload is shown in Figure 7-9.

Figure 7-9 Overview of OSGi Java workload

Both configurations used the following JVM options:

•-Xgcpolicy:gencon

•-Xcompressedheap

•-XXnosuballoc32bitmem

•-Xmx100M

•-Xms100M

•-Xmnx70M

•-Xmns70M

•-Xmox30M

•-Xmos30M

The benchmark results are shown in Figure 7-10.

Figure 7-10 Comparing overall CPU utilization for Java OSGi workload with CICS TS V5.2 and V5.3

The chart in Figure 7-10 shows a slight reduction in overall CPU usage per transaction because of the improved TCB management.

The chart in Figure 7-11 shows the same data as Figure 7-10, but broken into usage that is non-eligible for offload and usage that is eligible for offload to a zIIP engine.

Figure 7-11 Comparing offload-eligible CPU utilization for OSGi workload with CICS TS V5.2 and V5.3

Both configurations scale well, with the ratio of eligible to non-eligible work remaining consistent between the V5.2 and V5.3 releases.

7.9 Java 8 performance

Every new release of Java provides more scope for performance improvements, the magnitude of which depends on the application. This section describes the effects of varying the Java release within a CICS environment for various workloads.

A JVM server in CICS TS for z/OS V5.3 can use Java 7.0, Java 7.1, or Java 8 as the runtime environment. A single CICS region can host multiple JVM server instances, with a different Java runtime version used in each instance.

7.9.1 Improvements in Java 7.0, Java 7.1, and Java 8

Java 7.0 uses hardware instructions that were introduced in the IBM zEnterprise 196 (z196) and the IBM zEnterprise EC12 (zEC12) machines. When running on a zEC12, the JVM also uses the new transactional memory capabilities of the hardware.

Java 7.1 extends the zEC12 exploitation by using technologies, such as IBM z Systems Data Compression (zEDC) for zip acceleration. Java 7.1 SR3 introduces improved zIIP-offload characteristics, which can reduce cost for Java applications in CICS.

Java 8 introduces the use of hardware instructions that were introduced in the IBM z13 machine. Also used are technologies, such as single instruction multiple data (SIMD) instructions and improved cryptographic performance that uses Crypto Express5S and CP Assist for Cryptographic Function (CPACF).

The IBM Java Crypto Engine (JCE) in Java 8 SR1 automatically detects and uses an on-core hardware cryptographic accelerator that is available through the CPACF. It also uses the SIMD vector engine that is available in the IBM z13 to provide industry-leading security performance. CPACF instructions are used to accelerate the following cryptographic functions:

•Symmetric key algorithms (AES, 3DES and DES with CBC, CFB and OBF modes)

•Hashing (SHA1 and SHA2)

Optimized routines accelerate the popular P256 NIST Elliptic Curve (ECC) Public Key Agreement. SIMD instructions are used in these routines to further enhance performance.

Java 8 SR2 also introduces the same improved zIIP-offload characteristics as seen in Java 7.1 SR3.

7.9.2 Java performance benchmarks in CICS

The following workloads were used to examine the behavior of Java applications in a CICS environment:

•OSGi JVM server with a mixture of applications that use JDBC and JCICS calls to access DB2, VSAM data, and CICS temporary storage.

•Liberty servlet application that uses JDBC and JCICS calls to access DB2 and VSAM data.

•Liberty JSON-based web service that uses z/OS Connect.

For performance testing, the following Java runtime environment levels were used:

•Java 7.0 SR9

•Java 7.1 SR3

•Java 8 SR2

7.9.3 Java 8 and OSGi applications

This workload uses the configuration as described in 7.8.3, “Java OSGi workload” on page 124. Several applications provide a mixture of operations, including JDBC access, VSAM access, string manipulation, and mathematical operations.

Figure 7-12 shows the average cost per transaction for each of the Java versions under test when the mixed OSGi application workload is run.

Figure 7-12 Comparing Java versions for OSGi JVM server workload

The chart shows a slight improvement in zIIP eligibility in Java 7.1 when compared to Java 7.0, but with no reduction in overall CPU per transaction.

Java 8 improves the Java 7.1 benchmark result by reducing the overall cost per transaction (from 1.40 ms to 1.29 ms) and reducing the amount of non-eligible CPU (from 0.73 ms to 0.68 ms). The improvements in the Java 8 environment are achieved by improvements to the JIT compiler and Java class library changes.

7.9.4 Java 8 and Liberty servlet applications

This workload uses the configuration as described in 3.4, “Liberty servlet with JDBC and JCICS access” on page 24. In all, 200 simulated web clients accessed the Java application at a rate of approximately 2,500 requests per second.

Figure 7-13 on page 129 shows the cost per request for each of the Java 7.0, Java 7.1, and Java 8 run times when the CICS Liberty servlet application is run.

Figure 7-13 Comparing Java versions for JDBC and JCICS servlet workload

No significant differences in total CPU per request are observed for this workload when comparing Java 7.0, Java 7.1, and Java 8. zIIP eligibility is slightly improved when Java 8 is used.

7.9.5 Java 8 and z/OS Connect applications

The z/OS Connect application that is described in 7.12, “z/OS Connect for CICS” on page 134 was used to compare the effects of the supported Java versions. A small JSON request and response was used, which contained 32 bytes of user data for each HTTP flow. The data was transmitted by using SSL with persistent connections.

The results of the benchmark comparing the three Java versions are shown in Figure 7-14.

Figure 7-14 Comparison of Java versions for a z/OS Connect workload

Java 7.1 provides a reduction in overall CPU per request by reducing the amount of non-eligible CPU that is used.

Java 8 further improves on the Java 7.1 result through a reduction in non-eligible and overall CPU cost for each request. The use of persistent SSL connections means that most of the performance improvements are achieved because of the increased AES performance.

As the transmitted document size increases, the SSL payload size increases. Increasing the size of the SSL payload allows an application to achieve greater performance benefits when compared to Java 7.0 or Java 7.1.

7.10 Simultaneous multithreading with Java workloads

The zIIP processors in a z13 system can run to two threads simultaneously in a single core while sharing certain processor resources, such as execution units and caches. This capability is known as simultaneous multithreading (SMT). The use of SMT for two threads concurrently is known as SMT mode 2.

This section describes SMT, the methods that are used to measure the effectiveness of the technology, and the results of a Java benchmark in CICS to demonstrate the increased capacity that is available when SMT is enabled.

7.10.1 Introduction to SMT

SMT technology allows instructions from more than one thread to run in any pipeline stage at a time. Each thread has its own unique state information, such as program status word (PSW) and registers. The simultaneous threads cannot necessarily run instructions instantly and at times must compete to use certain core resources that are shared between the threads. In some cases, threads can use shared resources that are not experiencing competition.

Generally, SMT mode 2 can run more threads over the same period on a single core. This increased core usage leads to greater core capacity and a higher throughput of work. Figure 7-15 shows how SMT increases the capacity of a single core by enabling the simultaneous running of two threads.

Figure 7-15 Demonstrating increased capacity by enabling SMT mode 2

Although each of the threads that are shown in Figure 7-15 on page 130 can take longer to run, the capability of SMT to run both simultaneously means that more threads can complete during a specific period, which increases the overall thread execution rate of a single core. Running more threads in a specific time increases the system throughput.

For more information about SMT, see IBM z13 Technical Guide, SG24-8251, which is available at this website:

http://www.redbooks.ibm.com/abstracts/sg248251.html

7.10.2 Measuring SMT performance

IBM z/OS RMF fully supports the extra performance information that is available when operating in SMT mode 2.

The IIP service times that are found in an RMF Workload Activity report are normalized by an SMT capacity factor (CF) when zIIP processors are in SMT mode 2. The CF is the ratio of work performed with SMT mode 2 enabled, when compared to SMT disabled. The normalization process reflects the increased ability of a zIIP in SMT mode 2 to perform more work.

RMF provides key metrics in the Multi-threading Analysis section of a CPU Activity report when zIIP processors are in SMT mode 2. The following terms are used when describing the workload performance:

•MAX CF reports the maximum CF: The ratio of the maximum amount of work the zIIPs performed with multithreading enabled compared to disabled.

The MAX CF value can be 0.0 - 2.0, with typical values 1.1 - 1.4.

•CF reports the average capacity factor: The ratio of the average amount of work the zIIPs performed with multithreading enabled compared to disabled.

The CF value can be 0.0 - 2.0, with typical values 1.0 - 1.4.

•AVG TD reports the average thread density: The average number of running threads while the core is busy.

The AVG TD value can be 1.0 - 2.0.

Figure 7-16 shows an extract of an RMF CPU Activity report. The average CF for the zIIP processors is highlighted for use in a later calculation.

Figure 7-16 Extract of RMF CPU Activity report

In the IBM z13 hardware, SMT mode 2 is available for zIIP processors only; therefore, the MODE and CF values for general CPs is always 1.

Figure 7-17 shows an extract of an RMF Workload Activity report. The IIP service time and IIP APPL % figures are highlighted for use in a later calculation.

Figure 7-17 Extract of RMF Workload Activity report

The APPL% IIP value is the amount of actual zIIP resource used. The APPL% IIP value is not normalized and shows how busy the processors are. The LPAR that was used for this benchmark was configured with three dedicated CPs and two dedicated zIIPs. Therefore, the maximum value for APPL% CP is 300%, and the maximum value for APPL% IIP is 200%.

The SERVICE TIME IIP value is the normalized zIIP time, factored by the CF. Note the relationship between SERVICE TIME IIP and APPL% IIP in the following equation:

APPL% IIP = ( SERVICE TIME IIP ÷ ( interval × CF ) ) × 100

The reports that are shown in Figure 7-16 on page 131 and Figure 7-17 are extracted from an RMF report with an interval of 60 seconds; therefore, the highlighted values that are shown in Figure 7-16 on page 131 and Figure 7-17 can be used in the previous equation, as shown in the following equation:

APPL% IIP = ( 140.240 ÷ ( 60 × 1.303 ) ) × 100 = 179.38%

The slight discrepancy between the calculated and reported APPL% IIP values is because of other values in the report are rounded.

7.10.3 CICS throughput improvement

A z/OS Connect workload was used to demonstrate the change in CPU utilization when running Java in CICS with SMT mode 2 disabled and enabled. Figure 7-18 on page 133 shows a comparison of a Java-based workload when running with the two SMT configurations.

Figure 7-18 Comparing a z/OS Connect workload with SMT mode disabled and SMT mode 2

The chart that is shown in Figure 7-18 plots the sum of the APPL% CP and APPL% IIP values from the RMF Workload Activity report.

Comparing the SMT-1 total and SMT-2 total lines, it can be seen that the total CPU cost is lower with SMT mode 2 enabled and the maximum throughput is increased.

The plot lines that show the amount of work that was not eligible to be offloaded to a System z Integrated Information Processor (zIIP) remains constant between the comparisons. The performance benefits are achieved through increased zIIP capacity.

7.11 Reporting of CPU time to z/OS Workload Manager

Mobile Workload Pricing is an IBM Software Pricing Option that was announced in May 2014. It offers a discount on MSUs consumed by transactions that originated on a mobile device. To use this discount, customers need a process that is agreed upon by IBM to identify (tag and track) their mobile-sourced transactions and their use.

Before CICS TS V5.3, the identification and accumulation of CPU time for certain transaction types required CICS Performance class monitoring to be active. The collection of high-volume SMF data in a production environment can introduce significant overhead.

z/OS Workload Manager (WLM) APAR OA47042 introduces enhancements to simplify the identification and reporting of mobile-sourced transactions and their processor consumption. For more information about updates to WLM, see the following APAR website:

http://www.ibm.com/support/docview.wss?uid=isg1OA47042

The associated APAR OA48466 is available for IBM z/OS RMF, which provides support for the new WLM function that is provided by APAR OA47042. For more information about the updates to RMF, see the following APAR website:

http://www.ibm.com/support/docview.wss?uid=isg1OA48466

The CICS TS V5.3 release introduces support for the new functions that were introduced by WLM APAR OA47042. CPU time is reported to WLM on a per-transaction basis, which enables a granular approach to transaction CPU tracking without the requirement for CMF data.

No configuration changes are required in CICS to use the WLM updates. CPU information is reported to WLM if CICS detects Mobile Workload Pricing support was installed, with no other CPU overhead in the CICS region.

7.12 z/OS Connect for CICS

IBM z/OS Connect is software that enables systems that run on z/OS to better participate in today’s mobile computing environment. z/OS Connect for CICS enables CICS programs to be called with a JSON interface.

z/OS Connect is distributed with CICS to enable connectivity, such as between mobile devices and CICS programs. The CICS embedded version of z/OS Connect is a set of capabilities that are used to enable CICS programs as JSON web services. z/OS Connect is an alternative to the JSON capabilities of the Java-based pipeline. The two technologies are broadly equivalent. Most JSON web services can be redeployed from one environment to the other without application or WSBind file changes. However, the URI and security configuration can be different in each environment.

7.12.1 CICS TS V5.3 performance enhancement

A significant performance enhancement in the CICS TS V5.3 release is the introduction of a JSON parser that is implemented in native (non-Java) code.

The parser implementation that is used by CICS is controlled by the java_parser attribute of the provider_pipeline_json XML element in the pipeline configuration file. For more information about the provider_pipeline_json element, see the “The <provider_pipeline_json> element” topic in the IBM Knowledge Center at this website:

https://ibm.biz/Bd4zBa

A sample XML configuration file for the z/OS Connect pipeline handler is supplied in the following location relative to the CICS installation root directory:

./samples/pipelines/jsonzosconnectprovider.xml

Example 7-2 shows a pipeline file that uses the CICS JVMSERVER resource that is named DFHWLP and specifies the use of the native parser.

Example 7-2 Sample pipeline configuration file that specifies the native parser implementation

<provider_pipeline_json java_parser="no">

<jvmserver>DFHWLP</jvmserver>

</provider_pipeline_json>

This section provides a performance comparison when various JSON request and response sizes for the Java and native parser implementations are used. In all configurations, SSL was used.

The methodology and applications that were used to produce the performance test results for z/OS Connect in CICS were similar to the methodology and applications that were used when testing the JSON support in CICS TS V5.2. For more information, see 6.7, “JSON support” on page 92. To expand the workload, an extra request and response size of 64 KB was added.

7.12.2 Varying payload sizes by using Java parser

By using z/OS Connect for CICS with the default Java parser, CPU usage was measured for a range of payload sizes. The CPU cost per request is shown in Figure 7-19 for a range of request and response size combinations. Total CPU cost per request is broken into non-zIIP-eligible and zIIP-eligible components.

Figure 7-19 CPU comparison for various request and response payloads by using the Java parser

It is clear that the CPU cost per request depends on the size of the JSON documents that were received or transmitted. A significant fraction of the CPU cost incurred for larger JSON documents is zIIP-eligible.

7.12.3 Comparing Java and native parsers

By using a medium-sized JSON request and response, the CPU usage was compared for the Java and native parsers. The scenario used a 4 KB request and 4 KB response. The result of this comparison is shown in Figure 7-20. As per Figure 7-19 on page 135, the CPU usage is broken into non-zIIP-eligible and zIIP-eligible components.

Figure 7-20 Comparing Java and native parsers for a medium-sized request and response

The chart in Figure 7-20 shows that for a medium-size request and response, the overall CPU cost per request is reduced with the native parser. Use of the native parser slightly increases the amount of non-zIIP-eligible CPU time from 0.33 ms to 0.39 ms per request.

7.12.4 Comparing Java and native parsers with varying request sizes

Extending the test scenario that is described in 7.12.3, “Comparing Java and native parsers” on page 136, various request sizes were used. Each request returns a response of 32 bytes. The following request sizes were tested:

•32 bytes (as shown in Figure 7-21 on page 137)

•4 KB (as shown in Figure 7-22 on page 137)

•64 KB (as shown in Figure 7-23 on page 138)

Figure 7-21 Comparing Java and native parsers for 32-byte request with 32-byte response

For the 32-bytes request with 32-bytes response scenario, there is no significant difference in CPU usage or zIIP-eligibility between the Java and native parsers.

Figure 7-22 Comparing Java and native parsers for 4 KB request with 32-byte response

For the 4 KB request with 32-bytes response scenario, the use of the native parser results in a reduction in total CPU per request, from 1.41 ms to 1.09 ms. However, the native parser uses more slightly non-zIIP-eligible CPU time, which increases from 0.29 ms to 0.33 ms.

Figure 7-23 Comparing Java and native parsers for 64 KB request with 32-byte response

The 64 KB request with 32-byte response scenario shows a significant reduction in total CPU usage per request. Overall CPU usage per request reduces from 18.62 ms to 13.31 ms, but non-zIIP-eligible usage increases from 0.41 ms to 1.39 ms per request.

The charts that are shown in Figure 7-21 on page 137, Figure 7-22 on page 137, and Figure 7-23 show that the native parser reduces overall CPU usage for each JSON request. As expected, the largest performance gains are realized with large request sizes. The use of the native parser also has the expected effect of the use of more non-zIIP-eligible CPU than the Java parser.

7.12.5 Comparing Java and native parsers with varying response sizes

The benchmark was further modified such that various response sizes were used. Each request was 64 KB and the performance of the Java and native parsers were compared. The following response sizes were tested:

•32 B (as shown in Figure 7-23 on page 138)

•4 KB (as shown in Figure 7-24 on page 139)

•64 KB (as shown in Figure 7-25 on page 139)

Unlike as described in 7.12.4, “Comparing Java and native parsers with varying request sizes” on page 136, this section describes scenarios in which the response sizes were modified. With an invariant request size, the benefit of the native parser is expected to remain constant across all scenarios because the parser operates on the incoming request only.

Performance results for a 64 KB request with 32-byte response are described in 7.12.4, “Comparing Java and native parsers with varying request sizes” on page 136. Figure 7-23 on page 138 shows the native parser reducing the overall CPU by 5.31 ms, but increasing the non-zIIP-eligible CPU by 0.98 ms per request.

Figure 7-24 Comparing Java and native parsers for 64 KB request with a 4 KB response

With a 64 KB request and a 4 KB response, the native parser reduces overall CPU usage by 5.34 ms. The native parser uses 0.98 ms more non-zIIP-eligible CPU.

Figure 7-25 Comparing Java and native parsers for 64 KB request with a 64 KB response

For the 64 KB request and 64 KB response scenario, the native parser again reduces overall CPU usage by 5.11 ms. The native parser again increases non-zIIP-eligible CPU usage by 0.98 ms per request.

7.12.6 Native parser conclusion

The native parser can provide a significant reduction in overall CPU usage per request. The potential reduction in CPU usage is determined by the size of the inbound request. The native parser is not implemented by using Java; therefore, CPU usage by the native parser cannot be offloaded to a specialty engine.

The performance improvements in CICS TS V5.3 apply to the JSON parser component only; therefore, it has no effect on the CPU costs that are involved in producing a JSON response.

7.13 HTTP flow control

CICS TS V5.3 introduces the ability to enable performance tuning for HTTP to protect CICS from unconstrained resource demand. TCP/IP flow control in CICS TS V5.3 is for HTTP connections only. If enabled, it addresses the following situations:

•A pacing mechanism to prevent HTTP requests from continuing to be accepted by a CICS region, when the region reached its throughput capacity.

•Gives an opportunity to rebalance persistent connections on a periodic basis.

If HTTP flow control is enabled and the region becomes overloaded, CICS temporarily stops listening for new HTTP connection requests. If overloading continues, CICS closes HTTP persistent connections and marks all new HTTP connections as non-persistent. These actions prevent oversupply of new HTTP work from being received and queued within CICS, which allows feedback to TCP/IP port sharing and Sysplex Distributor. This ability promotes a balanced sharing of workload with other regions that are sharing the IP endpoint and allowing the CICS region to recover more quickly.

7.13.1 Server accept efficiency fraction

CICS HTTP flow control is implemented by queuing new HTTP connection requests in the TCP/IP socket backlog. Queuing requests in the TCP/IP socket backlog affects the server accept efficiency fraction (SEF).

Note: When ports that are managed by CICS are used, it is the CICS address space that is accepting connections; therefore, CICS is the server application (in IBM z/OS Communications Server terminology).

SEF is a measure (calculated at intervals of approximately 1 minute) of the efficiency of the server application in accepting new connection setup requests and managing its backlog queue. The SEF value is reported as a percentage. A value of 100% indicates that the server application is successfully accepting all its new connection setup requests. A value of 0% indicates that the server application is not responding to new connection set up requests. The SEF field is only available for a connection that is in listen state.

The netstat command can display the SEF value for an IP socket. This command is available in the TSO and z/OS UNIX System Services environments. The following sample commands produce the same output when inquiring about the state of port 4025:

•TSO environment

NETSTAT ALL (PORT 4025

•z/OS UNIX System Services environment

netstat -A -P 4025

Example 7-3 shows a fragment of the output that is produced by the netstat command.

Example 7-3 Sample fragment of the output of the netstat command

ReceiveBufferSize: 0000065536 SendBufferSize: 0000065536

ConnectionsIn: 0000008574 ConnectionsDropped: 0000000000

MaximumBacklog: 0000001024 ConnectionFlood: No

CurrentBacklog: 0000000000

ServerBacklog: 0000000000 FRCABAcklog: 0000000000

CurrentConnections: 0000001464 SEF: 100

When the SHAREPORTWLM option in a port definition is used, the SEF value is used to modify the IBM Workload Manager for z/OS server-specific weights, which influences how new connection setup requests are distributed to the servers sharing this port.

When the SHAREPORT option in a port definition is used, the SEF value is used to weight the distribution of new connection setup requests among the SHAREPORT servers.

Whether SHAREPORT or SHAREPORTWLM is specified, the SEF value is reported back to the sysplex distributor to be used as part of the target server responsiveness fraction calculation, which influences how new connection setup requests are distributed to the target servers.

For more information about the configuration of ports in IBM z/OS Communications Server, see the “PORT statement” topic in the IBM Knowledge Center at this website:

https://ibm.biz/Bd4zdY

7.13.2 Flow control configuration

The behavior of HTTP flow control is specified by using the new system initialization parameter SOTUNING, which can be set to one of the following values:

•YES

Performance tuning for HTTP connections occurs to protect CICS from unconstrained resource demand. YES is the default value.

•520

No performance tuning occurs.

Note: If sharing IP endpoints, ensure that all regions have the same SOTUNING value or uneven loading might occur.

For more information about the SOTUNING SIT parameter, see the “SOTUNING” topic in the IBM Knowledge Center at this website:

https://ibm.biz/Bd4zd2

7.13.3 Flow control operation

When a CICS region reaches the maximum task limit (MXT), it stops accepting new HTTP connections and incoming requests are queued in the backlog for the TCP/IP socket. When the MXT condition is relieved, CICS starts accepting new connections again.

In the case of persistent connections (that is, connections that were accepted and maintained their connection), work can continue to be received even after reaching MXT. In this situation, when the number of active transactions in CICS and the number of queued requests in the IP socket reaches 110% of the MXT value, client connections are actively disconnected to route work away from an overloaded CICS region.

When actively disconnecting clients, the current request is permitted to complete and then the connection is closed. New connection requests are made non-persistent until the region drops below 95% of the MXT value.

In addition to these mechanisms, CICS also disconnects a client connection every 1,000 requests. This disconnection rate gives an opportunity for rebalancing the connection when the client reconnects.

7.13.4 CICS statistics enhancements

CICS TCP/IP global statistics was enhanced to provide information about how incoming work is being processed and the effect flow control is having on HTTP requests. Example 7-4 shows an extract of a sample DFHSTUP report for an HTTP workload where flow control is enabled.

Example 7-4 Extract of sample TCP/IP global statistics report produced by CICS TS V5.3 DFHSTUP

Performance tuning for HTTP connections . . . . . . . . . . . . . : Yes

Socket listener has paused listening for HTTP connections . . . . : Yes

Number of times socket listener notified at task accept limit . . : 25672

Last time socket listener paused listening for HTTP connections . : 10/15/2015 11:13:26.3862

Region stopping HTTP connection persistence . . . . . . . . . . . : Yes

Number of times region stopped HTTP connection persistence. . . . : 0

Last time stopped HTTP connection persistence. . . . . . . . . . : --/--/---- --:--:--:----

Number of persistent HTTP connections made non-persistent . . . . : 52554

Number of times disconnected an HTTP connection at max uses . . . : 0

For more information about available CICS statistics fields, see 7.5.6, “TCP/IP global statistics” on page 116.

7.13.5 Comparison of SOTUNING options

Table 7-3 on page 142 lists CICS statistics reports, comparing a workload that is running in CICS with SOTUNING=YES to the same workload that is running in a CICS system with SOTUNING=520 for the same period. The workload in this case consisted of a simple HTTP web application where each inbound request made a new TCP/IP connection.

Table 7-3 Extract of CICS statistics reported values with SOTUNING=520 and SOTUNING=YES

CICS statistic	SOTUNING=520	SOTUNING=YES
Number of completed transactions	101,538	105,117
Peak queued transactions	2,193	3
Peak active transactions	150	150
Times stopped accepting new sockets	n/a	26,674
Number of times at MXT limit	1 (continuously)	29,418
CPU used	62.11 s	60.86 s
CPU per transaction	0.611 ms	0.578 ms
EDSA used (MB)	121 MB	60 MB

Although this example is an extreme case, it does demonstrate that it is more effective to queue work outside of CICS by preventing new connections being accepted in terms of CPU and EDSA usage.

In the SOTUNING=520 case, MXT was reached and the CICS region did not drop below that value of concurrent tasks, which remained permanently at MXT for the measurement interval. In the SOTUNING=YES case, the CICS system kept dropping in and out of MXT as it stopped new work arriving and then started accepting work as it dropped below MXT.

7.14 High transaction rate performance study

To demonstrate many of the performance improvements that were introduced in the CICS V5.3 release, a performance study was undertaken to drive a high rate of transactions through a CICS configuration. The study consisted of the following workloads:

•The first workload runs on a single z13 LPAR with 18 CPs up to a rate of 174,000 CICS transactions per second.

•The second workload runs on a single z13 LPAR with 26 CPs up to a rate of 227,000 CICS transactions per second.

For more information about the full results of this study, see IBM CICS Performance Series: CICS TS V5.3 Benchmark on IBM z13, REDP-5320, which is available at this website:

http://www.redbooks.ibm.com/redpieces/abstracts/redp5320.html

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Chapter 7. CICS TS for z/OS V5.3

Create new playlist

Sign In

Sign Up

Table of Contents for
Chapter 7. CICS TS for z/OS V5.3