Chapter 10. Modern batch workloads

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Modern batch workloads

This chapter describes the IBM CICS Transaction Server for z/OS (CICS TS) Feature Pack for Modern Batch, which can be installed in CICS TS V4.2 or later. It enables the WebSphere batch environment to schedule and manage batch applications in CICS.

First, we explain the need for a modern batch environment. Then, we review the types of workloads that it is suitable for and the architecture of the solution. The key design considerations when building a batch application to run in CICS TS are highlighted.

This chapter covers the following topics:

•10.1, “Business pressures on traditional batch processing” on page 98

•10.2, “WebSphere Java batch and batch container services” on page 101

•10.3, “Introduction to CICS batch support” on page 107

•10.4, “Running batch applications in CICS” on page 108

•10.5, “Reasons to run a batch application in CICS” on page 110

•10.6, “Benefits of running batch jobs within CICS” on page 110

•10.7, “Implications of running batches in CICS” on page 111

10.1 Business pressures on traditional batch processing

Business models have changed. The need now is to have access to data near-real-time data. Therefore, thsoe who still use the traditional batch model are under constant pressure to change to a real-time model. Some of the reasons are described in the following sections.

10.1.1 The “dedicated batch” window is disappearing

There was a time when “batch” and “online” processing were separate from one another. Online processing used to be stopped so that batch processing could have access to the system resources to complete its work.

But those days are behind us. Online processing is becoming a 24x7 operation. This is because client access is increasingly global, with people from different time zones seeking access at all times of the day and night. There might be times when online processing is greatly reduced, but in most cases, it is never stopped altogether. This means that batch and online processing must work at the same time. The window of time available for dedicated batch processing is shrinking. Figure 10-1 shows the batch processing today and the shrinking batch window.

Figure 10-1 Batch processing time available today

The advent of mobile devices means that client access is now even more frequent than before. The transaction work that flows back to information processing systems because of mobile device activity is increasing. Mobile device activity may occur at any time of night or day, so it is truly a 24x7 world for online processing.

But batch processing has not gone away. There are still requirements to do batch work. But it is evident that batch and online work must be processed at the same time and in a manner that implies that both work cooperatively.

10.1.2 The value of shared services

It is not just that the batch window that is shrinking. The cost pressures on maintaining the batch and online transaction processing (OLTP) environments are increasing, too. Having separate infrastructures for online and batch process requires separate computers, separate tools, and separate support staff. Figure 10-2 shows the efficiencies of consolidation of batch and OLTP staff.

Figure 10-2 Efficiencies through consolidation

The trend is in the direction of convergence. Online and batch processing are both information processing, and, as such, can and should be handled in a cooperative, converged manner. This offers efficiencies in infrastructure, staff, development tools, and, potentially, Java assets that can be shared between OLTP and batch processes.

10.1.3 Java for batch processing

Some readers might ask “Why use Java when COBOL works well?”

Batch assets written in COBOL are still very useful. To the extent that Java is relevant, it is to complement COBOL, not as a total replacement.

However, there are business pressures that make Java an attractive batch solution:

•Skills

Java skills are simply more plentiful than COBOL skills. Many organizations have very good Java development skills. It makes sense to leverage those Java skills for batch work as well.

•Tools

Today’s development tools are powerful and quite sophisticated. Acquiring them also represents an investment in technology and an investment in skills. It makes good sense to leverage that investment across multiple information processing disciplines.

•CICS Transaction Server Value Unit Edition (VUE)

A Java batch workload running in a CICS JVM server is eligible for CICS VUE pricing. This attractive pricing model is a financial motivation for running batches within CICS.

•Specialty engines

COBOL runs on general processors, but Java runs on specialty engines. Specialty engines (System z Integrated Information Processors, or zIIPs) provide a way to lower overall System z acquisition and licensing costs. Specialty engines are an attractive solution, and leveraging them for batch work is desirable.

•Cooperative processing

Online processing runtime infrastructures are often designed around Java. For example, CICS Transaction Server for z/OS is a powerful online processing run time. There is an investment in maintaining that online infrastructure. That investment can be leveraged for batch work also.

Therefore, using Java for batch processing is an area of growing interest and is already in use in many large processing operations.

10.1.4 Conflicting needs of CICS applications and z/OS batch applications

It is common for an online CICS application to update resources. For some resources, such as VSAM files, CICS maintains a record and image lock to prevent other applications from making conflicting updates and to be able to restore records in case of a failure. In these cases, the resources need to be opened exclusively by CICS.

A traditional z/OS batch application can be defined as a job written in job control language (JCL) that is submitted to the z/OS job entry subsystem (JES) for execution and does not need user input to complete. For example, it reads all records from a VSAM input file and, for each one, updates a VSAM master file and creates a summary report. However, the master file may be opened for exclusive use by CICS and thus be unavailable to the batch application.

In this scenario, there are several choices:

•Close the master file in CICS and start the batch application, which starts by making a backup of the master file.

Then, process all of the records, make the updates, delete the backup, and re-open the master file in CICS. If the batch application fails, the backup of the master file is restored, and either the issue is fixed immediately and the batch application is re-started or the issue is fixed later.

This choice results in a period of time, referred to as the batch window, during which the master file and the online applications that use that file are not available.

•Code the batch application to send a request to an online CICS application to make each update to the resources locked by the online application.

If each request is committed individually, for example, by using the CICS nontransactional External Communications Interface (EXCI), that causes data integrity issues if the batch application fails. This because if the batch application is restarted, some updates will be executed twice. If all requests were committed together, for example, using transactional EXCI, this can result in many records being locked by the online application for an extended time, causing unacceptable delays for other applications.

•For VSAM resources, use record-level sharing (RLS) to maintain record locks for the batch application. However, the batch application is unlikely to maintain its own recoverable logs due to the complexity of writing them. Therefore, if the batch application fails, the records already updated are not restored, and that leads to data integrity issues.

•For VSAM resources, use Data Facility Storage Management Subsystem (DFSMS) transactional VSAM services to lock and log record images before updates. However, unless the application implements its own checkpoints, many records can be locked for an extended period, causing unacceptable delays for other applications.

In addition, as companies provide their services across more locations and time zones and customers require services at times of day to suit them, there is a growing need for online applications to be available continuously, 24x7. Also over time, batch applications are expected to process an increasing amount of data and there is a need to drive down costs. Therefore, in the event of the batch application failing, it is unacceptable to restart it from the beginning. Instead there is a requirement to restart from a frequent checkpoint.

10.2 WebSphere Java batch and batch container services

In this section, we describe the batch environment, WebSphere Java batch, and the batch container framework.

10.2.1 Definition of a batch environment

The batch environment is a managed environment for batch applications that are scheduled, process large amounts of information, and may take many hours to complete. The batch environment provides a powerful failover model based on checkpoint and restart scenarios. This is essential to efficiently manage, run, and restart batch applications, in particular when batch application resources are shared with online transaction processing.

The batch environment has two primary components:

•The job scheduler is responsible for determining when and where to dispatch a job, monitoring the job, and reporting back to its caller.

•Endpoints (batch containers) are where the batch application runs. Jobs are dispatched to an endpoint from the job scheduler. The job runs and, upon completion, the job log and return code is provided back to the job scheduler.

10.2.2 CICS functions

CICS is a modern general-purpose transaction processing environment for online applications that start as a result of a request received through a terminal, web service, or message. It typically processes a small amount of information within subseconds. CICS provides the following capabilities:

•Administration, security, and transaction facilities, such as authorization, data integrity, workload management, logging, tracing, debugging, statistics, and monitoring

•API and development tools, such as named counter and XML conversion

•Shared access to resources, such as temporary storage, data tables, IBM DB2 databases, IBM Information Management System (IMS), and Virtual Storage Access Method (VSAM) data sets or files

•Communications, such as web services, WebSphere Optimized Local Adapters, IBM MQ, HTTP, and sockets

10.2.3 WebSphere Java batch

First, it’s necessary to understand the high-level architecture of IBM WebSphere Java batch WebSphere Application Server software provides what is known as a “Java EE” (Java Enterprise Edition) runtime. Java EE is an industry-standard specification for an application server that provides a long list of standard application specifications.

Part of the design of this Java EE runtime is the concept of a container. Containers are simply runtime functions that provide managed services to the programs that run in the containers. Container-managed services mean tjat the applications can focus on their core business logic and not have to implement common functions. WebSphere Application Server already had web and Enterprise JavaBeans (EJB) containers. The addition of the WebSphere Java batch function adds a third batch container.

Like other containers, the batch container provides function services to the batch applications that run in the container. The CICS Modern Batch Feature pack extends WebSphere Java batch management and execution realm. It allows CICS TS to participate as a WebSphere Java batch endpoint server. Figure 10-3 shows a summary listing of some of those services provided by the batch container of a WebSphere Java batch.

Figure 10-3 WebSphere batch container

The next thing to understand is the job management and execution model provided by WebSphere Java batch. The Java batch separates several key elements of batch processing, as shown in Figure 10-4 on page 103:

•Job submission: This is done through a defined interface called the Job Management Console (JMC). It provides a view of the batch environment and allows you to submit and manage jobs.

•Job description: The job description is specified in an XML file called xJCL. This avoids hardcoding job properties in the application code. The job properties file is used to tell the job submission function what the job is and how to run it. We will describe xJCL later in the chapter.

•Job dispatching: The job dispatching function signals to the endpoint to begin execution of the batch code named in the job declaration file (xJCL). The job dispatching function interprets the xJCL, dispatches the job to the endpoint where the batch application resides, and provides ability to stop and restart jobs.

•Job execution: The execution of the job takes place in an endpoint, which is a batch container where the Java batch application is deployed.

•Job development and deployment: Java batch applications implement the batch logic, and they are deployed to the batch containers. The development libraries and tools assist in the creation of the batch applications.

Figure 10-4 Job Management and Execution Model

In the section that follows, we describe the job control language, integration with enterprise schedulers and the batch middleware framework that are supported in CICS TS Feature Pack for Modern Batch.

10.2.4 Job control language

The job control language for modern batch is an XML file called xJCL. It describes the Java class files that are used in a batch step and the steps that are included in the batch job. The first thing to understand is the concept of xJCL.

xJCL is a job declaration file. Conceptually, xJCL is just like a normal JCL. It describes the job to be run and the context in which the job is to operate. The difference is that xJCL is written in XML. The xJCL file is used to tell the batch run time that a job invocation is being requested and to provide the runtime understanding of what job to run and to details about the job (like a regular JCL). Figure 10-5 on page 104 shows a trimmed version of an actual xJCL file.

Figure 10-5 xJCL file snippet

10.2.5 Integration with enterprise schedulers

The point of integration on z/OS is still with the enterprise scheduler submitting JCL. Figure 10-6 shows how the integration of enterprise schedulers and the job dispatcher function happens.

Figure 10-6 Integration with enterprise schedulers

The job dispatcher function is hosted in a WebSphere Application Server and has several interfaces. The one used for this integration is a message-driven bean (MDB) interface. The “glue” between enterprise scheduler JCL submission and the dispatcher is a supplied program that uses messaging (IBM MQ or the service integration bus, or SIBus, of WebSphere Application Server) to submit Java batch jobs to WebSphere Java batches. That “glue” utility is known as WSGRID. The enterprise scheduler sees a batch job as the WSGRID invocation.

10.2.6 Checkpoint and job restart services

Checkpoint commit and rollback is a function of the batch container. This function is abstracted away from the batch job and is handled by the batch container. It relies on the CICS sync point API to do this. These key items are important to note:

•Checkpoint interval (record or time) is specified in the xJCL file.

•As checkpoint intervals are reached, the container commits the records that the checkpoint attained.

•In the event of a failure, the job may be restarted at the last good checkpoint.

Figure 10-7 shows the concepts of checkpoint processing.

Figure 10-7 Checkpoint processing

10.2.7 Data record read and write support services

The batch container provides batch data streams (BDS) for externalizing data access from the job step. This provides a way of abstracting the data read and write logic away from the batch step code.

Several batch data streams are provided:

•Read and write byte data from file

•Read and write text file

•Read from a VSAM key-sequenced data set (KSDS) as input data to the job

•update in a VSAM KSDS data set

•Retrieve data from a database using a JDBC connection

•Write data to a database using a JDBC connection

The Feature Pack for Modern Batch also provides access to the full set of Java class library for CICS (JCICS) APIs.

10.2.8 Job resiliency services

The batch container provides services for the batch job to skip records where a data read or write operation throws an exception. It can also retry job steps for an unhandled exception.

Skip-record processing

This service provides a way of tolerating a data read or write errors so the job can continue. The objective is to provide mechanism to survive the odd data exception rather than stop the job. Figure 10-8 shows skip-record processing.

Figure 10-8 Skip-record processing

Retry-step processing

This service provides a way of retrying the job step in the event of an exception. If successful on retry, the job continues and your processing completes. This is at a higher level from skip-record processing. It is at the “invoke batch step” level. This provides a way to retry the step for exceptions. The batch container falls back to the last good checkpoint and restarts from there. Figure 10-9 depicts the retry-step processing. The xJCL provides the following information to the container:

•How many retry steps may be attempted

•What exceptions to consider for retry-step processing

•Alternatively, what exceptions to exclude from retry-step processing

Figure 10-9 Retry-step processing

10.3 Introduction to CICS batch support

In this section, we describe CICS support for modern batches and how batch processing fits within the CICS environment.

10.3.1 CICS support for modern batch

The CICS TS Feature Pack for Modern Batch installs in the CICS region and presents to a WebSphere Java batch dispatcher (dispatcher) as another endpoint to which it can dispatch work. That provides Java batch management for the WebSphere Java batch runtime model, with CICS being an endpoint.

The Feature Pack for Modern Batch function is configured to communicate over the network to the dispatcher, telling the dispatcher about the CICS Feature Pack presence and the Java batch application deployed there. When an xJCL file is submitted to the dispatcher for the Java batch program deployed in the CICS Feature Pack, the dispatcher communicates across the network to invoke and monitor the progress of the batch program. This allows a CICS region to become a job endpoint for a WebSphere Java batch dispatching server, which puts batch logic much closer to the CICS data, as shown in Figure 10-10 on page 108.

Figure 10-10 CICS TS Feature Pack for Modern Batch

10.4 Running batch applications in CICS

In this section, we review a solution to resolve the conflicting needs of batch and online applications. It can be used to develop a batch application that uses the batch programming model and runs the application in CICS.

These are the key behavioral aspects of such a batch application:

•The batch application shares access to resources with online applications.

•The batch application creates regular checkpoints to free up transactional resourcest so that online applications are not blocked from completing for excessive amounts of time.

Note: The endpoint provides this checkpoint capability on behalf of the applications.

•If the batch application fails, it can be restarted from its most recent checkpoint.

•Both batch and online applications run concurrently.

The batch application can be divided into job steps that execute in parallel against different subsets of the input data, to shorten the overall elapsed time to process the job.

10.4.1 WebSphere batch environment architecture

The CICS TS Feature Pack for Modern Batch provides an endpoint called the batch container, which runs in a Java virtual machine (JVM) in the CICS address space. The job scheduler interacts with the batch container to start, stop, and manage batch applications. These components are required for the batch implementation within CICS TS:

•WebSphere Application Server 8.5 or later

•CICS Transaction Server 4.2 or above

•CICS TS Feature Pack for Modern Batch

Figure 10-11 depicts this architecture and the interaction among the components.

Figure 10-11 CICS provides an endpoint for the WebSphere batch environment

When started in CICS, the batch container loads a configuration file that details which batch applications it can run, how to connect to the job scheduler, and how the job scheduler can connect to it. The batch container registers with the job scheduler and informs the scheduler of the batch applications it can run. Then, it periodically sends status information to the scheduler, which states that it is still active and available for work.

The starting and processing of a job are detailed by using the numbers in the diagram in Figure 10-11:

1. JCL is submitted on z/OS to request that the batch application start. The JCL runs the WSGRID program and passes the location of an xJCL document to it.

2. The WSGRID program connects to the job scheduler and passes the xJCL location. Alternate interfaces are provided to start a batch application, including a console, command-line interface, and programmatic API.

3. The job scheduler examines the xJCL to establish the name of the batch application to dispatch and uses the information published to it from all endpoints to select which endpoint should run the batch application. The job scheduler chooses a batch container hosted in CICS. It then connects to the batch container, passing the xJCL to it.

4. The batch application runs within the CICS batch container.

When the batch application is running in the batch container in CICS, it can use Java APIs, such as JDBC, or the CICS Java APIs (JCICS) to access CICS resources and services, including VSAM files. It also uses the APIs to call existing CICS programs written in other languages, such as COBOL, C/C++, PL/I, and Assembler.

As the job runs, the batch container takes checkpoints, which enable the batch application to be restarted from the last successful checkpoint in the event of a failure. When the batch application completes, the batch container notifies the job scheduler.

10.5 Reasons to run a batch application in CICS

There are various reasons to run batch applications in CICS:

•When you can reduce costs by taking advantage of the CICS VUE pricing model

CICS modern batch workload is eligible to run in a CICS VUE region and take advantage of the pricing model. This would substantially reduce the processing costs.

•When there is pressure to reduce or eliminate the batch window

If CICS is used for online transaction processing and those online applications need to be available for longer each day, it can make sense to run batch jobs in CICS.

•When CICS has opened resources exclusively that are needed for batch processing

If your batch needs access to resources that are opened exclusively by CICS, it can make sense to run the batch application under the control of CICS.

•When the batch application does not need exclusive resource access

If the batch application runs at the same time as online applications, updates to the resources being used by the batch job can be made by the online applications. To work in this environment, the batch application needs to be tolerant of these changes.

•When you want to reuse CICS business logic in batch

Reuse of existing business logic between online applications and batch helps to reduce duplication and can make it quicker and easier to develop new applications. It is likely that there is significant business logic contained within existing CICS online applications. This logic can be invoked using the JCICS equivalent of the EXEC CICS LINK API.

10.6 Benefits of running batch jobs within CICS

The following benefits can result from running batch jobs in CICS:

•Online CICS applications can be available closer to 24 hours a day.

By running online applications and batch applications in parallel, there is less need to take the CICS managed resources offline.

•Capabilities provided by the batch container simplify application development.

Capabilities such as checkpointing, recovery to last checkpoint after a failure, logging, and trace are provided by the batch container, removing the need for the application developer to provide these capabilities.

•A common batch programming model is used.

The batch environment provides a common batch programming model across runtimes and platforms. Therefore, any developers who are skilled in batch environment application development should be able to write a batch application to run in CICS.

•People with Java skills are readily available.

Java is a well-known, popular, modern language. Batch job steps and batch data streams are written in Java.

•Modern batch within CICS is eligible for Value Unit Edition pricing.

CICS modern batch workload is eligible to run in a CICS VUE region and take advantage of the pricing model. This substantially reduces the processing costs.

•Java functions are included.

Functions such as email, PDF file generation, and XML processing are readily available for Java. This can make it easy to develop batch functions that might be more challenging to implement in languages such as COBOL or PL/I.

10.7 Implications of running batches in CICS

Running a batch job under the control of CICS means that the job has different behavioral characteristics. These are some of the implications of running a batch job in CICS.

•Batch jobs might take longer to run.

Traditional z/OS batch jobs are typically optimized to run as quickly as possible. When a batch job is moved to CICS, it might have to share resources with online applications. These resources can be CPU resources, files, or databases. Therefore, it is possible that the batch job might take longer to complete. At the same time, with the removal of the batch window, it might be that the batch job can start earlier or complete later. For batch jobs with hard time deadlines, investigation is required to understand whether the job can complete in the time required.

•Implications regarding online application performance need to be considered.

It is important that the batch processing does not negatively affect the performance of online applications. A user who is invoking a CICS transaction might expect a response time of tenths of a second. If the batch application locks too many resources at a time, it could affect the performance of the online applications. The checkpoint interval of the batch job can be adjusted to change the number of updates made within a checkpoint. If the batch job consumes too much of the CPU resources, this can also affect the performance of the online applications.

•Data being updated by batch processing can be changed by online applications.

Batch applications that rely on a set of records to remain consistent may not work when run in parallel with online applications, because the online applications can update records that the batch job has read or is about to read. This needs to be considered on a per application basis to determine whether it is likely to be an issue.

•Traditional batch jobs need factoring.

If you plan to move an existing batch job to CICS, the job needs refactoring to fit into the batch programming model.

10.8 Summary

Batch processing has proved to be an efficient, manageable and reliable method for bulk processing of updates to data. As businesses expand, the volume of data to be processed also expands. At the same time, the growth of online transactional workloads suggests that a new paradigm is needed to enable the two processing styles to work better together. The CICS batch container provides this new capability.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Chapter 10. Modern batch workloads

Create new playlist

Sign In

Sign Up

Table of Contents for
Chapter 10. Modern batch workloads