CHAPTER 1

image

Tasks and Scheduling

With rising numbers of cores and increasing processor speed in computers nowadays, it is crucial for applications to be able to use this computational power. Let’s compare a processor to yourself at work. Imagine you get stuck with a task, and only your experienced teammate can help you with it. But he’s at lunch. You could switch to a different task or wait for him. But when you decide to wait for your teammate’s help and study your social account for a moment, your progress at work is blocked. You are not productive at all until your mentor comes back. What if he doesn’t return? Your employer wouldn’t be happy.

Now imagine an application that needs to access the file system, a database, or a third-party system via Hypertext Transfer Protocol (HTTP). The application’s action is a blocking operation: the CPU is doing nothing while waiting for an input/output (I/O) response. At the same time, new requests are coming in and waiting to be processed. Such a scenario is a textbook example of a performance bottleneck.

This situation can be solved by diverting the CPU’s attention to a separate thread of execution. The stack and context of the current thread are parked (so the processor can continue computation later, as if the thread was never terminated), and the CPU starts computing a different thread for a short amount of time. This context switching between threads allows the CPU to prioritize the processing pipeline and focus on tasks that need to be processed instead of tasks that are waiting for I/O responses. This concept is commonly called multitasking, multithreading, or concurrent processing.

Threads are valuable for applications that

  • Are I/O intensive (for example, applications storing and reading data from a database)
  • Are serving requests from multiple clients (for example, web applications)
  • Want to use full power of all CPU cores

Multithreading covers immediate parallel execution. It is commonly triggered by user action—for example, sending a request from a web browser or via actions executed in the application’s graphical user interface (GUI). But some processing can’t rely on external triggering. This is where another important concept of concurrent programming comes into play: task scheduling. Execution can be planned at a certain time to start once or even repeatedly. Examples include background batch processing and housekeeping quiet hours.

The ability to create maintainable, concurrent code has become an important part of a developer’s knowledge base. However, this is hard. The evolution of languages has led to various abstractions that can help with this implementation. It is crucial to understand how these abstractions can be applied, to achieve the best performance.

All of the projects from Spring’s enterprise portfolio rely on multithreading. That is why you need to understand Spring constructs for threading and scheduling. This chapter also highlights their benefits against standard Java APIs. But it does not cover complicated parts such as synchronization, locking, deadlocks, or sharing resources.

Multithreading in Java

Developing asynchronous concurrent applications isn’t trivial. Early Java versions provided threading abstractions that were hard to maintain. Let’s quickly gather the basic concepts and skip the less important parts.

Java SE

In the Java world, the primary element of multithreading is the java.lang.Runnable interface (shown in Listing 1-1).

A Java developer implements this interface and places the concurrent logic into the Runnable.run() method. It doesn’t take nor return any parameters.

For controlling concurrent execution, the java.lang.Thread class can be used. When the method Thread.start() is executed, the Java virtual machine (JVM) kicks off a new thread, while the current thread also continues processing—kind of. The CPU switches its attention between these two tasks, and the developer can’t predict at which point the attention of the processor will be given to a particular piece of code. If the CPU has multiple cores, both threads can be running simultaneously, leveraging the best allocation of computational power.

Thread creation is a CPU- and memory-expensive operation. Also, direct java.lang.Thread usage turned out to be too low level and often led to unmaintainable code. Therefore, Java SE 5 introduced the new java.util.concurrent package, which utilizes reuse of threads in so-called thread pools. The main interface java.util.concurrent.Executor is shown in Listing 1-2.

The basic concept is to wrap the concurrent behavior into a class implementing the java.lang.Runnable interface and pass it to a thread pool for execution. The thread pool (an implementation of java.util.concurrent.Executor) takes care of executing the concurrent code in a parallel thread of execution.

A crucial subinterface of java.util.concurrent.Executor introduced with Java SE 5 concurrency utilities is java.util.concurrent.ExecutorService, shown in Listing 1-3.

ExecutorService provides two groups of operations on top of the thread pool. One group is used to terminate thread execution. Here are the main players from this category:

  • shutdown: The thread pool doesn’t accept new tasks for execution.
  • isShutdown: Indicates whether the thread pool is scheduled for shutdown (some threads can still be running).
  • isTerminated: Indicates whether the thread was terminated (all threads are finished).
  • awaitTermination: Blocks current threads until all tasks are finished.

The second group of ExecutorService methods is designed for submitting tasks for execution. As you can see from the declaration, they have complicated-looking signatures. Newcomers are java.util.concurrent.Future<V> and java.util.concurrent.Callable<V>. They are designed for use cases that java.util.Runnable can’t handle, such as returning a value from a thread and populating exceptions into the caller thread.

As its name suggests, Future wraps a value returned later by the concurrent logic. The value type is specified by generic type V. It will be filled in the future by a forked thread. The caller thread can grab it by calling Future.get(). But the caller thread will be blocked until the value is filled by the forked thread. Optionally, we can pass a time-out to Future.get(long timeout, TimeUnit unit) when it’s not suitable to block the caller for a long time.

java.util.concurrent.Callable is the interface designed for task declaration, similar to java.lang.Runnable. But it provides more possibilities than java.lang.Runnable, because its method can return a value and throw an exception. See Listing 1-4.

The thread-pool abstraction java.util.concurrent.ExecutorService has various implementations. Most notable is the class java.util.concurrent.ThreadPoolExecutor, which has various constructors to configure its properties. Thread-pool properties can be used for fine-grained tuning of thread pools. The most important properties of java.util.concurrent.ThreadPoolExecutor are as follows:

  • corePoolSize: The number of threads to keep in the pool
  • maximumPoolSize: The maximum number of threads to allow to run concurrently
  • keepAliveTime with unit: The keep-alive time-out for threads that exceed corePoolSize
  • workQueue: The queue for waiting threads that are submitted for execution when there aren’t enough worker threads in the pool

We can create thread pools directly via java.util.concurrent.ThreadPoolExecutor constructors, but a more convenient way is to create them via the factory class java.util.concurrent.Executors, as shown in Listing 1-5.

Executors contain many more factory methods that enable powerful control over your thread pools, but in most cases the preceding methods should be enough. The first three factory methods use java.util.concurrent.ThreadPoolExecutor as a thread-pool implementation. Each factory creates it with different constructor parameters.

newFixedThreadPool(int nThreads) creates a thread pool of a fixed size. The desired size of the pool is passed into this factory method as a parameter. This pool will start execution of a new thread immediately if the nThreads limit is not reached. When the pool contains nThreads running threads, any subsequent tasks submitted for execution will be waiting in the queue until one of the running threads finishes its work or is interrupted. So when an unexpected peak event occurs, threads are blocked, but this situation doesn’t overload the whole application.

newSingleThreadExecutor() creates a thread pool with only one thread. If this single worker thread is currently running, subsequent submitted tasks will be queued.

newCachedThreadPool() creates a thread pool that reuses finished threads. It creates new ones only when there isn’t an idle one in the pool. This pool is handy if we need to serve a lot of short tasks. Because creation of a thread is a CPU- and memory-intensive operation, we may benefit from reusing the same thread instances.

The last two factory methods shown in Listing 1-3 were introduced in Java SE 8. They use java.util.concurrent.ForkJoinPool as their underlying thread-pool implementation. Introduced in Java SE 7, this implementation uses a unique algorithm of CPU load distribution between threads called work-stealing. Idle threads can “steal” work from busy threads rather than waiting silently.

Executors.newWorkStealingPool() creates a thread pool with a size equal to the number of cores on the machine it’s currently running on. Thus it tries to use CPU power most effectively, because having more running threads than cores leads to frequent context switching, which isn’t a resource-free operation. But this type of thread pool isn’t suitable for I/O-bound processing. The reason is related to the number of threads again. If threads are mostly waiting on I/O operations and there aren’t other threads in the pool able to keep CPU cores busy, a lot of CPU cycles are wasted on waiting. A much better fit for such I/O-bound operations is a fixed thread pool with a number of cores bigger than the available cores. A better fit is a cached thread pool.

Executors.newWorkStealingPool(int parallelism) creates a pool that doesn’t take into account the number of cores on the current machine, but rather optimizes the pool to the given parallelism level. So the pool size is optimized toward the number of threads specified by parallelism. Unless you are a skilled user of ForkJoinPool, use the default parallelism, equal to the number of available cores (Executors.newWorkStealingPool() without the parallelism parameter.

Image Note  The fork/join algorithm is beyond the scope of this book. To become familiar with this algorithm’s special features, the Oracle tutorial for fork/join pools provides a good explanation: http://docs.oracle.com/javase/tutorial/essential/concurrency/forkjoin.html. Javadoc documentation for java.util.concurrent.ForkJoinPool and java.util.concurrent.ForkJoinTask provides a detailed explanation of why this type of pool isn’t suitable for I/O-bound operations.

The parameters of a thread pool (especially its size) can significantly improve or diminish performance of parallel processing. It is important to understand the nature of concurrent tasks processed by a thread pool. Various considerations affect thread-pool configuration. For example, a developer needs to understand whether parallel tasks are CPU bound, I/O bound, or mixed.

If we are dealing with CPU-bound operations, we want to keep the number of threads as close to the number of cores on the machine as possible. This avoids frequent switching between threads causing unnecessary overhead when there is enough processing to keep the CPU busy.

If we are processing I/O-intensive operations, we generally want to have a larger number of threads available, because we don’t want the CPU to waste cycles on blocking I/O operations. But the number of threads can’t be too large, because the overhead of switching threads can decrease performance. Balance needs to be found.

Balance needs to be found also for mixed types of processing. There isn’t any guidance for tuning configuration when I/O- and CPU-intensive operations are mixed within one thread pool. A developer may consider splitting the tasks into various threads if it helps. But such splitting doesn’t make sense for ForkJoinPool, because is it optimized toward the number of cores on the current machine. Therefore, we likely want to have only one instance of it reserved for processing CPU-intensive operations.

Trying to guess the thread-pool configuration is impossible. It needs to be backed up by performance testing, or at least by benchmarking or micro-benchmarking of certain modules. Performance testing with the environment as close as possible to that experienced in production, and with a similar load as expected in production, will lead to more accurate results than benchmarking modules in isolation.

Image Note  If you want to dive deeper into thread-pool sizing alongside other crucial aspects of thread pools, this article by Brian Goetz provides a short, decent introduction to this complicated topic: www.ibm.com/developerworks/library/j-jtp0730/index.html.

Implementations of the java.util.concurrent.Executor interface effectively made the java.lang.Thread class obsolete, because they abstract out the need for low-level thread handling and provide the same flexibility.

After all this theory, it’s time to show an example. See Listing 1-6.

The class SimpleTask is our logic that should be executed concurrently. It implements the java.util.concurrent.Callable interface with the generic type String to specify a return value type passed from the concurrent logic to the caller thread. It also contains various statements that highlight these features of the java.util.concurrent package:

  • Generates a random value to simulate an error in the form of UnsupportedOperationException thrown from the concurrent logic.
  • Simulates a blocking I/O operation with Thread.sleep(SIMULATE_IO) call. It blocks concurrent logic for 1 second.
  • Returns a String value in the case of success.

But notice that this mechanism doesn’t allow asynchronous code to accept parameters. SimpleTask is called from the class shown in Listing 1-7.

This class takes two properties via the constructor. First, executorService represents the thread pool used for managing concurrent execution. We are using an instance of java.util.concurrent.ExecutorService, so we can execute this concurrent processing against different thread pools. The second constructor parameter specifies the number of tasks to execute on the thread pool.

The concurrent execution itself happens in the executeTasks() method . At the beginning, it creates a collection of Future instances to gather results from concurrent tasks. After that, it saves the execution start time into the variable start. Subsequently, in the for loop, it submits one SimpleTask after another to the thread pool for execution. Future results are gathered in the results collection. At this stage, the thread pool starts executing tasks in parallel.

When all the tasks are submitted for execution and our work is being processed in the background, two commands are included that can be used to shut down the thread pool politely(without interrupting all the currently running threads). After calling executorService.shutdown(), the thread pool rejects further task submissions. After the awaitTermination call, the current thread would be blocked until the last thread from the thread pool finishes its execution.

Finally, we loop through results of concurrent task executions and log them to the console. Each result is of type java.util.concurrent.Future; therefore, the call result.get() waits for the thread to finish, if it’s still running. This is not our case because we already did that to all threads by calling awaitTermination. You might notice that the behavior would be practically the same without the awaitTermination call, because the reading of results would wait for threads also.

Calling result.get() can carry exceptions from parallel logic; therefore, we catch them and log the error message to the console. The ability to return values and bubble up exceptions from concurrent logic is a big advantage introduced in Java SE 5 with the java.util.concurrent package.

Last, we write to the console the elapsed time of parallel processing. This allows us to compare various thread-pool implementations. The remaining class is the main Java concurrency example highlighted in Listing 1-8.

The main class specifies the number of tasks to execute as 10. The main method first creates a fixed thread pool with 10 fixed threads. Then we create SimpleTaskExecutor to execute 10 tasks on this thread pool. Finally, we again execute 10 tasks, but this time on ForkJoinPool. When we run this small application, the console output may look like Listing 1-9.

Notice the big difference in the elapsed times of the same concurrent processing on different thread pools. As you may remember, our concurrent task (Listing 1-6) contains a delay, simulating a blocking I/O operation 1,000 milliseconds long. The fixed thread pool (with pool size 10) is able to execute 10 given tasks in parallel and straight away after submission. So after the blocking I/O delay of 1 second, all threads finish, and the whole parallel execution takes only a little bit more than 1 second. In fact, the CPU was waiting mostly for the blocking I/O operation to finish.

The second ForkJoinPool uses only four threads for execution, because the machine it was executed on has four CPU cores. So after four threads are already running, further submitted tasks are waiting in the queue to get a free thread-pool worker. At the time, only four tasks can be served. So simple math explains why the second parallel execution on ForkJoinPool took 3 seconds.

Such a scenario is an example of how important it is to understand the nature of tasks being executed by our thread pool. Generally, we want to increase thread-pool size if I/O-bound operations are being processed (but not too much). Or we can align the thread-pool size as close as possible to the available core count in the case of CPU-bound operations. But this isn’t a general rule for every case. Performance testing or benchmarking should apply. Bear in mind that you can have various thread pools for various purposes in your application, so it’s crucial to understand the behavior of your whole application under load.

Java EE

In the Java Platform, Enterprise Edition (Java EE), use of Java SE concurrency APIs is considered bad practice. Until Java EE 7, enterprise developers couldn’t use any specific concurrent API that would be managed by the container (concurrency should be handled exclusively by the application container). Because Java EE standards didn’t provide any concurrency support, various mechanisms specific to application containers were created:

  • CommonJ
    • WebSphere
    • WebLogic
  • Java EE Connector Architecture (JCA) work managers
    • JBoss (now WildFly)
    • GlassFish

Java EE 7 finally introduced Concurrency Utilities that allowed application developers to use managed concurrency APIs.

Image Note  Java EE 7 Concurrency Utilities are beyond the scope of this book. Please refer to the Java EE 7 tutorial for more information: http://docs.oracle.com/javaee/7/tutorial/concurrency-utilities.htm#GKJIQ8.

Task Scheduling

Some use cases require scheduling of tasks for later execution. For this purpose, Java has the interface java.util.concurrent.ScheduledExecutorService, shown in Listing 1-10.

The important mechanisms are the last two methods. They facilitate recurring scheduling, whereby a task is triggered at certain time intervals. This is most commonly used for batch processing of jobs that doesn’t involve user interaction. Two important scheduling configurations are available:

  • Fixed Delay: The time between the end time of the previous thread and the start of subsequent execution is specified (think of it as a constant gap between scheduled threads). The next start time is hard to predict because it depends on the end time of the previous execution.
  • Fixed Rate: The start time of each scheduled thread is specified. So the gap between threads varies. If thread execution time is longer than the scheduled rate, the start time of the next task execution is delayed until the previous one finishes. So scheduled tasks wouldn’t be executed concurrently.

With this support, we can schedule implementations of the java.lang.Runnable and java.util.concurrent.Callable<V> interfaces. But notice that we are not able to schedule java.util.concurrent.Callable<V> for recurring scheduling.

Implementation of a scheduled thread pool is java.util.concurrent.ScheduledThreadPoolExecutor. It can be created via its constructor or via a factory method provided by the class java.util.concurrent.Executors. The signature of this factory method is shown in Listing 1-11.

Let’s explore how we would use this mechanism. Listing 1-12 highlights a simple task as an implementation of the java.lang.Runnable interface.

This task simply outputs the time when it was started to the console. Listing 1-13 shows how we schedule this task.

ScheduledThreadPoolExecutor with only one worker thread is created via the Executors.newScheduledThreadPool(1) factory method call. Subsequently, a SimpleTask instance is scheduled at a fixed rate of 1 second with no initial delay. If we run the schduleTask() method of SimpleScheduler, console output would look like Listing 1-14.

SimpleTask outputs the current time to the console every second. Because we are using recurring scheduling in this example, this execution would continue further—potentially until the JVM of our application would be stopped forcibly.

But such limited scheduling often isn’t suitable for enterprise applications. Fortunately,  a popular library called Quartz Scheduler is commonly used for complex scheduling. It also provides enterprise features such as distributed transaction propagation, job persistence, and clustering features. But most notable are these scheduling features:

  • Indicate a certain time of the day (with millisecond precision)
  • Indicate certain days of the week/month/year
  • Exclude days specified by a given calendar (useful for excluding public holidays)
  • Specify an explicit number of executions
  • Repeat until date/time

Multithreading with Spring

You might be wondering where Spring comes into this game. Spring is a well-known provider of fine-grained abstractions on top of common Java technologies. It removes a lot of boilerplate code by providing three main abstractions used for executing and scheduling tasks:

  • TaskExecutor
  • Trigger
  • TaskScheduler

org.springframework.core.task.TaskExecutor

The purpose of this interface is to provide an abstraction for executing asynchronous tasks. It extends the java.util.concurrent.Executor interface, but doesn’t introduce a changed API contract in comparison to pure Java SE abstractions. See Listing 1-15.

TaskExecutor abstracts away the Java SE 5 Executor interface so it can be used as a superinterface for non-Java SE 5 thread pools. TaskExecutor is an umbrella for these thread executors:

  • SimpleAsyncTaskExecutor: Creates a new thread for each task. Supports a limited number of threads (unlimited by default). When this limit is reached, execution of subsequent tasks is blocked until some of the threads finish execution.
  • SyncTaskExecutor: Task execution is performed synchronously in the calling thread. Mainly intended for testing purposes.
  • ConcurrentTaskExecutor: Wraps the provided java.util.concurrent.Executor and exposes it as TaskExecutor. It is useful when the developer wants to create a custom Java SE 5 thread pool and use it in a Spring environment. Not very common. In most cases, you would probably want to use ThreadPoolTaskExecutor.
  • SimpleThreadPoolTaskExecutor: Subclass of org.quartz.simpl.SimpleThreadPool (Quartz Scheduler’s implementation of a thread pool). It is useful because it can understand Spring’s life-cycle callbacks.
  • ThreadPoolTaskExecutor: Wrapper for java.util.concurrent.ThreadPoolExecutor. Enables you to create configurable thread pools with parameters such as corePoolSize, maxPoolSize, keepAliveSeconds, and queueCapacity.

There are also some Java EE implementations of TaskExecutor. These implementations act as a bridge between the Spring environment and multithreading support of the enterprise application container where the configuration of thread pools is located:

  • org.springframework.jca.work.WorkManagerTaskExecutor: Adapter for the JCA 1.5 javax.resource.spi.work.WorkManager interface.
  • org.springframework.scheduling.commonj.WorkManagerTaskExecutor: Adapter for the CommonJ commonj.work.WorkManager interface.
  • org.springframework.jca.work.glassfish.GlassFishWorkManagerTaskExecutor: Adapter for the GlassFish JCA WorkManager.
  • org.springframework.jca.work.jboss.JbossWorkManagerTaskExecutor: Adapter for the JBoss JCA WorkManager.

All these task executors can be declared as standard Spring beans via the @Bean annotation or the <bean/> tag. An example is shown later in this chapter, in the “Configuring Asynchronous Tasks” section.

org.springframework.scheduling.Trigger

The Trigger interface, shown in Listing 1-16, is designed for determining the next execution time based on a previous execution. It is used by Spring for scheduling.

It has two implementations:

  • CronTrigger: Allows triggering based on Cron expressions.
  • PeriodicTrigger: Allows you to specify Fixed Rate or Fixed Delay triggers. For switching between them, it exposes a flag via PeriodicTrigger.setFixedRate(boolean fixedRate). The default triggering is Fixed Delay.

Image Note  Cron is used within Unix systems for job scheduling. A Cron expression has six parts separated by spaces. Each part specifies the minute, hour, day of month, month, day of week, and year. The year is not mandatory, so commonly only five parts are used. But the original Cron expression allows triggering with the shortest intervals 1 minute long. Therefore, Spring uses a Quartz Scheduler style of Cron syntax. It uses seven parts instead of six, and the first one is in seconds. The year is not mandatory, as with Unix Cron. With powerful placeholders for each part, it is possible to specify complicated patterns for scheduling (for example, * */10 9-17 * * MON-FRI represents triggering every 10 minutes during work hours). Understanding Cron expressions isn’t in the scope of this book, but it’s important to know that Spring integrates this powerful triggering mechanism. For more information about the Quartz flavor of Cron expressions, go to www.quartz-scheduler.org/documentation/quartz-2.1.x/tutorials/crontrigger.

Task Scheduler

org.springframework.scheduling.TaskScheduler, shown in Listing 1-17, is Spring’s abstraction for scheduling tasks.

As you can see from the interface declaration, Spring can schedule one-time or recurring jobs based on a fixed rate and fixed delay approach. It can also use the Trigger interface for complex Cron-based scheduling.

Although it’s important to remember how TaskExecutor and Trigger interfaces work, we don’t need to use them directly. Spring provides powerful annotations and XML tags for configuring multithreading and scheduling behavior. So TaskExecutor and Trigger are used under the hood.

Configuring Asynchronous Tasks

For enabling multithreading support, Spring provides the class-level annotation @EnableAsync. A class annotated with @EnableAsync should be used in conjunction with the @Configuration annotation. So the @EnableAsync annotation basically switches on Spring asynchronous support.

The @Async method-level annotation is designed to define asynchronous logic. Such tasks will be executed in a separate thread. This annotation provides much wider possibilities when the developer designs asynchronous APIs. Earlier sections about the Callable<T> interface explained that Java SE APIs can return value and bubble up exceptions from asynchronous code. Passing parameters into asynchronous logic is a unique feature of @Async annotation that Spring provides. But methods annotated by @Async have two important limitations:

  • The method must return void or Future<T> (similar to Callable<T>).
  • The method can’t be used with life-cycle callbacks (for example, @PostConstruct).

@Async annotation has one optional parameter, which specifies the name of the task executor bean used for executing asynchronous logic. Let’s explore use of these annotations in an example. Listing 1-18 shows logic that will be executed in a separate thread.

The class AsyncTask is annotated with the @Component annotation, which specifies that the AsyncTask class is a candidate for component scanning by Spring. This class has only one method, call. First it reads the name of the current thread. Then it simulates a blocking I/O operation, which is 1 second long. Finally, it simulates an error if the given parameter is odd, or the return name of the thread if the given parameter is even. This behavior is in place to highlight possibilities of this API.

But this method is special, which is declared by the @Async annotation. It will be executed in a separate thread. Notice that it can take any number of parameters, as opposed to the Callable<T> interface. To be able to return a value into the caller thread, it wraps the result into AsyncResult<String>. AsyncResult<T> is Spring’s implementation of the Future interface.

Listing 1-19 shows how we can call asynchronous logic.

The AsyncTask bean is injected into the Caller class via constructor injection. The method kickOffAsyncTasks executes this asynchronous logic. It first takes the number of tasks to execute as a parameter and allocates the collection for gathering Future<String> results. We track the start and end time of execution and write it to the console. The execution of each thread is started by calling results.add(asyncTask.call(idx)), which calls the asynchronous method, passes a parameter to it, and stores the Future<String> object for retrieving return values later. An interesting aspect is the ability to pass parameters to asynchronous logic. It is possible because Spring’s proxy mechanism for AsyncTask handles multithreading behavior with propagating parameters. Notice that the multithreading example in Listing 1-6 is based on java.util.concurrent.Callable<T> not being able to pass parameters to the thread directly. This is a big advantage of Spring’s multithreading support.

After the first for loop, all threads are submitted for execution. Finally, we loop through the results collection and output return values to the console. Calling result.get() would wait for the thread to finish its job. In addition, a try-catch block handles errors penetrated from asynchronous logic.

You might ask where and how the thread pool is involved in this setup. Let’s explore it in Listing 1-20.

This Spring configuration scans the package for Spring beans where the Caller class is located. So with Caller, the AsyncTask bean is registered into the Spring IoC container. The first method, customTaskExecutor , registers the thread-pool bean into the context. java.util.concurrent.Executor is used as a bean type to allow for variability of thread-pool implementation. So it’s obvious that we can use any of the executors that Spring provides as well as the standard Java SE java.util.concurrent thread-pool implementations. In this case, java.util.concurrent.ForkJoinPool is created with the help of the java.util.concurrent.Executors factory class. This pool creates a number of worker threads equal to the number of cores of the machine it’s running on (which is four in this example).

This bean is registered with the name customTaskExecutor; Spring uses the name of the bean registration method as the bean name when it isn’t configured explicitly. Notice that this name is the same as the parameter of the @Async annotation in the class AsyncTask in Listing 1-18. So this bean will be used for executing the AsyncTask.call method .

The second method in this class is main. The signature of this method should be familiar to every Java developer. Yes, it’s the famous main method that turns this class into a command-line utility. The array args can take arguments from the command line, but we don’t use this feature in this case. What happens in the body of this method is interesting. It uses a feature of a relatively new project from the Spring portfolio called Spring Boot. We can think of the Spring Boot project as a convention over the configuration wrapper for the Spring Framework. It can significantly simplify and reduce the Spring configuration needed for building modern Java applications, but at the same time it doesn’t reduce the variability of Spring configuration features. Starting a new Spring project without it wouldn’t make sense nowadays.

Image Note  The Spring Boot project is beyond the scope of this book, because it’s not part of Enterprise Integration with Spring certification. If you want to learn more about this useful project, take a look at the “Getting Started” section of the Spring Boot reference documentation: http://docs.spring.io/spring-boot/docs/current/reference/htmlsingle/#getting-started.

SpringApplication.run will kick off the application with the given Spring configuration and command-line arguments. It is used to simply bootstrap our asynchronous application. This method returns the created Spring context instance, which is used to find the Caller bean and explicitly execute asynchronous code. The possible console output after running this application is shown in Listing 1-21.

Notice we used java.util.concurrent.ForkJoinPool as a thread-pool implementation. It uses parallelism equal to the number of CPU cores provided. We can think about parallelism as the core size of the pool. So this thread pool will have four worker threads, because four cores are available on the machine. We can find this fact in Listing 1-21. Fork/join pool worker threads are named with the suffix -worker-X, where X specifies the ID of the thread.

An interesting statistic is the elapsed time of the whole execution. It took more than 3 seconds. Notice that simulating the blocking I/O operation in AsyncTask from Listing 1-18 took only 1 second. But then why does the overall execution take 3 seconds? The reason is again in the incorrect thread-pool type used to handle this type of asynchronous logic. When we are dealing with blocking I/O operations, it is a good idea to make sure the size of the pool is bigger than the number of CPU cores, so the CPU doesn’t have to wait for a slow blocking I/O operation. Let’s show what would happen if we use the thread pool from Listing 1-22.

This code uses Spring’s org.springframework.scheduling.concurrent.ThreadPoolTaskExecutor implementation of the thread pool. We configured the core pool size of 10 to be the same as the number of tasks being executed asynchronously. Listing 1-23 shows possible output for this thread-pool configuration.

Now the overall execution took much less time than with the fork/join pool. This clearly shows how crucial it is for the developer to understand asynchronous behavior and tune the thread-pool parameters accordingly.

XML Configuration for Task Execution

Spring’s Java-based configuration is a modern younger brother of XML configuration. If you need to use XML configuration for legacy reasons, Spring allows you to define asynchronous behavior with the task namespace. To replicate Listing 1-23 with XML configuration, use the XML in Listing 1-24.

Our AsyncTask and Caller beans are located in the package net.lkrnac.book.eiws.chapter01.async.task, where they are component scanned. So their @Component annotations are still needed, for registration into the Spring IoC container. The <task:executor/> tag creates a ThreadPoolTaskExecutor instance as a thread-pool implementation. If we need to use a different implementation, it needs to be specified with a standard bean definition. The AsyncTask annotation doesn’t need the custom task executor name parameter anymore, because the thread-pool reference is configured by the executor parameter in the <task:annotation-driven/> tag. We use the customTaskExecutor instance defined by the <task:executor/> tag.

The <task:executor/> tag has various parameters:

  • pool-size: This optional parameter can have two forms:
    • A single value specifies the number of threads to keep in the pool, even if they are idle (the core pool size).
    • A range of values, separated by a dash (for example, 5-25). The first value specifies the core pool size (the default value is 1). The second value specifies the maximum pool size (the default value is Integer.MAX_INT). If the range isn’t specified, the default core size is 1 and the maximum pool size is Integer.MAX_INT.
  • queue-capacity: Specifies the capacity for ThreadPoolTaskExecutor’s BlockingQueue. Queue is used for storing scheduled tasks when there aren’t free threads in pool. When the capacity of the queue is reached, the task executor rejects further thread submissions. The default queue capacity is Integer.MAX_INT. This value is often not desirable, because OutOfMemory problems can occur if a lot of threads are queued. So it’s always a good idea to limit queue-capacity.
  • rejection-policy: Specifies the rejection behavior when the queue capacity is reached. Options are as follows.
    • ABORT: The thread pool throws TaskRejectedException when an additional task is submitted. This is the default behavior.
    • CALLER_RUNS: Logic is executed synchronously in the caller thread. It allows you to keep the caller thread busy until subsequent tasks are submitted for execution. In the meantime, some threads from the thread pool may finish and free some capacity.
    • DISCARD: The submitted task is silently discarded.
    • DISCARD_OLDEST: The task at the head of the queue is discarded.

Configuring Scheduled Tasks

Scheduling configuration is enabled by the @EnableScheduling class-level annotation. It should be used with the @Configuration annotation. A class annotated with @EnableScheduling and @Configuration can be imported by the @Import annotation into another Spring configuration class. The second option is to component scan such an annotated class.

The @Scheduled method-level annotation defines logic that should be scheduled for recurring execution. It can’t take any parameters nor return any value. An annotation can have these parameters:

  • fixedDelay: Numeric value in milliseconds
  • fixedRate: Numeric value in milliseconds
  • cron: String Cron expression
  • initialDelay: Delay first start in milliseconds

Exactly one of the parameters fixedDelay, fixedRate, or cron must be specified. Otherwise, registering such a bean into the Spring context would fail. The last parameter, initialDelay, is optional. It would be useful when we don’t want to schedule a first iteration immediately but rather want to defer this execution to a later time. Listing 1-25 shows an example of this annotation.

ScheduledTask is a Spring bean with one method annotated by the @Scheduled annotation. In this case, we use the fixed-rate scheduling method with 1-second intervals. This method simply prints the current time to the console. The Spring configuration looks like Listing 1-26.

This example uses the @EnableScheduling annotation to enable the scheduling feature. It also uses a component scan to register Spring beans from the current package and its subpackages. The last notable aspect of this class is usage of Spring Boot to simply execute this configuration as an executable JAR application. When we run this Java application, we should see output similar to Listing 1-27.

Our job is scheduled for execution each second. This application would have this behavior until we terminate it forcibly.

XML Configuration for Scheduling

It is also possible to configure scheduling with XML by using the task namespace, which has various tags for scheduling support. The first one is <task:scheduler>, used to configure the ThreadPoolTaskScheduler thread pool that will be driving scheduling. <task:scheduler> has one mandatory parameter, id, which specifies the name of the scheduler bean and the prefix for worker thread names. The second parameter, pool-size, is optional and specifies the core size of the thread-pool implementation used for scheduling. When it’s not explicitly specified, the default value is 1, indicating that a single thread will be used to execute scheduled logic.

The second top-level element for scheduling support is <task:scheduled-tasks>. It can have various <task:scheduled> subelements, where each subelement specifies one scheduled task. <task:scheduled> can have one of these parameters:

  • fixed-delay: Numeric value in milliseconds
  • fixed-rate: Numeric value in milliseconds
  • cron: String Cron expression

One of them has to be specified. The next parameter is the optional initial-delay. It may be used to delay initial iteration by a given number of milliseconds. The last two parameters, ref and method, are mandatory. They specify a reference to a Spring bean and the name of the method that will be scheduled. This method has to be void and can’t accept parameters.

Also, the top-level element, <task:scheduled-tasks>, has one optional parameter, scheduler, used to define the scheduler bean reference. If it isn’t specified, a single-threaded executor is used for scheduling tasks defined as subelements. Listing 1-28 shows the configuration.

Summary

This chapter described Spring’s APIs for multithreading and scheduling and compared them with pure Java SE multithreading support. It is important to bear in mind that multithreading is a double-edged sword. Throwing a pool of threads on the problem can slow computation, because switching context between threads can become more expensive than the computation itself. This can happen especially when there are many more running threads than CPU cores in the system. In addition, multithreaded logic could be too simple and not worth the cost of context switching. So figuring out the correct level of parallelism isn’t a trivial job and requires a good understanding of the problem and the surrounding environment. Benchmarking and accurate performance testing are always a good idea when problems are being solved by multithreading.

Image Note  In addition to multithreading, you can use other approaches to solve problems of blocking I/O operations. But these are beyond the scope of this book. For more information, you can research Java Futures and Reactive Extensions (for example, see the RxJava library at https://github.com/ReactiveX/RxJava) and Actor systems (for example, Spring Reactor at https://github.com/reactor/reactor or Akka at http://akka.io/). A good example of an aggressive nonblocking I/O approach is the event loop in Node.js (https://nodejs.org/) that naturally evolved from the web browser’s event loop.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.117.76.204