195. Parallel processing of streams

In a nutshell, parallel processing a stream refers to a process that consists of three steps:

Splitting the elements of a stream into multiple chunks
Processing each chunk in a separate thread
Joining the results of processing in a single result

These three steps take place behind the scenes via the default ForkJoinPool method as we discussed in Chapter 10, Concurrency – Thread Pools, Callables, and Synchronizers and Chapter 11, Concurrency – Deep Dive.

As a rule of thumb, parallel processing can be applied only to stateless (the state of an element doesn't affect another element), non-interfering (the data source is not affected), and associative (the result is not affected by the order of operands) operations.

Let's assume that our problem is to sum the elements of a list of doubles:

Random rnd = new Random();
List<Double> numbers = new ArrayList<>();

for (int i = 0; i < 1 _000_000; i++) {
  numbers.add(rnd.nextDouble());
}

We can also do this directly as a stream:

DoubleStream.generate(() -> rnd.nextDouble()).limit(1_000_000)

In a sequential approach, we can do this as follows:

double result = numbers.stream()
  .reduce((a, b) -> a + b).orElse(-1d);

This operation will probably take place on a single core behind the scenes (even if our machine has more cores), as shown in the following diagram:

This problem is a good candidate for leverage parallelization, and so we can call parallelStream() instead of stream(), as follows:

double result = numbers.parallelStream()
  .reduce((a, b) -> a + b).orElse(-1d);

Once we call parallelStream(), Java will take action and process the stream using multiple threads. Parallelization can be done via the parallel() method as well:

double result = numbers.stream()
  .parallel()
  .reduce((a, b) -> a + b).orElse(-1d);

This time, the processing takes place via a fork/join, as shown in the following diagram (there is one thread for each available core):

In the context of reduce(), parallelization can be depicted as follows:

By default, the Java ForkJoinPool will try to fetch as many threads as available processors like so:

int noOfProcessors = Runtime.getRuntime().availableProcessors();

We can affect the number of threads globally (all the parallel streams will use it) as follows:

System.setProperty(
  "java.util.concurrent.ForkJoinPool.common.parallelism", "10");

Alternatively, we can affect the number of threads for a single parallel stream as follows:

ForkJoinPool customThreadPool = new ForkJoinPool(5);

double result = customThreadPool.submit(
  () -> numbers.parallelStream()
    .reduce((a, b) -> a + b)).get().orElse(-1d);

Affecting the number of threads is an important decision to make. Trying to determine the optimal number of threads depending on the environment is not an easy task and, in most scenarios, the default setting (number of threads = number of processors) is the most suitable.

Even if the problem is a good candidate for leverage parallelization, it doesn't mean that parallel processing is a silver bullet. Deciding to go with parallel processing or not should be a decision that's made after benchmarking and comparing sequential versus parallel processing. Most commonly, parallel processing acts better in the case of huge datasets.

Do not fall into the trap of thinking that a larger number of threads results in faster processing. Avoid something like the following (these numbers are just indicators for a machine with 8 cores):

5 threads (~40 ms)
20 threads (~50 ms)
100 threads (~70 ms)
1000 threads (~ 250 ms)

Table of Contents for 195. Parallel processing of streams

Create new playlist

Sign In

Sign Up

Table of Contents for
195. Parallel processing of streams