205. Work-stealing thread pool

Let's focus on the packing process, which should be implemented via a work-stealing thread pool. To start, let's discuss what a work-stealing thread pool is, and let's do it via a comparison with a classic thread pool. The following diagram depicts how a classic thread pool works:

So, a thread pool relies on an internal inbound queue to store tasks. Each thread must dequeue a task and execute it. This is suitable for cases when the tasks are time-consuming and their number is relatively low. On the other hand, if these tasks are many and are small (they require a small amount of time to be executed), there will be a lot of contentions as well. This is not good, and even if this is a lock-free queue the problem is not entirely solved.

In order to reduce contentions and increase performance, a thread pool can rely on a work-stealing algorithm and a queue per thread. In this case, there is a central inbound queue for all tasks, and an extra queue (known as the local task queue) for each thread (worker thread), as in the following diagram:

So, each thread will dequeue tasks from the central queue and enqueue them in their own queue. Each thread has its own local queue of tasks. Further, when a thread wants to process a task, it simply dequeues a task from its own local queue. As long as its local queue is not empty, the thread will continue to process the tasks from it without bothering other threads (no contentions with other threads). When its local queue is empty (as in the case of Thread 2 in the preceding diagram), it tries to steal (via a working-stealing algorithm) tasks from local queues that belong to other threads (for example, Thread 2 steals tasks from Thread 3). If it doesn't find anything to steal, it accesses the shared central inbound queue.

Each local queue is actually a deque (short for double-ended queue), therefore it can be accessed efficiently from both ends. The thread sees its deque as a stack, meaning that it will enqueue (add new tasks) and dequeue (take tasks for processing) from only one end. On the other hand, when a thread tries to steal from the queue of another thread, it will access the other end (for example, Thread 2 steals from Thread 3 queue from the other end). So, tasks are processed from one end and stolen from the other end.

If two threads try to steal from the same local queue then there is contention, but normally this should be insignificant.

What we've just described is the fork/join framework introduced in JDK 7 and exemplified in the The fork/join framework section. Starting with JDK 8, the Executors class was enriched with a work-stealing thread pool using the number of available processors as its target parallelism level. This is available via Executors.newWorkStealingPool() and Executors.newWorkStealingPool(int parallelism).

Let's see the source code of this thread pool:

public static ExecutorService newWorkStealingPool() {

  return new ForkJoinPool(Runtime.getRuntime().availableProcessors(),
    ForkJoinPool.defaultForkJoinWorkerThreadFactory,
      null, true);
}

So, internally, this thread pool instantiates ForkJoinPool via the following constructor:

public ForkJoinPool(int parallelism,
  ForkJoinPool.ForkJoinWorkerThreadFactory factory,
  Thread.UncaughtExceptionHandler handler,
  boolean asyncMode)

We have the parallelism level set to availableProcessors(), the default thread factory for returning new threads, Thread.UncaughtExceptionHandler, passed as null, and asyncMode set to true. Setting asyncMode to true means that it empowers the local First In First Out (FIFO) scheduling mode for tasks that are forked and never joined. This mode may be more suitable than the default one (locally stack-based) in programs that rely on worker threads to process only event-style asynchronous tasks.

Nevertheless, don't forget that the local task queue and work-stealing algorithm are empowered only if the worker threads schedule new tasks in their own local queues. Otherwise, ForkJoinPool is just a ThreadPoolExecutor with extra overhead.

When we work directly with ForkJoinPool, we can instruct tasks to explicitly schedule new tasks during execution using ForkJoinTask (typically, via RecursiveTask or RecursiveAction).

But since newWorkStealingPool() is a higher level of abstraction for ForkJoinPool, we cannot instruct tasks to explicitly schedule new tasks during execution. Therefore, newWorkStealingPool() will decide internally how to work based on the tasks that we pass. We can try a comparison between newWorkStealingPool(), newCachedThreadPool(), and newFixedThreadPool(), and see how they perform in two scenarios:

For a large number of small tasks
For a small number of time-consuming tasks

Let's take a look at the solutions for both these scenarios in the next sections.

Table of Contents for 205. Work-stealing thread pool

Create new playlist

Sign In

Sign Up

Table of Contents for
205. Work-stealing thread pool