Challenges with the WebFlux processing model

WebFlux is significantly different from Web MVC. Since there is no blocking I/O in the system, we can use only a few Thread instances to process all requests. Processing events simultaneously does not require a higher number of Thread instances than processors/cores in the system.

This is because WebFlux is built on top of Netty, where the default number of Thread instances is the Runtime.getRuntime().availableProcessors() multiplied by two.

Although the use of non-blocking operations allows the processing of results asynchronously (see diagram 6.11), so we can scale better, utilize CPUs more efficiently, spend CPU cycles on actual processing, and reduce wastage on context switching, the asynchronous non-blocking processing model has its own pitfalls. First of all, it is important to understand that CPU-intensive tasks should be scheduled on separate Thread or ThreadPool instances. This problem does not apply to the thread-per-connection model or a similar model in which a thread pool has a high number of workers, because in this case, each connection has already got a dedicated worker. Usually, most developers who have intensive experience with such a model forget about this and execute a CPU-intensive task on the main thread. A mistake like this comes at a high cost and can impact on overall performance. In this case, the main thread is busy with processing and does not have time to accept or process new connections:

Diagram 6.20. CPU-intensive work in the single-processor environment

As we can see from the preceding diagram, even if the whole request processing line consists of white bars (which means there is no blocking I/O), we may stack the processing by running hard computation that steals processing time from other requests.

To solve this problem, we should delegate long-running work to a separate pool of processors or, in the case of a single-process node, delegate work to a different node. For example, we may organize an efficient event loop (https://en.wikipedia.org/wiki/Event_loop), where one Thread accepts connections and then delegates the actual processing to a different pool of workers/nodes:

Diagram 6.21. Netty-like non-blocking server architecture

Another common mistake to do with asynchronous, non-blocking programming is blocking the operation usage. One of the tricky parts of web application development is generating a unique UUID:

UUID requestUniqueId = java.util.UUID.randomUUID();

The problem here is that #randomUUID() uses SecureRandom. Typical crypto-strength random number generators use a source of entropy that is external to the application. It might be a hardware random number generator, but more commonly it is accumulated randomness that is harvested by the operating system in normal operation.

In this context, the notion of randomness means events such as mouse movements, electrical power changes, and other random events that might be collected by the system at runtime.

The problem is that sources of entropy have a rate limit. If this is exceeded over a period of time, for some of the systems, the syscall to read entropy will stall until enough entropy has been made available. Also, the number of threads has a huge impact on the performance of the generation of UUIDs. That can be explained by looking at the implementation of SecureRandom#nextBytes(byte[] bytes), which generates the random numbers for UUID.randomUUID():

synchronized public void nextBytes(byte[] bytes) {
   secureRandomSpi.engineNextBytes(bytes);
}

As we can see, #nextBytes is synchronized, which leads to significant performance loss when accessed by different threads.

To learn more about the resolution of SecureRandom, please see the following Stack Overflow answer: https://stackoverflow.com/questions/137212/how-to-solve-slow-java-securerandom.

As we have learned, WebFlux uses a few threads to process a huge amount of requests asynchronously and in a non-blocking fashion. We have to be careful when using methods that, at first glance, look like they are I/O operations-free but in fact hide some specific interactions with OS. Without proper attention to such methods, we may dramatically decrease the entire system's performance. It is, therefore, crucial to use only non-blocking operations with WebFlux. However, such a requirement puts a lot of limitations on reactive system development. For example, the whole Java Development Kit was designed for imperative, synchronous interaction between components of the Java ecosystem. Therefore, a lot of blocking operations do not have non-blocking, asynchronous analogs, which complicates a lot of non-blocking, reactive system development. While WebFlux gives us a higher throughput and lower latency, we have to pay a lot of attention to all operations and libraries with which we are working.

Also, in cases where complex computations are the central operation of our service, the uncomplicated threading-based processing model is preferable to the non-blocking, asynchronous processing model. Also, if all operations of interaction with the I/O are blocking, we do not have as many benefits as we have with non-blocking I/O. Moreover, the complexity of non-blocking and asynchronous algorithms for event processing may be redundant, so a straightforward threading model in Web MVC would be more efficient than the WebFlux one.

Nevertheless, for cases where there are no such limitations or specific use cases, and we have a lot of I/O interaction, the non-blocking and asynchronous WebFlux will shine brightly.

Table of Contents for Challenges with the WebFlux processing model

Create new playlist

Sign In

Sign Up

Table of Contents for
Challenges with the WebFlux processing model