Understanding the processing models in WebFlux and Web MVC

First of all, in order to understand the impact of different processing models on system throughput and latency, we will recap how incoming requests are processed in Web MVC and WebFlux.

As mentioned earlier, Web MVC is built on top of blocking I/O. That means that the Thread that processes each incoming request may be blocked by reading the incoming body from the I/O:

Diagram 6.10. Blocking request and response processing

In the preceding example, all requests are queued and processed sequentially by one Thread. The black bars indicate that there is a blocking read/write operation from/to I/O. Also, as we may notice, the actual processing time (white bars) is much smaller than the time spent on blocking operations. From this simple diagram, we can conclude that the Thread is inefficient and waiting times may be shared when accepting and processing requests in the queue.

In contrast, WebFlux is built on top of a non-blocking API, which means that no operations require interaction with the I/O block Thread. This efficient technique of accepting and processing requests is depicted in the following diagram:

Diagram 6.11. Asynchronous non-blocking request processing

As we can see in the preceding diagram, we have an identical case to the previous blocking I/O case. On the left side of the diagram, there is a queue of requests and, in the middle, there is a processing timeline. In this case, the processing timeline does not have any black bars, which means that even if there are not enough bytes coming from the network in order to continue with the processing of the request, we can always switch to processing another request without blocking the Thread. Comparing the preceding asynchronous, non-blocking request processing to the blocking example, we might notice that now, instead of waiting while the request body is collected, the Thread is used efficiently to accept new connections. Then, the underlying operating system may notify us that, for example, the request body has been collected and the processor may take it for processing without blocking. In that case, we have optimal CPU utilization. Similarly, writing the response does not require blocking and allows us to write to the I/O in a non-blocking fashion. The only difference is that the system notifies us when it is ready to write a part of the data into the I/O without blocking.

The previous example shows that WebFlux utilizes one Thread much more efficiently than Web MVC, and therefore may process many more requests in the same period of time. It is still possible to argue, however, that we still have multi-threading in Java so we can utilize the real processor by having the proper number of Thread instances in place. Therefore, to process requests faster and achieve the same CPU utilization with blocking Web MVC, we may use, instead of one Thread, several worker threads, or even one Thread per connection model:

Diagram 6.12. Thread per connection Web MVC model

As we can see from the preceding diagram, the multithreading model allows faster processing of queued requests and gives the illusion that the system accepts, processes, and responds to almost the same number of requests.

However, this design has its flaws. As we learned from the Universal Scalability Law, when the system has shared resources such as CPU or memory, scaling the number of parallel workers may decrease the system's performance. In this case, when the processing of user requests involves too many Thread instances, it causes degradation of performance because of incoherence between them.

Table of Contents for Understanding the processing models in WebFlux and Web MVC

Create new playlist

Sign In

Sign Up

Table of Contents for
Understanding the processing models in WebFlux and Web MVC