Introduction to Akka Streams

The word stream is vastly overloaded in meaning in modern computing. It carries many different meanings depending on the context. For instance, in Java, in different times streaming meant an abstraction over blocking IO, non-blocking IO, and later, a way to express data processing queries.

In essence, a stream in computing is just a flow of data or instructions. Usually, the content of a stream is not loaded into memory fully. This possibility to process basically unlimited amounts of information on devices with limited memory capacity is a motivating factor for the rise of streams, popularity that has been happening recently.

The definition of the stream as a flow implies that it should have some source and a destination of data elements. In computing, these concepts are naturally expressed in the code in a way that on one side of the flow the code emits data and on another side other code consumes this data. The emitting side is usually called the producer and the receiving side is respectively a consumer. Usually, there will be a portion of data in the memory which was already issued by the producer but not yet ingested by the consumer. 

This aspect of the stream brings up the next idea: it should be possible to manipulate data in-flight by code in-between, the same way a water heater is plugged in between the inlet and a water tap and changes cold water into hot water. Interestingly, the presence of a water heater in this scenario is not known to the producer or to the consumer. If the scenario is that the water flow increases in intensity, we could easily imagine having another heater plugged in or replacing the existing one with a more powerful model. The water heater becomes the property of the flow in the sense that the quantity of the water received by the consumer depends on the amounts emitted by the producer, but the temperature depends on properties of the pipe system, or in essence, of the flow.

This is the basic idea behind using streams: a stream is usually seen as a combination of producer, consumer, and transformation steps in between. In a streaming scenario, the producer and consumer become less interesting and the main focus shifts to the transformation steps in the middle. For the sake of modularity and code reuse, defining many tiny transformations is usually considered to be a preferable approach.

Depending on the art of transferring data between the parts of the stream, we distinguish between pushing and pulling elements of the stream.

With the push, it is the producer who controls the process. The data is pushed to the stream as soon as it becomes available and the rest of the stream is supposed to be able to absorb it. Naturally, it is not always possible to consume data which is produced at an unpredictable rate. In the case of streaming, it is dealt with by dropping data or using buffers. Dropping data is sometimes appropriate but more often is undesired. Buffers have limited size and thus can become full if data is produced faster than it is consumed for a long period of time. A full buffer yet again leads to memory overflow or the need to drop data. Visibly, with the push model, a combination of a fast producer and a slow consumer is a problem.

With the pull model, it is the consumer who drives the process. It tries to read the data from the stream as soon as it needs it. If there is some data, it is taken. If there is no data, the consumer has a choice between waiting for it or trying again at a later moment. Usually, both possibilities are less than ideal. Waiting for the data is usually done by blocking and polling data, which means excessive consumption of resources and delays between the moment the data becomes available and its consumption. Evidently, the pull model is not optimal in the case of a slow producer and fast consumer.

This dichotomy led to the creation of the dynamic pull-push concept named Reactive Streams and an initiative of the same name in 2013 by engineers at Lightbend, Netflix, and Pivotal.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.116.80.45