Broadcasting along functional pipes

In a data processing pipeline, we may encounter a situation where the functions can be fused together into a single loop. This is called broadcasting and it can be conveniently enabled by using the dot notation. Using broadcasting may make a huge performance difference for data-intensive applications.

Consider the following scenario, where two vectorized functions have already been defined, as follows:

add1v(xs) = [x + 1 for x in xs]
mul2v(xs) = [2x for x in xs]

The add1v function takes a vector and increments all the elements by 1. Likewise, the mul2v function takes a vector and multiplies every element by 2. Now, we can combine the functions to create a new one that takes a vector and sends it down the pipe to add1v and subsequently mul2v:

add1mul2v(xs) = xs |> add1v |> mul2v

However, the add1mul2v function is not optimal from a performance perspective. The reason for this is that each operation must be fully completed and then passed to the next function. The intermediate result, while only needed temporarily, must be allocated in memory:

As depicted in the preceding diagram, besides the input vector and the output vector, an intermediate vector must be allocated to hold the results from the add1v function.

In order to avoid the allocation of the intermediate results, we can utilize broadcasting. Let's create another set of functions that operate on individual elements rather than arrays, as follows:

add1(x) = x + 1
mul2(x) = 2x

Our original problem still requires taking a vector, adding 1, and multiplying by 2 for every element. So, we can define such a function using the dot notation, as follows:

add1mul2(xs) = xs .|> add1 .|> mul2

The dot character right before the pipe operator indicates that the elements in xs will be broadcast to the add1 and mul2 functions, fusing the whole operation into a single loop. The data flow now looks more like the following:

Here, the intermediate result becomes a single integer, eliminating the need for the temporary array. To appreciate the performance improvement we get from broadcasting, we can run a performance benchmark for the two functions, as shown in the following screenshot:

As you can see, the broadcasting version ran twice as fast as the vectorized version in this scenario.

In the next section, we will review some considerations about using functional pipes.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.137.217.17