Broadcasting

Broadcasting in NumPy denotes the ability to guess a common, compatible shape between two arrays. For instance, when adding a vector (one-dimensional array) and a scalar (zero-dimensional array), the scalar is extended to a vector, in order to allow for the addition. The general mechanism is called broadcasting. We will first review that mechanism from a mathematical point of view, and then proceed to give the precise rules for broadcasting in NumPy.

Mathematical view

Broadcasting is often performed in mathematics, mainly implicitly. Examples are expressions such as f(x) + C or f(x) + g(y). We'll give an explicit description of that technique in this section.

We have in mind the very close relationship between functions and NumPy arrays, as described in section Mathematical preliminaries of Chapter 4, Linear Algebra - Arrays.

Constant functions

One of the most common examples of broadcasting is the addition of a function and a constant; if C is a scalar, one often writes:

Constant functions

This is an abuse of notation since one should not be able to add functions and constants. Constants are however implicitly broadcast to functions. The broadcast version of the constant C is the function Constant functions defined by:

Constant functions

Now it makes sense to add two functions together:

Constant functions

We are not being pedantic for the sake of it, but because a similar situation may arise for arrays, as in the following code:

vector = arange(4) # array([0.,1.,2.,3.])
vector + 1.        # array([1.,2.,3.,4.])

In this example, everything happens as if the scalar 1. had been converted to an array of the same length as vector, that is, array([1.,1.,1.,1.]), and then added to vector.

This example is exceedingly simple, so we proceed to show less obvious situations.

Functions of several variables

A more intricate example of broadcasting arises when building functions of several variables. Suppose, for instance, that we were given two functions of one variable, f and g, and that we want to construct a new function F according to the formula:

Functions of several variables

This is clearly a valid mathematical definition. We would like to express this definition as the sum of two functions in two variables defined as

Functions of several variables

and now we may simply write:

Functions of several variables

The situation is similar to that arising when adding a column matrix and a row matrix:

C = arange(2).reshape(-1,1) # column
R = arange(2).reshape(1,-1) # row
C + R                       # valid addition: array([[0.,1.],[1.,2.]])

This is especially useful when sampling functions of two variables, as shown in section Typical examples.

General mechanism

We have seen how to add a function and a scalar and how to build a function of two variables from two functions of one variable. Let us now focus on the general mechanism that makes this possible. The general mechanism consists of two steps: reshaping and extending.

First, the function g is reshaped to a function General mechanism that takes two arguments. One of these arguments is a dummy argument, which we take to be zero, as a convention:

General mechanism

Mathematically, the domain of definition of General mechanism is now General mechanismThen the function f is reshaped in a way similar to:

General mechanism

Now both General mechanism and General mechanism take two arguments, although one of them is always zero. We proceed to the next step, extending. It is the same step that converted a constant into a constant function (refer to the constant function example).

The function General mechanism is extended to:

General mechanism

The function General mechanism is extended to:

General mechanism

Now the function of two variables F, which was sloppily defined by F(x,y) = f(x) + g(y), may be defined without reference to its arguments:

General mechanism

For example, let us describe the preceding mechanism for constants. A constant is a scalar, that is, a function of zero arguments. The reshaping step is thus to define the function of one (empty) variable:

General mechanism

Now the extension step proceeds simply by:

General mechanism

Conventions

The last ingredient is the convention on how to add the extra arguments to a function, that is, how the reshaping is automatically performed. By convention, a function is automatically reshaped by adding zeros on the left.

For example, if a function g of two arguments has to be reshaped to three arguments, the new function would be defined by:

Conventions

Broadcasting arrays

We now repeat the observation that arrays are merely functions of several variables (refer to section Mathematical preliminaries in Chapter 4, Linear Algebra - Arrays). Array broadcasting thus follows exactly the same procedure as explained above for mathematical functions. Broadcasting is done automatically in NumPy.

In the following figure (Figure 5.1), we show what happens when adding a matrix of shape (4, 3) to a matrix of size (1, 3). The second matrix is of the shape (4, 3):

Broadcasting arrays

Figure 5.1: Broadcasting between a matrix and a vector.

The broadcasting problem

When NumPy is given two arrays with different shapes, and is asked to perform an operation that would require the two shapes to be the same, both arrays are broadcast to a common shape.

Suppose the two arrays have shapes s1 and s2. This broadcasting is performed in two steps:

  1. If the shape s1 is shorter than the shape s2 then ones are added on the left of the shape s1. This is a reshaping.
  2. When the shapes have the same length, the array is extended to match the shape s2 (if possible).

Suppose we want to add a vector of shape (3, ) to a matrix of shape (4, 3). The vector needs be broadcast. The first operation is a reshaping; the shape of the vector is converted from (3, ) to (1, 3). The second operation is an extension; the shape is converted from (1, 3) to (4, 3).

For instance, suppose a vector of size n is to be broadcast to the shape (m, n):

  1. v is automatically reshaped to (1, n).
  2. v is extended to (m, n).

To demonstrate this we consider a matrix defined by:

M = array([[11, 12, 13, 14],
           [21, 22, 23, 24],
           [31, 32, 33, 34]])

and vector given by:

v = array([100, 200, 300, 400])

Now we may add M and v directly:

M + v # works directly

The result is this matrix:

The broadcasting problem

Shape mismatch

It is not possible to automatically broadcast a vector v of length n to the shape (n,m). This is illustrated in the following figure:

Shape mismatch

The broadcasting will fail, because the shape (n,) may not be automatically broadcast to the shape (m, n). The solution is to manually reshape v to the shape (n,1). The broadcasting will now work as usual (by extension only):

M + v.reshape(-1,1)

Here is another example, define a matrix  by:

M = array([[11, 12, 13, 14],
           [21, 22, 23, 24],
           [31, 32, 33, 34]])

and a vector by:

v = array([100, 200, 300])

Now automatic broadcasting will fail, because automatic reshaping does not work:

M + v # shape mismatch error

The solution is thus to take care of the reshaping manually. What we want in that case is to add 1 on the right, that is, transform the vector into a column matrix. The broadcasting then works directly:

M + v.reshape(-1,1)

For the shape parameter -1, refer to section Accessing and changing the shape of Chapter 4, Linear Algebra  - Arrays. The result is this matrix:

Shape mismatch

Typical examples

Let us examine some typical examples where broadcasting may come in handy.

Rescale rows

Suppose M is an n × m matrix, and we want to multiply each row by a coefficient. The coefficients are stored in a vector coeff with n components. In that case, automatic reshaping will not work, and we have to execute:

rescaled = M*coeff.reshape(-1,1)

Rescale columns

The setup is the same here, but we would like to rescale each column with a coefficient stored in a vector coeff of length m. In this case, automatic reshaping will work:

rescaled = M*coeff

Obviously, we may also do the reshaping manually and achieve the same result with:

rescaled = M*coeff.reshape(1,-1)

Functions of two variables

Suppose u and v are vectors and we want to form the matrix W with elements wij = ui + vj. This would correspond to the function F(x, y) = x + y. The matrix W is merely defined by:

W=u.reshape(-1,1) + v

If the vectors u and v are [0, 1] and [0, 1, 2] respectively, the result is:

Functions of two variables

More generally, suppose that we want to sample the function w(x, y) := cos(x) + sin(2y). Supposing that the vectors x and y are defined, the matrix w of sampled values is obtained by:

w = cos(x).reshape(-1,1) + sin(2*y)

Note that this is very frequently used in combination with ogrid. The vectors obtained from ogrid are already conveniently shaped for broadcasting. This allows for the following elegant sampling of the function cos(x) + sin(2y):

x,y = ogrid[0:1:3j,0:1:3j] 
# x,y are vectors with the contents of linspace(0,1,3)
w = cos(x) + sin(2*y)

The syntax of ogrid needs some explanation. First, ogrid is no function. It is an instance of a class with a __getitem__ method (refer to section Attributes in Chapter 8, Classes). That is why it is used with brackets instead of parentheses.

The two commands are equivalent:

x,y = ogrid[0:1:3j, 0:1:3j]
x,y = ogrid.__getitem__((slice(0, 1, 3j),slice(0, 1, 3j)))

The stride parameter in the preceding example is a complex number. This is to indicate that it is the number of steps instead of the step size. The rules for the stride parameter might be confusing at first glance:

  • If the stride is a real number, then it defines the size of the steps between start and stop and stop is not included in the list.
  • If the stride is a complex number s, then the integer part of s.imag defines the number of steps between start and stop and stop is included in the list.

Another example for the output of ogrid is a tuple with two arrays, which can be used for broadcasting:

x,y = ogrid[0:1:3j, 0:1:3j]

gives:

array([[ 0. ],
       [ 0.5],
       [ 1. ]])
array([[ 0. ,  0.5,  1. ]])

which is equivalent to:

x,y = ogrid[0:1.5:.5, 0:1.5:.5]
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.15.206.25