Performance and Vectorization

When it comes to performance of your Python code, it often boils down to the difference between interpreted code and compiled code. Python is an interpreted programming language and basic Python code is executed directly without any intermediate compilation to machine code. With a compiled language, the code needs to be translated to machine instructions before execution.

The benefits of an interpreted language are many but interpreted code cannot compete with compiled code for speed. To make your code faster, you can write some parts in a compiled language like FORTRAN, C, or C++. This is what NumPy and SciPy do.

For this reason, it is best to use functions in NumPy and SciPy over interpreted versions whenever possible. NumPy array operations such as matrix multiplication, matrix-vector multiplication, matrix factorization, scalar products, and so on are much faster than any pure Python equivalent. Consider the simple case of scalar products. The scalar product is much slower than the compiled NumPy function, dot(a,b) (more than 100 times slower for arrays with about 100 elements):

def my_prod(a,b):
    val = 0
    for aa,bb in zip(a,b):
        val += aa*bb
    return val

Measuring the speed of your functions is an important aspect of scientific computing. Refer to section Measuring execution time in Chapter 13, Testing,  for details on measuring execution times.

Vectorization

To improve performance, one has to vectorize the code often. Replacing for loops and other slower parts of the code with NumPy slicing, operations, and functions can give significant improvements. For example, the simple addition of a scalar to a vector by iterating over the elements is very slow:

for i in range(len(v)):
    w[i] = v[i] + 5

where using NumPy's addition is much faster:

w = v + 5

Using NumPy slicing can also give significant speed improvements over iterating with for loops. To demonstrate this let us consider forming the average of neighbors in a two-dimensional array:

def my_avg(A):
    m,n = A.shape
    B = A.copy()
    for i in range(1,m-1):
        for j in range(1,n-1):
            B[i,j] = (A[i-1,j] + A[i+1,j] + A[i,j-1] + A[i,j+1])/4
    return B

def slicing_avg(A):
    A[1:-1,1:-1] = (A[:-2,1:-1] + A[2:,1:-1] +
    A[1:-1,:-2] + A[1:-1,2:])/4
    return A

These functions both assign each element the average of its four neighbors. The second version, using slicing, is much faster.

Besides replacing for loops and other slower constructions with NumPy functions, there is a useful function called vectorize, refer to section Functions acting on arrays in Chapter 4, Linear Algebra - Arrays. This will take a function and create a vectorized version that applies the function on all elements of an array using functions wherever possible.

Consider the following example for vectorizing a function:

def my_func(x):
    y = x**3 - 2*x + 5
    if y>0.5:
        return y-0.5
    else:
        return 0

Applying this by iterating over an array is very slow:

for i in range(len(v)):
    v[i] = my_func(v[i])

Instead, use vectorize to create a new function, like this:

my_vecfunc = vectorize(my_func)

This function can then be applied to the array directly:

v = my_vecfunc(v)

The vectorized option is much faster (around 10 times faster with arrays of length 100).

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.188.227.4