V

Timing Code

If you’re running Python in an IPython instance (e.g., Jupyter Notebook, Jupyter Lab, or IPython directly), you have access to “magic” commands that allow you to easily perform non-Python tasks.

Magic commands are called with % or %%. In a Jupyter Notebook the %timeit will time a line of code and %%timeit will time the entire cell of code.

Let’s time the different vectorization methods from Chapter 5.

import pandas as pd
import numpy as np
import numba


def avg_2(x, y):
  return (x + y) / 2


@np.vectorize
def v_avg_2_mod(x, y):
  """Calculate the average, unless x is 20
  Same as before, but we are using the vectorize decorator
  """
  if (x == 20):
    return(np.NaN)
  else:
    return (x + y) / 2

@numba.vectorize
def v_avg_2_numba(x, y):
  """Calculate the average, unless x is 20
  Using the numba decorator.
  """
  # we now have to add type information to our function
  if (int(x) == 20):
    return(np.NaN)
  else:
    return (x + y) / 2

df = pd.DataFrame({"a": [10, 20, 30], "b": [20, 30, 40]})
print(df)
   a  b
0 10 20
1 20 30
2 30 40

Timing the different methods.

%%timeit
avg_2(df['a'], df['b'])
67.1 µs ± 12.7 µs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
%%timeit
v_avg_2_mod(df['a'], df['b'])
16.6 µs ± 1.05 µs per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
%%timeit
v_avg_2_numba(df['a'].values, df['b'].values)
3.92 µs ± 632 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)

The first method isn’t even as flexible as the custom functions we created. If you are working with mathematical calculations, you can get performance benefits from changing the library you are using. Otherwise, using vectorize() can also help you write more readable apply code.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.119.136.84