Improving performance using Python extensions

One of the gripes of Python and pandas users is that the ease of use and expressiveness of the language and module comes with a significant downside—the performance—especially when it comes to numeric computing.

According to the programming benchmarks site, Python is often slower than compiled languages, such as C/C++ for many algorithms or data structure operations. An example of this would be binary tree operations. In the following reference, Python3 ran 104x slower than the fastest C++ implementation of an n-body simulation calculation: http://bit.ly/1dm4JqW.

So, how can we solve this legitimate yet vexing problem? We can mitigate this slowness in Python while maintaining the things that we like about it—clarity and productivity—by writing the parts of our code that are performance sensitive. For example numeric processing, algorithms in C/C++ and having them called by our Python code by writing a Python extension module: http://docs.python.org/2/extending/extending.html

Python extension modules enable us to make calls out to user-defined C/C++ code or library functions from Python, thus enabling us to boost our code performance but still benefit from the ease of using Python.

To help us understand what a Python extension module is, consider what happens in Python when we import a module. An import statement imports a module, but what does this really mean? There are three possibilities, which are as follows:

  • Some Python extension modules are linked to the interpreter when it is built.
  • An import causes Python to load a .pyc file into memory. The .pyc files contain Python bytecode.For example to the following command:
    In [3]: import pandas
                    pandas.__file__
    Out[3]: '/usr/lib/python2.7/site-packages/pandas/__init__.pyc'
    
  • The import statement causes a Python extension module to be loaded into the memory. The .so (shared object) file is comprised of machine code. For example refer to the following command:
    In [4]: import math
              math.__file__
    Out[4]: '/usr/lib/python2.7/lib-dynload/math.so'
    

We will focus on the third possibility. Even though we are dealing with a binary-shared object compiled from C, we can import it as a Python module, and this shows the power of Python extensions—applications can import modules from Python machine code or machine code and the interface is the same. Cython and SWIG are the two most popular methods of writing extensions in C and C++. In writing an extension, we wrap up C/C++ machine code and turn it into Python extension modules that behave like pure Python code. In this brief discussion, we will only focus on Cython, as it was designed specifically for Python.

Cython is a superset of Python that was designed to significantly improve Python's performance by allowing us to call externally compiled code in C/C++ as well as declare types on variables.

The Cython command generates an optimized C/C++ source file from a Cython source file, and compiles this optimized C/C++ source into a Python extension module. It offers built-in support for NumPy and combines C's performance with Python's usability.

We will give a quick demonstration of how we can use Cython to significantly speed up our code. Let's define a simple Fibonacci function:

In [17]: def fibonacci(n):
         a,b=1,1
         for i in range(n):
             a,b=a+b,a
         return a
In [18]: fibonacci(100)
Out[18]: 927372692193078999176L
In [19]: %timeit fibonacci(100)
         100000 loops, best of 3: 18.2 µs per loop

Using the timeit module, we see that it takes 18.2 µs per loop.

Let's now rewrite the function in Cython, specifying types for the variables by using the following steps:

  1. First, we import the Cython magic function to IPython as follows:
    In [22]: %load_ext cythonmagic
    
  2. Next, we rewrite our function in Cython, specifying types for our variables:
    In [24]: %%cython
             def cfibonacci(int n):
                 cdef int i, a,b
                 for i in range(n):
                     a,b=a+b,a
                 return a
    
  3. Let's time our new Cython function:
    In [25]: %timeit cfibonacci(100)
             1000000 loops, best of 3: 321 ns per loop
    
    In [26]: 18.2/0.321
    Out[26]: 56.69781931464174
    
  4. Thus, we can see that the Cython version is 57x faster than the pure Python version!

For more references on writing Python extensions using Cython/SWIG or other options, please refer to the following references:

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.223.196.146