Profiling Python code using line_profiler

Only cProfile gave us comprehensive information about the performance of all functions in the asa.py file. However, what happens if you want to drill down further and understand the performance of each line in the Python code? Robert Kern's line_profiler module is a Python module that enables you to do just this, and this is exactly the level of detail that you want for this chapter.

Getting ready

The installation and setup of the line profiler is a little bit more complicated than usual, so we will discuss this in the next recipe.

How to do it…

The steps that are listed will introduce you to profiling with the line_profiler module:

  1. To use the line_profiler module, we must first install it using the pip command:
    (sudo) pip install line_profiler 
    
  2. Next, we want to grab the kernprof.py Python script from the website (http://pythonhosted.org/line_profiler/) and place it in the directory where we are running asa.py.
  3. Open the asa.py script in your favorite editor, and we will decorate the function that we wish to profile line by line. Note that you can profile multiple functions if you so desire. In this case, we decorate the main workhorse function, which is calculate_asa, as follows:
    @profile
    def calculate_asa(atoms, probe, n_sphere_point=960):
    
  4. From the command line and not the Python interpreter, we run the following:
    python kernprof.py -l asa.py "1R0R.pdb"
    
  5. Once complete, we see a new file, which is the asa.py.lprof output file, in the current directory.
  6. To view the results, type the following:
    python -m line_profiler asa.py.lprof
    
  7. The line_profiler module provides the treasure trove of information displayed in the following screenshot:
    How to do it…

How it works…

Note that the line_profiler module is reporting back detailed information for each of the lines, referenced by line number, in each profiled function (in this case, calculate_asa), including the all-important percentage of total time (% Time).

From this output, we see that the lines of code in the innermost loop (78-81) of the calculate_asa function are eating up over half of the execution time, and it is here that we should focus our optimization effort. Also, notice that there are a lot of variable assignments, lines 7072 that are chewing up time as well.

There's more…

Also, notice that the total execution time is a whopping 309.418 seconds. Profiling line by line adds a great deal of overhead to the execution and slows it down by a factor of five. Since optimizing is highly iterative, we do not want to wait 5 minutes for each run to complete. Thus, we want to find some way of decreasing the execution time so that each instrumented profiling run gets completed much faster.

Given the nature of this code, the easiest fix would be to decrease the number of "sphere points" that the code loops over. The default value is 960, and if we lower the value, we should expect much lower execution times. Changing this value to 60 and rerunning the time Unix command, we see that the code executes in 11 seconds.

See also

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.22.27.45