Only cProfile
gave us comprehensive information about the performance of all functions in the asa.py
file. However, what happens if you want to drill down further and understand the performance of each line in the Python code? Robert Kern's line_profiler
module is a Python module that enables you to do just this, and this is exactly the level of detail that you want for this chapter.
The installation and setup of the line profiler is a little bit more complicated than usual, so we will discuss this in the next recipe.
The steps that are listed will introduce you to profiling with the line_profiler
module:
line_profiler
module, we must first install it using the pip
command:(sudo) pip install line_profiler
kernprof.py
Python script from the website (http://pythonhosted.org/line_profiler/) and place it in the directory where we are running asa.py
.asa.py
script in your favorite editor, and we will decorate the function that we wish to profile line by line. Note that you can profile multiple functions if you so desire. In this case, we decorate the main workhorse function, which is calculate_asa
, as follows:@profile def calculate_asa(atoms, probe, n_sphere_point=960):
python kernprof.py -l asa.py "1R0R.pdb"
asa.py.lprof
output file, in the current directory.python -m line_profiler asa.py.lprof
line_profiler
module provides the treasure trove of information displayed in the following screenshot:Note that the line_profiler
module is reporting back detailed information for each of the lines, referenced by line number, in each profiled function (in this case, calculate_asa
), including the all-important percentage of total time (% Time
).
From this output, we see that the lines of code in the innermost loop (78
-81
) of the calculate_asa
function are eating up over half of the execution time, and it is here that we should focus our optimization effort. Also, notice that there are a lot of variable assignments, lines 70
–72
that are chewing up time as well.
Also, notice that the total execution time is a whopping 309.418 seconds. Profiling line by line adds a great deal of overhead to the execution and slows it down by a factor of five. Since optimizing is highly iterative, we do not want to wait 5 minutes for each run to complete. Thus, we want to find some way of decreasing the execution time so that each instrumented profiling run gets completed much faster.
Given the nature of this code, the easiest fix would be to decrease the number of "sphere points" that the code loops over. The default value is 960
, and if we lower the value, we should expect much lower execution times. Changing this value to 60
and rerunning the time
Unix command, we see that the code executes in 11 seconds.
line_profiler
module at http://pythonhosted.org/line_profiler/3.22.27.45