Plucking the low-hanging (optimization) fruit

When optimizing, always perform just enough to get the job done so as to not introduce new bugs. In this recipe, we introduce some simple optimizations that require minimal modification to the source code and test how effective they are at decreasing the execution time.

Getting ready

To prepare, change the directory to the source directory and open asa.py in an editor.

How to do it…

The following steps walk you through establishing a new performance baseline for the code and implementing several small optimizations:

  1. In the main function, edit the line that sets the number of sphere points, changing it from 960 to 60:
    n_sphere = 60
    
  2. At the command line, use the Unix time command to benchmark the new code configuration and compute the new result:
    time python asa.py 1R0R.pdb
    12114.7 angstrom squared.
    python asa.py 1R0R.pdb  11.05s user 0.04s system 97% cpu 11.334 total
    
  3. Let's copy asa.py into a new file as a local code-versioning system. You can also just use Git if you are comfortable using it to roll back to previous versions of the code. At the command line, enter the following command:
    cp asa.py asa_v1.py
    
  4. Next, we will make several small code changes to test simple optimizations. In the calculate_asa function, change the original version:
    for point in sphere_points:
    is_accessible = True
    test_point.x = point[0]*radius + atom_i.pos.x
    test_point.y = point[1]*radius + atom_i.pos.y
    test_point.z = point[2]*radius + atom_i.pos.z
    

    To:

    atom_i_pos_x = atom_i.pos.x
    atom_i_pos_y = atom_i.pos.y
    atom_i_pos_z = atom_i.pos.z
    n_accessible_point = 0
    for point in sphere_points:
    is_accessible = True
    test_point.x = point[0]*radius + atom_i_pos_x
    test_point.y = point[1]*radius + atom_i_pos_y
    test_point.z = point[2]*radius + atom_i_pos_z
    
  5. Next, we eliminate one of the math.sqrt functions in the find_neighbor_indices function, transforming the original code:
    dist = pos_distance(atom_k.pos, atom_i.pos)
    if dist < radius + atom_i.radius:
    

    Into:

    dist = pos_distance_sq(atom_k.pos, atom_i.pos)
    if dist < (radius + atom_i.radius)**2:
    
  6. Now, let's use the Unix time command again to see the end result:
    time python asa_v1.py 1R0R.pdb
    12114.7 angstrom squared.
    python asa_v1.py 1R0R.pdb  9.42s user 0.03s system 99% cpu 9.500 total
    
  7. Check whether the output returned in the optimized code, 12114.7, is the same value that was computed previously (which it is).

How it works…

With this first optimization effort, we tried to make only very small tweaks to the code and not introduce any new libraries or radical restructuring.

The first trick was to use local variables inside the loops to remove unneeded dereferencing. In the original code, , we must locate the pos attribute of the atom_i object for each point in sphere_points and then access the x attribute of the vector3d.Vector3d object:

for point in sphere_points:
…
test_point.x = point[0]*radius + atom_i.pos.x

We perform this same lookup for each iteration of this loop despite the fact that atom_i has not changed. The simple solution is to pull this lookup to just before the loop and cache the value in a local variable used inside the loop. This is shown in the new version of the code as follows:

atom_i_pos_x = atom_i.pos.x
…
for point in sphere_points:
…
test_point.x = point[0]*radius + atom_i_pos_x

The second optimization we used was to remove an unnecessary call to the square root. Remember, square roots are painfully expensive in comparison to other mathematical operations that can be tested:

In [1]: %timeit sqrt(29111)
10000000 loops, best of 3: 116 ns per loop
In [2]: %timeit 29111 + 372.1
10000000 loops, best of 3: 30.9 ns per loop

This factor of approximately 4x differential is actually on the low side and we have seen square root calculations require ten times the time of additions and multiplications.

In the original code, a square root was concealed in the pos_distance function call. We then test the result against the sum of radius and atom_i.radius:

dist = pos_distance(atom_k.pos, atom_i.pos)
if dist < radius + atom_i.radius:

Assuming that the smallest possible dist value is greater than 1, we can square both sides of the inequality and remove the need for the original square root:

dist = pos_distance_sq(atom_k.pos, atom_i.pos)
if dist < (radius + atom_i.radius)**2:

The two simple changes in the preceding code resulted in an approximately 15 percent decrease in the execution time of the code. We can take these two ideas and roll them out to all other locations in asa.py. We can also implement a number of other simple changes such as preallocating lists that could offer additional performance increases. However, our target is to get the code execution time down from 60 seconds to 1 second, and it is unlikely that these little changes will yield the improvement we need.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.145.166.167