Appendix A

Process-Based Parallelism using Joblib

A.1Introduction to Process-Based Parallelism

The execution of a Python code launches a Python process that accepts the instruction and executes it. If we need to process multiple images (for example) using the same instruction, then the Python process runs the instruction on each of the images sequentially. If we have 8 images and if each image takes a minute to process, the total processing time would be 8 minutes. However, a typical modern computer has multiple cores each of which can handle one Python process. So it would be preferable to process each of these images in parallel and use all the cores in a modern computer. This can be achieved using Python’s joblib module. If a computer has 8 cores and we start 8 Python processes, then the processing described previously can be completed in 1 minute at a speedup of 8X. Modern servers have 12+ cores and hence we can obtain considerable speedup using joblib.

A.2Introduction to Joblib

The module joblib ([Job20]) is designed to perform process-based parallelism. It has other functionalities but we will limit the discussion only to parallelism. The joblib mechanism for parallelization is a single class called Parallel. This simple mechanism enables easy conversion of an existing serial code to parallel without significant cost to the programmer.

The Parallel class instance takes a generator expression [PC18]. The generator in Python returns an object (also called an iterator) which we can iterate over and fetch one value at a time. The function that needs to be parallelized has to be decorated by joblib’s ‘delayed’ decorator.

We will discuss a few examples where we will compute the value of cube of numbers between 0 and 9 to demonstrate the syntax of a joblib parallel code. We will then complete the task by processing images in parallel using joblib.

A.3Parallel Examples

In the next three examples, we will parallelize the same functionality. The task to parallelize is defined in the function called cube. The function takes an argument ’x’ and returns its cube. In all cases, we import the class Parallel and the decorator delayed from joblib. The parameter n_jobs determines the number of parallel processes. A value of −1 indicates that the number of parallel processes will be equal to the number of cores. If a value of 1 is used, then the number of parallel processes will be 1.

When running the examples below, we recommend opening the process monitor in your operating system, such as Task Manager for Windows, Activity Monitor for Mac, or top on Linux, and notice that new Python processes are created proportional to the value of n_jobs.

In the first example, the cube function is decorated with the delayed decorator. The generator expression fetches each of the values 0, 1 … 9 and passes it to the cube function.

from joblib import Parallel, delayed
def cube(x):
    return x*x*x
Parallel(n_jobs=-1)(delayed(cube)(i) for i in range(10))

In the second example, we decorate the cube function using delayed, and using the more familiar @ syntax above the function definition and hence make the generator expression cleaner.

from joblib import Parallel, delayed
 
@delayed
def cube(x):
    return x*x*x
 
Parallel(n_jobs=-1)(cube(i) for i in range(10))

In the third example, we use the decorated cube function and produce an explicit generator expression. We then feed this generator expression to the Parallel class.

from joblib import Parallel, delayed
 
@delayed
def cube(x):
   return x*x*x
 
gen = (cube(i) for i in range(10))
 
Parallel(n_jobs=-1)(gen)

These three mechanisms produce the same result but the authors find the third method more readable.

In the last example, we will discuss a more realistic parallelization case. In this example, we will perform sigmoid correction that we discussed in Chapter 5. The sigmoid function takes the file name as input, reads the image using OpenCV, then performs sigmoid correction and stores the corrected image as a file. The function is decorated with @delayed so that it can be run in parallel. The generator expression (called gen in the example) accepts a list of file names, iterates over them and calls the sigmoid function for each image. When the code is executing, open the ‘process monitor’ for your operating system and you will notice multiple Python processes running.

import os
import cv2
from skimage.exposure import adjust_sigmoid
from joblib import Parallel, delayed
 
@delayed
def sigmoid(folder, file_name):
    path = os.path.join(folder, file_name)
    img = cv2.imread(path)
    img1 = adjust_sigmoid(img, gain=15)
    output_path = os.path.join(folder,
       ’sigmoid_’+file_name)
    cv2.imwrite(output_path, img1)
 
folder = ’input’
file_names = [’angiogram1.png’, ’sem2.png’,
    ’hequalization_input.png’]
gen = (sigmoid(folder, file_name) for file_name in
    file_names)
Parallel(n_jobs=-1)(gen)
print("Processing completed.")
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.149.213.209