Concurrency

A system's concurrency is the degree to which the system is able to perform work simultaneously instead of sequentially. An application written to be concurrent in general, can execute more units of work in a given time than one which is written to be sequential or serial.

When we make a serial application concurrent, we make the application better utilize the existing compute resources in the system—CPU and/or RAM—at a given time. Concurrency, in other words, is the cheapest way of making an application scale inside a machine in terms of the cost of compute resources.

Concurrency can be achieved using different techniques. The common ones include the following:

  • Multithreading: The simplest form of concurrency is to rewrite the application to perform parallel tasks in different threads. A thread is the simplest sequence of programming instructions that can be performed by a CPU. A program can consist of any number of threads. By distributing tasks to multiple threads, a program can execute more work simultaneously. All threads run inside the same process.
  • Multiprocessing: Another way to concurrently scale up a program is to run it in multiple processes instead of a single process. Multiprocessing involves more overhead than multithreading in terms of message passing and shared memory. However, programs that perform a lot of CPU-intensive computations can benefit more from multiple processes than multiple threads.
  • Asynchronous Processing: In this technique, operations are performed asynchronously with no specific ordering of tasks with respect to time. Asynchronous processing usually picks tasks from a queue of tasks, and schedules them to execute at a future time, often receiving the results in callback functions or special future objects. Asynchronous processing usually happens in a single thread.

There are other forms of concurrent computing, but in this chapter, we will focus our attention on only these three.

Python, especially Python 3, has built-in support for all these types of concurrent computing techniques in its standard library. For example, it supports multi-threading via its threading module, and multiple processes via its multiprocessing module. Asynchronous execution support is available via the asyncio module. A form of concurrent processing that combines asynchronous execution with threads and processes is available via the concurrent.futures module.

In the coming sections we will take a look at each of these in turn with sufficient examples.

Note

The asyncio module is available only in Python 3.

Concurrency versus parallelism

We will take a brief look at the concept of concurrency and its close cousin, namely parallelism.

Both concurrency and parallelism are about executing work simultaneously rather than sequentially. However, in concurrency, the two tasks need not be executed at the exact same time; instead, they just need to be scheduled to be executed simultaneously. Parallelism, on the other hand, requires that both the tasks execute together at a given moment in time.

To take a real-life example, let's say you are painting two exterior walls of your house. You have employed just one painter, and you find that he is taking a lot more time than you thought. You can solve the problem in these two ways:

  1. Instruct the painter to paint a few coats on one wall before switching to the next wall, and doing the same there. Assuming he is efficient, he will work on both the walls simultaneously (though not at the same time), and achieve the same degree of finish on both walls for a given time. This is a concurrent solution.
  2. Employ one more painter. Instruct the first painter to paint the first wall, and the second painter to paint the second wall. This is a parallel solution.

Two threads performing bytecode computations in a single core CPU do not exactly perform parallel computation, as the CPU can accommodate only one thread at a time. However, they are concurrent from a programmer's perspective, since the CPU scheduler performs fast switching in and out of the threads so that they appear to run in parallel.

However, on a multi-core CPU, two threads can perform parallel computations at any given time in its different cores. This is true parallelism.

Parallel computation requires that the computation resources increase at least linearly with respect to its scale. Concurrent computation can be achieved by using the techniques of multitasking, where work is scheduled and executed in batches, making better use of existing resources.

Note

In this chapter, we will use the term concurrent uniformly to indicate both types of execution. In some places, it may indicate concurrent processing in the traditional way, and in some others, it may indicate true parallel processing. Use the context to disambiguate.

Concurrency in Python – multithreading

We will start our discussion of concurrent techniques in Python with multithreading.

Python supports multiple threads in programming via its threading module. The threading module exposes a Thread class, which encapsulates a thread of execution. Along with this, it also exposes the following synchronization primitives:

  • A Lock object, which is useful for synchronized protected access to share resources, and its cousin RLock
  • A Condition object, which is useful for threads to synchronize while waiting for arbitrary conditions
  • An Event object, which provides a basic signaling mechanism between threads
  • A Semaphore object, which allows synchronized access to limited resources
  • A Barrier object, which allows a fixed set of threads to wait for each other, synchronize to a particular state, and proceed

Thread objects in Python can be combined with the synchronized Queue class in the queue module for implementing thread-safe producer/consumer workflows.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.135.249.220