Threads are another way to start activities running at the same time. They sometimes are called “lightweight processes,” and they are run in parallel like forked processes, but all run within the same single process. For applications that can benefit from parallel processing, threads offer big advantages for programmers:
Because all threads run within the same process, they don’t generally incur a big startup cost to copy the process itself. The costs of both copying forked processes and running threads can vary per platform, but threads are usually considered less expensive in terms of performance overhead.
Threads can be noticeably simpler to program too, especially when some of the more complex aspects of processes enter the picture (e.g., process exits, communication schemes, and “zombie” processes covered in Chapter 10).
Also because threads run in a single process, every thread shares the same global memory space of the process. This provides a natural and easy way for threads to communicate -- by fetching and setting data in global memory. To the Python programmer, this means that global (module-level) variables and interpreter components are shared among all threads in a program: if one thread assigns a global variable, its new value will be seen by other threads. Some care must be taken to control access to shared global objects, but they are still generally simpler to use than the sorts of process communication tools necessary for forked processes we’ll meet later in this chapter (e.g., pipes, streams, signals, etc.).
Perhaps most importantly, threads are more
portable than forked processes. At this writing, the
os.fork
is not supported on Windows at all, but
threads are. If you want to run parallel tasks portably in a Python
script today, threads are likely your best bet. Python’s thread
tools automatically account for any platform-specific thread
differences, and provide a consistent interface across all operating
systems.
Using threads is surprisingly easy in Python. In fact, when a program
is started it is already running a thread -- usually called the
“main thread” of the process. To start new, independent
threads of execution within a process, we either use the Python
thread
module to run a function call in a spawned
thread, or the Python threading
module to manage
threads with high-level objects. Both modules also provide tools for
synchronizing access to shared objects with locks.
Since the basic thread
module is a bit simpler than the more advanced threading module
covered later in this section, let’s look at some of its
interfaces first. This module provides a
portable interface to whatever threading system
is available in your platform: its interfaces work the same on
Windows, Solaris, SGI, and any system with an installed
“pthreads” POSIX threads implementation (including
Linux). Python scripts that use the Python thread module work on all
of these platforms without changing their source code.
Let’s start off by experimenting with a script that demonstrates the main thread interfaces. The script in Example 3-5 spawns threads until you reply with a “q” at the console; it’s similar in spirit to (and a bit simpler than) the script in Example 3-1, but goes parallel with threads, not forks.
Example 3-5. PP2ESystemThreads hread1.py
# spawn threads until you type 'q' import thread def child(tid): print 'Hello from thread', tid def parent( ): i = 0 while 1: i = i+1 thread.start_new(child, (i,)) if raw_input( ) == 'q': break parent( )
There are really only two thread-specific lines in this script: the
import of the thread
module, and the thread
creation call. To start a thread, we simply call the
thread.start_new
function, no matter what platform
we’re programming on.[24]
This call takes a function object and an arguments tuple, and starts
a new thread to execute a call to the passed function with the passed
arguments. It’s almost like the built-in
apply
function (and like apply
,
also accepts an optional keyword arguments dictionary), but in this
case, the function call begins running in parallelwith the rest of the program.
Operationally speaking, the thread.start_new
call
itself returns immediately with no useful value, and the thread it
spawns silently exits when the function being run returns (the return
value of the threaded function call is simply ignored). Moreover, if
a function run in a thread raises an uncaught exception, a stack
trace is printed and the thread exits, but the rest of the program
continues.
In practice, though, it’s almost trivial to use threads in a Python script. Let’s run this program to launch a few threads; it can be run on both Linux and Windows this time, because threads are more portable than process forks:
C:...PP2ESystemThreads>python thread1.py
Hello from thread 1 Hello from thread 2 Hello from thread 3 Hello from thread 4q
Each message here is printed from a new thread, which exits almost as
soon as it is started. To really understand the power of threads
running in parallel, we have to do something more long-lived in our
threads. The good news is that threads are both easy and fun to play
with in Python. Let’s mutate the fork-count
program of the prior section to use threads. The script in Example 3-6 starts 10 copies of its
counter
running in parallel threads.
Example 3-6. PP2ESystemThreads hread-count.py
################################################## # thread basics: start 10 copies of a function # running in parallel; uses time.sleep so that # main thread doesn't die too early--this kills # all other threads on both Windows and Linux; # stdout shared: thread outputs may be intermixed ################################################## import thread, time def counter(myId, count): # this function runs in threads for i in range(count): #time.sleep(1) print '[%s] => %s' % (myId, i) for i in range(10): # spawn 10 threads thread.start_new(counter, (i, 3)) # each thread loops 3 times time.sleep(4) print 'Main thread exiting.' # don't exit too early
Each parallel copy of the counter
function simply
counts from zero up to two here. When run on Windows, all 10 threads
run at the same time, so their output is intermixed on the standard
output stream:
C:...PP2ESystemThreads>python thread-count.py
...some lines deleted...
[5] => 0
[6] => 0
[7] => 0
[8] => 0
[9] => 0
[3] => 1
[4] => 1
[1] => 0
[5] => 1
[6] => 1
[7] => 1
[8] => 1
[9] => 1
[3] => 2
[4] => 2
[1] => 1
[5] => 2
[6] => 2
[7] => 2
[8] => 2
[9] => 2
[1] => 2
Main thread exiting.
In fact, these threads’ output is mixed arbitrarily, at least on Windows -- it may even be in a different order each time you run this script. Because all 10 threads run as independent entities, the exact ordering of their overlap in time depends on nearly random system state at large at the time they are run.
If you care to make this output a bit more coherent, uncomment (that
is, remove the #
before) the
time.sleep(1)
call in the
counter
function and rerun the script. If you do,
each of the 10 threads now pauses for one second before printing its
current count value. Because of the pause, all threads check in at
the same time with the same count; you’ll actually have a
one-second delay before each batch of 10 output lines appears:
C:...PP2ESystemThreads>python thread-count.py
...some lines deleted...
[7] => 0
[6] => 0 pause...
[0] => 1
[1] => 1
[2] => 1
[3] => 1
[5] => 1
[7] => 1
[8] => 1
[9] => 1
[4] => 1
[6] => 1 pause...
[0] => 2
[1] => 2
[2] => 2
[3] => 2
[5] => 2
[9] => 2
[7] => 2
[6] => 2
[8] => 2
[4] => 2
Main thread exiting.
Even with the sleep synchronization active, though, there’s no telling in what order the threads will print their current count. It’s random on purpose -- the whole point of starting threads is to get work done independently, in parallel.
Notice that this script sleeps for four seconds at the end. It turns out that, at least on my Windows and Linux installs, the main thread cannot exit while any spawned threads are running; if it does, all spawned threads are immediately terminated. Without the sleep here, the spawned threads would die almost immediately after they are started. This may seem ad hoc, but isn’t required on all platforms, and programs are usually structured such that the main thread naturally lives as long as the threads it starts. For instance, a user interface may start an FTP download running in a thread, but the download lives a much shorter life than the user interface itself. Later in this section, we’ll see different ways to avoid this sleep with global flags, and will also meet a “join” utility in a different module that lets us wait for spawned threads to finish explicitly.
One of the nice things about threads is that they automatically come with a cross-task communications mechanism: shared global memory. For instance, because every thread runs in the same process, if one Python thread changes a global variable, the change can be seen by every other thread in the process, main or child. This serves as a simple way for a program’s threads to pass information back and forth to each other -- exit flags, result objects, event indicators, and so on.
The downside to this scheme is that our threads must sometimes be careful to avoid changing global objects at the same time -- if two threads change an object at once, it’s not impossible that one of the two changes will be lost (or worse, will corrupt the state of the shared object completely). The extent to which this becomes an issue varies per application, and is sometimes a nonissue altogether.
But even things that aren’t obviously at risk may be at risk.
Files and streams, for example, are shared by all threads in a
program; if multiple threads write to one stream at the same time,
the stream might wind up with interleaved, garbled data. Here’s
an example: if you edit Example 3-6, comment-out the
sleep call in counter
, and increase the per-thread
count
parameter from 3 to 100, you might
occasionally see the same strange results on Windows that I did:
C:...PP2ESystemThreads>python thread-count.py | more
...more deleted...
[5] => 14
[7] => 14
[9] => 14
[3] => 15
[5] => 15
[7] => 15
[9] => 15
[3] => 16 [5] => 16 [7] => 16 [9] => 16
[3] => 17
[5] => 17
[7] => 17
[9] => 17
...more deleted...
Because all 10 threads are trying to write to
stdout
at the same time, once in a while the
output of more than one thread winds up on the same line. Such an
oddity in this script isn’t exactly going to crash the Mars
Lander, but it’s indicative of the sorts of clashes in time
that can occur when our programs go parallel. To be robust, thread
programs need to control access to shared global items like this such
that only one thread uses it at once.[25]
Luckily, Python’s thread
module comes with
its own easy-to-use tools for synchronizing access to shared objects
among threads. These tools are based on the concept of a
lock -- to change a shared object, threads
acquire a lock, make their changes, and then
release the lock for other threads to grab. Lock
objects are allocated and processed with simple and portable calls in
the thread
module, and are automatically mapped to
thread locking mechanisms on the underlying platform.
For instance, in Example 3-7, a lock object created
by thread.allocate_lock
is acquired and released
by each thread around the print
statement that
writes to the shared standard output stream.
Example 3-7. PP2ESystemThreads hread-count-mutex.py
################################################## # synchronize access to stdout: because it is # shared global, thread outputs may be intermixed ################################################## import thread, time def counter(myId, count): for i in range(count): mutex.acquire( ) #time.sleep(1) print '[%s] => %s' % (myId, i) mutex.release( ) mutex = thread.allocate_lock( ) for i in range(10): thread.start_new_thread(counter, (i, 3)) time.sleep(6) print 'Main thread exiting.'
Python guarantees that only one thread can acquire a lock at any
given time; all other threads that request the lock are blocked until
a release call makes it available for acquisition. The net effect of
the additional lock calls in this script is that no two threads will
ever execute a print
statement at the same point
in time -- the lock ensures mutually exclusive access to the stdout
stream.
Hence, the output of this script is the same as the original
thread_count.py
, except that standard output
text is never munged by overlapping prints.
Incidentally, uncommenting the time.sleep
call in
this version’s counter
function makes each
output line show up one second apart. Because the sleep occurs while
a thread holds the lock, all other threads are blocked while the lock
holder sleeps. One thread grabs the lock, sleeps one second and
prints; another thread grabs, sleeps, and prints, and so on. Given 10
threads counting up to 3, the program as a whole takes 30 seconds (10
x 3) to finish, with one line appearing per second. Of course,
that assumes that the main thread sleeps at least that long too; to
see how to remove this assumption, we need to move on to the next
section.
Thread module locks are surprisingly useful. They can form the basis of higher-level synchronization paradigms (e.g., semaphores), and can be used as general thread communication devices.[26] For example, Example 3-8 uses a global list of locks to know when all child threads have finished.
Example 3-8. PP2ESystemThreads hread-count-wait1.py
################################################## # uses mutexes to know when threads are done # in parent/main thread, instead of time.sleep; # lock stdout to avoid multiple prints on 1 line; ################################################## import thread def counter(myId, count): for i in range(count): stdoutmutex.acquire( ) print '[%s] => %s' % (myId, i) stdoutmutex.release( ) exitmutexes[myId].acquire( ) # signal main thread stdoutmutex = thread.allocate_lock( ) exitmutexes = [] for i in range(10): exitmutexes.append(thread.allocate_lock( )) thread.start_new(counter, (i, 100)) for mutex in exitmutexes: while not mutex.locked( ): pass print 'Main thread exiting.'
A lock’s locked
method can be used to check
its state. To make this work, the main thread makes one lock per
child, and tacks them onto a global exitmutexes
list (remember, the threaded function shares global scope with the
main thread). On exit, each thread acquires its lock on the list, and
the main thread simply watches for all locks to be acquired. This is
much more accurate than naively sleeping while child threads run, in
hopes that all will have exited after the sleep.
But wait -- it gets even simpler: since threads share global
memory anyhow, we can achieve the same effect with a simple global
list of integers, not locks. In Example 3-9, the module’s namespace (scope) is
shared by top-level code and the threaded function as
before -- name exitmutexes
refers to the same
list object in the main thread and all threads it spawns. Because of
that, changes made in a thread are still noticed in the main thread
without resorting to extra locks.
Example 3-9. PP2ESystemThreads hread-count-wait2.py
#################################################### # uses simple shared global data (not mutexes) to # know when threads are done in parent/main thread; #################################################### import thread stdoutmutex = thread.allocate_lock( ) exitmutexes = [0] * 10 def counter(myId, count): for i in range(count): stdoutmutex.acquire( ) print '[%s] => %s' % (myId, i) stdoutmutex.release( ) exitmutexes[myId] = 1 # signal main thread for i in range(10): thread.start_new(counter, (i, 100)) while 0 in exitmutexes: pass print 'Main thread exiting.'
The main threads of both of the last two scripts fall into busy-wait
loops at the end, which might become significant performance drains
in tight applications. If so, simply add a
time.sleep
call in the wait loops to insert a
pause between end tests and free up the CPU for other tasks. Even
threads must be good citizens.
Both of the last two counting thread scripts produce roughly the same
output as the original thread_count.py
-- albeit, without stdout
corruption, and with different random ordering of output lines. The
main difference is that the main thread exits immediately after (and
no sooner than!) the spawned child threads:
C:...PP2ESystemThreads>python thread-count-wait2.py
...more deleted...
[2] => 98
[6] => 97
[0] => 99
[7] => 97
[3] => 98
[8] => 97
[9] => 97
[1] => 99
[4] => 98
[5] => 98
[2] => 99
[6] => 98
[7] => 98
[3] => 99
[8] => 98
[9] => 98
[4] => 99
[5] => 99
[6] => 99
[7] => 99
[8] => 99
[9] => 99
Main thread exiting.
Of course, threads are for much more than counting. We’ll put shared global data like this to more practical use in a later chapter, to serve as completion signals from child processing threads transferring data over a network, to a main thread controlling a Tkinter GUI user interface display (see Section 11.4 in Chapter 11).
The standard Python library comes with
two thread modules -- thread
, the basic
lower-level interface illustrated thus far, and
threading
, a higher-level interface based on
objects. The threading
module internally uses the
thread
module to implement objects that represent
threads and common synchronization tools. It is loosely based on a
subset of the Java language’s threading model, but differs in
ways that only Java programmers would notice.[27]
Example 3-10
morphs our counting threads example one last time to demonstrate this
new module’s interfaces.
Example 3-10. PP2ESystemThreads hread-classes.py
####################################################### # uses higher-level java like threading module object # join method (not mutexes or shared global vars) to # know when threads are done in parent/main thread; # see library manual for more details on threading; ####################################################### import threading class mythread(threading.Thread): # subclass Thread object def __init__(self, myId, count): self.myId = myId self.count = count threading.Thread.__init__(self) def run(self): # run provides thread logic for i in range(self.count): # still synch stdout access stdoutmutex.acquire( ) print '[%s] => %s' % (self.myId, i) stdoutmutex.release( ) stdoutmutex = threading.Lock( ) # same as thread.allocate_lock( ) threads = [] for i in range(10): thread = mythread(i, 100) # make/start 10 threads thread.start( ) # start run method in a thread threads.append(thread) for thread in threads: thread.join( ) # wait for thread exits print 'Main thread exiting.'
The output of this script is the same as that shown for its ancestors
earlier (again, randomly distributed). Using the
threading
module is largely a matter of
specializing classes. Threads in this module are implemented with a
Thread
object -- a Python class which we
customize per application by providing a run
method that defines the thread’s action. For example, this
script subclasses Thread
with its own
mythread
class;
mythread
’s run
method is
what will be executed by the Thread
framework when
we make a mythread
and call its
start
method.
In other words, this script simply provides methods expected by the
Thread
framework. The advantage of going this more
coding-intensive route is that we get a set of additional
thread-related tools from the framework “for free.” The
Thread.join
method used near the end of this
script, for instance, waits until the thread exits (by default); we
can use this method to prevent the main thread from exiting too
early, rather than the time.sleep
calls and global
locks and variables we relied on in earlier threading examples.
The example script also uses threading.Lock
to
synchronize stream access (though this name is just a synonym for
thread.allocate_lock
in the current
implementation). Besides Thread
and
Lock
, the threading
module also
includes higher-level objects for synchronizing access to shared
items (e.g., Semaphore
,
Condition
, Event
), and more;
see the library manual for details. For more examples of threads and
forks in general, see the following section and the examples in Part III.
[24] This call is also
available as thread.start_new_thread
, for
historical reasons. It’s possible that one of the two names for
the same function may become deprecated in future Python releases,
but both appear in this text’s examples.
[25] For a more detailed explanation of this phenomenon, see The Global Interpreter Lock and Threads.
[26] They cannot,
however, be used to directly synchronize processes. Since processes
are more independent, they usually require locking mechanisms that
are more long-lived and external to programs. In Chapter 14, we’ll meet a
fcntl.flock
library call that allows scripts to
lock and unlock files, and so is ideal as a cross-process locking
tool.
[27] But in
case this means you: Python’s lock and condition variables are
distinct objects, not something inherent in all objects, and
Python’s Thread
class doesn’t have all
the features of Java’s. See Python’s library manual for
further details.
3.16.66.156