Chapter 3. The Python Programming Language

I just want to go on the record as being completely opposed to computer languages. Let them have their own language and soon they’ll be off in the corner plotting with each other!

Dr. Steven D. Majewski

A key requirement for automated instrumentation is the ability to describe what needs to be done in terms that a computer, or some other type of automated control system, can execute. While the term “programming” might immediately come to mind for some readers, there are actually many ways to do this, some of which don’t even involve a programming language (at least, not in the conventional sense). However, in this book we will be using Python, along with a smattering of C, to create software for automated instrumentation.

This chapter is intended to give you a basic introduction to Python. In the next chapter I’ll introduce the C programming language, which we’ll use to create extensions for Python that will allow you to interface with a vendor’s driver, or create modules for handling computation-intensive chores. This chapter is not intended as an in-depth tutorial or reference for Python; there are many other excellent books available that can fill those roles (refer to the references at the end of this chapter for suggested reading). There is also an extensive collection of documents available at the official Python website, ranging from beginner’s tutorials to advanced topics.

Python was chosen as the primary programming language for this book for several reasons: it’s relatively easy to learn; it doesn’t require a compilation step, so one can execute programs simply by loading them (or just typing them in, if you’re brave enough); and it is powerful and full-featured. Python is also rather unique in that it supports three different programming models—procedural, object oriented, and functional—simultaneously. To begin, we will generally be using the procedural paradigm. Later, when we start working with graphical user interface (GUI) designs and extensions written in C, we will encounter situations where it will be necessary to put aside the purely procedural approach and more fully embrace objects by creating our own.

However, as we will see shortly, Python is inherently object-oriented. Even variables are actually objects, so even though Python doesn’t really force the OO paradigm on the programmer, you will still be working with objects. If you’re not clear on what “procedural” and “object-oriented” mean, please see the sidebar below.

Installing Python

The first step is to install Python. In this book we will be using version 2.6 (not 3.x). For the Windows environment, either the freely available ActiveState distribution, which can be found at http://www.activestate.com/activepython/, or the distribution from python.org is fine. Both include a nice help and reference tool tailored to Windows. If you are running Linux, you should try to use your package manager (synaptic, apt-get, rpm, or whatever) to install version 2.6.

If you need to build and install Python from the source code, see this page for more information:

http://docs.python.org/using/unix.html#getting-and-installing-the-latest-version-of-python

The Python Programming Language

Now that you have (hopefully) at least installed Python, we can take a quick tour through some of the main features of the language.

Python is an interpreted language. More accurately, it is a bytecode compiled interpreted language. What this means is that Python performs a single-pass conversion of program text into a compact binary pseudolanguage referred to as bytecode. This is what is actually executed by the interpreter, which is itself a form of virtual computer that uses the bytecode as its instruction set. This approach is common with modern interpreted languages, and if the virtual machine and its instruction set are well designed and optimized, program execution can approach some respectable speeds. Python is highly optimized internally and demonstrates good execution speeds. It will never be as fast as a compiled language that is converted into the raw binary machine language used by the underlying physical processor itself, but for most applications the speed difference is of little concern. This is particularly true when one considers that nowadays the typical processor (the CPU, or central processing unit) in an average PC is running at between 1 and 3 gigahertz (GHz). Way back in time when a CPU running at a speed of 30 megahertz (MHz) or so was considered fast, code efficiency and program execution speed were much bigger concerns.

If you are new to Python, or even if you aren’t, the book Python Pocket Reference by Mark Lutz (O’Reilly) is highly recommended. It provides a terse, cut-to-the-chase description of the primary features and capabilities of Python, and it is well organized and actually very readable. It is also small, so you can literally put it into a pocket and have it at hand when needed. Several other excellent books on Python are listed in the suggested reading list at the end of this chapter.

The Python Command Line

How you will start the Python interpreter in interactive mode depends on which operating system you are using. For Windows, the usual method is to first open a command prompt window (this is sometimes erroneously called a “DOS box,” but Windows hasn’t had a real DOS box for a long time). At the prompt (which may look different than what is shown here), type in the following command:

C:> python

You should see something like this (assuming you’ve installed the ActiveState distribution, but the standard Python distribution is almost identical):

ActivePython 2.6.4.8 (ActiveState Software Inc.) based on
Python 2.6.4 (r264:75706, Nov  3 2009, 13:23:17) [MSC v.1500 32 bit
(Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>>

The procedure is similar for a Linux (or BSD, or Solaris) system. Open a shell window (it shouldn’t matter if the shell is csh, ksh, bash, or whatever) and enter python at the prompt. Assuming that Python has been installed correctly, you will see the startup message.

The >>> is Python’s command prompt, waiting for you to give it something to do. To exit from the Python command line on a Windows machine, use Ctrl-Z, and on a Linux system use Ctrl-D. Typing “quit” will not work.

The Python command line is a great way to explore and experiment. You can get help for just about everything by using the built-in help facility. Just typing help(), with no arguments, results in the following display:

>>> help()

Welcome to Python 2.6!  This is the online help utility.

If this is your first time using Python, you should definitely check out
the tutorial on the Internet at http://docs.python.org/tutorial/.

Enter the name of any module, keyword, or topic to get help on writing
Python programs and using Python modules.  To quit this help utility and
return to the interpreter, just type "quit".

To get a list of available modules, keywords, or topics, type "modules",
"keywords", or "topics".  Each module also comes with a one-line summary
of what it does; to list the modules whose summaries contain a given word
such as "spam", type "modules spam".

help>

As the help display states, the tutorial material found on the official website is indeed a good place to get a feel for what Python looks like and how to use it. This chapter takes a somewhat different approach to the language, however, by introducing the reader to the concept of data objects first, and reserving things like operators and statements until a little later. I feel that the underlying object-oriented nature of the language is important enough to be dealt with first, because when creating even trivial programs in Python one will quickly encounter situations that will require the use of some of the capabilities embedded in each type of data object.

Over the years I have observed that when tutorial material on Python attempts to ignore or downplay the fundamental OO nature of the language, the result is often full of statements like “Oh, and by the way...” and “It is also like this, but we won’t worry about that here...” Rather than trying to avoid the topic, we will just deal with it head-on. Having a good understanding of what is going on under the hood helps make it a lot easier to comprehend what is happening when things work correctly, and a whole lot easier to have some idea of what to look for when they don’t. If you’re new to Python, it would probably be a good idea to read through both this section and Python’s online tutorial.

Command-Line Options and Environment

The manpage (manual page) for Python is very informative, but unfortunately it is hard to get at if you only have a Windows machine. On a Linux system, simply type man python at a shell prompt (actually, if Python was installed correctly, this should work on any Unix-ish-type system).

On Windows, you can ask Python for some abbreviated help at the command line by typing:

C:> python -h

What you get back should look something like this:

usage: python [option] ... [-c cmd | -m mod | file | -] [arg] ...
Options and arguments (and corresponding environment variables):
-B     : don't write .py[co] files on import; also PYTHONDONTWRITEBYTECODE=x
-c cmd : program passed in as string (terminates option list)
-d     : debug output from parser; also PYTHONDEBUG=x
-E     : ignore PYTHON* environment variables (such as PYTHONPATH)
-h     : print this help message and exit (also --help)
-i     : inspect interactively after running script; forces a prompt even
         if stdin does not appear to be a terminal; also PYTHONINSPECT=x
-m mod : run library module as a script (terminates option list)
-O     : optimize generated bytecode slightly; also PYTHONOPTIMIZE=x
-OO    : remove doc-strings in addition to the -O optimizations
-Q arg : division options: -Qold (default), -Qwarn, -Qwarnall, -Qnew
-s     : don't add user site directory to sys.path; also PYTHONNOUSERSITE
-S     : don't imply 'import site' on initialization
-t     : issue warnings about inconsistent tab usage (-tt: issue errors)
-u     : unbuffered binary stdout and stderr; also PYTHONUNBUFFERED=x
         see man page for details on internal buffering relating to '-u'
-v     : verbose (trace import statements); also PYTHONVERBOSE=x
         can be supplied multiple times to increase verbosity
-V     : print the Python version number and exit (also --version)
-W arg : warning control; arg is action:message:category:module:lineno
-x     : skip first line of source, allowing use of non-Unix forms of #!cmd
-3     : warn about Python 3.x incompatibilities that 2to3 cannot trivially fix
file   : program read from script file
-      : program read from stdin (default; interactive mode if a tty)
arg ...: arguments passed to program in sys.argv[1:]

Other environment variables:
PYTHONSTARTUP: file executed on interactive startup (no default)
PYTHONPATH   : ';'-separated list of directories prefixed to the
               default module search path.  The result is sys.path.
PYTHONHOME   : alternate <prefix> directory (or <prefix>;<exec_prefix>).
               The default module search path uses <prefix>lib.
PYTHONCASEOK : ignore case in 'import' statements (Windows).
PYTHONIOENCODING: Encoding[:errors] used for stdin/stdout/stderr.

You will probably not have much need for the majority of the option switches, but occasionally they do come in handy (especially the -i, -tt, and -v switches). The environment variables, particularly PYTHONHOME, are important, and should be set initially according to the installation directions supplied with the distribution of Python that you are using.

Objects in Python

Generally speaking, everything in Python is an object, including data variables. An assignment is equivalent to creating a new object, and so is a function definition. If you’re not familiar with object-oriented concepts, don’t worry too much about it for now (see the sidebar Procedural and Object-Oriented Programming for a nutshell overview). Hopefully it will become clear as we go along. For now, we just want to show what types of objects one can expect to find in Python; we’ll look at how they are used later.

Table 3-1 lists the various object types most commonly encountered in Python. The type class name is what one would expect to be returned by the built-in type() method, or if an error involving a type mismatch occurs.

Table 3-1. Object types

Object type

Type class name

Description

Character

chr

Single-byte character, used in strings

Integer

int

Signed integer, 32 bits

Float

float

Double-precision (64-bit) number

Long integer

long

Arbitrarily large integer

Complex

complex

Contains both the real and imaginary parts

Character string

str

Ordered (array) collection of byte characters

List

list

Ordered collection of objects

Dictionary

dict

Collection of mapped key/value pairs

Tuple

tuple

Similar to a list but immutable

Function

function

A Python function object

Object instance

instance

An instance of a particular class

Object method

instancemethod

A method of an object

Class object

classobj

A class definition

File

file

A disk file object

We will touch on all of these before we’re finished: we’ll start with numeric data and work up to things like lists, tuples, and dictionaries.

Data Types in Python

If you’ve done any programming in a language like Pascal or C, you are probably familiar with the notion of a variable. It’s a binary value stored in a particular memory location. Python is different, however, and this is where things start to get interesting. Python provides the usual numeric data types, such as integers, floats, and so on. It also has a complex type, which encapsulates both the real and imaginary parts of a complex number. The key thing is in how Python implements variables.

Numeric data as objects

When a variable is assigned a literal value in Python, what actually happens is that an object is created, the literal value is assigned to it (it becomes an attribute of the object), and then it is “bound” to a name. Objects usually have a special method called a constructor that handles the details of creating (instantiating) a new object in memory and initializing it. Conversely, an object may also have a destructor method to remove it from memory when the program is finished with it. In Python, the removal of an object is usually handled automatically in a process called garbage collection.

Here’s an example of how Python creates a new data object:

>>> some_var = 5

This statement instantiates a new object of type int with a value attribute of 5, and then binds the name some_var to it (we’ll see how name binding works shortly). One could also type the following and get the same result:

>>> some_var = int(5)

In this case, we are explicitly telling Python the object type we want (an integer) by calling the int class constructor and passing it the literal value to be assigned when the new object is instantiated. It is important to note that this is not a “cast” in a C or C++ sense; it is an instantiation of an int object that encapsulates the integer value 5.

This way of doing things may seem a bit odd at first, but one gets used to it fairly quickly. Also, most of the time you can safely ignore the fact that variables are actually objects, and just treat them as you might treat a variable in C or C++:

>>> var_one = 5
>>> var_two = 10
>>> var_one + var_two
15

You can also query an object to see what type it is:

>>> type(some_var)
<type 'int'>

Although I just stated that int() is not a cast, it can be used as something akin to that by letting the data objects do the type conversion themselves when a new object is created:

>>> float_var = 5.5
>>> int_var = int(float_var)
>>> print int_var
5

Notice that the fractional part of float_var vanished as a result of the conversion.

Octal and hexadecimal integer notation is also supported, and work as in C:

Octal integer

Use a leading 0, as in 0157.

Hexadecimal integer

Use a leading 0x, as in 0x3FE.

Octal and hexadecimal values don’t have their own type classes. This is because when a value written in either format is assigned to a Python variable, it is converted to its integer equivalent:

>>> foo_hex = 0x2A7
>>> print foo_hex
679

This is equivalent to writing:

>>> foo_hex = int("2A7",16)
>>> print foo_hex
679

So exactly what is a “data object”? In Python, things like variable names reside in what is called a namespace. There are various levels of namespaces, from the local namespace of a function or method to the global namespace of the Python interpreter’s execution environment. For now, we won’t worry too much about them; we’ll just work with the concept of a local namespace.

Variable names do not have any value, other than the string that makes up the name. They are more like handles or labels that we can attach to things that do have values—namely, objects. Figure 3-1 shows how this works.

Typically objects have methods, or internal functions, that operate on the data encapsulated within them. Python’s data objects are no exception. If we create an integer data object, we can ask Python to describe the object to us using the help() function, like this:

>>> int_var = 5
>>> help(5)
Help on int object:

class int(object)
 |  int(x[, base]) -> integer
 |
 |  Convert a string or number to an integer, if possible.  A floating point
 |  argument will be truncated towards zero (this does not include a string
 |  representation of a floating point number!)  When converting a string, use
 |  the optional base.  It is an error to supply a base when converting a
 |  non-string.  If base is zero, the proper base is guessed based on the
 |  string content.  If the argument is outside the integer range a
 |  long object will be returned instead.
 |
 |  Methods defined here:
 |
 |  __abs__(...)
 |      x.__abs__() <==> abs(x)
 |
 |  __add__(...)
 |      x.__add__(y) <==> x+y
 |
 |  __and__(...)
 |      x.__and__(y) <==> x&y
 |
 |  __cmp__(...)
 |      x.__cmp__(y) <==> cmp(x,y)
 |
 |  __coerce__(...)
 |      x.__coerce__(y) <==> coerce(x, y)
 |
 |  __div__(...)
 |      x.__div__(y) <==> x/y
 |
 |  __divmod__(...)
 |      x.__divmod__(y) <==> divmod(x, y)
 |
 |  __float__(...)
 |      x.__float__() <==> float(x)
 |
 |  __floordiv__(...)
 |      x.__floordiv__(y) <==> x//y
 |
 |  __format__(...)
 |
 |  __getattribute__(...)
 |      x.__getattribute__('name') <==> x.name
 |
 |  __getnewargs__(...)
 |
-- More  --

There are more internal methods, and you can peruse them if you are so inclined (just press the space bar for another screenful, Return for another line, or q to return to the prompt), but the main point here is that in Python, data objects “know” how to manipulate their internal data using the built-in methods for a particular class. In other words, the Python interpreter handles the details of converting a statement like this:

5 + 5

into the bytecode equivalent of this, internally:

int(5).__add__(int(5))

and then executing it.

Numeric data objects
Figure 3-1. Numeric data objects

The fact that variables in Python really are objects does take a little getting used to. But it is a powerful feature of the language, and because you can selectively ignore this feature it is possible to create what look like procedural programs, when in reality Python is all about objects.

Sequence objects

Python provides three data types for ordered collections of data objects: lists (arrays), strings, and tuples (list-like objects). These are also known as sequence objects. The “sequence” part refers to the fact that each of these data objects may contain zero or more references to other data objects in an ordered sequence. All except for the string type allow their member elements to be any valid Python object. All have methods for manipulating their data; some methods are common to all sequence objects, and some are unique to a particular type. Table 3-2 lists the three sequence types and some of their properties.

Table 3-2. Sequence objects

Type

Mutable?

Delimiters

List

Yes

[]

String

No, immutable

'' or ""

Tuple

No, immutable

()

Python sequence objects are either mutable (changeable) or immutable (unchangeable). A list object, for example, is mutable in that its data can be modified. A string, on the other hand, is not mutable. One cannot replace, remove, or insert characters into a string directly. A string object is an immutable collection of character values that is treated as a read-only array of byte-sized data objects.

Note

Actually, this applies only to 8-bit UTF-8 character encoding; other character sets (e.g., Unicode) may require something other than just single bytes for each character. In this book we’ll only be working with the UTF-8 character encoding (see Chapter 12 for more on ASCII and the UTF-8 character encoding standard).

In order to make a change to a string, one must create a new string that incorporates the changes. The original string object remains untouched, even if the same variable name is reused for the new string object (which “unbinds” the original string object; unbound objects tend to evaporate through the process of garbage collection, but that’s a low-level detail we don’t really need to worry about).

Lists

A list is Python’s closest equivalent to an array, but it has a few tricks that the arrays in C and Pascal never learned how to do. A list is an ordered sequence, and any element in the list may be replaced with something different. New elements are appended to a list using its append method (there is also a pop method, which means a list can be a queue as well), and the contents of a list can be sorted in place. Each element in a list is actually a reference to an object, just as a numeric data variable name is a reference to a numeric data object. In fact, a list can contain references to any valid Python object. Consider the following:

>>> import random
>>> alist = []
>>> alist.append(4)
>>> alist.append(55.89)
>>> alist.append('a short string')
>>> alist.append(random.random)

alist now contains four elements, which are composed of an integer, a floating-point value, a string, and a reference to a method from Python’s random module called, appropriately enough, random (we’ll discuss the import statement in more detail later). We can examine each member element of alist to verify this:

>>> alist[0]
4
>>> alist[1]
55.890000000000001
>>> alist[2]
'a short string'
>>> alist[3]
<built-in method random of Random object at 0x00A29D28>

If we want a random number, all we have to do in order to invoke random() is treat alist[3] as if it were a function by appending the expected parentheses:

>>> alist[3]()
0.87358651337544713

We can change a particular element in alist simply by assigning it a new value:

>>> alist[2]
'a short string'
>>> alist[2] = 'a better string'
>>> alist[2]
'a better string'

Figure 3-2 shows what is going on inside the alist object.

List object internal organization
Figure 3-2. List object internal organization

We can use a list object to demonstrate Python’s underlying OO nature by entering the following at the Python prompt and observing the results:

>>> list_name = []
>>> list_name.append(0)
>>> list_name.append(1)
>>> list_name
[0, 1]
>>> var_one = list_name
>>> var_two = list_name
>>> var_one
[0, 1]
>>> var_two
[0, 1]
>>> list_name[0] = 9
>>> var_one
[9, 1]
>>> var_two
[9, 1]

Because the names var_one and var_two both refer to the list object initially bound to the name list_name, when list_name is altered the change in the list object is “seen” by both of the other variable names.

Like most every other object in Python, a list has a collection of methods. These include the indexing methods we’ve already seen, but there are more. Lists can be concatenated and are appended end-to-end in the order specified, like so:

>>> alist1 = [1,2,3,4,5]
>>> alist2 = [6,7,8,9,10]
>>> alist1 + alist2
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

To find the index offset of a particular item in a list, we can use the index() method:

>>> alist2.index(8)
2

We can also reverse the order of a list:

>>> alist1.reverse()
>>> alist1
[5, 4, 3, 2, 1]

And we can sort a list:

>>> slist = [8,22,0,5,16,99,14,-6,42,66]
>>> slist.sort()
>>> slist
[-6, 0, 5, 8, 14, 16, 22, 42, 66, 99]

Notice in the last two examples that the list itself is modified “in place.” That is, a new object is not created as a result of reversing or sorting a list. Lists are mutable.

Strings

Strings are ordered sequences of byte-value characters. Strings are immutable, meaning that (unlike in C or C++) they cannot be altered in place by using an index and treating them like arrays. In order to modify a string, one must create a new string object. The contents of a string can, however, be referenced using an index into the string.

Here are some string examples:

>>> astr1 = 'This is a short string.'
>>> astr2 = "This is another short string."
>>> astr3 = "This string has 'embedded' single-quote chracters."
>>> astr4 = """This is an example
... of a multi-line
... string.
... """
>>>

Although one cannot change the contents of a string using an index value, the data can be read using an index, and Python provides the ability to extract specific parts of a string (or “slices,” as they are called). The result is a new string object. The following line will read the first four characters of the string variable astr1, starting at the zero position and stopping before, but not at, the fourth position:

>>> print astr1[0:4]
This

We could also eliminate the 0 in the index range and just let it be assumed:

>>> print astr1[:4]
This

This form tells Python to extract everything from the start of the string up to the fourth position. We can also extract everything from the fourth position to the end of the line:

>>> print astr1[4:]
 is a short string.

Or we can get something from the middle of the string:

>>> print astr1[10:15]
short

Figure 3-3 shows how indexing works in Python.

String indexing
Figure 3-3. String indexing

String objects also incorporate a set of methods that perform operations such as capitalization, centering, and counting the occurrences of particular characters, among other things, each returning a new string object.

As with lists, concatenation uses the + operator:

>>> str_cat = astr1 + " " + astr2
>>> print str_cat
This is a short string. This is another short string.

The result is, as you might expect by now, a new string object. Fortunately, Python incorporates garbage collection, and objects that are no longer bound to a name, as in the following situation, are quietly whisked away; their memory is returned to a shared pool for reuse. This is a good thing, as otherwise memory could quickly fill up with abandoned data objects:

>>> the_string = "This is the string."
>>> the_string = the_string[0:4]
>>> the_string
'This'

In this case, the name the_string is initially bound to the string object containing "This is the string.". When a section of the initial string object is pulled out, a new object is created and the name is reassigned to it. The original object, no longer bound, disappears. However, if an object is shared between two or more names, it will persist so long as one name is bound to it. This can come in handy when creating objects that need to hang around for the life of a program.

Other string methods allow you to left- or right-align a string, replace a word in a string, or convert the case of the characters in a string. Here are some examples.

The upper() method converts all alphabetic characters in a string to uppercase:

>>> print astr1.upper()
THIS IS A SHORT STRING.

find() returns the index of the first character in the search pattern string:

>>> print astr1.find('string')
16

The replace() method substitutes the new string for the search pattern:

>>> print astr1.replace('string', 'line')
This is a short line.

The rjust() method (and its counterpart, ljust()) justifies a string in a field, the width of which is the method’s argument:

>>> print astr1.rjust(30)
       This is a short string.

The default fill character is a space, but one can specify an alternative as a second argument:

>>> print astr1.rjust(30,'.')
.......This is a short string.

You can get a listing of the various string methods available by typing help(str) at the Python prompt.

Tuples

The tuple is an interesting data object with many uses. Like a list object, it is an ordered set that may contain zero or more items, but unlike the list, it is immutable. Once created, a tuple cannot be directly modified. Tuples are typically referred to by the number of items they contain. For example, a 2-tuple has, as you might expect, two data objects. A shorthand way of referring to a tuple of any size is to say “n-tuple.” Even a 0-tuple is possible in Python; it isn’t particularly interesting or useful, except perhaps as a placeholder, but Python will let you create one if you really want to.

Whereas lists in Python employ square brackets as delimiters, tuples use parentheses, like this:

>>> tuple2 = (1,2)
>>> tuple2
(1, 2)

The contents of a tuple can be accessed using an index, just as with lists and strings:

>>> tuple4 = (9, 22.5, 0x16, 0)
>>> tuple4
(9, 22.5, 22, 0)
>>> tuple4[2]
22
>>> tuple4[0]
9

Like lists and strings, tuples may be concatenated (with a new tuple as the result):

>>> tuple2
(1, 2)
>>> tuple4
(9, 22.5, 22, 0)
>>> tuple6 = tuple2 + tuple4
>>> tuple6
(1, 2, 9, 22.5, 22, 0)

In this case, we can see that a new tuple object is created.

A tuple cannot be sorted, but it can be counted. To find out how many times a particular value or object occurs, we can use the count() method:

>>> tpl = (0, 0, 2, 2, 6, 0, 3, 2, 1, 0)
>>> tpl.count(0)
4
>>> tpl.count(2)
3
>>> tpl.count(6)
1

Since the contents of a tuple are actually references to objects, a tuple can contain any mix of valid Python objects, just like a list object.

Mapped objects—dictionaries

Python’s dictionary is a unique data object. Instead of an ordered set of data elements, a dictionary contains data in the form of a set of unordered key/value pairs. That is, each data element has an associated key that uniquely identifies it. It is Python’s one and only mapped data object.

Like any other Python data object, a dictionary can be passed as an argument to a function or method, and returned as well. It can be a data element in a tuple or list, and its values can be any valid Python object type. The types that are usable as keys are limited to integers, strings, and tuples; in other words, keys must be immutable objects.

To create a dictionary object, we can initialize it with a set of keys and associated values:

>>> dobj = {0:"zero", 1:"one", "food":"eat", "spam":42}
>>> dobj
{0: 'zero', 1: 'one', 'food': 'eat', 'spam': 42}

To get at a particular key, we can use what looks like indexing, but is not:

>>> dobj[0]
'zero'
>>> dobj[1]
'one'

If we try a key that isn’t in the dictionary, Python complains:

>>> dobj[2]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
KeyError: 2

But so long as it’s a valid key, we will get a valid value back:

>>> dobj["spam"]
42

Dictionaries incorporate a set of powerful methods for manipulating their data. Table 3-3 contains a list of what’s available, and we’ll look at a few in detail.

Table 3-3. Dictionary methods

Method

Description

clear()

Removes all items from a dictionary.

copy()

Performs a “shallow” copy of a dictionary.

get()

Returns the data associated with a key, or a default value if no matching key is found.

has_key()

Returns True if a specified key is in a dictionary, and False otherwise.

items()

Returns a list of a dictionary’s key/value pairs as 2-tuples.

iteritems()

Iterates over the key/value pairs in a dictionary.

iterkeys()

Iterates over the keys in a dictionary.

itervalues()

Iterates over the values in a dictionary.

keys()

Returns a list of the keys in a dictionary.

pop()

Pops off a specific item by key and removes it from the dictionary.

popitem()

Pops off a specific key/value pair and removes it from the dictionary.

setdefault()

Sets the default value to be returned should a get() fail.

update()

Updates the values with the values from another dictionary. Replaces values in matching keys.

values()

Returns a list of the values in a dictionary.

Note that there is no append() method like the one available for lists. To add a new item to a dictionary, one simply assigns a value to a new key:

>>> dobj[99] = "agent"
>>> dobj
{0: 'zero', 1: 'one', 99: 'agent', 'food': 'eat', 'spam': 42}

Notice that the new key and its associated data are inserted in the dictionary at an arbitrary location. A dictionary is not a sequence object, and data is accessed using keys, so it really is unimportant where it is actually located amongst the other key/value pairs in the data object.

This technique can also be used to modify an existing key’s value:

>>> dobj[1] = "the big one"
>>> dobj
{0: 'zero', 1: 'the big one', 99: 'agent', 'food': 'eat', 'spam': 42}

A safer way to fetch a value from a dictionary is to use the get() method:

>>> dobj.get(99)
'agent'

If we attempt to get a value for a key that doesn’t exist, get() will by default return the special value of None. At the Python command line, this doesn’t show anything:

>>> dobj.get(256)

We can specify a default return value of our choosing, if we so desire, like this:

>>> dobj.get(256,"Nope")
'Nope'

Dictionaries are useful for keeping global data (such as parameters) in one convenient place, and the ability to return a default value allows a program to use predefined parameter values if no externally supplied values are available.

There may be times when we want to get a list of what’s in a dictionary. The items() method returns all of a dictionary’s key/value pairs as a list object of 2-tuples:

>>> dobj.items()
[(0, 'zero'), (1, 'the big one'), (99, 'agent'), ('food', 'eat'), ('spam', 42)]

If we want a list of the keys, we can get one using the keys() method:

>>> dobj.keys()
[0, 1, 99, 'food', 'spam']

Finally, if we are only interested in the values, the values() method comes in handy:

>>> dobj.values()
['zero', 'the big one', 'agent', 'eat', 42]

That should be enough on dictionaries for now. We will see other interesting ways to use dictionaries and the other Python data types later, but in the meantime, feel free to experiment with the Python command line. Trying out new things is one of the best ways to learn about them.

Expressions

In this book, we’re going to use a mathematical-type definition of an expression. That is, an expression is a well-formed sequence of variables and mathematical or logical symbols that does not contain an equals (assignment) symbol but will evaluate to a valid logical or numerical value. A statement (which we will look at shortly) does specify an assignment or some other action, and statements may contain expressions.

Expressions make use of various operators, such as addition, subtraction, comparison, and so on. Expressions may be simple, such as:

a + b

or they may be compound expressions, as in:

((a + b) * c) ** z

Parentheses are used to indicate order of evaluation. In the previous example, the multiplication operator (*) has a higher precedence than addition (+), and exponentiation (**) has a higher precedence than multiplication, so without the parentheses the expression would be evaluated like this:

a + b * c**z

which is clearer if we put the implied parentheses back in:

a + (b * (c**z))

This is definitely not what was wanted in the original expression.

Expressions may contain things other than operators. For example, assume there is a function called epow() that will return the value of e raised to the power of some number or the result of some expression. An expression could contain a call to this function and use it to create a new value:

n + epow(x - (2 * y))

This would be the equivalent of writing n + e(x2y) in standard mathematical notation.

Operators

Now that we’ve seen the data types Python supports and what an expression is, we can look at the various things one can do with them using operators. Python provides a full set of arithmetic, logical, and comparison operators. It also includes operators for bitwise operations, membership tests, and identity tests, and it provides various augmented assignment operators.

Arithmetic operators

Python provides the usual four basic arithmetic operators: addition, subtraction, multiplication, and division. It also has two operators that are not found in some other languages: exponent and floor division. Table 3-4 lists Python’s arithmetic operators.

Table 3-4. Arithmetic operators

Operator

Description

+

Addition

Subtraction

*

Multiplication

/

Division

%

Modulus

**

Exponent

//

Floor division

When dealing with a mix of numeric data types, Python will automatically “promote” all of the operands to the highest-level type, and then perform the indicated operation. The type priorities are:

complex
float
long
int

This means that if an expression contains a floating-point value but no complex values, the result will be a floating-point value. If an expression contains a long and no floating-point or complex values, the result will be a long. If an expression contains a complex value, the result will be complex. So, if one has an expression that looks like this:

5.0 * 5

the result will be a floating-point value:

25.0

As I mentioned, Python also has a unique division operator called “floor division.” This is used to return the quotient of a floating-point operation truncated down to the nearest whole value, with the result returned as a float. In Python, the behavior of // is like this:

>>> 5/2
2
>>> 5//2
2
>>> 5.0/2
2.5
>>> 5.0//2
2.0

Logical operators

Python’s logical operators, shown in Table 3-5, act on the truth values of any object.

Table 3-5. Logical operators

Operator

Description

and

Logical AND

or

Logical OR

not

Logical NOT

Python provides the keywords True and False for use in logical expressions. Note that any of the following are also considered to be False:

  • The None object

  • Zero (any numeric type)

  • An empty sequence object (list, tuple, or string)

  • An empty dictionary

All other values are considered to be True. It is also common to find 1 and 0 acting as true and false values.

Comparison operators

Comparison operators evaluate two operands and determine the relationship between them in terms of equality, inequality, and magnitude (see Table 3-6).

Table 3-6. Comparison operators

Operator

Description

==

True if a equals b, else False

!=

True if a does not equal b, else False

<>

Same as !=

>

True if a is greater than b, else False

<

True if a is less than b, else False

>=

True if a is greater than or equal to b, else False

<=

True if a is less than or equal to b, else False

Python expressions that use comparison operators always return a logical true or false.

Bitwise operators

Python’s AND, OR, and XOR operators map across bit-to-bit between the operands; they do not perform arithmetic operations. The bitwise operators are listed in Table 3-7.

Table 3-7. Bitwise operators

Operator

Description

&

Binary AND

|

Binary OR

^

Binary XOR

~

Binary one’s complement

<<

Binary left shift

>>

Binary right shift

The AND operation will return only those bits in each operand that are true (1), whereas the OR will “merge” the bits of both operands, as shown in Figure 3-4.

Python bitwise AND and OR operators
Figure 3-4. Python bitwise AND and OR operators

The bitwise operators are useful when there is a need to set a particular bit (OR) or test for a bit with a value of 1 (AND). The XOR operator returns the bitwise difference between two operands, as shown in the truth table in Figure 3-5.

Python bitwise XOR operator
Figure 3-5. Python bitwise XOR operator

The one’s complement operator changes the value of each bit to its inverse. That is, a binary value of 00101100 becomes 11010011.

The binary shift operators work by shifting the contents of a data object left or right by the number of bit positions specified by the righthand operand. The effect is the equivalent of multiplication by 2n for a left shift or division by 2n for a right shift (where n is the number of bit positions shifted). For example:

>>> 2 << 1
4
>>> 2 << 2
8
>>> 2 << 3
16
>>> 16 >> 2
4

Assignment operators

As we’ve already seen, assignment in Python involves more than just stuffing some data into a memory location. An assignment is equivalent to instantiating a new data object. Python’s assignment operators are listed in Table 3-8.

Table 3-8. Assignment operators

Operator

Description

=

Simple assignment

+=

Add and assignment (augmented assignment)

−=

Subtract and assignment (augmented assignment)

*=

Multiply and assignment (augmented assignment)

/=

Divide and assignment (augmented assignment)

%=

Modulus and assignment (augmented assignment)

**=

Exponent and assignment (augmented assignment)

//=

Floor division and assignment (augmented assignment)

In addition to the simple assignment operator, Python provides a set of augmented assignment operators as corollaries to each of the arithmetic operators. An augmented assignment first performs the operation and then assigns the result back to the name on the lefthand side of the operator. For example:

>>> a = 1
>>> a += 1
>>> a
2

Membership operators

The membership operators are used to determine whether a value or object exists (in), or doesn’t (not in), within a sequence or dictionary object (see Table 3-9). Note that when used with a dictionary only the keys are tested, not the values.

Table 3-9. Membership operators

Operator

Description

in

Result is True if x is a member of y, else False

not in

Result is True if x is not a member of y, else False

One way to use the in operator would be like this:

if x in some_list:
    DoSomething(x, some_list)

In this case, the function doSomething() will only be called if x is in some_list. Conversely, one could test to see if something is not in an object:

if x not in some_dict:
    some_dict[x] = new_value

If the key x does not already exist in the dictionary, it will be added along with a value.

Identity operators

Python’s identity operators (shown in Table 3-10) are used to determine if one name refers to the same object as another name (is), or if it does not (is not).

Table 3-10. Identity operators

Operator

Description

is

Result is True if x and y refer to the same object, else False

is not

Result is True if x and y do not refer to the same object, else False

The identity operators are handy when attempting to determine if an object is available for a particular operation. An is expression will evaluate to True if the variable names on either side of the operator refer to the same object. An is not expression will evaluate to True if the variable names on either side of the operator do not refer to the same object.

Here is a (nonexecutable) example:

def GetFilePath(name):
    global pathParse

    if pathParse is None:
        pathParse = FileUtil.PathParse()

    file_path = pathParse(name)
    if len(file_path) > 1:
        return file_path
    else:
        return None

The global name pathParse would be initialized (at the start of the module) to None, but for this function it should refer to an object of the class pathParse in the FileUtil module. If it does not (i.e., it is None), it is instantiated. If the function attempts to use pathParse with a value of None, it will fail.

Operator precedence

We already saw some of the precedence characteristics of operators in the earlier discussion of expressions, but now let’s take a closer look. Table 3-11 lists Python’s operators in order of precedence, from lowest to highest.

Table 3-11. Operator precedence

Precedence

Operator

Lowest

or

.

and

.

not x

.

in, not in, is, is not, <, <=, >, >=, <>, !=, ==

.

|

.

^

.

&

.

<<, >>

.

+, -

.

*, /, //, %

.

+x, -x, ~x

Highest

**

Parentheses are used to force the order of evaluation, as was shown earlier. If you can’t remember how the evaluation order works, or the default order isn’t what you want, use parentheses as necessary to get the desired result. Using parentheses for clarity is never a bad thing.

Statements

A typical program is composed of statements, comments, and whitespace (blank lines, spaces, tabs, etc.). Statements are composed of keywords and optional expressions, and specify an action. A statement might be a simple assignment:

>>> some_var = 5

Or it could be a compound set of control statements, such as an if-else construct:

>>> if some_var < 10:
...     print "Yes"
...     print "Indeed"
... else:
...     print "Sorry"
...     print "Nope"
...
Yes
Indeed

Python is also interesting for what it doesn’t have. Those with experience in other languages may notice that there is no “switch” or “case” statement. Python’s if-elif-else construct is usually used for this purpose. There is also nothing that looks like the structure data type in C. Dictionaries and lists can be used to emulate a structure, but it’s often not necessary. Python also does not have a “do”, as in do-until or do-while. It does have a for statement, but it doesn’t work in the way that a C programmer might expect.

Indentation

When talking about program structure, one often refers to blocks of statements. A block can be defined as a set of one or more statements that are logically associated. Unlike C and some other languages, Python does not use special characters or reserved words to denote how statements are logically grouped into blocks. It uses indentation. For example, in C, one could write the if-else shown above like this:

if (some_var < 10) {
    printf("Yes
");
    printf("Indeed
");
}
else {
    printf("Sorry
");
    printf("Nope
");
}

The curly braces tell the C compiler how the statements are grouped, and C does not care how much or how little each statement is indented—in C that’s considered to be “whitespace,” and the compiler ignores it. In Python, however, the indentation is essential, as it tells the interpreter how the code is structured and which statements are logically associated. The amount of indentation is not critical so long as it is consistent. The recommended amount is four spaces for each level, and no tabs (tabs are generally considered somewhat evil because they don’t always move between different editors gracefully—one editor might interpret tabs as four spaces, whereas another might translate tabs to eight spaces).

Some people have issues with Python’s use of indentation to denote blocks of code, and for those with extensive experience in C or C++ it does seem rather odd at first (although it is by no means a new idea in computer science). The advantages claimed for indentation are that it helps to enforce a consistent style across different programs by different authors and that it improves readability. Some people find that using comments such as #endif, #endfor, and #endwhile helps to make large sections of code with multiple levels of indentation easier to read, but we won’t get into that discussion here.

Comments

In Python, a comment is denoted by a # character (sometimes called a hash), and a comment can appear anywhere on a line. The interpreter ignores everything following the hash. Use comments liberally to document your programs, but make the comments worthwhile. A comment like this:

a += 1    # increment by one

isn’t very useful (although they still show up quite often), but a comment like this:

if (a + 1) > maxval:     # do not increment past limit

can help to dispel mystery.

Keywords

Python utilizes 31 distinct reserved keywords, listed in Table 3-12.

Table 3-12. Python’s keywords

and

elif

if

print

as

else

import

raise

assert

except

in

return

break

exec

is

try

class

finally

lambda

while

continue

for

not

with

def

from

or

yield

del

global

pass

 

We will examine some of the more commonly used keywords in the remainder of this chapter. Others will be introduced as necessary when we start developing some larger and more complex programs.

Simple statements

In Python, a simple statement (see Table 3-13) is one that consists of an assignment or keyword in a single line; there are no other components. The statement may have more than one expression, however.

Table 3-13. Simple statements

Keyword

Description

assert

assert <expression>; if <expression> is not true, an exception will be raised

Assignment (=)

Creates a new data object and assigns (binds) it to a name

Augmented assignment

See Table 3-8

pass

The null operation; when executed, nothing happens

del

Removes the binding between a name or list of names and any associated objects

print

Sends output to the standard output (stdout)

return

May return an optional literal value or the result of an expression

yield

Only used in generator functions

raise

Raises an exception

break

Used in for and while loops to terminate loop execution

continue

Used in for and while loops to force the loop to jump back to the top and immediately start a new cycle

import

Specifies an external module to be included in the current namespace

global

Specifies a list of names that are to be treated as global variables within the current module

exec

Supports dynamic execution of Python code

I have intentionally skipped the del, exec, raise, and yield statements in the following subsections because they really won’t come into play for what we want to do in this book. A discussion of the import statement is deferred until later in this chapter, in the section titled Importing Modules.

assert

The assert statement is typically used to determine if some condition has been met. If not, an exception is raised. It is heavily used in unit testing and sometimes for catching off-nominal conditions (although there are other ways to do this).

Assignment

The assignment statement (=) is probably the most basic form of Python statement. As we’ve already seen, an assignment is essentially equivalent to instantiating an object of some type and binding it to a name. We’ve already made extensive use of assignment in the previous sections, so we won’t belabor it any more here.

Augmented assignment

Augmented assignment statements are very useful and show up quite often in Python programs. Because an assignment of any type will create a new data object, you cannot have an augmented assignment in an expression. In other words, this won’t work:

if (a += 1) > maxval:

But this will:

if (a + 1) > maxval:

In an augmented assignment, the arithmetic operation is performed first, followed by the assignment. For a list of Python’s augmented assignment operators, see Table 3-8.

pass

The pass statement is a no-op statement that does nothing. It is typically used as a placeholder when a statement is required syntactically. One often finds pass statements in methods of a top-level class that are intended to be overwritten by methods in a child class. They may also appear in “callback” functions or methods that don’t really need to do anything, but must have a statement to be syntactically complete.

print

print writes the values of one or more objects to stdout, unless stdout has been redirected or the output of print is itself redirected. If print is given an object that is not a string, it will attempt to convert the data to string form. By default, print appends a newline (/n) to the end of the output, but this can be suppressed.

return

The return statement is used to return control from a function or method back to the original caller. The return statement may optionally pass data back to the caller, and the data can be any valid Python object. As mentioned earlier, a function may return a tuple instead of just a single value, which makes it possible to return both a status code and a data value (or more). While it is possible to return a list or a dictionary, this can be problematic in large programs with large and complex data objects because of the inherently opaque nature of these data types. I’ll have more to say about this kind of unintentional obfuscation in a later section.

break

The break statement may occur only within a for or while loop. It will terminate the nearest enclosing loop construct and skip the else statement, if there is one.

continue

The continue statement may occur only within a for or while loop. continue forces the loop to return to the for or while statement at the start of the loop. Any subsequent statements past the continue are skipped. If a continue causes control to pass out of a try construct with a finally statement, the finally is executed before the next iteration of the loop. We’ll look at the try-except construct in more detail shortly.

global

The global statement is used to declare names in a module context that are modifiable by a function or method within the module. Normally such names are read-only by functions or methods, and then only if the name does not already appear in the function or method.

Compound statements

Compound statements are composed of groups of statements that are logically related and control the execution of other statements. We will take a look at the if, while, for, and try statements, but we will skip the with statement and save the def and class statements until the next section. Table 3-14 lists Python’s compound statements.

Table 3-14. Compound statements

Keyword

Description

if

Conditional test with optional alternate tests or terminal case

while

Executes loop repeatedly while initial condition is True

for

Iterates over elements of an iterable object (e.g., a list, string, or tuple)

try

Defines exception handling for a group of statements

with

Used with context managers

def

Declares a user-defined function or method

class

Declares a user-defined class

The if statement

Python’s if statement behaves as one would expect. Following the keyword if is an expression that will evaluate to either True or False. In its simplest form it is just an if statement and a block of one or more subordinate statements:

if <expression>:
    statement
    (more statements as necessary)

To specify an alternative action, one would use the else statement:

if <expression>:
    statement
    (more statements as necessary)
else:
    statement
    (and yet more statement if necessary)

To create a series of possible outcomes, the elif statement (a compression of “else if”) is used. It is like an if and requires an expression, but it can only appear after an if, never by itself:

if <expression>:
    statement
    (more statements as necessary)
elif <expression>:
    statement
    (more statements as necessary)
else:
    statement
    (and yet more statements if necessary)
The while statement

The while statement repeats a block of statements as long as a control expression is true:

while <expression>:
    statement
    (more statements as necessary)
else:
    statement
    (and yet more statement if necessary)

The else block is executed if the loop terminates normally (i.e., the control expression evaluates to False) and a break statement was not encountered. In the following example, the loop is controlled using a Boolean variable, which is initialized to True and then assigned the value of False from within the loop:

>>> loop_ok = True
>>> loop_cnt = 10
>>> while loop_ok:
...     print "%d Loop is OK" % loop_cnt
...     loop_cnt -= 1
...     if loop_cnt < 0:
...         loop_ok = False
... else:
...     print "%d Loop no longer OK" % loop_cnt
...
10 Loop is OK
9 Loop is OK
8 Loop is OK
7 Loop is OK
6 Loop is OK
5 Loop is OK
4 Loop is OK
3 Loop is OK
2 Loop is OK
1 Loop is OK
0 Loop is OK
-1 Loop no longer OK

The else statement is completely optional.

The continue and break statements may also be used to cause a loop to re-cycle through the while statement immediately or terminate and exit, respectively. If a break statement is used to terminate a loop, the else statement is also skipped. No statements following a continue statement will be evaluated.

The for statement

Python does have a for statement, but not in the sense that one would expect to find in some other languages. In Python, the for statement is used to iterate through a sequence of values. The for statement also includes an optional else statement, just as the while statement does, and it behaves in the same way:

for some_var in <sequence>:
    statement
    (more statements as necessary)
else:
    statement
    (and yet more statement if necessary)

One way to specify a sequence of integer values is to use the built-in function range(), like this:

>>> for i in range(0,5):
...     print i
...
0
1
2
3
4

Another place where for comes in handy is when dealing with a sequence object such as a list:

>>> alist = [1,2,3,4,5,6,7,8,9,10]
>>> for i in alist:
...     print i
...
1
2
3
4
5
6
7
8
9
10

The values that for traverses don’t have to be integers. They could just as well be a set of strings in a tuple:

>>> stuple = ("this","is","a","4-tuple")
>>> for s in stuple:
...     print s
...
this
is
a
4-tuple

Like the while statement, the for statement supports the continue and break statements, and these work as one might expect.

The try statement

The try statement is used to trap and handle exceptions, and it is similar to the try-catch found in C++ or Java. It is very useful for creating robust Python applications by allowing the program designer to implement an alternative to the default approach the Python interpreter takes when an error occurs (which is usually to generate what is called a traceback message and then terminate). The full form of the try-except construct looks like this:

try:
    statement
    (more statements as necessary)
except <exception, err_info>:
    statement
    (more statements as necessary)
else:
    statement
    (more statements as necessary)
finally:
    statement
    (and yet more statements if necessary)

The use of a specific exception type (<exception>) is optional, and if it is not given, any exception will invoke the statements in the except block. One way to find out what happened to cause the exception is to use the base class Exception and specify a variable for Python to write the exception information into:

try:
    f = open(fname, "r")
except Exception, err:
    print "File open failed: %s" % str(err)

In this case, if the file open fails the program won’t terminate. Instead, a message will be printed to stdout with some information about why the open statement failed.

The else statement is executed if there was no exception, and the finally block will be executed if there is a break or continue statement in the try block of statements. Refer to the Python documentation for more information about the try statement and exception handling in Python.

Strings

The ability to create strings of formatted data is used extensively in many Python programs, and the programs we will encounter in this book are no exception. Python’s string objects provide a rich set of methods, and when they are combined with string formatting Python can generate output with formatted columns, left- or right-justified fields, and specific representations of various data types. Strings are important enough to merit a separate section.

String quotes

A string literal is quoted using one of the following forms:

'A single-quote string.'
"A double-quote string."
'''This is a multiline string using triple single quotes.
It is a medium-length string. '''
"""This is a multiline string with triple double quotes containing many
characters along with some punctuation, and it is a very long string indeed."""

Multiline strings can span more than one line, and (newline) characters are inserted into the string automatically to preserve the original formatting.

String methods

The string type provides numerous methods, some of which we have already seen. Table 3-15 is a complete list (not including the Unicode methods) as given in the Python 2.6 documentation.

Table 3-15. String methods

capitalize

lower

center

lstrip

count

partition

decode

replace

encode

rfind

endswith

rindex

expandtabs

rjust

find

rpartition

format

rsplit

index

rstrip

isalnum

split

isalpha

splitlines

isdigit

startswith

islower

strip

isspace

swapcase

istitle

title

isupper

translate

join

upper

ljust

zfill

Some of these get a lot more use than others, but it’s good to have some idea of what’s available. For the methods we don’t cover here, refer to the Python documentation. Also, remember that the form:

new_sring = "string text".method()

works just as well as:

new_string = string_var.method()

Also keep in mind that there needs to be a target name for the new string object created as a result of invoking the method (strings are not mutable); otherwise, the modified string data will simply vanish.

Table 3-16 lists 14 commonly used string methods. Other less commonly used methods will be described as the need arises. In the following descriptions I will use the convention used in the Python documentation to indicate (required) and [optional] parameters.

Table 3-16. Commonly used string methods

Method

Description

capitalize()

Returns a copy of the string with just its first character capitalized.

center(width[,fillchar])

Returns a copy of the text in the string centered in a new string of length width. If fillchar is specified, the new string will be padded to width length on either side of the string text with the fillchar character. The default (if fillchar is omitted) is to use the space character.

count(sub[,start[,end]])

Counts the number of unique occurrences of the substring sub. The arguments start and end may be used to specify a range within the original string.

find(sub[,start[,end]])

Locates the first occurrence of the substring sub within the original string and returns an index value. If the substring is not found, −1 is returned.

isalnum()

Returns True if all of the characters in the string are alphanumeric (0..9, A..Z, a..z); otherwise, returns False.

isalpha()

Returns True if all of the characters in the string are alphabetic (A..Z, a..z); otherwise, returns False.

isdigit()

Returns True if all of the characters in the string are numeric (0..9), and False otherwise.

islower()

Returns True if all of the alphabetic characters in the string (a..z) are lowercase.

isspace()

Returns True if the string consists of nothing but whitespace characters (space, tab, newline, etc.); otherwise, returns False.

ljust(width[,fillchar])

Returns a new left-justified string of length width. If fillchar is specified, the string will be padded with the specified character. The default pad character is a space. If width is less than the length of the original string, the original is returned unchanged.

lower()

Returns a copy of the original string with all alphabetic characters converted to lowercase.

rjust(width[,fillchar])

Returns a new right-justified string of length width. If fillchar is specified, the string will be padded with the specified character. The default pad character is a space. If width is less than the length of the original string, the original is returned unchanged.

split([sep[,maxsplit]])

Returns a list whose elements are the words from the string using sep as the delimiter. sep may itself be a string. If sep is not specified, all whitespace between words is used as the delimiter (the whitespace can be any length so long as it is contiguous). If maxsplit is given, only up to maxsplit items will be returned.

upper()

Returns a copy of the original string with all alphabetic characters converted to uppercase.

String formatting

There are basically two ways to format a string with variable data in Python. The first is to use concatenation, which we saw earlier. The second is to make use of Python’s string formatting capability. Which method is most appropriate depends on what you are trying to accomplish. While concatenation is relatively easy with simple strings, it doesn’t provide for a lot of control over things like the number of decimal places, and data in strings with lots of embedded characters can be cumbersome when using concatenation. Consider the following example:

>>> data1 = 5.05567
>>> data2 = 34.678
>>> data3 = 0.00296705087
>>> data4 = 0
>>> runid = 1
>>> outstr1 = "Run "+str(runid)+": "+str(data1)+" "+str(data2)
>>> outstr2 = " "+str(data3)+" : "+str(data4)
>>> outstr = outstr1 + outstr2
>>> outstr
'Run 1: 5.05567 34.678 0.00296705087 : 0'

There is an easier way. Python employs string formatting placeholders that are very similar to those used in the C sprintf() function. By using special formatting codes, one can specify where data is to be inserted into a string and how the data will appear in string form. Here is a string created using formatting placeholders with the same data as above:

>>> outstr = "Run %d: %2.3f %2.3f %2.3f : %d" % (runid, data1, data2, data3, data4)
>>> outstr
'Run 1: 5.056 34.678 0.003 : 0'

Notice that the variables to be used in the string are enclosed in parentheses—it is an n-tuple. If the parentheses are omitted, only the first variable name is evaluated and an error will result:

>>> "%d %d %d" % 1, 2, 3
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: not enough arguments for format string

The syntax for a string format placeholder is:

%[(name)][flags][width][.precision]type_code

Each placeholder can have an optional name assigned to it as the first item after the % (the parentheses are required). Following that are optional flags for justification, leading spaces, a sign character, and 0 fill. Next is an optional width value that specifies the minimum amount of space to allow for the data. If the data contains a decimal part, the value of the .precision field specifies the number of decimal places to use. Finally, a type code specifies what kind of data to expect (string, integer, long, floating point, etc.). Table 3-17 lists the available flags. Table 3-18 summarizes the various type codes available.

Table 3-17. String format placeholder flags

Flag

Meaning

#

Use “alternate form” formatting (see the notes in Table 3-18).

0

Pad numeric values with a leading zero.

-

Left-adjust (overrides 0 flag if both are specified).

(a space)

Insert a space before positive numbers.

+

Precede the values with a sign character (+ or -). Overrides the “space” flag.

Table 3-18. String format placeholder type codes

Type code

Meaning

Notes

d

Signed integer decimal.

 

i

Signed integer decimal.

 

o

Signed octal value.

The alternate form prepends a leading zero to the number if one is not already present.

u

Obsolete type.

Identical to d.

x

Signed hexadecimal (lowercase).

The alternate form prepends 0x if not already present.

X

Signed hexadecimal (uppercase).

The alternate form prepends 0X if not already present.

e

Floating-point exponential format (lowercase).

The alternate form always uses a decimal point even if no digits follow it.

E

Floating-point exponential format (uppercase).

Alternate form same as e.

f

Floating-point decimal format (lowercase).

Alternate form same as e.

F

Floating-point decimal format (uppercase).

Alternate form same as e.

g

Floating-point format. Uses lowercase exponential format if exponent is less than –4 or not less than precision, and decimal format otherwise.

The alternate form always contains a decimal and trailing zeros are not removed.

G

Floating-point format. Uses uppercase exponential format if exponent is less than –4 or not less than precision, and decimal format otherwise.

Same as g.

c

Single character (accepts an integer or single-character string).

 

r

String (converts any Python object using repr()).

 

s

String (converts any Python object using str()).

 

%

No argument is converted, results in a % character in the result.

 

String methods can be applied along with string formatting in the same statement. This may look a bit odd, but it’s perfectly valid:

>>> "%d %d".ljust(20) % (2, 5)
'2 5               '
>>> "%d %d".rjust(20) % (2, 5)
'               2 5'
>>>

Because there is no assignment of the string object to a name, Python just prints it out immediately after applying the formatting.

Lastly, Python provides a set of so-called escape characters for use with strings. These are special two-character codes composed of a backslash followed by a character, as shown in Table 3-19.

Table 3-19. String escape sequences

Escape sequence

Description

ASCII

'

Single quote

'

"

Double quote

"

\

Single backslash

a

ASCII bell

BEL



ASCII backspace

BS

f

ASCII formfeed

FF

ASCII linefeed

LF

ASCII carriage return

CR

ASCII horizontal tab

TAB

v

ASCII vertical tab

VT

A backslash character () may also be used for line continuation if it is the last character on a line, followed immediately by a newline (LF or CRLF). This causes the newline to be ignored by the interpreter, so it treats the line and the subsequent line as a single line of code.

Program Organization

So far we’ve been doing things at Python’s command prompt. Now we’ll look at how to create program modules with functions, classes, and methods.

Scope

Earlier I obliquely referred to the notion of scope without actually defining what it is. Let’s do that now.

As I’ve already mentioned, Python utilizes the concept of namespaces as collections of names that are bound to objects. Actually, a namespace is more like a dictionary object, where the names are the keys and the objects they reference are the values. There are three levels of namespaces in Python: local, global, and built-ins. Figure 3-6 shows the namespace scopes in a Python module.

When a name is referenced in a function or method, a search is first made of the local namespace, including enclosing functions. Next, the global namespace is searched. Finally, the built-in namespace is searched. If the name cannot be found, Python will raise an exception. Figure 3-7 shows the namespace search hierarchy.

Python’s namespaces
Figure 3-6. Python’s namespaces
Local scope

The local scope is the namespace of a particular function, class, or method. In other words, any variables defined within a function are local to that function and are not visible outside of it. The local scope also includes the nearest surrounding function (if any). We will look at nested functions shortly.

Namespace search hierarchy
Figure 3-7. Namespace search hierarchy

Class objects introduce yet another namespace into the local context. In a class object, any variables defined within the namespace of the class are accessible to any method within the class by prefixing the name with self, like this:

self.some_var

The data variable attributes and methods of an object instance of a class are visible outside of the object and may be accessed using the “dot notation” we’ve already seen:

SomeObj = SomeClass()
SomeObj.var_name = value

This will assign a value to the attribute var_name in the object instance SomeObj. If var_name does not exist, it will be created in the object’s context. This leads us to an interesting observation: Python objects do not have truly private data or methods in the sense that they cannot be accessed from outside of the object. Everything is accessible, although some things are not as readily available as others. You can prefix the name of a function, class, or variable with a leading underscore to prevent it from being included in a wildcard import, but that doesn’t hide it. Using two leading underscore characters will “mangle” the object’s name, but even then it is still accessible if you know how. So, while nothing is really hidden, the onus is on the programmer to be polite and not look.

If you’re not sure exactly what this means, don’t worry about it for now. We’ll address objects in more detail later, when we start building user interfaces for our instrumentation applications.

Global scope

The global scope is the namespace of the enclosing module. Functions cannot modify the module’s global variables unless the global statement is used. The following example, named globals.py, illustrates this:

# globals.py

var1 = 0
var2 = 1

def Function1():
    var1 = 1
    var2 = 2

    print var1, var2

def Function2():
    global var1, var2

    print var1, var2

    var1 = 3
    var2 = 4

    print var1, var2

To try it out, we will need to load it using the import statement. This tells Python to read the module and populate the command line’s namespace using what it finds there:

>>> import globals

Once globals is imported, we can use the help() function to see what is inside:

>>> help(globals)
Help on module globals:

NAME
    globals

FILE
    globals.py

FUNCTIONS
    Function1()

    Function2()

DATA
    var1 = 3
    var2 = 4

If we execute Function1, we can verify that the global instances of var1 and var2 are not changed:

>>> globals.var1
0
>>> globals.var2
1
>>> globals.Function1()
1 2
>>> globals.var1
0
>>> globals.var2
1

However, Function2 will change the values assigned to var1 and var2:

>>> globals.Function2()
0 1
3 4
>>> globals.var1
3
>>> globals.var2
4

If a function assigns values to variables with names identical to those in the global namespace, the global statement must be used if the names are referenced before the assignments are made. This example, called globals2.py, illustrates this:

# globals2.py

var1 = 0
var2 = 1

def Function1():
    print var1, var2

def Function2():
    var1 = 1
    var2 = 2

    print var1, var2

def Function3():
    print var1, var2

    var1 = 1
    var2 = 2

    print var1, var2

Observe what happens when we execute the three functions:

>>> import globals2
>>> globals2.Function1()
0 1
>>> globals2.Function2()
1 2
>>> globals2.Function3()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "globals2.py", line 14, in Function3
    print var1, var2
UnboundLocalError: local variable 'var1' referenced before assignment

Function1() succeeded because there was no conflict between its local variables and the module’s global variables. In Function2() the local variables var1 and var2 are defined within the function, so again, there is no problem. However, Function3() causes Python to emit an error message. In this case the use of the global names is blocked because identical names have already been placed into the function’s local namespace, but the names aren’t yet bound to an object containing a value when the print statement is invoked. Hence the UnboundLocalError exception. If the print statement were preceded by a global statement, the error would not have occurred.

Built-in scope

The built-in namespace is the Python runtime environment. It includes things like abs(), print, and various exception names. If you want a list of the built-in names, just type dir(__builtins__) at the Python prompt. I won’t list the output here because it’s rather large (144 names at least).

Modules and packages

A Python source code file is called a module. It is a collection of statements composed of variable definition statements, import statements, directly executable statements, function definition statements, and class definition statements, with the variables and methods that go with them.

Modules are contained within packages. A package is, in effect, a directory that contains one or more modules. Packages may contain other packages. Figure 3-8 shows this graphically.

A module is an object, and as we’ve already seen, it has its own namespace. A module also has attributes just like any other Python object. A module’s attributes include the functions, classes, methods, and variables defined in its namespace.

Functions, classes, and methods

The def statement is used to define both functions within modules and methods within classes:

def SomeName (parameters):
    """ docstring goes here.
    """
    local_var = value

    statement...
    statement...
    more statements...

When used to define a function, the def statement begins at the leftmost column and all of the function’s statements are indented relative to it. When used to define a method in a class, the def statement is indented relative to the class statement.

Packages and modules
Figure 3-8. Packages and modules

Functions and methods may be nested. When this is done, the internal functions are not accessible from outside of the enclosing function. Here is a rather contrived example of nested functions called subfuncs.py:

#subfuncs.py

def MainFunc():
    def SubFunc1():
        print "SubFunc1"
    def SubFunc2():
        print "SubFunc2"
    def SubFunc3():
        def SubSubFunc1():
            print "SubSubFunc1"
        def SubSubFunc2():
            print "SubSubFunc2"
        SubSubFunc1()
        SubSubFunc2()
    SubFunc1()
    SubFunc2()
    SubFunc3()

We can only execute the function MainFunc(); none of the other functions nested within it are directly accessible from outside of the scope of MainFunc(). If you import subfuncs and try to get help on it, this is all you will see:

>>> import subfuncs
>>> help(subfuncs)
Help on module subfuncs:
NAME
    subfuncs
FILE
    subfuncs.py
FUNCTIONS
    MainFunc()

However, if we execute MainFunc() we can see that the subfunctions do get executed:

>>> import subfuncs
>>> subfuncs.MainFunc()
SubFunc1
SubFunc2
SubSubFunc1
SubSubFunc2

The class statement defines a class object, which in turn is used to create object instances of the class. The following class defines a timer object that may be used to get elapsed times during program execution:

import time

class TimeDelta:
    def __init__(self):
        self.tstart = 0
        self.tlast  = 0
        self.tcurr  = 0

        self.Reset()

    def GetDelta(self):
        """ Returns time since last call to GetDelta(). """
        self.tcurr = time.clock()
        delta = self.tcurr - self.tlast
        self.tlast = self.tcurr
        return delta

    def GetTotal(self):
        """ Returns time since object created. """
        return time.clock() - self.tstart

    def Reset(self):
        """ Initializes time attributes. """
        self.tstart = time.clock()
        self.tlast = self.tstart

Objects of this class can be instantiated in the code wherever one might want to check on elapsed times, and multiple occurrences may exist simultaneously. This would be rather awkward to do if TimeDelta was a function in a module, but as a class each instance can maintain its own data for when it was started and when it was last checked.

Docstrings

Docstrings are used to document modules, classes, methods, and functions. A multiline string that appears at the start of a module, function, class, or method is seen by Python as a docstring, and it is stored in the object’s internal __doc__ variable. This is what you are seeing when you type help() for a specific function at the command-line prompt.

The following example shows how docstrings are used. The pass statement has been used so that we can import this code and use help() to display the embedded documentation:

#docstrings.py

""" Module level docstring.

    This describes the overall purpose and features of the module.
    It should not go into detail about each function or class as
    each of those objects has its own docstring.
"""

def Function1():
    """ A function docstring.

        Describes the purpose of the function, its inputs (if any)
        and what it will return (if anything).
    """
    pass

class Class1:
    """ Top-level class docstring.

        Like the module docstring, this is a general high-level
        description of the class. The methods and variable
        attributes are not described here.
    """

    def Method1():
        """ A method docstring.

            Similar to a function docstring.
        """
        pass

    def Method2():
        """ A method docstring.

            Similar to a function docstring.
        """
        pass

When the help() function is used on this module, the following output is the result:

>>> import docstrings
>>> help(docstrings)
Help on module docstrings:

NAME
    docstrings - Module level docstring.

FILE
    docstrings.py

DESCRIPTION
    This describes the overall purpose and features of the module.
    It should not go into detail about each function or class as
    each of those objects has its own docstring.

CLASSES
    Class1

    class Class1
     |  Top-level class docstring.
     |
     |  Like the module docstring, this is a general high-level
     |  description of the class. The methods and variable
     |  attributes are not described here.
     |
     |  Methods defined here:
     |
     |  Method1()
     |      A method docstring.
     |
     |      Similar to a function docstring.
     |
     |  Method2()
     |      A method docstring.
     |
     |      Similar to a function docstring.

FUNCTIONS
    Function1()
        A function docstring.

        Describes the purpose of the function, its inputs (if any)
        and what it will return (if anything).

Importing Modules

Python modules can bring in functionality from other modules by using the import statement. When a module is imported Python will first check to see if the module has already been imported, and if it has it will refer to the existing objects by including their names in the current namespace. Otherwise, it will load the indicated module, scan it, and add the imported names to the current namespace. Note that “current namespace” may refer to the local namespace of a function, class, or method, or it might be the global namespace of a module.

Statements in a Python module that are not within a function or method will be executed immediately when the module is loaded. This means that any import statements, assignments, def or class statements, or other code will be executed at load time. Code within a function or method is executed only when it is called, although an object for it is created when the def or class statement is processed.

Import methods

The import statement comes in several different forms. This is the most common, and safest, form:

import module

The objects in module are added to the current namespace as references of the form module.function() or module.class(). To access data attributes within a module, the notation module.variable is used.

A variation on this is the aliased import form:

import module as alias

This is identical to the import module statement, except now the alias can be used to reference objects in module. This is handy when a module has a long name. For example:

import CommonReturnCodes as RetCodes

One can also specify what to import from a module:

from module import somename

This form imports a specific function, class, or data attribute from a module. The function or attribute somename can then be used without a module prefix.

The wildcard form imports everything from the external module and adds it to the current namespace:

from module import *

The wildcard import is generally considered to be a bad idea except in special cases, such as when importing a module that has been specifically designed to be used in this fashion and contains only unique names that are unlikely to conflict with existing names. It is considered problematic because it imports everything from the imported module unless special precautions are taken. If the imported module happens to have attributes with the same names as those in the current module, the current names will be overwritten.

There is, as one might expect from Python, a way to control what is exported by using single or double leading underscore characters for attribute names. An attribute name of the form:

_some_name

will not be included in a wildcard import, but it can still be referenced using the module prefix notation. A double leading underscore of the form:

__some_name

is about as close as Python gets to data hiding. It can still be accessed from outside the parent module, but its external name is “mangled” to make it more difficult to get at.

Import processing

Because Python executes any import statements that are not within the scope of a function immediately when a module is imported, it will descend through the import statements in each module in a depth-first fashion until all imports have been processed. Figure 3-9 shows graphically how this works.

Module import sequence
Figure 3-9. Module import sequence

The import sequence in Figure 3-9 is indicated by numbers in circles. Module A imports module B, which imports Module C, which in turn imports modules G and H. Module D and the modules it imports will be next in line after module H is processed.

Cyclic imports

One drawback to Python’s import scheme is that it is possible to create situations where imports can become “hung.” This is called a cyclic import. Consider the diagram in Figure 3-10.

Cyclic import situation
Figure 3-10. Cyclic import situation

Here we have a situation where Module A imports Module B, which in turn imports Modules C and D. However, Module C attempts to import Module A, which is currently waiting for Module B to finish importing Module C, so Module B can then move on to Modules D and E. Because the import of Module B cannot complete, the entire process deadlocks.

One sure way to avoid cyclic imports is to remember the rule “Never import up, only down.” This means that modules should be imported hierarchically, and also that modules should be architected such that there is no need to import from a higher-level module. A typical mistake made by many newcomers to Python is to place a set of pseudoconstants (assignments to names with values that don’t change) in a module with other related functionality, and then import the entire module solely to gain access to the pseudoconstant objects. Things like pseudoconstants that are referenced by more than one module should go into their own module, which can then be imported when needed without worrying about causing a cyclic import situation.

Loading and Running a Python Program

The following example is a complete Python program that contains no function or class definitions—it is what is commonly referred to as a “script.” It will generate a PGM format image file consisting of random data. The result looks like an old-style TV screen tuned to an empty channel—it’s a lot of “snow.” The main point here is to get a look at what a small Python program looks like. Any image viewer capable of handling PGM files should be able to load and display the image (ImageJ, a free tool from http://rsbweb.nih.gov/ij/, works quite well for this, and check out http://netpbm.sourceforge.net for information about the PGM image format.)

Executing this program doesn’t require that you start the Python interpreter first. Just run python from the command line with the program filename as its only parameter, like this:

C:samples> python pgmrand.py

or, on Linux:

/home/jmh/samples/% python pgmrand.py

The prompt will most likely look different on your system (unless you’re keeping your Python samples in a directory called “samples”).

If you are using Linux, you’ll probably need to put the following line at the top of the program file:

#! /usr/bin/python

On some systems you may need to modify this to point to where Python is actually installed. A likely alternate location is /usr/local/bin/python.

Here’s the source code:

""" Generates an 8 bpp "image" of random pixel values.

The sequence of operations used to create the PGM output file is as follows:
    1. Create the PGM header, consisting of:
        ID string (P5)
        Image width
        Image height
        Image data size (in bits/pixel)
    2. Generate height x width bytes of random values
    3. Write the header and data to an output file
"""
import random as rnd   # use import alias for convenience

rnd.seed()        # seed the random number generator

# image parameters are hardcoded in this example
width  = 256
height = 256
pxsize = 255      # specify an 8 bpp image

# create the PGM header
hdrstr = "P5
%d
%d
%d
" % (width, height, pxsize)

# create a list of random values from 0 to 255
pixels = []
for i in range(0,width):
    for j in range(0,height):
        # generate random values of powers of 2
        pixval = 2**rnd.randint(0,8)
        # some values will be 256, so fix them
        if pixval > pxsize:
            pixval = pxsize
        pixels.append(pixval)

# convert array to character values
outpix = "".join(map(chr,pixels)) 1

# append the "image" to the header
outstr = hdrstr + outpix

# and write it out to the disk
FILE = open("pgmtest.pgm","w")
FILE.write(outstr)
FILE.close()
1

The string join() method and the map() function are used to create an output string that is written to the image file.

It would be a worthwhile exercise to review the program and look up the things that don’t immediately make sense to you. The only really “tricky” part is the use of the string join() method and the map() function to create the output string. This was done because Python does not have a native byte type, but it does have a chr type for use with strings. If one wants an array of bytes, one way to get these is to create a string by scanning through a list of integers, converting each to a chr type, and then joining it to an empty string (the "" in the "".join(map(chr,pixels)) statement). Note that all the parameters one might want to change to experiment with the output file are hardcoded in this example.

Basic Input and Output

In order to be generally useful, a program must have some means to input data and output results. Python provides several ways to achieve both objectives using the console, the command line, and file objects. Later on we will examine things like serial ports, USB interfaces, network sockets, and data acquisition hardware, but for now let’s look at what can be done with Python as it comes right out of the box.

User input

Getting user input from stdin (standard input) is straightforward. Python provides the raw_input() function for just this purpose.

The module getInfo.py contains a simple example of how raw_input() can be used:

# getInfo.py

def ask():
    uname = raw_input("What is your name? ")
    utype = raw_input("What kind of being are you? ")
    uhome = raw_input("What planet are you from? ")
    print ""
    print "So, %s, you are a %s from %s." % (uname, utype, uhome)
    uack = raw_input("Is that correct? ")
    if uack[0] in ('y', 'Y'):
        print "Cool. Welcome."
    else:
        print "OK, whatever."

To see how this works, we can import the module getInfo and then call its function ask():

>>> import getInfo
>>> getInfo.ask()
What is your name? zifnorg
What kind of being are you? Zeeble
What planet are you from? Arcturus III

So, zifnorg, you are a Zeeble from Arcturus III.
Is that correct? y
Cool. Welcome.

The raw_input() function accepts an optional prompt string and always returns the data from stdin as a string. If the program is looking for a numeric value, it will need to be converted. A safe way to do this is by using the try-except construct. Here is getInfo2.py with the try-except modification:

def ask2():
    uname = raw_input("What is your name? ")
    utype = raw_input("What kind of being are you? ")
    uhome = raw_input("What planet are you from? ")
    getgumps = True
    while (getgumps):
        intmp = raw_input("How many mucklegumps do you own? ")
        try:
            ugumps = int(intmp)
        except:
            print "Sorry, you need to enter an integer number."
            continue
        else:
            getgumps = False
    print ""
    print "So, %s, you are a %s from %s, with %d mucklegumps."
        % (uname, utype, uhome, ugumps)
    uack = raw_input("Is that correct? ")
    if uack[0] in ('y', 'Y'):
        print "Cool. Welcome."
    else:
        print "OK, whatever."

Before we move on, there are a few things to consider about this simple function. First, it will only accept an integer value for the number of “mucklegumps.” Strings and floats will be rejected. Secondly, there is no way for the user to gracefully abort the input process. This could be easily handled by checking for a special character (a ., for example), or just detecting null input (just pressing the Enter key with no input). Speaking of null input, if the user does press the Enter key in response to the last question, Python will raise an exception:

Is that correct? <enter>
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "getInfo2.py", line 18, in ask2
    if uack[0] in ('y', 'Y'):
IndexError: string index out of range

The expression in the if statement is attempting to match whatever is in uack[0] with either of the values in the 2-tuple ('y', 'Y'), and just pressing Enter returns a zero-length string, which causes the exception. Using a try-except here will prevent this from happening:

    uack = raw_input("Is that correct? ")
    try:
        if uack[0] in ('y', 'Y'):
            print "Cool. Welcome."
        else:
            print "OK, whatever."
    except:
        print "Fine. Have a nice day."

When dealing with user input (that is, whatever a human being types in response to a prompt), one must always be aware of possible input errors or exceptions. Humans can, and often will, type in erroneous data, values that are out of range, unexpected words or phrases, or even nothing at all. Users are unpredictable, so building in safeguards to catch bad input values is always a good idea.

Command-line parameters

Program parameters entered at the command line are captured by the operating system and passed to the program via the Python interpreter as a list. The first item in the list (at index 0) is always the name of the program itself. Python’s included sys module contains methods for dealing with this data.

This simple program (argshow.py) will print out all the items from the command-line parameter list:

import sys

print "%d items in argument list
" % len(sys.argv)

i = 1
for arg in sys.argv:
    print "%d: %s" % (i, arg)
    i += 1

And here is what happens when we run it:

C:samples> python argshow.py 1 2 3 4 -h -v
7 items in argument list

1: argshow.py
2: 1
3: 2
4: 3
5: 4
6: -h
7: -v

Python also provides tools for detecting specific arguments and extracting values from command-line parameters, which we won’t cover at this point. We will see them in action in later chapters.

Files

Python has a basic built-in object type for dealing with files that provides methods to read and write data from and to a disk file, among other actions. We’ve already seen a little bit of it with the pgmrand.py script we looked at earlier.

The open() method is used to create an instance of a file object:

>>> fname = "test1.txt"
>>> fmode = "w"
>>> f = open(fname, fmode)

Of course, you could also write:

f = open("test1.txt", "w")

and get the same result.

Once we have a file object, we can write something to it using its write() method:

>>> f.write("Test line 1
")
>>> f.write("Test line 2
")
>>> f.close()

The resulting file should now contain two lines of text:

Test line 1
Test line 2

Notice that the strings to be written to the file end with a (the code for a newline character). The file write() method does not append a newline to the end of a string like print does, so it must be explicitly included in the string.

Table 3-20 lists the most commonly encountered file modes.

Table 3-20. File I/O modes

Mode

Meaning

r

Read

rb

Read binary

w

Write

wb

Write binary

a

Append

ab

Append binary

Table 3-21 lists some commonly used file object methods. For a description of the other file object methods that are available, refer to the Python documentation.

Table 3-21. File methods

Method

Description

close()

Close a file.

flush()

Flush the internal buffer.

read([size])

Read at most size bytes from the file.

readline([size])

Read an entire line from the file.

write(str)

Write a string to the file.

Console output using print

We’ve already seen Python’s print function in action. Its primary purpose is to send output to whatever is currently defined as stdout (standard output). The print function is capable of handling conversions between numeric types and strings for console output in a transparent fashion. The string formatting discussed earlier works with the print statement to create nicely formatted output.

Redirecting print

By default, the output of print is sent to whatever is currently defined as stdout. By using the “chevron” (>>) operator this behavior can be modified, and print can send output to any object that provides a write() method. Typically this would be a file object, as shown here:

>>> datastr = "This is a test."
>>> f = open("testfile.txt", "w")
>>> print >> f,datastr
>>> f.close()

Hints and Tips

Here is a semirandom collection of observations that may prove useful to you.

Module global variables

It is usually a good idea to initialize module global `variables at the start of a module file. Attempting to check a global variable that does not yet exist will result in an exception, and taking care of this beforehand can save some aggravation later.

Latent defects

Because Python does not execute the internal statements (i.e., the body) of functions or methods when a module is imported, only the def statement, it is always possible for bugs to be lurking there that will not become apparent until the code is invoked. In such situations the try statement is a powerful ally, but it is not a cure-all. Good unit testing is key to detecting and removing such defects before they can cause problems.

Deferred imports

Sometimes you may encounter code where the original author attempted to resolve a cyclic import by deferring the import of the problematic module by placing the import statement within a function or method, instead of at the top of the module file. While this is syntactically allowed in Python it is considered to be bad form, and it’s a sure sign that someone didn’t think the design through before sitting down at the keyboard and hammering away at it. However, when dealing with legacy code (or just poorly written code) it may not be possible to avoid using this trick. Use it sparingly, only when you really have to, and test it thoroughly.

Dictionaries as function parameters

Although Python allows any data object to be used as a parameter to a function or method, resist the temptation to use dictionary objects unless you have a compelling reason to do so. If a dictionary object is used as a parameter, document it in detail and try to avoid altering its structure dynamically as it gets passed from function to function. Code that dynamically alters the structure of a shared dictionary object can be very difficult to understand and a nightmare to debug. It could even be considered a form of obfuscation, albeit (hopefully) unintentional. The same common-sense rationale applies to lists.

Function return values

Tuples are a handy way to return more than one value from a function. For example, one could return a tuple containing both a status code value and a data value by using a 2-tuple. To see if the function succeeded one would examine the status code, and if it is OK one would then get the data value.

Think of modules as objects

Of course, in Python a module actually is an object (everything is, as you may recall), but the tendency seems to be to treat a module as something akin to a source code module in C or C++. One can achieve some neat and tidy data encapsulation using just a module with nothing in it but assignment statements to associate names with values. Here is part of a module that contains nothing but event ID values for use with a wxPython GUI, which we will get to in a later chapter:

# ResourceIDs.py

import  wx

# File
idFileSave                          = wx.NewId()
idFileSaveAs                        = wx.NewId()
idFileNew                           = wx.NewId()
idFileOpen                          = wx.NewId()
idFileOpenGroup                     = wx.NewId()
idFileClose                         = wx.NewId()
idFileCloseAll                      = wx.NewId()
idFilePrint                         = wx.NewId()
idFilePrintPreview                  = wx.NewId()
idFilePrintSetup                    = wx.NewId()

The wxPython package includes a function called NewID() that automatically assigns a new ID number each time it is called. When ResourceIDs is imported, every statement is evaluated and a value is assigned to each data object. To use these one simply imports the module (perhaps using an alias, as shown here):

import ResourceIDs as rID

event_id = rID.idFileSave

This comes in handy in large programs, especially those that employ a GUI with lots of event ID names. A data-only module can also be imported from any other module without worrying about creating a cyclic import, provided that it does not itself import anything else (except perhaps system-level modules). If the attribute names in a data-only module are unique (using, say, a special prefix on each name), it could also be safely imported using the wildcard import style.

Use docstrings and descriptive comments

Once upon a time, a physics professor told me: “Document everything you do in the lab like you were going to get struck with amnesia tomorrow.” Sage advice, to be sure, but many people are loath to spend the time necessary to include docstrings and descriptive comments in their code. This is silly, because no one can be expected to remember exactly what something does or why it’s even there 12 months or more down the road (some might say that even a couple of months is a stretch). It also says something about how the author of the software feels about those who might come along later and try to fix or maintain the code.

Coding style

The document known as “PEP-8” (available from http://www.python.org) contains some suggested coding style guidelines. You may not agree with all of it, but you should at least read it and be familiar with it. There is a lot of good advice there. In any case, you should attempt to arrive at some type of consistent style for your code, if for no other reason than that it improves readability and makes things a whole lot easier when there is a need to revisit old code.

Python Development Tools

A good development environment can make the difference between success and frustration. The development environment must, at a minimum, provide some way to create and edit Python source code as a standard ASCII text file. Additional tools, such as debuggers, automatic documentation generators, and version control are all good, but one could get by without them if absolutely necessary. Fortunately this isn’t necessary, given that there are a lot of excellent FOSS (Free and Open Source Software) tools available, and some very good and inexpensive commercial tools as well.

In this section we’ll take a brief look at what is available, with a primary emphasis on FOSS tools. It doesn’t really matter what tools you use, and most people have (or develop over time) their own preferences and work habits. The important thing to take away here is that there are many paths available, and choosing the right one is simply a matter of picking the tools that feel right and selecting the right tools for the job.

Editors and IDEs

At the very least, you will need a text editor or integrated development environment (IDE) of some sort for entering and editing Python source code. You may also want to use the editor for your C source code when writing extensions (which we will delve into in Chapter 5), so it may be a good idea to pick something that’s language-neutral, or perhaps language-aware with syntax highlighting.

The primary difference between an editor and an IDE lies in how much one can accomplish from within the tool itself. An editor typically allows you to do just one thing: editing. An IDE, on the other hand, lets you do much more—from editing, to compiling, debugging, and perhaps even metrics and version control. An IDE is intended to be an environment the developer doesn’t need to leave until either it’s time to quit and go home, or the program is complete.

With some editors there is also the capability to launch another program from within the tool and then capture and display program output, but this is usually more of an add-on capability, not something that is inherently part of the editor tool, and some editors support this capability better than others. A full-featured IDE incorporates all of this functionality in some form or another, although some IDEs also require functionality from external tools and applications. In other words, the line between an editor with lots of bells and whistles and an IDE is sometimes blurry.

Using an IDE with Python is probably not necessary (although there are a few available), since it is not a compiled language, and most of what happens with Python is either happening at the command line or within the Python application’s GUI (if it uses one).

Editors

If you think you would prefer to use a standalone editor (which is what I use, by the way), there are several excellent packages to choose from. Table 3-22 lists a few of the more popular ones to consider.

Table 3-22. Short list of text editors

Name

OS

FOSS?

Pros

Cons

Emacs

Linux

Windows

Others

Yes

Supports sophisticated editing functions, scripting, syntax highlighting, and multiwindow displays.

Has a somewhat steep learning curve and uses some nonintuitive multikey commands that must be memorized.

vi/vim

Linux

Windows

Others

Yes

The basic functions are easy to learn, and vi is very widespread across different Linux- and Unix-like platforms. vim also provides a GUI interface in addition to the conventional command-line operation.

Learning the more complex and sophisticated functionality can be a slog. Nonintuitive key combinations and codes are a holdover from the days of mainframes, minicomputers, and terminals.

nano

Linux

Yes

Very simple. Provides some syntax highlighting.

Based on the Pico editor and its Control-key commands. Limited capabilities.

Slickedit

Linux

Windows

Others

No ($$$)

Lots of features, full GUI interface, programmable macros, and syntax highlighting. Capable of emulating other editors.

Lots of knobs and dials to learn—may be overkill for most development tasks. Rather hefty price tag.

UltraEdit

Linux

Windows

No ($)

Very easy to learn with a full GUI interface. Multiple tabbed text windows, programmable macros, and syntax highlighting.

Has lots of features that the average developer will probably never use. Requires some effort to figure out how to adjust the default settings and disable some unnecessary defaults. It costs money (but not a whole lot).

This is only a partial list, and there are other editors available, including some good FOSS ones. If you don’t already have a favorite editor (or even if you do), it would probably be worthwhile to try to compare what’s available for your development platform. But, a word of caution: some people seem to become rather attached to a particular editor, even to the point of being somewhat fanatical about it. This is particularly apparent in the Emacs versus vi debate that has been going on now for well over 20 years (refer to http://en.wikipedia.org/wiki/Editor_war for details). Just keep an open mind, select the right tool for the job, and see the editor war for what it really is: free entertainment.

IDE tools

An IDE attempts to integrate everything a programmer might need into a single tool. The first popular and low-cost IDE for the PC was Borland’s Turbo Pascal, developed by Philippe Kahn in the mid-1980s. Most modern IDEs provide a text editor for source code, an interface to a compiler or interpreter, tools to automate the build process, perhaps some support for version control, and a debugger of some sort. In other words, it’s a one-stop shopping experience for software development. Not every IDE will provide all the functionality we’ve listed here, but at the very least you should expect a text editor and the ability to run external tools and applications such as a compiler, interpreter, and debugger. In this sense even editors such as UltraEdit and Emacs (listed in Table 3-22) could be used as IDEs (and often are, actually). Table 3-23 lists some readily available IDE tools suitable for use with Python.

Table 3-23. Short list of IDEs

Name

OS

FOSS?

Pros

Cons

Boa

Any that Python and wxPython support

Yes

Excellent tool for creating and maintaining wxPython GUI components and applications. Includes a decent editor and a basic Python debugger.

Targeted for the wxPython GUI add-on package. It does a lot but isn’t as full-featured as a dedicated editor.

Idle

Any that Python supports

Yes

Provided with Python and coded entirely in Python. Provides multiple editing windows, function/method lists, a Python shell window, and a rudimentary debugger.

Idle’s multiple editing windows are free-floating, and it is sometimes annoying trying to track down a particular window.

Eclipse

(with PyDev)

Linux

Windows

Others

Yes

A very flexible multilanguage IDE written in Java. Additional functionality and language support are provided by plug-in modules such as PyDev for Python development.

A rather steep learning curve and a project/package model for capturing project components that may not be suitable for everyone.

PythonWin

Windows

Yes

Provided with the ActiveState Python distribution. Includes most of the same capabilities as Idle.

Specifically for the Windows platform.

WingIDE

Linux

Windows

Others

No ($$)

Lots of functionality specifically geared toward Python development and debugging.

Python-specific, although the editor can, of course, be used with other languages. The interface can be somewhat busy and cluttered, so spending time with the configuration is usually necessary.

Debuggers

Debuggers allow a software developer to see inside the software, so to speak, while it is running. While one could perhaps argue that a debugger is seldom, if ever, actually necessary, they can save a lot of time and quickly expose serious problems in a program. However, as with any addictive substance, a debugger may be good in moderation, but it can develop into a serious dependency problem if one is not careful.

What, exactly, can one do with a debugger? For starters, a debugger allows the developer to set “breakpoints” in the code by selecting a particular line in the source listing. When the program execution reaches that point, it is halted and the local variables may be examined. A debugger also provides the ability to step through the code, one line at a time. If the debugger supports the concept of a “watch,” specific variables may be selected and their values displayed to the developer at breakpoints or while stepping through the code.

A debugger is, by necessity, language-specific—there is no “one size fits all” debugger currently available, although there are some “shells” that provide a similar interface across several languages.

For Python, the Boa, Idle, Eclipse, and WingIDE tools provide capable debuggers. A standalone Python debugger, Winpdb, is also available, and Python itself ships with an integrated command-line debugger, pdb.

Summary

This concludes our brief tour of Python. You should now have a general feel for what Python looks like and what it is capable of. I have intentionally glossed over many aspects of the language, because, after all, this book is not a tutorial on Python. As I stated going in, there are many excellent books available that can provide copious amounts of detail, and the official Python website is the authoritative source of all things Python. As we go along we will encounter other features of the language, and we will examine them when the need arises.

Suggested Reading

If you would like to get deeper into the realm of Python programming, the following books would be good places to start:

Python in a Nutshell, 2nd ed. Alex Martelli, O’Reilly Media, 2006.

A compact reference that’s very handy to have on the desk when you’re working with Python. Well organized and easy to use, this is an essential reference work when you need to look up something in a hurry and want more than a pocket reference, but less than a massive tome.

Programming Python, 3rd ed. Mark Lutz, O’Reilly Media, 2006.

A comprehensive introduction to Python and a massive reference, this 1,600-page book covers everything from string methods to GUI programming. The one book anyone working with Python should have.

In addition to the URL references already provided in this chapter, there are numerous other online resources available for Python, including the following:

http://diveintopython3.org

This site hosts the complete text of Mark Pilgrim’s book Dive Into Python, also available as a PDF download. The book takes a learn-by-doing approach and uses numerous examples to illustrate key concepts and techniques.

http://effbot.org

Fredrik Lundh’s blog site. Here you can find hundreds of articles on Python, downloadable and viewable books, and some software to examine and try. The articles are well written and interesting to browse, and they are useful for the insights they provide into the language and its uses.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.17.63.138