I just want to go on the record as being completely opposed to computer languages. Let them have their own language and soon they’ll be off in the corner plotting with each other!
A key requirement for automated instrumentation is the ability to describe what needs to be done in terms that a computer, or some other type of automated control system, can execute. While the term “programming” might immediately come to mind for some readers, there are actually many ways to do this, some of which don’t even involve a programming language (at least, not in the conventional sense). However, in this book we will be using Python, along with a smattering of C, to create software for automated instrumentation.
This chapter is intended to give you a basic introduction to Python. In the next chapter I’ll introduce the C programming language, which we’ll use to create extensions for Python that will allow you to interface with a vendor’s driver, or create modules for handling computation-intensive chores. This chapter is not intended as an in-depth tutorial or reference for Python; there are many other excellent books available that can fill those roles (refer to the references at the end of this chapter for suggested reading). There is also an extensive collection of documents available at the official Python website, ranging from beginner’s tutorials to advanced topics.
Python was chosen as the primary programming language for this book for several reasons: it’s relatively easy to learn; it doesn’t require a compilation step, so one can execute programs simply by loading them (or just typing them in, if you’re brave enough); and it is powerful and full-featured. Python is also rather unique in that it supports three different programming models—procedural, object oriented, and functional—simultaneously. To begin, we will generally be using the procedural paradigm. Later, when we start working with graphical user interface (GUI) designs and extensions written in C, we will encounter situations where it will be necessary to put aside the purely procedural approach and more fully embrace objects by creating our own.
However, as we will see shortly, Python is inherently object-oriented. Even variables are actually objects, so even though Python doesn’t really force the OO paradigm on the programmer, you will still be working with objects. If you’re not clear on what “procedural” and “object-oriented” mean, please see the sidebar below.
The first step is to install Python. In this book we will be using version 2.6 (not 3.x). For the Windows environment, either the freely available ActiveState distribution, which can be found at http://www.activestate.com/activepython/, or the distribution from python.org is fine. Both include a nice help and reference tool tailored to Windows. If you are running Linux, you should try to use your package manager (synaptic, apt-get, rpm, or whatever) to install version 2.6.
If you need to build and install Python from the source code, see this page for more information:
http://docs.python.org/using/unix.html#getting-and-installing-the-latest-version-of-python |
Now that you have (hopefully) at least installed Python, we can take a quick tour through some of the main features of the language.
Python is an interpreted language. More accurately, it is a bytecode compiled interpreted language. What this means is that Python performs a single-pass conversion of program text into a compact binary pseudolanguage referred to as bytecode. This is what is actually executed by the interpreter, which is itself a form of virtual computer that uses the bytecode as its instruction set. This approach is common with modern interpreted languages, and if the virtual machine and its instruction set are well designed and optimized, program execution can approach some respectable speeds. Python is highly optimized internally and demonstrates good execution speeds. It will never be as fast as a compiled language that is converted into the raw binary machine language used by the underlying physical processor itself, but for most applications the speed difference is of little concern. This is particularly true when one considers that nowadays the typical processor (the CPU, or central processing unit) in an average PC is running at between 1 and 3 gigahertz (GHz). Way back in time when a CPU running at a speed of 30 megahertz (MHz) or so was considered fast, code efficiency and program execution speed were much bigger concerns.
If you are new to Python, or even if you aren’t, the book Python Pocket Reference by Mark Lutz (O’Reilly) is highly recommended. It provides a terse, cut-to-the-chase description of the primary features and capabilities of Python, and it is well organized and actually very readable. It is also small, so you can literally put it into a pocket and have it at hand when needed. Several other excellent books on Python are listed in the suggested reading list at the end of this chapter.
How you will start the Python interpreter in interactive mode depends on which operating system you are using. For Windows, the usual method is to first open a command prompt window (this is sometimes erroneously called a “DOS box,” but Windows hasn’t had a real DOS box for a long time). At the prompt (which may look different than what is shown here), type in the following command:
C:> python
You should see something like this (assuming you’ve installed the ActiveState distribution, but the standard Python distribution is almost identical):
ActivePython 2.6.4.8 (ActiveState Software Inc.) based on Python 2.6.4 (r264:75706, Nov 3 2009, 13:23:17) [MSC v.1500 32 bit (Intel)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>>
The procedure is similar for a Linux (or BSD, or Solaris) system.
Open a shell window (it shouldn’t matter if the shell is
csh, ksh,
bash, or whatever) and enter python
at the prompt. Assuming that Python has
been installed correctly, you will see the startup message.
The >>>
is
Python’s command prompt, waiting for you to give it something to do. To
exit from the Python command line on a Windows machine,
use Ctrl-Z, and on a Linux system use Ctrl-D. Typing “quit” will not
work.
The Python command line is a great way to explore and experiment.
You can get help for just about everything by using the built-in help
facility. Just typing help()
, with no arguments, results in the
following display:
>>> help()
Welcome to Python 2.6! This is the online help utility.
If this is your first time using Python, you should definitely check out
the tutorial on the Internet at http://docs.python.org/tutorial/.
Enter the name of any module, keyword, or topic to get help on writing
Python programs and using Python modules. To quit this help utility and
return to the interpreter, just type "quit".
To get a list of available modules, keywords, or topics, type "modules",
"keywords", or "topics". Each module also comes with a one-line summary
of what it does; to list the modules whose summaries contain a given word
such as "spam", type "modules spam".
help>
As the help display states, the tutorial material found on the official website is indeed a good place to get a feel for what Python looks like and how to use it. This chapter takes a somewhat different approach to the language, however, by introducing the reader to the concept of data objects first, and reserving things like operators and statements until a little later. I feel that the underlying object-oriented nature of the language is important enough to be dealt with first, because when creating even trivial programs in Python one will quickly encounter situations that will require the use of some of the capabilities embedded in each type of data object.
Over the years I have observed that when tutorial material on Python attempts to ignore or downplay the fundamental OO nature of the language, the result is often full of statements like “Oh, and by the way...” and “It is also like this, but we won’t worry about that here...” Rather than trying to avoid the topic, we will just deal with it head-on. Having a good understanding of what is going on under the hood helps make it a lot easier to comprehend what is happening when things work correctly, and a whole lot easier to have some idea of what to look for when they don’t. If you’re new to Python, it would probably be a good idea to read through both this section and Python’s online tutorial.
The manpage (manual page) for Python is very informative, but
unfortunately it is hard to get at if you only have a Windows machine.
On a Linux system, simply type man
python
at a shell prompt (actually, if Python was installed
correctly, this should work on any Unix-ish-type system).
On Windows, you can ask Python for some abbreviated help at the command line by typing:
C:> python -h
What you get back should look something like this:
usage: python [option] ... [-c cmd | -m mod | file | -] [arg] ... Options and arguments (and corresponding environment variables): -B : don't write .py[co] files on import; also PYTHONDONTWRITEBYTECODE=x -c cmd : program passed in as string (terminates option list) -d : debug output from parser; also PYTHONDEBUG=x -E : ignore PYTHON* environment variables (such as PYTHONPATH) -h : print this help message and exit (also --help) -i : inspect interactively after running script; forces a prompt even if stdin does not appear to be a terminal; also PYTHONINSPECT=x -m mod : run library module as a script (terminates option list) -O : optimize generated bytecode slightly; also PYTHONOPTIMIZE=x -OO : remove doc-strings in addition to the -O optimizations -Q arg : division options: -Qold (default), -Qwarn, -Qwarnall, -Qnew -s : don't add user site directory to sys.path; also PYTHONNOUSERSITE -S : don't imply 'import site' on initialization -t : issue warnings about inconsistent tab usage (-tt: issue errors) -u : unbuffered binary stdout and stderr; also PYTHONUNBUFFERED=x see man page for details on internal buffering relating to '-u' -v : verbose (trace import statements); also PYTHONVERBOSE=x can be supplied multiple times to increase verbosity -V : print the Python version number and exit (also --version) -W arg : warning control; arg is action:message:category:module:lineno -x : skip first line of source, allowing use of non-Unix forms of #!cmd -3 : warn about Python 3.x incompatibilities that 2to3 cannot trivially fix file : program read from script file - : program read from stdin (default; interactive mode if a tty) arg ...: arguments passed to program in sys.argv[1:] Other environment variables: PYTHONSTARTUP: file executed on interactive startup (no default) PYTHONPATH : ';'-separated list of directories prefixed to the default module search path. The result is sys.path. PYTHONHOME : alternate <prefix> directory (or <prefix>;<exec_prefix>). The default module search path uses <prefix>lib. PYTHONCASEOK : ignore case in 'import' statements (Windows). PYTHONIOENCODING: Encoding[:errors] used for stdin/stdout/stderr.
You will probably not have much need for the majority of the
option switches, but occasionally they do come in handy (especially the
-i
, -tt
, and -v
switches). The environment variables, particularly PYTHONHOME
, are important, and should be set
initially according to the installation directions supplied with the
distribution of Python that you are using.
Generally speaking, everything in Python is an object, including data variables. An assignment is equivalent to creating a new object, and so is a function definition. If you’re not familiar with object-oriented concepts, don’t worry too much about it for now (see the sidebar Procedural and Object-Oriented Programming for a nutshell overview). Hopefully it will become clear as we go along. For now, we just want to show what types of objects one can expect to find in Python; we’ll look at how they are used later.
Table 3-1 lists the various object types
most commonly encountered in Python. The type class name is what one
would expect to be returned by the built-in type()
method, or if an error involving a type
mismatch occurs.
Object type | Type class name | Description |
Character |
| Single-byte character, used in strings |
Integer |
| Signed integer, 32 bits |
Float |
| Double-precision (64-bit) number |
Long integer |
| Arbitrarily large integer |
Complex |
| Contains both the real and imaginary parts |
Character string |
| Ordered (array) collection of byte characters |
List |
| Ordered collection of objects |
Dictionary |
| Collection of mapped key/value pairs |
Tuple |
| Similar to a list but immutable |
Function |
| A Python function object |
Object instance |
| An instance of a particular class |
Object method |
| A method of an object |
Class object |
| A class definition |
File |
| A disk file object |
We will touch on all of these before we’re finished: we’ll start with numeric data and work up to things like lists, tuples, and dictionaries.
If you’ve done any programming in a language like Pascal or C, you are probably familiar with the notion of a variable. It’s a binary value stored in a particular memory location. Python is different, however, and this is where things start to get interesting. Python provides the usual numeric data types, such as integers, floats, and so on. It also has a complex type, which encapsulates both the real and imaginary parts of a complex number. The key thing is in how Python implements variables.
When a variable is assigned a literal value in Python, what actually happens is that an object is created, the literal value is assigned to it (it becomes an attribute of the object), and then it is “bound” to a name. Objects usually have a special method called a constructor that handles the details of creating (instantiating) a new object in memory and initializing it. Conversely, an object may also have a destructor method to remove it from memory when the program is finished with it. In Python, the removal of an object is usually handled automatically in a process called garbage collection.
Here’s an example of how Python creates a new data object:
>>> some_var = 5
This statement instantiates a new object of type int
with a value attribute of 5
, and then binds the name some_var
to it (we’ll see how name binding
works shortly). One could also type the following and get the same
result:
>>> some_var = int(5)
In this case, we are explicitly telling Python the object type
we want (an integer) by calling the int
class constructor and passing it the
literal value to be assigned when the new object is instantiated. It
is important to note that this is not a “cast” in a C or C++ sense; it is an instantiation of an
int
object that encapsulates the
integer value 5
.
This way of doing things may seem a bit odd at first, but one gets used to it fairly quickly. Also, most of the time you can safely ignore the fact that variables are actually objects, and just treat them as you might treat a variable in C or C++:
>>>var_one = 5
>>>var_two = 10
>>>var_one + var_two
15
You can also query an object to see what type it is:
>>> type(some_var)
<type 'int'>
Although I just stated that int()
is not a cast, it can be used as
something akin to that by letting the data objects do the type
conversion themselves when a new object is created:
>>>float_var = 5.5
>>>int_var = int(float_var)
>>>print int_var
5
Notice that the fractional part of float_var
vanished as a result of the
conversion.
Octal and hexadecimal integer notation is also supported, and work as in C:
Use a leading 0, as in 0157.
Use a leading 0x, as in 0x3FE.
Octal and hexadecimal values don’t have their own type classes. This is because when a value written in either format is assigned to a Python variable, it is converted to its integer equivalent:
>>>foo_hex = 0x2A7
>>>print foo_hex
679
This is equivalent to writing:
>>>foo_hex = int("2A7",16)
>>>print foo_hex
679
So exactly what is a “data object”? In Python, things like variable names reside in what is called a namespace. There are various levels of namespaces, from the local namespace of a function or method to the global namespace of the Python interpreter’s execution environment. For now, we won’t worry too much about them; we’ll just work with the concept of a local namespace.
Variable names do not have any value, other than the string that makes up the name. They are more like handles or labels that we can attach to things that do have values—namely, objects. Figure 3-1 shows how this works.
Typically objects have methods, or internal functions, that
operate on the data encapsulated within them. Python’s data objects
are no exception. If we create an integer data object, we can ask
Python to describe the object to us using the help()
function, like this:
>>>int_var = 5
>>>help(5)
Help on int object: class int(object) | int(x[, base]) -> integer | | Convert a string or number to an integer, if possible. A floating point | argument will be truncated towards zero (this does not include a string | representation of a floating point number!) When converting a string, use | the optional base. It is an error to supply a base when converting a | non-string. If base is zero, the proper base is guessed based on the | string content. If the argument is outside the integer range a | long object will be returned instead. | | Methods defined here: | | __abs__(...) | x.__abs__() <==> abs(x) | | __add__(...) | x.__add__(y) <==> x+y | | __and__(...) | x.__and__(y) <==> x&y | | __cmp__(...) | x.__cmp__(y) <==> cmp(x,y) | | __coerce__(...) | x.__coerce__(y) <==> coerce(x, y) | | __div__(...) | x.__div__(y) <==> x/y | | __divmod__(...) | x.__divmod__(y) <==> divmod(x, y) | | __float__(...) | x.__float__() <==> float(x) | | __floordiv__(...) | x.__floordiv__(y) <==> x//y | | __format__(...) | | __getattribute__(...) | x.__getattribute__('name') <==> x.name | | __getnewargs__(...) | -- More --
There are more internal methods, and you can peruse them if you
are so inclined (just press the space bar for another screenful,
Return for another line, or q
to return to the
prompt), but the main point here is that in Python, data objects
“know” how to manipulate their internal data using the built-in
methods for a particular class. In other words, the Python interpreter
handles the details of converting a statement like this:
5 + 5
into the bytecode equivalent of this, internally:
int(5).__add__(int(5))
and then executing it.
The fact that variables in Python really are objects does take a little getting used to. But it is a powerful feature of the language, and because you can selectively ignore this feature it is possible to create what look like procedural programs, when in reality Python is all about objects.
Python provides three data types for ordered collections of data objects: lists (arrays), strings, and tuples (list-like objects). These are also known as sequence objects. The “sequence” part refers to the fact that each of these data objects may contain zero or more references to other data objects in an ordered sequence. All except for the string type allow their member elements to be any valid Python object. All have methods for manipulating their data; some methods are common to all sequence objects, and some are unique to a particular type. Table 3-2 lists the three sequence types and some of their properties.
Type | Mutable? | Delimiters |
List | Yes |
|
String | No, immutable |
|
Tuple | No, immutable |
|
Python sequence objects are either mutable (changeable) or immutable (unchangeable). A list object, for example, is mutable in that its data can be modified. A string, on the other hand, is not mutable. One cannot replace, remove, or insert characters into a string directly. A string object is an immutable collection of character values that is treated as a read-only array of byte-sized data objects.
Actually, this applies only to 8-bit UTF-8 character encoding; other character sets (e.g., Unicode) may require something other than just single bytes for each character. In this book we’ll only be working with the UTF-8 character encoding (see Chapter 12 for more on ASCII and the UTF-8 character encoding standard).
In order to make a change to a string, one must create a new string that incorporates the changes. The original string object remains untouched, even if the same variable name is reused for the new string object (which “unbinds” the original string object; unbound objects tend to evaporate through the process of garbage collection, but that’s a low-level detail we don’t really need to worry about).
A list is Python’s closest equivalent to an array, but
it has a few tricks that the arrays in C and Pascal never learned
how to do. A list is an ordered sequence, and any element in the
list may be replaced with something different. New elements are
appended to a list using its append
method (there
is also a pop
method, which means a list can be a
queue as well), and the contents of a list can be sorted in place.
Each element in a list is actually a reference to an object, just as
a numeric data variable name is a reference to a numeric data
object. In fact, a list can contain references to any valid Python
object. Consider the following:
>>>import random
>>>alist = []
>>>alist.append(4)
>>>alist.append(55.89)
>>>alist.append('a short string')
>>>alist.append(random.random)
alist
now contains four
elements, which are composed of an integer, a floating-point value,
a string, and a reference to a method from Python’s random
module called, appropriately
enough, random
(we’ll discuss the
import
statement in more detail later). We can
examine each member element of alist
to verify this:
>>>alist[0]
4 >>>alist[1]
55.890000000000001 >>>alist[2]
'a short string' >>>alist[3]
<built-in method random of Random object at 0x00A29D28>
If we want a random number, all we have to do in order to
invoke random()
is treat alist[3]
as if it were a function by
appending the expected parentheses:
>>> alist[3]()
0.87358651337544713
We can change a particular element in alist
simply by assigning it a new
value:
>>>alist[2]
'a short string' >>>alist[2] = 'a better string'
>>>alist[2]
'a better string'
Figure 3-2 shows
what is going on inside the alist
object.
We can use a list object to demonstrate Python’s underlying OO nature by entering the following at the Python prompt and observing the results:
>>>list_name = []
>>>list_name.append(0)
>>>list_name.append(1)
>>>list_name
[0, 1] >>>var_one = list_name
>>>var_two = list_name
>>>var_one
[0, 1] >>>var_two
[0, 1] >>>list_name[0] = 9
>>>var_one
[9, 1] >>>var_two
[9, 1]
Because the names var_one
and var_two
both refer to the
list object initially bound to the name list_name
, when list_name
is altered the change in the
list object is “seen” by both of the other variable names.
Like most every other object in Python, a list has a collection of methods. These include the indexing methods we’ve already seen, but there are more. Lists can be concatenated and are appended end-to-end in the order specified, like so:
>>>alist1 = [1,2,3,4,5]
>>>alist2 = [6,7,8,9,10]
>>>alist1 + alist2
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
To find the index offset of a particular item in a list, we
can use the index()
method:
>>> alist2.index(8)
2
We can also reverse the order of a list:
>>>alist1.reverse()
>>>alist1
[5, 4, 3, 2, 1]
And we can sort a list:
>>>slist = [8,22,0,5,16,99,14,-6,42,66]
>>>slist.sort()
>>>slist
[-6, 0, 5, 8, 14, 16, 22, 42, 66, 99]
Notice in the last two examples that the list itself is modified “in place.” That is, a new object is not created as a result of reversing or sorting a list. Lists are mutable.
Strings are ordered sequences of byte-value characters. Strings are immutable, meaning that (unlike in C or C++) they cannot be altered in place by using an index and treating them like arrays. In order to modify a string, one must create a new string object. The contents of a string can, however, be referenced using an index into the string.
Here are some string examples:
>>>astr1 = 'This is a short string.'
>>>astr2 = "This is another short string."
>>>astr3 = "This string has 'embedded' single-quote chracters."
>>>astr4 = """This is an example
...of a multi-line
...string.
..."""
>>>
Although one cannot change the contents of a string using an
index value, the data can be read using an index, and Python
provides the ability to extract specific parts of a string (or
“slices,” as they are called). The result is a new string object.
The following line will read the first four characters of the string
variable astr1
, starting at the
zero position and stopping before, but not at, the fourth
position:
>>> print astr1[0:4]
This
We could also eliminate the 0 in the index range and just let it be assumed:
>>> print astr1[:4]
This
This form tells Python to extract everything from the start of the string up to the fourth position. We can also extract everything from the fourth position to the end of the line:
>>> print astr1[4:]
is a short string.
Or we can get something from the middle of the string:
>>> print astr1[10:15]
short
Figure 3-3 shows how indexing works in Python.
String objects also incorporate a set of methods that perform operations such as capitalization, centering, and counting the occurrences of particular characters, among other things, each returning a new string object.
As with lists, concatenation uses the +
operator:
>>>str_cat = astr1 + " " + astr2
>>>print str_cat
This is a short string. This is another short string.
The result is, as you might expect by now, a new string object. Fortunately, Python incorporates garbage collection, and objects that are no longer bound to a name, as in the following situation, are quietly whisked away; their memory is returned to a shared pool for reuse. This is a good thing, as otherwise memory could quickly fill up with abandoned data objects:
>>>the_string = "This is the string."
>>>the_string = the_string[0:4]
>>>the_string
'This'
In this case, the name the_string
is initially bound to the
string object containing "This is the
string."
. When a section of the initial string object is
pulled out, a new object is created and the name is reassigned to
it. The original object, no longer bound, disappears. However, if an
object is shared between two or more names, it will persist so long
as one name is bound to it. This can come in handy when creating
objects that need to hang around for the life of a program.
Other string methods allow you to left- or right-align a string, replace a word in a string, or convert the case of the characters in a string. Here are some examples.
The upper()
method converts all alphabetic
characters in a string to uppercase:
>>> print astr1.upper()
THIS IS A SHORT STRING.
find()
returns the index of the first character in the search
pattern string:
>>> print astr1.find('string')
16
The replace()
method substitutes the new string for the search
pattern:
>>> print astr1.replace('string', 'line')
This is a short line.
The rjust()
method (and its counterpart, ljust()
) justifies a string in a field, the
width of which is the method’s argument:
>>> print astr1.rjust(30)
This is a short string.
The default fill character is a space, but one can specify an alternative as a second argument:
>>> print astr1.rjust(30,'.')
.......This is a short string.
You can get a listing of the various string methods available
by typing help(str)
at the Python prompt.
The tuple is an interesting data object with many uses. Like a list object, it is an ordered set that may contain zero or more items, but unlike the list, it is immutable. Once created, a tuple cannot be directly modified. Tuples are typically referred to by the number of items they contain. For example, a 2-tuple has, as you might expect, two data objects. A shorthand way of referring to a tuple of any size is to say “n-tuple.” Even a 0-tuple is possible in Python; it isn’t particularly interesting or useful, except perhaps as a placeholder, but Python will let you create one if you really want to.
Whereas lists in Python employ square brackets as delimiters, tuples use parentheses, like this:
>>>tuple2 = (1,2)
>>>tuple2
(1, 2)
The contents of a tuple can be accessed using an index, just as with lists and strings:
>>>tuple4 = (9, 22.5, 0x16, 0)
>>>tuple4
(9, 22.5, 22, 0) >>>tuple4[2]
22 >>>tuple4[0]
9
Like lists and strings, tuples may be concatenated (with a new tuple as the result):
>>>tuple2
(1, 2) >>>tuple4
(9, 22.5, 22, 0) >>>tuple6 = tuple2 + tuple4
>>>tuple6
(1, 2, 9, 22.5, 22, 0)
In this case, we can see that a new tuple object is created.
A tuple cannot be sorted, but it can be counted. To find out
how many times a particular value or object occurs, we can use the
count()
method:
>>>tpl = (0, 0, 2, 2, 6, 0, 3, 2, 1, 0)
>>>tpl.count(0)
4 >>>tpl.count(2)
3 >>>tpl.count(6)
1
Since the contents of a tuple are actually references to objects, a tuple can contain any mix of valid Python objects, just like a list object.
Python’s dictionary is a unique data object. Instead of an ordered set of data elements, a dictionary contains data in the form of a set of unordered key/value pairs. That is, each data element has an associated key that uniquely identifies it. It is Python’s one and only mapped data object.
Like any other Python data object, a dictionary can be passed as an argument to a function or method, and returned as well. It can be a data element in a tuple or list, and its values can be any valid Python object type. The types that are usable as keys are limited to integers, strings, and tuples; in other words, keys must be immutable objects.
To create a dictionary object, we can initialize it with a set of keys and associated values:
>>>dobj = {0:"zero", 1:"one", "food":"eat", "spam":42}
>>>dobj
{0: 'zero', 1: 'one', 'food': 'eat', 'spam': 42}
To get at a particular key, we can use what looks like indexing, but is not:
>>>dobj[0]
'zero' >>>dobj[1]
'one'
If we try a key that isn’t in the dictionary, Python complains:
>>> dobj[2]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
KeyError: 2
But so long as it’s a valid key, we will get a valid value back:
>>> dobj["spam"]
42
Dictionaries incorporate a set of powerful methods for manipulating their data. Table 3-3 contains a list of what’s available, and we’ll look at a few in detail.
Method | Description |
| Removes all items from a dictionary. |
| Performs a “shallow” copy of a dictionary. |
| Returns the data associated with a key, or a default value if no matching key is found. |
| Returns |
| Returns a list of a dictionary’s key/value pairs as 2-tuples. |
| Iterates over the key/value pairs in a dictionary. |
| Iterates over the keys in a dictionary. |
| Iterates over the values in a dictionary. |
| Returns a list of the keys in a dictionary. |
| Pops off a specific item by key and removes it from the dictionary. |
| Pops off a specific key/value pair and removes it from the dictionary. |
| Sets the default value
to be returned should a |
| Updates the values with the values from another dictionary. Replaces values in matching keys. |
| Returns a list of the values in a dictionary. |
Note that there is no append()
method like the one available for
lists. To add a new item to a dictionary, one simply assigns a value
to a new key:
>>>dobj[99] = "agent"
>>>dobj
{0: 'zero', 1: 'one', 99: 'agent', 'food': 'eat', 'spam': 42}
Notice that the new key and its associated data are inserted in the dictionary at an arbitrary location. A dictionary is not a sequence object, and data is accessed using keys, so it really is unimportant where it is actually located amongst the other key/value pairs in the data object.
This technique can also be used to modify an existing key’s value:
>>>dobj[1] = "the big one"
>>>dobj
{0: 'zero', 1: 'the big one', 99: 'agent', 'food': 'eat', 'spam': 42}
A safer way to fetch a value from a dictionary is to use the
get()
method:
>>> dobj.get(99)
'agent'
If we attempt to get a value for a key that doesn’t exist,
get()
will by default return the
special value of None
. At the
Python command line, this doesn’t show anything:
>>> dobj.get(256)
We can specify a default return value of our choosing, if we so desire, like this:
>>> dobj.get(256,"Nope")
'Nope'
Dictionaries are useful for keeping global data (such as parameters) in one convenient place, and the ability to return a default value allows a program to use predefined parameter values if no externally supplied values are available.
There may be times when we want to get a list of what’s in a
dictionary. The items()
method returns all of a dictionary’s
key/value pairs as a list object of 2-tuples:
>>> dobj.items()
[(0, 'zero'), (1, 'the big one'), (99, 'agent'), ('food', 'eat'), ('spam', 42)]
If we want a list of the keys, we can get one using the keys()
method:
>>> dobj.keys()
[0, 1, 99, 'food', 'spam']
Finally, if we are only interested in the values, the values()
method comes in handy:
>>> dobj.values()
['zero', 'the big one', 'agent', 'eat', 42]
That should be enough on dictionaries for now. We will see other interesting ways to use dictionaries and the other Python data types later, but in the meantime, feel free to experiment with the Python command line. Trying out new things is one of the best ways to learn about them.
In this book, we’re going to use a mathematical-type definition of an expression. That is, an expression is a well-formed sequence of variables and mathematical or logical symbols that does not contain an equals (assignment) symbol but will evaluate to a valid logical or numerical value. A statement (which we will look at shortly) does specify an assignment or some other action, and statements may contain expressions.
Expressions make use of various operators, such as addition, subtraction, comparison, and so on. Expressions may be simple, such as:
a + b
or they may be compound expressions, as in:
((a + b) * c) ** z
Parentheses are used to indicate order of evaluation. In the
previous example, the multiplication operator (*
) has a higher precedence than addition
(+
), and exponentiation (**
) has a higher precedence than
multiplication, so without the parentheses the expression would be
evaluated like this:
a + b * c**z
which is clearer if we put the implied parentheses back in:
a + (b * (c**z))
This is definitely not what was wanted in the original expression.
Expressions may contain things other than operators. For example,
assume there is a function called epow()
that will return the value of
e raised to the power of some number or the result
of some expression. An expression could contain a call to this function
and use it to create a new value:
n + epow(x - (2 * y))
This would be the equivalent of writing n + e(x − 2y) in standard mathematical notation.
Now that we’ve seen the data types Python supports and what an expression is, we can look at the various things one can do with them using operators. Python provides a full set of arithmetic, logical, and comparison operators. It also includes operators for bitwise operations, membership tests, and identity tests, and it provides various augmented assignment operators.
Python provides the usual four basic arithmetic operators: addition, subtraction, multiplication, and division. It also has two operators that are not found in some other languages: exponent and floor division. Table 3-4 lists Python’s arithmetic operators.
Operator | Description |
| Addition |
| Subtraction |
| Multiplication |
| Division |
| Modulus |
| Exponent |
| Floor division |
When dealing with a mix of numeric data types, Python will automatically “promote” all of the operands to the highest-level type, and then perform the indicated operation. The type priorities are:
complex float long int
This means that if an expression contains a floating-point value but no complex values, the result will be a floating-point value. If an expression contains a long and no floating-point or complex values, the result will be a long. If an expression contains a complex value, the result will be complex. So, if one has an expression that looks like this:
5.0 * 5
the result will be a floating-point value:
25.0
As I mentioned, Python also has a unique division operator
called “floor division.” This is used to return the quotient of a
floating-point operation truncated down to the nearest whole value,
with the result returned as a float. In Python, the behavior of
//
is like this:
>>>5/2
2 >>>5//2
2 >>>5.0/2
2.5 >>>5.0//2
2.0
Python’s logical operators, shown in Table 3-5, act on the truth values of any object.
Python provides the keywords True
and False
for use in logical expressions. Note
that any of the following are also considered to be False
:
The None
object
Zero (any numeric type)
An empty sequence object (list, tuple, or string)
An empty dictionary
All other values are considered to be True
. It is also common to find 1
and 0
acting as true and false values.
Comparison operators evaluate two operands and determine the relationship between them in terms of equality, inequality, and magnitude (see Table 3-6).
Operator | Description |
|
|
|
|
| Same as |
|
|
|
|
|
|
|
|
Python expressions that use comparison operators always return a logical true or false.
Python’s AND
, OR
,
and XOR
operators map across
bit-to-bit between the operands; they do not perform arithmetic
operations. The bitwise operators are listed in Table 3-7.
Operator | Description |
| Binary |
| Binary |
| Binary |
| Binary one’s complement |
| Binary left shift |
| Binary right shift |
The AND
operation will return
only those bits in each operand that are true (1), whereas the
OR
will “merge” the bits of both
operands, as shown in Figure 3-4.
The bitwise operators are useful when there is a need to set a
particular bit (OR
) or test for a
bit with a value of 1
(AND
). The XOR
operator returns the bitwise difference
between two operands, as shown in the truth table in Figure 3-5.
The one’s complement operator changes the value of each bit to
its inverse. That is, a binary value of 00101100
becomes 11010011
.
The binary shift operators work by shifting the contents of a data object left or right by the number of bit positions specified by the righthand operand. The effect is the equivalent of multiplication by 2n for a left shift or division by 2n for a right shift (where n is the number of bit positions shifted). For example:
>>>2 << 1
4 >>>2 << 2
8 >>>2 << 3
16 >>>16 >> 2
4
As we’ve already seen, assignment in Python involves more than just stuffing some data into a memory location. An assignment is equivalent to instantiating a new data object. Python’s assignment operators are listed in Table 3-8.
Operator | Description |
| Simple assignment |
| Add and assignment (augmented assignment) |
| Subtract and assignment (augmented assignment) |
| Multiply and assignment (augmented assignment) |
| Divide and assignment (augmented assignment) |
| Modulus and assignment (augmented assignment) |
| Exponent and assignment (augmented assignment) |
| Floor division and assignment (augmented assignment) |
In addition to the simple assignment operator, Python provides a set of augmented assignment operators as corollaries to each of the arithmetic operators. An augmented assignment first performs the operation and then assigns the result back to the name on the lefthand side of the operator. For example:
>>>a = 1
>>>a += 1
>>>a
2
The membership operators are used to determine whether a
value or object exists (in
), or
doesn’t (not in
), within a sequence
or dictionary object (see Table 3-9). Note that when used with a
dictionary only the keys are tested, not the values.
Operator | Description |
| Result is |
| Result is |
One way to use the in
operator would be like this:
if x in some_list: DoSomething(x, some_list)
In this case, the function doSomething()
will only be called if
x
is in some_list
. Conversely, one could test to see
if something is not in an object:
if x not in some_dict: some_dict[x] = new_value
If the key x
does not already
exist in the dictionary, it will be added along with a value.
Python’s identity operators (shown in Table 3-10) are used to determine if one name
refers to the same object as another name (is
), or if it does not (is not
).
Operator | Description |
| Result is |
| Result is |
The identity operators are handy when attempting to determine if
an object is available for a particular operation. An is
expression will evaluate to True
if the variable names on either side of
the operator refer to the same object. An is
not
expression will evaluate to True
if the variable names on either side of
the operator do not refer to the same object.
Here is a (nonexecutable) example:
def GetFilePath(name): global pathParse if pathParse is None: pathParse = FileUtil.PathParse() file_path = pathParse(name) if len(file_path) > 1: return file_path else: return None
The global name pathParse
would be initialized (at the start of the module) to None
, but for this function it should refer
to an object of the class pathParse
in the FileUtil
module. If it does
not (i.e., it is None
), it is
instantiated. If the function attempts to use pathParse
with a value of None
, it will fail.
We already saw some of the precedence characteristics of operators in the earlier discussion of expressions, but now let’s take a closer look. Table 3-11 lists Python’s operators in order of precedence, from lowest to highest.
Precedence | Operator |
Lowest |
|
. |
|
. |
|
. |
|
. |
|
. |
|
. |
|
. |
|
. |
|
. |
|
. |
|
Highest |
|
Parentheses are used to force the order of evaluation, as was shown earlier. If you can’t remember how the evaluation order works, or the default order isn’t what you want, use parentheses as necessary to get the desired result. Using parentheses for clarity is never a bad thing.
A typical program is composed of statements, comments, and whitespace (blank lines, spaces, tabs, etc.). Statements are composed of keywords and optional expressions, and specify an action. A statement might be a simple assignment:
>>> some_var = 5
Or it could be a compound set of control statements, such as an
if-else
construct:
>>>if some_var < 10:
...print "Yes"
...print "Indeed"
...else:
...print "Sorry"
...print "Nope"
... Yes Indeed
Python is also interesting for what it doesn’t have. Those with
experience in other languages may notice that there is no “switch” or
“case” statement. Python’s if-elif-else
construct is usually used for
this purpose. There is also nothing that looks like the structure data
type in C. Dictionaries and lists can be used to emulate a structure,
but it’s often not necessary. Python also does not have a “do”, as in
do-until
or do-while
. It does have a for
statement, but it doesn’t work in the way
that a C programmer might expect.
When talking about program structure, one often refers to blocks of statements. A block can
be defined as a set of one or more statements that are logically
associated. Unlike C and some other languages, Python does not use
special characters or reserved words to denote how statements are
logically grouped into blocks. It uses indentation. For example, in C,
one could write the if-else
shown
above like this:
if (some_var < 10) { printf("Yes "); printf("Indeed "); } else { printf("Sorry "); printf("Nope "); }
The curly braces tell the C compiler how the statements are grouped, and C does not care how much or how little each statement is indented—in C that’s considered to be “whitespace,” and the compiler ignores it. In Python, however, the indentation is essential, as it tells the interpreter how the code is structured and which statements are logically associated. The amount of indentation is not critical so long as it is consistent. The recommended amount is four spaces for each level, and no tabs (tabs are generally considered somewhat evil because they don’t always move between different editors gracefully—one editor might interpret tabs as four spaces, whereas another might translate tabs to eight spaces).
Some people have issues with Python’s use of indentation to
denote blocks of code, and for those with extensive experience in C or
C++ it does seem rather odd at first (although it is by no means a new
idea in computer science). The advantages claimed for indentation are
that it helps to enforce a consistent style across different programs
by different authors and that it improves readability. Some people
find that using comments such as #endif
, #endfor
, and #endwhile
helps to make large sections of
code with multiple levels of indentation easier to read, but we won’t
get into that discussion here.
In Python, a comment is denoted by a #
character (sometimes called a
hash), and a comment can appear anywhere on a
line. The interpreter ignores everything following the hash. Use
comments liberally to document your programs, but make the comments
worthwhile. A comment like this:
a += 1 # increment by one
isn’t very useful (although they still show up quite often), but a comment like this:
if (a + 1) > maxval: # do not increment past limit
can help to dispel mystery.
Python utilizes 31 distinct reserved keywords, listed in Table 3-12.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
We will examine some of the more commonly used keywords in the remainder of this chapter. Others will be introduced as necessary when we start developing some larger and more complex programs.
In Python, a simple statement (see Table 3-13) is one that consists of an assignment or keyword in a single line; there are no other components. The statement may have more than one expression, however.
Keyword | Description |
|
|
Assignment ( | Creates a new data object and assigns (binds) it to a name |
Augmented assignment | See Table 3-8 |
| The null operation; when executed, nothing happens |
| Removes the binding between a name or list of names and any associated objects |
| Sends output to the standard output (stdout) |
| May return an optional literal value or the result of an expression |
| Only used in generator functions |
| Raises an exception |
| Used in |
| Used in |
| Specifies an external module to be included in the current namespace |
| Specifies a list of names that are to be treated as global variables within the current module |
| Supports dynamic execution of Python code |
I have intentionally skipped the del
, exec
, raise
, and yield
statements in the following
subsections because they really won’t come into play for what we want
to do in this book. A discussion of the import
statement is deferred until later in
this chapter, in the section titled Importing Modules.
The assert
statement is
typically used to determine if some condition has been met. If not,
an exception is raised. It is heavily used in unit testing and
sometimes for catching off-nominal conditions (although there are
other ways to do this).
The assignment statement (=
) is probably the most basic form of
Python statement. As we’ve already seen, an assignment is
essentially equivalent to instantiating an object of some type and
binding it to a name. We’ve already made extensive use of assignment
in the previous sections, so we won’t belabor it any more
here.
Augmented assignment statements are very useful and show up quite often in Python programs. Because an assignment of any type will create a new data object, you cannot have an augmented assignment in an expression. In other words, this won’t work:
if (a += 1) > maxval:
But this will:
if (a + 1) > maxval:
In an augmented assignment, the arithmetic operation is performed first, followed by the assignment. For a list of Python’s augmented assignment operators, see Table 3-8.
The pass
statement is a no-op statement that does nothing. It
is typically used as a placeholder when a statement is required
syntactically. One often finds pass
statements in methods of a top-level
class that are intended to be overwritten by methods in a child
class. They may also appear in “callback” functions or methods that
don’t really need to do anything, but must have a statement to be
syntactically complete.
print
writes the values of one or more objects to
stdout, unless stdout has
been redirected or the output of print
is itself redirected. If print
is given an object that is not a
string, it will attempt to convert the data to string form. By
default, print
appends a newline
(/n
) to the end of the output,
but this can be suppressed.
The return
statement is used to return control
from a function or method back to the original caller. The return
statement may optionally pass data
back to the caller, and the data can be any valid Python object. As
mentioned earlier, a function may return a tuple instead of just a
single value, which makes it possible to return both a status code
and a data value (or more). While it is possible to return a list or
a dictionary, this can be problematic in large programs with large
and complex data objects because of the inherently opaque nature of
these data types. I’ll have more to say about this kind of
unintentional obfuscation in a later section.
The break
statement may occur only within a for
or while
loop. It will terminate the nearest
enclosing loop construct and skip the else
statement, if there is one.
The continue
statement
may occur only within a for
or
while
loop. continue
forces the loop to return to the
for
or while
statement at the start of the loop.
Any subsequent statements past the continue
are skipped. If a continue
causes control to pass out of a
try
construct with a finally
statement, the finally
is executed before the next
iteration of the loop. We’ll look at the try-except
construct in more detail
shortly.
Compound statements are composed of groups of statements that are logically
related and control the execution of other statements. We will take a
look at the if
, while
, for
, and try
statements, but we will skip the
with
statement and save the
def
and class
statements until the next section.
Table 3-14 lists Python’s compound
statements.
Keyword | Description |
| Conditional test with optional alternate tests or terminal case |
| Executes loop
repeatedly while initial condition is |
| Iterates over elements of an iterable object (e.g., a list, string, or tuple) |
| Defines exception handling for a group of statements |
| Used with context managers |
| Declares a user-defined function or method |
| Declares a user-defined class |
Python’s if
statement behaves as one would expect.
Following the keyword if
is an
expression that will evaluate to either True
or False
. In its simplest form it is just an
if
statement and a block of one
or more subordinate statements:
if <expression>: statement (more statements as necessary)
To specify an alternative action, one would use the else
statement:
if <expression>: statement (more statements as necessary) else: statement (and yet more statement if necessary)
To create a series of possible outcomes, the elif
statement (a compression of “else
if”) is used. It is like an if
and requires an expression, but it can only appear after an if
, never by itself:
if <expression>: statement (more statements as necessary) elif <expression>: statement (more statements as necessary) else: statement (and yet more statements if necessary)
The while
statement
repeats a block of statements as long as a control expression is
true:
while <expression>: statement (more statements as necessary) else: statement (and yet more statement if necessary)
The else
block is
executed if the loop terminates normally (i.e., the control
expression evaluates to False
)
and a break
statement was not
encountered. In the following example, the loop is controlled using
a Boolean variable, which is initialized to True
and then assigned the value of
False
from within the
loop:
>>>loop_ok = True
>>>loop_cnt = 10
>>>while loop_ok:
...print "%d Loop is OK" % loop_cnt
...loop_cnt -= 1
...if loop_cnt < 0:
...loop_ok = False
...else:
...print "%d Loop no longer OK" % loop_cnt
... 10 Loop is OK 9 Loop is OK 8 Loop is OK 7 Loop is OK 6 Loop is OK 5 Loop is OK 4 Loop is OK 3 Loop is OK 2 Loop is OK 1 Loop is OK 0 Loop is OK -1 Loop no longer OK
The else
statement is
completely optional.
The continue
and break
statements may also be used to cause
a loop to re-cycle through the while
statement immediately or terminate
and exit, respectively. If a break
statement is used to terminate a
loop, the else
statement is also
skipped. No statements following a continue
statement will be
evaluated.
Python does have a for
statement, but not in the sense that one
would expect to find in some other languages. In Python, the
for
statement is used to iterate
through a sequence of values. The for
statement also includes an optional
else
statement, just as the
while
statement does, and it
behaves in the same way:
for some_var in <sequence>: statement (more statements as necessary) else: statement (and yet more statement if necessary)
One way to specify a sequence of integer values is to use the
built-in function range()
, like this:
>>>for i in range(0,5):
...print i
... 0 1 2 3 4
Another place where for
comes in handy is when dealing with a sequence object such as a
list:
>>>alist = [1,2,3,4,5,6,7,8,9,10]
>>>for i in alist:
...print i
... 1 2 3 4 5 6 7 8 9 10
The values that for
traverses don’t have to be integers. They could just as well be a
set of strings in a tuple:
>>>stuple = ("this","is","a","4-tuple")
>>>for s in stuple:
...print s
... this is a 4-tuple
Like the while
statement,
the for
statement supports the
continue
and break
statements, and these work as one
might expect.
The try
statement is used to trap and handle
exceptions, and it is similar to the try-catch
found in C++ or Java. It is very
useful for creating robust Python applications by allowing the
program designer to implement an alternative to the default approach
the Python interpreter takes when an error occurs (which is usually
to generate what is called a traceback message and then terminate).
The full form of the try-except
construct looks like this:
try: statement (more statements as necessary) except <exception, err_info>: statement (more statements as necessary) else: statement (more statements as necessary) finally: statement (and yet more statements if necessary)
The use of a specific exception type
(<exception>
) is optional, and if
it is not given, any exception will invoke the statements in the
except
block. One way to find out
what happened to cause the exception is to use the base class
Exception
and specify a variable
for Python to write the exception information into:
try: f = open(fname, "r") except Exception, err: print "File open failed: %s" % str(err)
In this case, if the file open fails the program won’t
terminate. Instead, a message will be printed to
stdout with some information about why the
open
statement failed.
The else
statement is
executed if there was no exception, and the finally
block will be executed if there is
a break
or continue
statement in the try
block of statements. Refer to the
Python documentation for more information about the try
statement and exception
handling in Python.
The ability to create strings of formatted data is used extensively in many Python programs, and the programs we will encounter in this book are no exception. Python’s string objects provide a rich set of methods, and when they are combined with string formatting Python can generate output with formatted columns, left- or right-justified fields, and specific representations of various data types. Strings are important enough to merit a separate section.
A string literal is quoted using one of the following forms:
'A single-quote string.' "A double-quote string." '''This is a multiline string using triple single quotes. It is a medium-length string. ''' """This is a multiline string with triple double quotes containing many characters along with some punctuation, and it is a very long string indeed."""
Multiline strings can span more than one line, and
(newline) characters are inserted into
the string automatically to preserve the original formatting.
The string type provides numerous methods, some of which we have already seen. Table 3-15 is a complete list (not including the Unicode methods) as given in the Python 2.6 documentation.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Some of these get a lot more use than others, but it’s good to have some idea of what’s available. For the methods we don’t cover here, refer to the Python documentation. Also, remember that the form:
new_sring = "string text".method()
works just as well as:
new_string = string_var.method()
Also keep in mind that there needs to be a target name for the new string object created as a result of invoking the method (strings are not mutable); otherwise, the modified string data will simply vanish.
Table 3-16 lists 14 commonly used string methods. Other less commonly used methods will be described as the need arises. In the following descriptions I will use the convention used in the Python documentation to indicate (required) and [optional] parameters.
Method | Description |
| Returns a copy of the string with just its first character capitalized. |
| Returns a copy of the
text in the string centered in a new string of length
|
| Counts the number of
unique occurrences of the substring
|
| Locates the first
occurrence of the substring |
| Returns |
| Returns |
| Returns |
| Returns |
| Returns |
| Returns a new
left-justified string of length
|
| Returns a copy of the original string with all alphabetic characters converted to lowercase. |
| Returns a new
right-justified string of length
|
| Returns a list whose
elements are the words from the string using
|
| Returns a copy of the original string with all alphabetic characters converted to uppercase. |
There are basically two ways to format a string with variable data in Python. The first is to use concatenation, which we saw earlier. The second is to make use of Python’s string formatting capability. Which method is most appropriate depends on what you are trying to accomplish. While concatenation is relatively easy with simple strings, it doesn’t provide for a lot of control over things like the number of decimal places, and data in strings with lots of embedded characters can be cumbersome when using concatenation. Consider the following example:
>>>data1 = 5.05567
>>>data2 = 34.678
>>>data3 = 0.00296705087
>>>data4 = 0
>>>runid = 1
>>>outstr1 = "Run "+str(runid)+": "+str(data1)+" "+str(data2)
>>>outstr2 = " "+str(data3)+" : "+str(data4)
>>>outstr = outstr1 + outstr2
>>>outstr
'Run 1: 5.05567 34.678 0.00296705087 : 0'
There is an easier way. Python employs string formatting
placeholders that are very similar to those used in the C sprintf()
function. By using special
formatting codes, one can specify where data is to be inserted into a
string and how the data will appear in string form. Here is a string
created using formatting placeholders with the same data as
above:
>>>outstr = "Run %d: %2.3f %2.3f %2.3f : %d" % (runid, data1, data2, data3, data4)
>>>outstr
'Run 1: 5.056 34.678 0.003 : 0'
Notice that the variables to be used in the string are enclosed in parentheses—it is an n-tuple. If the parentheses are omitted, only the first variable name is evaluated and an error will result:
>>> "%d %d %d" % 1, 2, 3
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: not enough arguments for format string
The syntax for a string format placeholder is:
%[(name
)][flags
][width
][.precision
]type_code
Each placeholder can have an optional name assigned to it as the
first item after the %
(the
parentheses are required). Following that are optional flags for
justification, leading spaces, a sign character, and 0 fill. Next is
an optional width
value that specifies the
minimum amount of space to allow for the data. If the data contains a
decimal part, the value of the .precision
field specifies the number of decimal places to use. Finally, a type
code specifies what kind of data to expect (string, integer, long,
floating point, etc.). Table 3-17 lists the available
flags. Table 3-18
summarizes the various type codes available.
Flag | Meaning |
| Use “alternate form” formatting (see the notes in Table 3-18). |
| Pad numeric values with a leading zero. |
| Left-adjust (overrides
|
(a space) | Insert a space before positive numbers. |
| Precede the values with
a sign character ( |
Type code | Meaning | Notes |
| Signed integer decimal. | |
| Signed integer decimal. | |
| Signed octal value. | The alternate form prepends a leading zero to the number if one is not already present. |
| Obsolete type. | Identical to |
| Signed hexadecimal (lowercase). | The alternate form prepends 0x if not already present. |
| Signed hexadecimal (uppercase). | The alternate form prepends 0X if not already present. |
| Floating-point exponential format (lowercase). | The alternate form always uses a decimal point even if no digits follow it. |
| Floating-point exponential format (uppercase). | Alternate form same as
|
| Floating-point decimal format (lowercase). | Alternate form same as
|
| Floating-point decimal format (uppercase). | Alternate form same as
|
| Floating-point format.
Uses lowercase exponential format if exponent is less than –4
or not less than | The alternate form always contains a decimal and trailing zeros are not removed. |
| Floating-point format.
Uses uppercase exponential format if exponent is less than –4
or not less than | Same as |
| Single character (accepts an integer or single-character string). | |
| String (converts any
Python object using | |
| String (converts any
Python object using | |
| No argument is
converted, results in a |
String methods can be applied along with string formatting in the same statement. This may look a bit odd, but it’s perfectly valid:
>>>"%d %d".ljust(20) % (2, 5)
'2 5 ' >>>"%d %d".rjust(20) % (2, 5)
' 2 5' >>>
Because there is no assignment of the string object to a name, Python just prints it out immediately after applying the formatting.
Lastly, Python provides a set of so-called escape characters for use with strings. These are special two-character codes composed of a backslash followed by a character, as shown in Table 3-19.
Escape sequence | Description | ASCII |
| Single quote |
|
| Double quote |
|
| Single backslash | |
| ASCII bell |
|
ASCII backspace |
| |
| ASCII formfeed |
|
| ASCII linefeed |
|
| ASCII carriage return |
|
| ASCII horizontal tab |
|
| ASCII vertical tab |
|
A backslash character () may
also be used for line continuation if it is the last character on a
line, followed immediately by a newline (
LF
or CRLF
). This causes the newline to be ignored
by the interpreter, so it treats the line and the subsequent line as a
single line of code.
So far we’ve been doing things at Python’s command prompt. Now we’ll look at how to create program modules with functions, classes, and methods.
Earlier I obliquely referred to the notion of scope without actually defining what it is. Let’s do that now.
As I’ve already mentioned, Python utilizes the concept of namespaces as collections of names that are bound to objects. Actually, a namespace is more like a dictionary object, where the names are the keys and the objects they reference are the values. There are three levels of namespaces in Python: local, global, and built-ins. Figure 3-6 shows the namespace scopes in a Python module.
When a name is referenced in a function or method, a search is first made of the local namespace, including enclosing functions. Next, the global namespace is searched. Finally, the built-in namespace is searched. If the name cannot be found, Python will raise an exception. Figure 3-7 shows the namespace search hierarchy.
The local scope is the namespace of a particular function, class, or method. In other words, any variables defined within a function are local to that function and are not visible outside of it. The local scope also includes the nearest surrounding function (if any). We will look at nested functions shortly.
Class objects introduce yet another namespace into the local
context. In a class object, any variables defined within the
namespace of the class are accessible to any method within the class
by prefixing the name with self
,
like this:
self.some_var
The data variable attributes and methods of an object instance of a class are visible outside of the object and may be accessed using the “dot notation” we’ve already seen:
SomeObj = SomeClass() SomeObj.var_name = value
This will assign a value to the attribute var_name
in the object instance SomeObj
. If var_name
does not exist, it will be
created in the object’s context. This leads us to an interesting
observation: Python objects do not have truly private data or
methods in the sense that they cannot be accessed from outside of
the object. Everything is accessible, although some things are not
as readily available as others. You can prefix the name of a
function, class, or variable with a leading underscore to prevent it
from being included in a wildcard import, but that doesn’t hide it.
Using two leading underscore characters will “mangle” the object’s
name, but even then it is still accessible if you know how. So,
while nothing is really hidden, the onus is on the programmer to be
polite and not look.
If you’re not sure exactly what this means, don’t worry about it for now. We’ll address objects in more detail later, when we start building user interfaces for our instrumentation applications.
The global scope is the namespace of the enclosing module.
Functions cannot modify the module’s global variables unless the
global
statement is used. The
following example, named globals.py
, illustrates this:
# globals.py var1 = 0 var2 = 1 def Function1(): var1 = 1 var2 = 2 print var1, var2 def Function2(): global var1, var2 print var1, var2 var1 = 3 var2 = 4 print var1, var2
To try it out, we will need to load it using the import
statement. This tells Python to
read the module and populate the command line’s namespace using what
it finds there:
>>> import globals
Once globals
is imported,
we can use the help()
function to
see what is inside:
>>> help(globals)
Help on module globals:
NAME
globals
FILE
globals.py
FUNCTIONS
Function1()
Function2()
DATA
var1 = 3
var2 = 4
If we execute Function1
, we
can verify that the global instances of var1
and var2
are not changed:
>>>globals.var1
0 >>>globals.var2
1 >>>globals.Function1()
1 2 >>>globals.var1
0 >>>globals.var2
1
However, Function2
will
change the values assigned to var1
and var2
:
>>>globals.Function2()
0 1 3 4 >>>globals.var1
3 >>>globals.var2
4
If a function assigns values to variables with names identical
to those in the global namespace, the global
statement must be used if the names
are referenced before the assignments are made. This example, called
globals2.py
, illustrates
this:
# globals2.py var1 = 0 var2 = 1 def Function1(): print var1, var2 def Function2(): var1 = 1 var2 = 2 print var1, var2 def Function3(): print var1, var2 var1 = 1 var2 = 2 print var1, var2
Observe what happens when we execute the three functions:
>>>import globals2
>>>globals2.Function1()
0 1 >>>globals2.Function2()
1 2 >>>globals2.Function3()
Traceback (most recent call last): File "<stdin>", line 1, in <module> File "globals2.py", line 14, in Function3 print var1, var2 UnboundLocalError: local variable 'var1' referenced before assignment
Function1()
succeeded
because there was no conflict between its local variables and the
module’s global variables. In Function2()
the local variables var1
and var2
are defined within the function, so
again, there is no problem. However, Function3()
causes Python to emit an error
message. In this case the use of the global names is blocked because
identical names have already been placed into the function’s local
namespace, but the names aren’t yet bound to an object containing a
value when the print
statement is
invoked. Hence the UnboundLocalError
exception. If the
print
statement were preceded by
a global
statement, the error
would not have occurred.
The built-in namespace is the Python runtime environment.
It includes things like abs()
,
print
, and various exception
names. If you want a list of the built-in names, just type dir(__builtins__)
at the Python prompt. I
won’t list the output here because it’s rather large (144 names at
least).
A Python source code file is called a module. It is a collection of statements composed of variable definition statements, import statements, directly executable statements, function definition statements, and class definition statements, with the variables and methods that go with them.
Modules are contained within packages. A package is, in effect, a directory that contains one or more modules. Packages may contain other packages. Figure 3-8 shows this graphically.
A module is an object, and as we’ve already seen, it has its own namespace. A module also has attributes just like any other Python object. A module’s attributes include the functions, classes, methods, and variables defined in its namespace.
The def
statement is used to define both
functions within modules and methods within classes:
def SomeName (parameters): """ docstring goes here. """ local_var = value statement... statement... more statements...
When used to define a function, the def
statement begins at the leftmost column
and all of the function’s statements are indented relative to it. When
used to define a method in a class, the def
statement is indented relative to the
class
statement.
Functions and methods may be nested. When this is done, the
internal functions are not accessible from outside of the enclosing
function. Here is a rather contrived example of nested functions
called subfuncs.py
:
#subfuncs.py def MainFunc(): def SubFunc1(): print "SubFunc1" def SubFunc2(): print "SubFunc2" def SubFunc3(): def SubSubFunc1(): print "SubSubFunc1" def SubSubFunc2(): print "SubSubFunc2" SubSubFunc1() SubSubFunc2() SubFunc1() SubFunc2() SubFunc3()
We can only execute the function MainFunc()
; none of the other functions
nested within it are directly accessible from outside of the scope of
MainFunc()
. If you import subfuncs
and try to get
help on it, this is all you will see:
>>>import subfuncs
>>>help(subfuncs)
Help on module subfuncs: NAME subfuncs FILE subfuncs.py FUNCTIONS MainFunc()
However, if we execute MainFunc()
we can see that the subfunctions
do get executed:
>>>import subfuncs
>>>subfuncs.MainFunc()
SubFunc1 SubFunc2 SubSubFunc1 SubSubFunc2
The class
statement defines a class object, which
in turn is used to create object instances of the class. The following
class defines a timer object that may be used to get elapsed times
during program execution:
import time class TimeDelta: def __init__(self): self.tstart = 0 self.tlast = 0 self.tcurr = 0 self.Reset() def GetDelta(self): """ Returns time since last call to GetDelta(). """ self.tcurr = time.clock() delta = self.tcurr - self.tlast self.tlast = self.tcurr return delta def GetTotal(self): """ Returns time since object created. """ return time.clock() - self.tstart def Reset(self): """ Initializes time attributes. """ self.tstart = time.clock() self.tlast = self.tstart
Objects of this class can be instantiated in the code wherever
one might want to check on elapsed times, and multiple occurrences may
exist simultaneously. This would be rather awkward to do if TimeDelta
was a function in a module, but as
a class each instance can maintain its own data for when it was
started and when it was last checked.
Docstrings are used to document modules, classes, methods, and
functions. A multiline string that appears at the start of a module,
function, class, or method is seen by Python as a docstring, and it is
stored in the object’s internal __doc__
variable. This is what you are
seeing when you type help()
for a
specific function at the command-line prompt.
The following example shows how docstrings are used. The
pass
statement has been used so
that we can import this code and use help()
to display the embedded documentation:
#docstrings.py """ Module level docstring. This describes the overall purpose and features of the module. It should not go into detail about each function or class as each of those objects has its own docstring. """ def Function1(): """ A function docstring. Describes the purpose of the function, its inputs (if any) and what it will return (if anything). """ pass class Class1: """ Top-level class docstring. Like the module docstring, this is a general high-level description of the class. The methods and variable attributes are not described here. """ def Method1(): """ A method docstring. Similar to a function docstring. """ pass def Method2(): """ A method docstring. Similar to a function docstring. """ pass
When the help()
function is
used on this module, the following output is the result:
>>>import docstrings
>>>help(docstrings)
Help on module docstrings: NAME docstrings - Module level docstring. FILE docstrings.py DESCRIPTION This describes the overall purpose and features of the module. It should not go into detail about each function or class as each of those objects has its own docstring. CLASSES Class1 class Class1 | Top-level class docstring. | | Like the module docstring, this is a general high-level | description of the class. The methods and variable | attributes are not described here. | | Methods defined here: | | Method1() | A method docstring. | | Similar to a function docstring. | | Method2() | A method docstring. | | Similar to a function docstring. FUNCTIONS Function1() A function docstring. Describes the purpose of the function, its inputs (if any) and what it will return (if anything).
Python modules can bring in functionality from other modules by
using the import
statement. When a
module is imported Python will first check to see if the module has
already been imported, and if it has it will refer to the existing
objects by including their names in the current namespace. Otherwise, it
will load the indicated module, scan it, and add the imported names to
the current namespace. Note that “current namespace” may refer to the
local namespace of a function, class, or method, or it might be the
global namespace of a module.
Statements in a Python module that are not within a function or
method will be executed immediately when the module is loaded. This
means that any import
statements,
assignments, def
or class
statements, or other code will be
executed at load time. Code within a function or method is executed only
when it is called, although an object for it is created when the
def
or class
statement is processed.
The import
statement comes in several different
forms. This is the most common, and safest, form:
import module
The objects in module
are added to
the current namespace as references of the form
module
.
function
()
or
module
.
class
()
. To access data attributes within a
module, the notation module
.
variable
is
used.
A variation on this is the aliased import
form:
importmodule
asalias
This is identical to the import
module
statement, except now the alias can be used to reference objects in
module
. This is handy when a module has a
long name. For example:
import CommonReturnCodes as RetCodes
One can also specify what to import from a module:
frommodule
importsomename
This form imports a specific function, class, or data attribute
from a module. The function or attribute
somename
can then be used without a module
prefix.
The wildcard form imports everything from the external module and adds it to the current namespace:
from module
import *
The wildcard import is generally considered to be a bad idea except in special cases, such as when importing a module that has been specifically designed to be used in this fashion and contains only unique names that are unlikely to conflict with existing names. It is considered problematic because it imports everything from the imported module unless special precautions are taken. If the imported module happens to have attributes with the same names as those in the current module, the current names will be overwritten.
There is, as one might expect from Python, a way to control what is exported by using single or double leading underscore characters for attribute names. An attribute name of the form:
_some
_name
will not be included in a wildcard import, but it can still be referenced using the module prefix notation. A double leading underscore of the form:
__some
_name
is about as close as Python gets to data hiding. It can still be accessed from outside the parent module, but its external name is “mangled” to make it more difficult to get at.
Because Python executes any import statements that are not within the scope of a function immediately when a module is imported, it will descend through the import statements in each module in a depth-first fashion until all imports have been processed. Figure 3-9 shows graphically how this works.
The import sequence in Figure 3-9 is indicated by numbers in circles. Module A imports module B, which imports Module C, which in turn imports modules G and H. Module D and the modules it imports will be next in line after module H is processed.
One drawback to Python’s import scheme is that it is possible to create situations where imports can become “hung.” This is called a cyclic import. Consider the diagram in Figure 3-10.
Here we have a situation where Module A imports Module B, which in turn imports Modules C and D. However, Module C attempts to import Module A, which is currently waiting for Module B to finish importing Module C, so Module B can then move on to Modules D and E. Because the import of Module B cannot complete, the entire process deadlocks.
One sure way to avoid cyclic imports is to remember the rule “Never import up, only down.” This means that modules should be imported hierarchically, and also that modules should be architected such that there is no need to import from a higher-level module. A typical mistake made by many newcomers to Python is to place a set of pseudoconstants (assignments to names with values that don’t change) in a module with other related functionality, and then import the entire module solely to gain access to the pseudoconstant objects. Things like pseudoconstants that are referenced by more than one module should go into their own module, which can then be imported when needed without worrying about causing a cyclic import situation.
The following example is a complete Python program that contains no function or class definitions—it is what is commonly referred to as a “script.” It will generate a PGM format image file consisting of random data. The result looks like an old-style TV screen tuned to an empty channel—it’s a lot of “snow.” The main point here is to get a look at what a small Python program looks like. Any image viewer capable of handling PGM files should be able to load and display the image (ImageJ, a free tool from http://rsbweb.nih.gov/ij/, works quite well for this, and check out http://netpbm.sourceforge.net for information about the PGM image format.)
Executing this program doesn’t require that you start the Python interpreter first. Just run python from the command line with the program filename as its only parameter, like this:
C:samples> python pgmrand.py
or, on Linux:
/home/jmh/samples/% python pgmrand.py
The prompt will most likely look different on your system (unless you’re keeping your Python samples in a directory called “samples”).
If you are using Linux, you’ll probably need to put the following line at the top of the program file:
#! /usr/bin/python
On some systems you may need to modify this to point to where Python is actually installed. A likely alternate location is /usr/local/bin/python.
Here’s the source code:
""" Generates an 8 bpp "image" of random pixel values. The sequence of operations used to create the PGM output file is as follows: 1. Create the PGM header, consisting of: ID string (P5) Image width Image height Image data size (in bits/pixel) 2. Generate height x width bytes of random values 3. Write the header and data to an output file """ import random as rnd # use import alias for convenience rnd.seed() # seed the random number generator # image parameters are hardcoded in this example width = 256 height = 256 pxsize = 255 # specify an 8 bpp image # create the PGM header hdrstr = "P5 %d %d %d " % (width, height, pxsize) # create a list of random values from 0 to 255 pixels = [] for i in range(0,width): for j in range(0,height): # generate random values of powers of 2 pixval = 2**rnd.randint(0,8) # some values will be 256, so fix them if pixval > pxsize: pixval = pxsize pixels.append(pixval) # convert array to character values outpix = "".join(map(chr,pixels)) # append the "image" to the header outstr = hdrstr + outpix # and write it out to the disk FILE = open("pgmtest.pgm","w") FILE.write(outstr) FILE.close()
It would be a worthwhile exercise to review the program and look
up the things that don’t immediately make sense to you. The only really
“tricky” part is the use of the string join()
method and the map()
function to create the output string.
This was done because Python does not have a native byte type, but it
does have a chr
type for use with
strings. If one wants an array of bytes, one way to get these is to
create a string by scanning through a list of integers, converting each
to a chr
type, and then joining it to
an empty string (the ""
in the
"".join(map(chr,pixels))
statement).
Note that all the parameters one might want to change to experiment with
the output file are hardcoded in this example.
In order to be generally useful, a program must have some means to input data and output results. Python provides several ways to achieve both objectives using the console, the command line, and file objects. Later on we will examine things like serial ports, USB interfaces, network sockets, and data acquisition hardware, but for now let’s look at what can be done with Python as it comes right out of the box.
Getting user input from stdin (standard input) is straightforward.
Python provides the raw_input()
function for just this
purpose.
The module getInfo.py
contains a simple example of how raw_input()
can be used:
# getInfo.py def ask(): uname = raw_input("What is your name? ") utype = raw_input("What kind of being are you? ") uhome = raw_input("What planet are you from? ") print "" print "So, %s, you are a %s from %s." % (uname, utype, uhome) uack = raw_input("Is that correct? ") if uack[0] in ('y', 'Y'): print "Cool. Welcome." else: print "OK, whatever."
To see how this works, we can import the module getInfo
and then call its function ask()
:
>>>import getInfo
>>>getInfo.ask()
What is your name?zifnorg
What kind of being are you?Zeeble
What planet are you from?Arcturus III
So, zifnorg, you are a Zeeble from Arcturus III. Is that correct?y
Cool. Welcome.
The raw_input()
function
accepts an optional prompt string and always returns the data from
stdin as a string. If the program is looking for
a numeric value, it will need to be converted. A safe way to do this
is by using the try-except
construct. Here is getInfo2.py
with
the try-except
modification:
def ask2(): uname = raw_input("What is your name? ") utype = raw_input("What kind of being are you? ") uhome = raw_input("What planet are you from? ") getgumps = True while (getgumps): intmp = raw_input("How many mucklegumps do you own? ") try: ugumps = int(intmp) except: print "Sorry, you need to enter an integer number." continue else: getgumps = False print "" print "So, %s, you are a %s from %s, with %d mucklegumps." % (uname, utype, uhome, ugumps) uack = raw_input("Is that correct? ") if uack[0] in ('y', 'Y'): print "Cool. Welcome." else: print "OK, whatever."
Before we move on, there are a few things to consider about this
simple function. First, it will only accept an integer value for the
number of “mucklegumps.” Strings and floats will be rejected.
Secondly, there is no way for the user to gracefully abort the input
process. This could be easily handled by checking for a special
character (a .
, for example), or
just detecting null input (just pressing the Enter key with no input).
Speaking of null input, if the user does press the Enter key in
response to the last question, Python will raise an exception:
Is that correct? <enter>
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "getInfo2.py", line 18, in ask2
if uack[0] in ('y', 'Y'):
IndexError: string index out of range
The expression in the if
statement is attempting to match whatever is in uack[0]
with either of the values in the
2-tuple ('y', 'Y')
, and just
pressing Enter returns a zero-length string, which causes the
exception. Using a try-except
here
will prevent this from happening:
uack = raw_input("Is that correct? ") try: if uack[0] in ('y', 'Y'): print "Cool. Welcome." else: print "OK, whatever." except: print "Fine. Have a nice day."
When dealing with user input (that is, whatever a human being types in response to a prompt), one must always be aware of possible input errors or exceptions. Humans can, and often will, type in erroneous data, values that are out of range, unexpected words or phrases, or even nothing at all. Users are unpredictable, so building in safeguards to catch bad input values is always a good idea.
Program parameters entered at the command line are captured by
the operating system and passed to the program via the Python
interpreter as a list. The first item in the list (at index 0) is
always the name of the program itself. Python’s included sys
module contains methods for dealing with
this data.
This simple program (argshow.py
) will print out all the items
from the command-line parameter list:
import sys print "%d items in argument list " % len(sys.argv) i = 1 for arg in sys.argv: print "%d: %s" % (i, arg) i += 1
And here is what happens when we run it:
C:samples> python argshow.py 1 2 3 4 -h -v
7 items in argument list
1: argshow.py
2: 1
3: 2
4: 3
5: 4
6: -h
7: -v
Python also provides tools for detecting specific arguments and extracting values from command-line parameters, which we won’t cover at this point. We will see them in action in later chapters.
Python has a basic built-in object type for dealing with files
that provides methods to read and write data from and to a disk file,
among other actions. We’ve already seen a little bit of it with the
pgmrand.py
script we looked at
earlier.
The open()
method is used to create an instance of
a file object:
>>>fname = "test1.txt"
>>>fmode = "w"
>>>f = open(fname, fmode)
Of course, you could also write:
f = open("test1.txt", "w")
and get the same result.
Once we have a file object, we can write something to it using
its write()
method:
>>>f.write("Test line 1 ")
>>>f.write("Test line 2 ")
>>>f.close()
The resulting file should now contain two lines of text:
Test line 1 Test line 2
Notice that the strings to be written to the file end with a
(the code for a newline
character). The file write()
method
does not append a newline to the end of a string like print
does, so it must be explicitly
included in the string.
Table 3-20 lists the most commonly encountered file modes.
Table 3-21 lists some commonly used file object methods. For a description of the other file object methods that are available, refer to the Python documentation.
We’ve already seen Python’s print function in action. Its primary
purpose is to send output to whatever is currently defined as
stdout (standard output). The print function
is capable of handling conversions between numeric types and strings
for console output in a transparent fashion. The string formatting
discussed earlier works with the print
statement to create nicely formatted
output.
By default, the output of print
is sent to whatever is currently
defined as stdout. By using the “chevron”
(>>
) operator this behavior can be modified, and print
can send output to any object that
provides a write()
method.
Typically this would be a file object, as shown here:
>>>datastr = "This is a test."
>>>f = open("testfile.txt", "w")
>>>print >> f,datastr
>>>f.close()
Here is a semirandom collection of observations that may prove useful to you.
It is usually a good idea to initialize module global `variables at the start of a module file. Attempting to check a global variable that does not yet exist will result in an exception, and taking care of this beforehand can save some aggravation later.
Because Python does not execute the internal statements (i.e.,
the body) of functions or methods when a module is imported, only the
def
statement, it is always
possible for bugs to be lurking there that will not become apparent
until the code is invoked. In such situations the try
statement is a powerful ally, but it is
not a cure-all. Good unit testing is key to detecting and removing
such defects before they can cause problems.
Sometimes you may encounter code where the original author
attempted to resolve a cyclic import by deferring the import of the
problematic module by placing the import
statement within a function or
method, instead of at the top of the module file. While this is
syntactically allowed in Python it is considered to be bad form, and
it’s a sure sign that someone didn’t think the design through before
sitting down at the keyboard and hammering away at it. However, when
dealing with legacy code (or just poorly written code) it may not be
possible to avoid using this trick. Use it sparingly, only when you
really have to, and test it thoroughly.
Although Python allows any data object to be used as a parameter to a function or method, resist the temptation to use dictionary objects unless you have a compelling reason to do so. If a dictionary object is used as a parameter, document it in detail and try to avoid altering its structure dynamically as it gets passed from function to function. Code that dynamically alters the structure of a shared dictionary object can be very difficult to understand and a nightmare to debug. It could even be considered a form of obfuscation, albeit (hopefully) unintentional. The same common-sense rationale applies to lists.
Tuples are a handy way to return more than one value from a function. For example, one could return a tuple containing both a status code value and a data value by using a 2-tuple. To see if the function succeeded one would examine the status code, and if it is OK one would then get the data value.
Of course, in Python a module actually is an object (everything is, as you may recall), but the tendency seems to be to treat a module as something akin to a source code module in C or C++. One can achieve some neat and tidy data encapsulation using just a module with nothing in it but assignment statements to associate names with values. Here is part of a module that contains nothing but event ID values for use with a wxPython GUI, which we will get to in a later chapter:
# ResourceIDs.py import wx # File idFileSave = wx.NewId() idFileSaveAs = wx.NewId() idFileNew = wx.NewId() idFileOpen = wx.NewId() idFileOpenGroup = wx.NewId() idFileClose = wx.NewId() idFileCloseAll = wx.NewId() idFilePrint = wx.NewId() idFilePrintPreview = wx.NewId() idFilePrintSetup = wx.NewId()
The wxPython package includes a function called NewID()
that automatically assigns a new ID
number each time it is called. When ResourceIDs
is imported, every statement is
evaluated and a value is assigned to each data object. To use these
one simply imports the module (perhaps using an alias, as shown
here):
import ResourceIDs as rID event_id = rID.idFileSave
This comes in handy in large programs, especially those that employ a GUI with lots of event ID names. A data-only module can also be imported from any other module without worrying about creating a cyclic import, provided that it does not itself import anything else (except perhaps system-level modules). If the attribute names in a data-only module are unique (using, say, a special prefix on each name), it could also be safely imported using the wildcard import style.
Once upon a time, a physics professor told me: “Document everything you do in the lab like you were going to get struck with amnesia tomorrow.” Sage advice, to be sure, but many people are loath to spend the time necessary to include docstrings and descriptive comments in their code. This is silly, because no one can be expected to remember exactly what something does or why it’s even there 12 months or more down the road (some might say that even a couple of months is a stretch). It also says something about how the author of the software feels about those who might come along later and try to fix or maintain the code.
The document known as “PEP-8” (available from http://www.python.org) contains some suggested coding style guidelines. You may not agree with all of it, but you should at least read it and be familiar with it. There is a lot of good advice there. In any case, you should attempt to arrive at some type of consistent style for your code, if for no other reason than that it improves readability and makes things a whole lot easier when there is a need to revisit old code.
A good development environment can make the difference between success and frustration. The development environment must, at a minimum, provide some way to create and edit Python source code as a standard ASCII text file. Additional tools, such as debuggers, automatic documentation generators, and version control are all good, but one could get by without them if absolutely necessary. Fortunately this isn’t necessary, given that there are a lot of excellent FOSS (Free and Open Source Software) tools available, and some very good and inexpensive commercial tools as well.
In this section we’ll take a brief look at what is available, with a primary emphasis on FOSS tools. It doesn’t really matter what tools you use, and most people have (or develop over time) their own preferences and work habits. The important thing to take away here is that there are many paths available, and choosing the right one is simply a matter of picking the tools that feel right and selecting the right tools for the job.
At the very least, you will need a text editor or integrated development environment (IDE) of some sort for entering and editing Python source code. You may also want to use the editor for your C source code when writing extensions (which we will delve into in Chapter 5), so it may be a good idea to pick something that’s language-neutral, or perhaps language-aware with syntax highlighting.
The primary difference between an editor and an IDE lies in how much one can accomplish from within the tool itself. An editor typically allows you to do just one thing: editing. An IDE, on the other hand, lets you do much more—from editing, to compiling, debugging, and perhaps even metrics and version control. An IDE is intended to be an environment the developer doesn’t need to leave until either it’s time to quit and go home, or the program is complete.
With some editors there is also the capability to launch another program from within the tool and then capture and display program output, but this is usually more of an add-on capability, not something that is inherently part of the editor tool, and some editors support this capability better than others. A full-featured IDE incorporates all of this functionality in some form or another, although some IDEs also require functionality from external tools and applications. In other words, the line between an editor with lots of bells and whistles and an IDE is sometimes blurry.
Using an IDE with Python is probably not necessary (although there are a few available), since it is not a compiled language, and most of what happens with Python is either happening at the command line or within the Python application’s GUI (if it uses one).
If you think you would prefer to use a standalone editor (which is what I use, by the way), there are several excellent packages to choose from. Table 3-22 lists a few of the more popular ones to consider.
Name | OS | FOSS? | Pros | Cons |
Emacs | Linux Windows Others | Yes | Supports sophisticated editing functions, scripting, syntax highlighting, and multiwindow displays. | Has a somewhat steep learning curve and uses some nonintuitive multikey commands that must be memorized. |
vi/vim | Linux Windows Others | Yes | The basic functions are easy to learn, and vi is very widespread across different Linux- and Unix-like platforms. vim also provides a GUI interface in addition to the conventional command-line operation. | Learning the more complex and sophisticated functionality can be a slog. Nonintuitive key combinations and codes are a holdover from the days of mainframes, minicomputers, and terminals. |
nano | Linux | Yes | Very simple. Provides some syntax highlighting. | Based on the Pico editor and its Control-key commands. Limited capabilities. |
Slickedit | Linux Windows Others | No ($$$) | Lots of features, full GUI interface, programmable macros, and syntax highlighting. Capable of emulating other editors. | Lots of knobs and dials to learn—may be overkill for most development tasks. Rather hefty price tag. |
UltraEdit | Linux Windows | No ($) | Very easy to learn with a full GUI interface. Multiple tabbed text windows, programmable macros, and syntax highlighting. | Has lots of features that the average developer will probably never use. Requires some effort to figure out how to adjust the default settings and disable some unnecessary defaults. It costs money (but not a whole lot). |
This is only a partial list, and there are other editors available, including some good FOSS ones. If you don’t already have a favorite editor (or even if you do), it would probably be worthwhile to try to compare what’s available for your development platform. But, a word of caution: some people seem to become rather attached to a particular editor, even to the point of being somewhat fanatical about it. This is particularly apparent in the Emacs versus vi debate that has been going on now for well over 20 years (refer to http://en.wikipedia.org/wiki/Editor_war for details). Just keep an open mind, select the right tool for the job, and see the editor war for what it really is: free entertainment.
An IDE attempts to integrate everything a programmer might need into a single tool. The first popular and low-cost IDE for the PC was Borland’s Turbo Pascal, developed by Philippe Kahn in the mid-1980s. Most modern IDEs provide a text editor for source code, an interface to a compiler or interpreter, tools to automate the build process, perhaps some support for version control, and a debugger of some sort. In other words, it’s a one-stop shopping experience for software development. Not every IDE will provide all the functionality we’ve listed here, but at the very least you should expect a text editor and the ability to run external tools and applications such as a compiler, interpreter, and debugger. In this sense even editors such as UltraEdit and Emacs (listed in Table 3-22) could be used as IDEs (and often are, actually). Table 3-23 lists some readily available IDE tools suitable for use with Python.
Name | OS | FOSS? | Pros | Cons |
Boa | Any that Python and wxPython support | Yes | Excellent tool for creating and maintaining wxPython GUI components and applications. Includes a decent editor and a basic Python debugger. | Targeted for the wxPython GUI add-on package. It does a lot but isn’t as full-featured as a dedicated editor. |
Idle | Any that Python supports | Yes | Provided with Python and coded entirely in Python. Provides multiple editing windows, function/method lists, a Python shell window, and a rudimentary debugger. | Idle’s multiple editing windows are free-floating, and it is sometimes annoying trying to track down a particular window. |
Eclipse (with PyDev) | Linux Windows Others | Yes | A very flexible multilanguage IDE written in Java. Additional functionality and language support are provided by plug-in modules such as PyDev for Python development. | A rather steep learning curve and a project/package model for capturing project components that may not be suitable for everyone. |
PythonWin | Windows | Yes | Provided with the ActiveState Python distribution. Includes most of the same capabilities as Idle. | Specifically for the Windows platform. |
WingIDE | Linux Windows Others | No ($$) | Lots of functionality specifically geared toward Python development and debugging. | Python-specific, although the editor can, of course, be used with other languages. The interface can be somewhat busy and cluttered, so spending time with the configuration is usually necessary. |
Debuggers allow a software developer to see inside the software, so to speak, while it is running. While one could perhaps argue that a debugger is seldom, if ever, actually necessary, they can save a lot of time and quickly expose serious problems in a program. However, as with any addictive substance, a debugger may be good in moderation, but it can develop into a serious dependency problem if one is not careful.
What, exactly, can one do with a debugger? For starters, a debugger allows the developer to set “breakpoints” in the code by selecting a particular line in the source listing. When the program execution reaches that point, it is halted and the local variables may be examined. A debugger also provides the ability to step through the code, one line at a time. If the debugger supports the concept of a “watch,” specific variables may be selected and their values displayed to the developer at breakpoints or while stepping through the code.
A debugger is, by necessity, language-specific—there is no “one size fits all” debugger currently available, although there are some “shells” that provide a similar interface across several languages.
For Python, the Boa, Idle, Eclipse, and WingIDE tools provide capable debuggers. A standalone Python debugger, Winpdb, is also available, and Python itself ships with an integrated command-line debugger, pdb.
This concludes our brief tour of Python. You should now have a general feel for what Python looks like and what it is capable of. I have intentionally glossed over many aspects of the language, because, after all, this book is not a tutorial on Python. As I stated going in, there are many excellent books available that can provide copious amounts of detail, and the official Python website is the authoritative source of all things Python. As we go along we will encounter other features of the language, and we will examine them when the need arises.
If you would like to get deeper into the realm of Python programming, the following books would be good places to start:
A compact reference that’s very handy to have on the desk when you’re working with Python. Well organized and easy to use, this is an essential reference work when you need to look up something in a hurry and want more than a pocket reference, but less than a massive tome.
A comprehensive introduction to Python and a massive reference, this 1,600-page book covers everything from string methods to GUI programming. The one book anyone working with Python should have.
In addition to the URL references already provided in this chapter, there are numerous other online resources available for Python, including the following:
This site hosts the complete text of Mark Pilgrim’s book Dive Into Python, also available as a PDF download. The book takes a learn-by-doing approach and uses numerous examples to illustrate key concepts and techniques.
Fredrik Lundh’s blog site. Here you can find hundreds of articles on Python, downloadable and viewable books, and some software to examine and try. The articles are well written and interesting to browse, and they are useful for the insights they provide into the language and its uses.
3.17.63.138