As mentioned, os
is
the larger of the two core system modules. It contains all of the
usual operating-system calls you may have used in your C programs and
shell scripts. Its calls deal with directories, processes, shell
variables, and the like. Technically, this module provides POSIX
tools—a portable standard for operating-system calls—along with
platform-independent directory processing tools as the nested module
os.path
. Operationally, os
serves as a largely portable interface to
your computer’s system calls: scripts written with os
and os.path
can usually be run unchanged on any
platform.
In fact, if you read the os
module’s source code, you’ll notice that it really just imports
whatever platform-specific system module you have on your computer
(e.g., nt
, mac
, posix
). See the os.py
file in the Python source library directory—it simply runs a from*
statement to copy all names out of a
platform-specific module. By always importing os
rather than platform-specific modules,
though, your scripts are mostly immune to platform implementation
differences. On some platforms, os
includes extra tools available just for that platform (e.g., low-level
process calls on Unix); by and large, though, it is as cross-platform
as it is technically feasible.
Let’s take a quick look at the basic interfaces in os
. As a
preview, Table 3-1
summarizes some of the most commonly used tools in the os
module organized by functional
area.
Table 3-1. Commonly used os module tools
Tasks | Tools |
---|---|
Shell variables | os.environ |
Running programs | |
Spawning processes | |
Descriptor files, locks | |
File processing | |
Administrative tools | |
Portability tools | |
Pathname tools | |
If you inspect this module’s attributes interactively, you get a huge list of names that will vary per Python release, will likely vary per platform, and isn’t incredibly useful until you’ve learned what each name means (I’ve removed most of this list to save space—run the command on your own):
>>>import os
>>>dir(os)
['F_OK', 'O_APPEND', 'O_BINARY', 'O_CREAT', 'O_EXCL', 'O_NOINHERIT', 'O_RANDOM', 'O_RDONLY', 'O_RDWR', 'O_SEQUENTIAL', 'O_SHORT_LIVED', 'O_TEMPORARY', 'O_TEXT', 'O_TRUNC', 'O_WRONLY', 'P_DETACH', 'P_NOWAIT', ... ...10 lines removed here... ... 'popen4', 'putenv', 'read', 'remove', 'removedirs', 'rename', 'renames', 'rmdir', 'sep', 'spawnl', 'spawnle', 'spawnv', 'spawnve', 'startfile', 'stat', 'stat_float_times', 'stat_result', 'statvfs_result', 'strerror', 'sys', 'system', 'tempnam', 'times', 'tmpfile', 'tmpnam', 'umask', 'unlink', 'unsetenv', 'urandom', 'utime', 'waitpid', 'walk', 'write']
Besides all of these, the nested os.path
module exports even more tools,
most of which are related to processing file and directory names
portably:
>>>dir(os.path)
['_ _all_ _', '_ _builtins_ _', '_ _doc_ _', '_ _file_ _', '_ _name_ _', 'abspath',
'altsep', 'basename', 'commonprefix', 'curdir', 'defpath', 'devnull', 'dirname',
'exists', 'expanduser', 'expandvars', 'extsep', 'getatime', 'getctime', 'getmtime',
'getsize', 'isabs', 'isdir', 'isfile', 'islink', 'ismount', 'join', 'lexists',
'normcase', 'normpath', 'os', 'pardir', 'pathsep', 'realpath', 'sep', 'split',
'splitdrive', 'splitext', 'splitunc', 'stat', 'supports_unicode_filenames', 'sys',
'walk']
Just in case those massive listings aren’t quite
enough to go on, let’s experiment interactively with some of the
simpler os
tools. Like sys
, the os
module comes with a collection of
informational and administrative tools:
>>>os.getpid( )
-510737 >>>os.getcwd( )
'C:\PP3rdEd\Examples\PP3E\System' >>>os.chdir(r'c: emp')
>>>os.getcwd( )
'c:\temp'
As shown here, the os.getpid
function gives the calling
process’s process ID (a unique system-defined identifier for a
running program), and os.getcwd
returns the current working directory. The current working directory
is where files opened by your script are assumed to live, unless
their names include explicit directory paths. That’s why earlier I
told you to run the following command in the directory where
more.py lives:
C:...PP3ESystem>python more.py more.py
The input filename argument here is given without an explicit
directory path (though you could add one to page files in another
directory). If you need to run in a different working directory,
call the os.chdir
function to
change to a new directory; your code will run relative to the new
directory for the rest of the program (or until the next os.chdir
call). This chapter will have
more to say about the notion of a current working directory, and its
relation to module imports when it explores script execution
context.
The os
module also
exports a set of names designed to make cross-platform programming
simpler. The set includes platform-specific settings for path and
directory separator characters, parent and current directory
indicators, and the characters used to terminate lines on the
underlying computer:[*]
>>>os.pathsep, os.sep, os.pardir, os.curdir, os.linesep
(';', '', '..', '.', '
')
os.sep
is whatever
character is used to separate directory components on the platform
on which Python is running; it is automatically preset to on Windows,
/
for POSIX machines, and :
on the Mac. Similarly, os.pathsep
provides the character that
separates directories on directory lists—:
for POSIX and ;
for DOS and Windows.
By using such attributes when composing and decomposing
system-related strings in our scripts, the scripts become fully
portable. For instance, a call of the form os.sep.split(dirpath)
will correctly split
platform-specific directory names into components, even though
dirpath
may look like dirdir
on Windows, dir/dir
on Linux, and dir:dir
on Macintosh. As previously
mentioned, on Windows you can usually use forward slashes rather
than backward slashes when giving filenames to be opened; but these
portability constants allow scripts to be platform neutral in
directory processing code.
The nested module os.path
provides a large set of directory-related tools of its own. For
example, it includes portable functions for tasks such as checking a
file’s type (isdir
, isfile
, and others), testing file
existence (exists
), and fetching
the size of a file by name (getsize
):
>>>os.path.isdir(r'C: emp'), os.path.isfile(r'C: emp')
(True, False) >>>os.path.isdir(r'C:config.sys'), os.path.isfile(r'C:config.sys')
(False, Tuue) >>>os.path.isdir('nonesuch'), os.path.isfile('nonesuch')
(False, False) >>>os.path.exists(r'c: empdata.txt')
0 >>>os.path.getsize(r'C:autoexec.bat')
260
The os.path.isdir
and
os.path.isfile
calls tell us
whether a filename is a directory or a simple file; both return
False
if the named file does not
exist. We also get calls for splitting and joining directory path
strings, which automatically use the directory name conventions on
the platform on which Python is running:
>>>os.path.split(r'C: empdata.txt')
('C:\temp', 'data.txt') >>>os.path.join(r'C: emp', 'output.txt')
'C:\temp\output.txt' >>>name = r'C: empdata.txt'
# Windows paths >>>os.path.basename(name), os.path.dirname(name)
('data.txt', 'C:\temp') >>>name = '/home/lutz/temp/data.txt'
# Unix-style paths >>>os.path.basename(name), os.path.dirname(name)
('data.txt', '/home/lutz/temp') >>>os.path.splitext(r'C:PP3rdEdExamplesPP3EPyDemos.pyw')
('C:\PP3rdEd\Examples\PP3E\PyDemos', '.pyw')
os.path.split
separates a
filename from its directory path, and os.path.join
puts them back together—all
in entirely portable fashion using the path conventions of the
machine on which they are called. The basename
and dirname
calls here return the second and
first items returned by a split simply as a convenience, and
splitext
strips the file
extension (after the last .
). The
normpath
call comes in handy if
your paths become a jumble of Unix and Windows separators:
>>>mixed
'C:\temp\public/files/index.html' >>>os.path.normpath(mixed)
'C:\temp\public\files\index.html' >>>print os.path.normpath(r'C: emp\sub.file.ext')
C: empsubfile.ext
This module also has an abspath
call that portably returns the
full directory pathname of a file; it accounts for adding the
current directory, ..
parents,
and more:
>>>os.getcwd( )
'C:\PP3rdEd\cdrom\WindowsExt' >>>os.path.abspath('temp')
# expand to full pathname 'C:\PP3rdEd\cdrom\WindowsExt\temp' >>>os.path.abspath(r'..examples')
# relative paths expanded 'C:\PP3rdEd\examples' >>>os.path.abspath(r'C:PP3rdEdchapters')
# absolute paths unchanged 'C:\PP3rdEd\chapters' >>>os.path.abspath(r'C: empspam.txt')
# ditto for filenames 'C:\temp\spam.txt' >>>os.path.abspath('')
# empty string means the cwd 'C:\PP3rdEd\cdrom\WindowsExt'
Because filenames are relative to the current working
directory when they aren’t fully specified paths, the os.path.abspath
function helps if you want
to show users what directory is truly being used to store a file. On
Windows, for example, when GUI-based programs are launched by
clicking on file explorer icons and desktop shortcuts, the execution
directory of the program is the clicked file’s home directory, but
that is not always obvious to the person doing the clicking;
printing a file’s abspath
can
help.
The os
module is also the
place where we run shell commands from within Python scripts. This concept is
intertwined with others we won’t cover until later in this chapter,
but since this is a key concept employed throughout this part of the
book, let’s take a quick first look at the basics here. Two os
functions allow scripts to run any
command line that you can type in a console window:
To understand the scope of these calls, we first need to define a few terms. In this text, the term shell means the system that reads and runs command-line strings on your computer, and shell command means a command-line string that you would normally enter at your computer’s shell prompt.
For example, on Windows, you can start an MS-DOS console
window and type DOS commands there—commands such as dir
to get a directory listing, and
type
to view a file, names of
programs you wish to start, and so on. DOS is the system shell,
and commands such as dir
and
type
are shell commands. On
Linux, you can start a new shell session by opening an xterm
window and typing shell commands there too—ls
to list directories, cat
to view files, and so on. A variety
of shells are available on Unix (e.g., csh, ksh), but they all
read and run command lines. Here are two shell commands typed and
run in an MS-DOS console box on Windows:
C: emp>dir /B
...type a shell command line about-pp.html ...its output shows up here python1.5.tar.gz ...DOS is the shell on Windows about-pp2e.html about-ppr2e.html newdir C: emp>type helloshell.py
# a Python program print 'The Meaning of Life'
None of this is directly related to Python, of
course (despite the fact that Python command-line scripts are
sometimes confusingly called “shell tools”). But because the
os
module’s system
and popen
calls let Python scripts run any
sort of command that the underlying system shell understands, our
scripts can make use of every command-line tool available on the
computer, whether it’s coded in Python or not. For example, here
is some Python code that runs the two DOS shell commands typed at
the shell prompt shown previously:
C: emp>python
>>>import os
>>>os.system('dir /B')
about-pp.html python1.5.tar.gz about-pp2e.html about-ppr2e.html newdir 0 >>>os.system('type helloshell.py')
# a Python program print 'The Meaning of Life' 0
The 0
s at the end here
are just the return values of the system call itself. The system
call can be used to run any command line that we could type at the
shell’s prompt (here, C: emp>
). The command’s output
normally shows up in the Python session’s or program’s standard
output stream.
But what if we want to grab a command’s output within a
script? The os.system
call
simply runs a shell command line, but os.popen
also connects to the standard
input or output streams of the command; we get back a file-like
object connected to the command’s output by default (if we pass a
w
mode flag to popen
, we connect to the command’s input
stream instead). By using this object to read the output of a
command spawned with popen
, we
can intercept the text that would normally appear in the console
window where a command line is typed:
>>>open('helloshell.py').read( )
"# a Python program print 'The Meaning of Life' " >>>text = os.popen('type helloshell.py').read( )
>>>text
"# a Python program print 'The Meaning of Life' " >>>listing = os.popen('dir /B').readlines( )
>>>listing
['about-pp.html ', 'python1.5.tar.gz ', 'helloshell.py ', 'about-pp2e.html ', 'about-ppr2e.html ', 'newdir ']
Here, we first fetch a file’s content the usual way (using
Python files), then as the output of a shell type command. Reading
the output of a dir
command
lets us get a listing of files in a directory that we can then
process in a loop (we’ll learn other ways to obtain such a list in
the next chapter[*]). So far, we’ve run basic DOS commands; because
these calls can run any command line that we can type at a shell
prompt, they can also be used to launch other Python
scripts:
>>>os.system('python helloshell.py')
# run a Python program The Meaning of Life 0 >>>output = os.popen('python helloshell.py').read( )
>>>output
'The Meaning of Life '
In all of these examples, the command-line strings sent to
system
and popen
are hardcoded, but there’s no
reason Python programs could not construct such strings at runtime
using normal string operations (+, %, etc.). Given that commands
can be dynamically built and run this way, system
and popen
turn Python scripts into flexible
and portable tools for launching and orchestrating other programs.
For example, a Python test “driver” script can be used to run
programs coded in any language (e.g., C++, Java, Python) and
analyze their output. We’ll explore such a script in Chapter 6.
You should keep in mind two limitations of system
and popen
. First, although these two
functions themselves are fairly portable, their use is really only
as portable as the commands that they run. The preceding examples
that run DOS dir
and type
shell commands, for instance, work
only on Windows, and would have to be changed in order to run
ls
and cat
commands on Unix-like
platforms.
Second, it is important to remember that running Python
files as programs this way is very different and generally much
slower than importing program files and calling functions they
define. When os.system
and
os.popen
are called, they must
start a brand-new, independent program running on your operating
system (they generally run the command in a newly forked process).
When importing a program file as a module, the Python interpreter
simply loads and runs the file’s code in the same process in order
to generate a module object. No other program is spawned along the
way.[†]
There are good reasons to build systems as separate programs too, and we’ll later explore things such as command-line arguments and streams that allow programs to pass information back and forth. But for most purposes, imported modules are a faster and more direct way to compose systems.
If you plan to use these calls in earnest, you should also
know that the os.system
call
normally blocks—that is, pauses—its caller until the spawned
command line exits. On Linux and Unix-like platforms, the spawned
command can generally be made to run independently and in parallel
with the caller by adding an &
shell background operator at the
end of the command line:
os.system("python program.py arg arg &")
On Windows, spawning with a DOS start
command will usually launch the
command in parallel too:
os.system("start program.py arg arg")
In fact, this is so useful that an os.startfile
call was added in recent
Python releases. This call opens a file with whatever program is
listed in the Windows registry for the file’s type—as though its
icon has been clicked with the mouse cursor:
os.startfile("webpage.html") # open file in your web browser
os.startfile("document.doc") # open file in Microsoft Word
os.startfile("myscript.py") # run file with Python
The os.popen
call does
not generally block its caller (by definition, the caller must be
able to read or write the file object returned) but callers may
still occasionally become blocked under both Windows and Linux if
the pipe object is closed—e.g., when garbage is collected—before
the spawned program exits or the pipe is read exhaustively (e.g.,
with its read( )
method). As we
will see in the next chapter, the Unix os.fork/exec
and Windows os.spawnv
calls can also be used to run
parallel programs without blocking.
Because the os
module’s
system
and popen
calls also fall under the category
of program launchers, stream redirectors, and cross-process
communication devices, they will show up again in later parts of
this chapter and in the following chapters, so we’ll defer further
details for the time being. If you’re looking for more details
right away, see the stream redirection section in this chapter and
the directory listings section in the next.
Since most other os
module
tools are even more difficult to appreciate outside the context of
larger application topics, we’ll postpone a deeper look at them
until later sections. But to let you sample the flavor of this
module, here is a quick preview for reference. Among the os
module’s other weapons are
these:
os.environ
Fetches and sets shell environment variables
os.fork
Spawns a new child process on Unix
os.pipe
Communicates between programs
os.execlp
Starts new programs
os.spawnv
Starts new programs with lower-level control
os.open
Opens a low-level descriptor-based file
os.mkdir
Creates a new directory
os.mkfifo
Creates a new named pipe
os.stat
Fetches low-level file information
os.remove
Deletes a file by its pathname
os.path.walk
, os.walk
Applies a function or loop body to all parts of an entire directory tree
And so on. One caution up front: the os
module provides a set of file open
, read
, and write
calls, but all of these deal with
low-level file access and are entirely distinct from Python’s
built-in stdio
file objects that
we create with the built-in open
function. You should normally use the built-in open
function (not the os
module) for all but very special
file-processing needs (e.g., opening with exclusive access file
locking).
Throughout this chapter, we will apply sys
and os
tools such as these to implement common
system-level tasks, but this book doesn’t have space to provide an
exhaustive list of the contents of modules we will meet along the
way. If you have not already done so, you should become acquainted
with the contents of modules such as os
and sys
by consulting the Python library
manual. For now, let’s move on to explore additional system tools in
the context of broader system programming concepts.