Many moons ago (about 10 years), I used machines that
had no tools for bundling files into a single package for easy transport. Here is the
situation: you have a large set of text files laying around that you
need to transfer to another computer. These days, tools like tar
are widely available for packaging many
files into a single file that can be copied, uploaded, mailed, or
otherwise transferred in a single step. As mentioned in an earlier
footnote, even Python itself has grown to support zip and tar archives
in the standard library (see the zipfile
and tarfile
modules in the library
reference).
Before I managed to install such tools on my PC, though, portable Python scripts served just as well. Example 6-6 copies all of the files listed on the command line to the standard output stream, separated by marker lines.
Example 6-6. PP3ESystemAppClients extpack.py
#!/usr/local/bin/python import sys # load the system module marker = ':'*10 + 'textpak=>' # hopefully unique separator def pack( ): for name in sys.argv[1:]: # for all command-line arguments input = open(name, 'r') # open the next input file print marker + name # write a separator line print input.read( ), # and write the file's contents if _ _name_ _ == '_ _main_ _': pack( ) # pack files listed on cmdline
The first line in this file is a Python comment (#...
), but it also gives the path to the
Python interpreter using the Unix executable-script trick discussed in
Chapter 3. If we give
textpack.py executable permission with a Unix
chmod
command, we can pack files by
running this program file directly from a shell console and redirect
its standard output stream to the file in which we want the packed
archive to show up:
C:...PP3ESystemAppClients est>type spam.txt
SPAM spam C:...... est>python .. extpack.py spam.txt eggs.txt ham.txt > packed.all
C:...... est>type packed.all
::::::::::textpak=>spam.txt SPAM spam ::::::::::textpak=>eggs.txt EGGS ::::::::::textpak=>ham.txt ham
Running the program this way creates a single output file called packed.all, which contains all three input files, with a header line giving the original file’s name before each file’s contents. Combining many files into one file in this way makes it easy to transfer in a single step—only one file need be copied to floppy, emailed, and so on. If you have hundreds of files to move, this can be a big win.
After such a file is transferred, though, it must somehow be unpacked on the receiving end to re-create the original files. To do so, we need to scan the combined file line by line, watching for header lines left by the packer to know when a new file’s contents begin. Another simple Python script, shown in Example 6-7, does the trick.
Example 6-7. PP3ESystemAppClients extunpack.py
#!/usr/local/bin/python import sys from textpack import marker # use common separator key mlen = len(marker) # filenames after markers for line in sys.stdin.readlines( ): # for all input lines if line[:mlen] != marker: print line, # write real lines else: sys.stdout = open(line[mlen:-1], 'w') # or make new output file
We could code this in a function like we did in textpack
, but there is little point in doing
so here; as written, the script relies on standard streams, not
function parameters. Run this in the directory where you want unpacked
files to appear, with the packed archive file piped in on the command
line as the script’s standard input stream:
C:...... estunpack>python .... extunpack.py < ..packed.all
C:...... estunpack>ls
eggs.txt ham.txt spam.txt C:...... estunpack>type spam.txt
SPAM Spam
So far so good; the textpack
and textunpack
scripts made it easy to move
lots of files around without lots of manual intervention. They are
prime examples of what are often called
tactical scripts—programs you code quickly for
a specific task.
But after playing with these and similar scripts for a while, I began to see commonalities that almost cried out for reuse. For instance, almost every shell tool I wrote had to scan command-line arguments, redirect streams to a variety of sources, and so on. Further, almost every command-line utility wound up with a different command-line option pattern, because each was written from scratch.
The following few classes are one solution to such problems. They define a class hierarchy that is designed for reuse of common shell tool code. Moreover, because of the reuse going on, every program that ties into its hierarchy sports a common look-and-feel in terms of command-line options, environment variable use, and more. As usual with object-oriented systems, once you learn which methods to overload, such a class framework provides a lot of work and consistency for free.
And once you start thinking in such ways, you make the leap to
more strategic development modes, writing code
with broader applicability and reuse in mind. The module in Example 6-8, for instance,
adapts the textpack
script’s
logic for integration into this hierarchy.
Example 6-8. PP3ESystemAppClientspackapp.py
#!/usr/local/bin/python ###################################################### # pack text files into one, separated by marker line; # % packapp.py -v -o target src src... # % packapp.py *.txt -o packed1 # >>> apptools.appRun('packapp.py', args...) # >>> apptools.appCall(PackApp, args...) ###################################################### from textpack import marker from PP3E.System.App.Kinds.redirect import StreamApp class PackApp(StreamApp): def start(self): StreamApp.start(self) if not self.args: self.exit('packapp.py [-o target]? src src...') def run(self): for name in self.restargs( ): try: self.message('packing: ' + name) self.pack_file(name) except: self.exit('error processing: ' + name) def pack_file(self, name): self.setInput(name) self.write(marker + name + ' ') while 1: line = self.readline( ) if not line: break self.write(line) if _ _name_ _ == '_ _main_ _': PackApp().main( )
Here, PackApp
inherits
members and methods that handle:
Operating system services
Command-line processing
Input/output stream redirection
from the StreamApp
class,
imported from another Python module file (listed in Example 6-10). StreamApp
provides a “read/write”
interface to redirected streams and a standard “start/run/stop”
script execution protocol. PackApp
simply redefines the start
and run
methods for its own purposes and reads
and writes itself to access its standard
streams. Most low-level system interfaces are hidden by the StreamApp
class; in OOP terms, we say they
are encapsulated.
This module can both be run as a program and imported by a
client (remember, Python sets a module’s name to _ _main_ _
when it’s run directly, so it
can tell the difference). When run as a program, the last line
creates an instance of the PackApp
class and starts it by calling its
main
method—a method call
exported by StreamApp
to kick off
a program run:
C:...... est>python ..packapp.py -v -o packedapp.all spam.txt eggs.txt ham.txt
PackApp start. packing: spam.txt packing: eggs.txt packing: ham.txt PackApp done. C:...... est>type packedapp.all
::::::::::textpak=>spam.txt SPAM spam ::::::::::textpak=>eggs.txt EGGS ::::::::::textpak=>ham.txt ham
This has the same effect as the
textpack.py script, but command-line options
(-v
for verbose mode, -o
to name an output file) are inherited
from the StreamApp
superclass.
The unpacker in Example
6-9 looks similar when migrated to the object-oriented
framework, because the very notion of running a program has been
given a standard structure.
Example 6-9. PP3ESystemAppClientsunpackapp.py
#!/usr/bin/python ########################################### # unpack a packapp.py output file; # % unpackapp.py -i packed1 -v # apptools.appRun('unpackapp.py', args...) # apptools.appCall(UnpackApp, args...) ########################################### from textpack import marker from PP3E.System.App.Kinds.redirect import StreamApp class UnpackApp(StreamApp): def start(self): StreamApp.start(self) self.endargs( ) # ignore more -o's, etc. def run(self): mlen = len(marker) while True: line = self.readline( ) if not line: break elif line[:mlen] != marker: self.write(line) else: name = line[mlen:].strip( ) self.message('creating: ' + name) self.setOutput(name) if _ _name_ _ == '_ _main_ _': UnpackApp().main( )
This subclass redefines the start
and run
methods to do the right thing for this
script: prepare for and execute a file unpacking operation. All the
details of parsing command-line arguments and redirecting standard
streams are handled in superclasses:
C:...... estunpackapp>python ....unpackapp.py -v -i ..packedapp.all
UnpackApp start. creating: spam.txt creating: eggs.txt creating: ham.txt UnpackApp done. C:...... estunpackapp>ls
eggs.txt ham.txt spam.txt C:...... estunpackapp>type spam.txt
SPAM spam
Running this script does the same job as the original
textunpack.py, but we get command-line flags
for free (-i
specifies the input
files). In fact, there are more ways to launch classes in this
hierarchy than I have space to show here. A command-line pair,
-i -
, for instance, makes the
script read its input from stdin
,
as though it were simply piped or redirected in the shell:
C:...... estunpackapp>type ..packedapp.all | python ....unpackapp.py -i -
creating: spam.txt
creating: eggs.txt
creating: ham.txt
This section lists the source code of StreamApp
and App
— the classes that do all of this extra work on behalf
of PackApp
and UnpackApp
. We don’t have space to go
through all of this code in detail, so be sure to study these
listings on your own for more information. It’s all straight Python
code.
I should also point out that the classes listed in this
section are just the ones used by the object-oriented mutations of
the textpack
and textunpack
scripts. They represent just
one branch of an overall application framework class tree, which you
can study on this book’s examples distribution (browse its
directory, PP3ESystemApp). Other classes in
the tree provide command menus, internal string-based file streams,
and so on. You’ll also find additional clients of the hierarchy that
do things like launch other shell tools and scan Unix-style email
mailbox files.
StreamApp
adds a few
command-line arguments (-i
,
-o
) and input/output stream
redirection to the more general App
root class listed later in this
section; App
, in turn, defines
the most general kinds of program behavior, to be inherited in
Examples 6-8, 6-9, and 6-10—i.e., in all classes
derived from App
.
Example 6-10. PP3ESystemAppKinds edirect.py
################################################################################ # App subclasses for redirecting standard streams to files ################################################################################ import sys from PP3E.System.App.Bases.app import App ################################################################################ # an app with input/output stream redirection ################################################################################ class StreamApp(App): def _ _init_ _(self, ifile='-', ofile='-'): App._ _init_ _(self) # call superclass init self.setInput( ifile or self.name + '.in') # default i/o filenames self.setOutput(ofile or self.name + '.out') # unless '-i', '-o' args def closeApp(self): # not _ _del_ _ try: if self.input != sys.stdin: # may be redirected self.input.close( ) # if still open except: pass try: if self.output != sys.stdout: # don't close stdout! self.output.close( ) # input/output exist? except: pass def help(self): App.help(self) print '-i <input-file |"-"> (default: stdin or per app)' print '-o <output-file|"-"> (default: stdout or per app)' def setInput(self, default=None): file = self.getarg('-i') or default or '-' if file == '-': self.input = sys.stdin self.input_name = '<stdin>' else: self.input = open(file, 'r') # cmdarg | funcarg | stdin self.input_name = file # cmdarg '-i -' works too def setOutput(self, default=None): file = self.getarg('-o') or default or '-' if file == '-': self.output = sys.stdout self.output_name = '<stdout>' else: self.output = open(file, 'w') # error caught in main( ) self.output_name = file # make backups too? class RedirectApp(StreamApp): def _ _init_ _(self, ifile=None, ofile=None): StreamApp._ _init_ _(self, ifile, ofile) self.streams = sys.stdin, sys.stdout sys.stdin = self.input # for raw_input, stdin sys.stdout = self.output # for print, stdout def closeApp(self): # not _ _del_ _ StreamApp.closeApp(self) # close files? sys.stdin, sys.stdout = self.streams # reset sys files ################################################################################ # to add as a mix-in (or use multiple-inheritance...) ################################################################################ class RedirectAnyApp: def _ _init_ _(self, superclass, *args): superclass._ _init_ _(self, *args) self.super = superclass self.streams = sys.stdin, sys.stdout sys.stdin = self.input # for raw_input, stdin sys.stdout = self.output # for print, stdout def closeApp(self): self.super.closeApp(self) # do the right thing sys.stdin, sys.stdout = self.streams # reset sys files
The top of the hierarchy knows what it means to be a
shell application, but not how to accomplish a particular utility
task (those parts are filled in by subclasses). App
, listed in Example 6-11, exports
commonly used tools in a standard and simplified interface and a
customizable start
/run
/stop
method protocol that abstracts
script execution. It also turns application objects into file-like
objects: when an application reads itself, for instance, it really
reads whatever source its standard input stream has been assigned
to by other superclasses in the tree (such as StreamApp
).
Example 6-11. PP3ESystemAppBasesapp.py
################################################################################ # an application class hierarchy, for handling top-level components; # App is the root class of the App hierarchy, extended in other files; ################################################################################ import sys, os, traceback class AppError(Exception): pass # errors raised here class App: # the root class def _ _init_ _(self, name=None): self.name = name or self._ _class_ _._ _name_ _ # the lowest class self.args = sys.argv[1:] self.env = os.environ self.verbose = self.getopt('-v') or self.getenv('VERBOSE') self.input = sys.stdin self.output = sys.stdout self.error = sys.stderr # stdout may be piped def closeApp(self): # not _ _del_ _: ref's? pass # nothing at this level def help(self): print self.name, 'command-line arguments:' # extend in subclass print '-v (verbose)' ############################## # script environment services ############################## def getopt(self, tag): try: # test "-x" command arg self.args.remove(tag) # not real argv: > 1 App? return 1 except: return 0 def getarg(self, tag, default=None): try: # get "-x val" command arg pos = self.args.index(tag) val = self.args[pos+1] self.args[pos:pos+2] = [] return val except: return default # None: missing, no default def getenv(self, name, default=''): try: # get "$x" environment var return self.env[name] except KeyError: return default def endargs(self): if self.args: self.message('extra arguments ignored: ' + repr(self.args)) self.args = [] def restargs(self): res, self.args = self.args, [] # no more args/options return res def message(self, text): self.error.write(text + ' ') # stdout may be redirected def exception(self): return tuple(sys.exc_info( )[:2]) # the last exception type,data def exit(self, message='', status=1): if message: self.message(message) sys.exit(status) def shell(self, command, fork=0, inp=''): if self.verbose: self.message(command) # how about ipc? if not fork: os.system(command) # run a shell cmd elif fork == 1: return os.popen(command, 'r').read( ) # get its output else: # readlines too? pipe = os.popen(command, 'w') pipe.write(inp) # send it input pipe.close( ) ################################################# # input/output-stream methods for the app itself; # redefine in subclasses if not using files, or # set self.input/output to file-like objects; ################################################# def read(self, *size): return self.input.read(*size) def readline(self): return self.input.readline( ) def readlines(self): return self.input.readlines( ) def write(self, text): self.output.write(text) def writelines(self, text): self.output.writelines(text) ################################################### # to run the app # main( ) is the start/run/stop execution protocol; ################################################### def main(self): res = None try: self.start( ) self.run( ) res = self.stop( ) # optional return val except SystemExit: # ignore if from exit( ) pass except: self.message('uncaught: ' + str(self.exception( ))) traceback.print_exc( ) self.closeApp( ) return res def start(self): if self.verbose: self.message(self.name + ' start.') def stop(self): if self.verbose: self.message(self.name + ' done.') def run(self): raise AppError, 'run must be redefined!'
Now that I’ve listed all this code, some readers might
naturally want to ask, “So why go to all this trouble?” Given the
amount of extra code in the object-oriented
version of these scripts, it’s a perfectly valid question. Most of
the code listed in Example
6-11 is general-purpose logic, designed to be used by many
applications. Still, that doesn’t explain why the packapp
and unpackapp
object-oriented scripts are
larger than the original equivalent textpack
and textunpack
non-object-oriented
scripts.
The answers will become more apparent after the first few times you don’t have to write code to achieve a goal, but there are some concrete benefits worth summarizing here:
StreamApp
clients
need not remember all the system interfaces in Python,
because StreamApp
exports
its own unified view. For instance, arguments, streams, and
shell variables are split across Python modules (e.g.,
sys.argv
, sys.stdout
, os.environ
); in these classes,
they are all collected in the same single place.
From the shell user’s perspective, StreamApp
clients all have a
common look-and-feel, because they inherit the same
interfaces to the outside world from their superclasses
(e.g., -i
and -v
flags).
As an added benefit of encapsulation, all of the
common code in the App
and StreamApp
superclasses must be debugged only once. Moreover,
localizing code in superclasses makes it easier to
understand and change in the future. Only one copy of the
code implements a system operation, and we’re free to change
its implementation in the future without breaking code that
makes use of it.
Such a framework can provide an extra precoded utility
that we would otherwise have to recode in every script we
write (command-line argument extraction, for instance). That
holds true now and will hold true in the future—services
added to the App
root
class become immediately usable and customizable among all
applications derived from this hierarchy.
Because file access isn’t hardcoded in PackApp
and UnpackApp
, they can easily take on
new behavior just by changing the class they inherit from.
Given the right superclass, PackApp
and UnpackApp
could just as easily
read and write to strings or sockets as to text files and
standard streams.
Although it’s not obvious until you start writing larger class-based systems, code reuse is perhaps the biggest win for class-based programs. For instance, in Chapter 11, we will reuse the object-oriented-based packer and unpacker scripts by invoking them from a menu GUI like so:
from PP3E.System.App.Clients.packapp import PackApp...get dialog inputs, glob filename patterns app = PackApp(ofile=output) # run with redirected output app.args = filenames # reset cmdline args list app.main( ) from PP3E.System.App.Clients.unpackapp import UnpackApp ...get dialog input app = UnpackApp(ifile=input) # run with input from file app.main( ) # execute app class
Because these classes encapsulate the notion of streams, they can be imported and called, not just run as top-level scripts. Further, their code is reusable in two ways: not only do they export common system interfaces for reuse in subclasses, but they can also be used as software components, as in the previous code listing. See the PP3EGuiShellgui directory for the full source code of these clients.
Python doesn’t impose object-oriented programming, of course, and you can get a lot of work done with simpler functions and scripts. But once you learn how to structure class trees for reuse, going the extra object-oriented mile usually pays off in the long run.
3.147.71.94