The email
package
used by the pymail
example of the
prior section is a collection of powerful tools—in fact, perhaps too
powerful to remember completely. At the minimum, some reusable
boilerplate code for common use cases can help insulate you from some
of its details. To simplify email interfacing for more complex mail
clients, and to further demonstrate the use of standard library email
tools, I developed the custom utility modules listed in this section—a
package called mailtools
.
mailtools
is a Python modules
package: a directory of code, with one module per tool class, and an
initialization module run when the directory is first imported. This
package’s modules are essentially just a wrapper layer above the
standard library’s email
package,
as well as its poplib
and smtplib
modules. They make some assumptions
about the way email
is to be used,
but they are reasonable and allow us to forget some of the underlying
complexity of the standard library tools employed.
In a nutshell, the mailtools
package provides three classes—to fetch, send, and parse email
messages. These classes can be used as
superclasses to mix in their methods to an
application-specific class or to create
standalone or embedded
objects that export their methods. We’ll see these classes deployed
both ways in this text.
One design note worth mentioning up front: none of the code in this package knows anything about the user interface it will be used in (console, GUI, web, or other), or does anything about things like threads; it is just a toolkit. As we’ll see, its clients are responsible for deciding how it will be deployed. By focusing on just email processing here, we simplify the code, as well as the programs that will use it.
As a simple example of this package’s tools in action, its
selftest.py
module serves as a
self-test script. When run, it sends a message from you, to you, which
includes the selftest.py file as an attachment.
It also fetches and displays some mail headers and contents. These
interfaces, along with some user-interface magic, will lead us to
full-blown email clients and web sites in later chapters.
The next few sections list mailtools
source code. We won’t cover all of
this package’s code in depth—study its listings for more details, and
see its self-test module for a usage example. Also, flip ahead to the
three clients that will use it for examples: the modified pymail2.py
following this listing, the
PyMailGUI client in Chapter 15,
and the PyMailCGI server in Chapter
17. By sharing and reusing this module, all three systems
inherit its utility, as well as any future enhancements.
The module in Example 14-21 implements the
initialization logic of the mailtools
package; as usual, its code is
run automatically the first time a script imports through the
package’s directory. Notice how this file collects the contents of
all the nested modules into the directory’s namespace with from *
statements—because mailtools
began life as a single
.py file, this provides backward compatibility
for existing clients. Since this is the root module, global comments
appear here as well.
Example 14-21. PP3EInternetEmailmailtools\_ _init_ _.py
################################################################################ # interface to mail server transfers, used by pymail, PyMailGUI and PyMailCGI; # does loads, sends, parsing, composing, and deleting, with attachment parts, # encoding, etc.; the parser, fetcher, and sender classes here are designed # to be mixed-in to subclasses which use their methods, or used as embedded or # standalone objects; also has convenience subclasses for silent mode, etc.; # loads all if pop server doesn't do top; doesn't handle threads or UI here, # and allows askPassword to differ per subclass; progress callback funcs get # status; all calls raise exceptions on error--client must handle in GUI/other; # this changed from file to package: nested modules imported here for bw compat; # TBD: in saveparts, should file be opened in text mode for text/ contypes? # TBD: in walkNamedParts, should we skip oddballs like message/delivery-status? ################################################################################ # collect modules here, when package dir imported directly from mailFetcher import * from mailSender import * from mailParser import * # export nested modules here, when from mailtools import * _ _all_ _ = 'mailFetcher', 'mailSender', 'mailParser' # test case moved to selftest.py to allow mailconfig's # path to be set before importing nested modules above
Example 14-22 contains common superclasses for the other classes in the package. At present, these are used only to enable or disable trace message output (some clients, such as web-based programs, may not want text to be printed to the output stream). Subclasses mix in the silent variant to turn off output.
Example 14-22. PP3EInternetEmailmailtoolsmailTool.py
############################################################################### # common superclasses: used to turn trace massages on/off ############################################################################### class MailTool: # superclass for all mail tools def trace(self, message): # redef me to disable or log to file print message class SilentMailTool: # to mixin instead of subclassing def trace(self, message): pass
The class used to compose and send messages is coded in Example 14-23. This module
provides a convenient interface that combines standard library tools
we’ve already met in this chapter—the email
package to compose messages with
attachments and encodings, and the smtplib
module to send the resulting email
text. Attachments are passed in as a list of filenames—MIME types
and any required encodings are determined automatically with the
module mimetypes
. Moreover, date
and time strings are automated with an email.Utils
call. Study this file’s code
and comments for more on its operation.
Example 14-23. PP3EInternetEmailmailtoolsmailSender.py
############################################################################### # send messages, add attachments (see _ _init_ _ for docs, test) ############################################################################### import mailconfig # client's mailconfig import smtplib, os, mimetypes # mime: name to type import email.Utils, email.Encoders # date string, base64 from mailTool import MailTool, SilentMailTool from email.Message import Message # general message from email.MIMEMultipart import MIMEMultipart # type-specific messages from email.MIMEAudio import MIMEAudio from email.MIMEImage import MIMEImage from email.MIMEText import MIMEText from email.MIMEBase import MIMEBase class MailSender(MailTool): """ send mail: format message, interface with SMTP server works on any machine with Python+Inet, doesn't use cmdline mail a nonauthenticating client: see MailSenderAuth if login required """ def _ _init_ _(self, smtpserver=None): self.smtpServerName = smtpserver or mailconfig.smtpservername def sendMessage(self, From, To, Subj, extrahdrs, bodytext, attaches, saveMailSeparator=(('='*80)+'PY ')): """ format,send mail: blocks caller, thread me in a GUI bodytext is main text part, attaches is list of filenames extrahdrs is list of (name, value) tuples to be added raises uncaught exception if send fails for any reason saves sent message text in a local file if successful assumes that To, Cc, Bcc hdr values are lists of 1 or more already stripped addresses (possibly in full name+<addr> format); client must split these on delimiters, parse, or use multiline input; note that SMTP allows full name+<addr> format in recipients """ if not attaches: msg = Message( ) msg.set_payload(bodytext) else: msg = MIMEMultipart( ) self.addAttachments(msg, bodytext, attaches) recip = To msg['From'] = From msg['To'] = ', '.join(To) # poss many: addr list msg['Subject'] = Subj # servers reject ';' sept msg['Date'] = email.Utils.formatdate( ) # curr datetime, rfc2822 utc for name, value in extrahdrs: # Cc, Bcc, X-Mailer, etc. if value: if name.lower( ) not in ['cc', 'bcc']: msg[name] = value else: msg[name] = ', '.join(value) # add commas between recip += value # some servers reject [''] fullText = msg.as_string( ) # generate formatted msg # sendmail call raises except if all Tos failed, # or returns failed Tos dict for any that failed self.trace('Sending to...'+ str(recip)) self.trace(fullText[:256]) server = smtplib.SMTP(self.smtpServerName) # this may fail too self.getPassword( ) # if srvr requires self.authenticateServer(server) # login in subclass try: failed = server.sendmail(From, recip, fullText) # except or dict finally: server.quit( ) # iff connect OK if failed: class SomeAddrsFailed(Exception): pass raise SomeAddrsFailed('Failed addrs:%s ' % failed) self.saveSentMessage(fullText, saveMailSeparator) self.trace('Send exit') def addAttachments(self, mainmsg, bodytext, attaches): # format a multipart message with attachments msg = MIMEText(bodytext) # add main text/plain part mainmsg.attach(msg) for filename in attaches: # absolute or relative paths if not os.path.isfile(filename): # skip dirs, etc. continue # guess content type from file extension, ignore encoding contype, encoding = mimetypes.guess_type(filename) if contype is None or encoding is not None: # no guess, compressed? contype = 'application/octet-stream' # use generic default self.trace('Adding ' + contype) # build sub-Message of appropriate kind maintype, subtype = contype.split('/', 1) if maintype == 'text': data = open(filename, 'r') msg = MIMEText(data.read( ), _subtype=subtype) data.close( ) elif maintype == 'image': data = open(filename, 'rb') msg = MIMEImage(data.read( ), _subtype=subtype) data.close( ) elif maintype == 'audio': data = open(filename, 'rb') msg = MIMEAudio(data.read( ), _subtype=subtype) data.close( ) else: data = open(filename, 'rb') msg = MIMEBase(maintype, subtype) msg.set_payload(data.read( )) data.close( ) # make generic type email.Encoders.encode_base64(msg) # encode using base64 # set filename and attach to container basename = os.path.basename(filename) msg.add_header('Content-Disposition', 'attachment', filename=basename) mainmsg.attach(msg) # text outside mime structure, seen by non-MIME mail readers mainmsg.preamble = 'A multi-part MIME format message. ' mainmsg.epilogue = '' # make sure message ends with a newline def saveSentMessage(self, fullText, saveMailSeparator): # append sent message to local file if worked # client: pass separator used for your app, splits # caveat: user may change file at same time (unlikely) try: sentfile = open(mailconfig.sentmailfile, 'a') if fullText[-1] != ' ': fullText += ' ' sentfile.write(saveMailSeparator) sentfile.write(fullText) sentfile.close( ) except: self.trace('Could not save sent message') # not a show-stopper def authenticateServer(self, server): pass # no login required for this server/class def getPassword(self): pass # no login required for this server/class ################################################################################ # specialized subclasses ################################################################################ class MailSenderAuth(MailSender): """ use for servers that require login authorization; client: choose MailSender or MailSenderAuth super class based on mailconfig.smtpuser setting (None?) """ def _ _init_ _(self, smtpserver=None, smtpuser=None): MailSender._ _init_ _(self, smtpserver) self.smtpUser = smtpuser or mailconfig.smtpuser self.smtpPassword = None def authenticateServer(self, server): server.login(self.smtpUser, self.smtpPassword) def getPassword(self): """ get SMTP auth password if not yet known; may be called by superclass auto, or client manual: not needed until send, but don't run in GUI thread; get from client-side file or subclass method """ if not self.smtpPassword: try: localfile = open(mailconfig.smtppasswdfile) self.smtpPassword = localfile.readline( )[:-1] self.trace('local file password' + repr(self.smtpPassword)) except: self.smtpPassword = self.askSmtpPassword( ) def askSmtpPassword(self): assert False, 'Subclass must define method' class MailSenderAuthConsole(MailSender): def askSmtpPassword(self): import getpass prompt = 'Password for %s on %s?' % (self.smtpUser, self.smtpServerName) return getpass.getpass(prompt) class SilentMailSender(SilentMailTool, MailSender): pass # replaces trace
The class defined in Example 14-24 does the work of interfacing with a POP email server—loading, deleting, and synchronizing.
This module deals strictly in email text; parsing email after it has been fetched is delegated to a different module in the package. Moreover, this module doesn’t cache already loaded information; clients must add their own mail-retention tools if desired. Clients must also provide password input methods or pass one in, if they cannot use the console input subclass here (e.g., GUIs and web-based programs).
The loading and deleting tasks use the standard library
poplib
module in ways we saw
earlier in this chapter, but notice that there are interfaces for
fetching just message header text with the TOP action in POP. This
can save substantial time if clients need to fetch only basic
details for an email index.
This module also supports the notion of progress indicators—for methods that perform multiple downloads or deletions, callers may pass in a function that will be called as each mail is processed. This function will receive the current and total step numbers. It’s left up to the caller to render this in a GUI, console, or other user interface.
Also notice that Example 14-24 devotes substantial code to detecting synchronization errors between an email list held by a client, and the current state of the inbox at the POP email server. Normally, POP assigns relative message numbers to email in the inbox, and only adds newly arrived emails to the end of the inbox. As a result, relative message numbers from an earlier fetch may usually be used to delete and fetch in the future.
However, although rare, it is not impossible for the server’s inbox to change in ways that invalidate previously fetched message numbers. For instance, emails may be deleted in another client, and the server itself may move mails from the inbox to an undeliverable state on download errors (this may vary per ISP). In both cases, email may be removed from the middle of the inbox, throwing some prior relative message numbers out of sync with the server.
This situation can result in fetching the wrong message in an email client—users receive a different message than the one they thought they had selected. Worse, this can make deletions inaccurate—if a mail client uses a relative message number in a delete request, the wrong mail may be deleted if the inbox has changed since the index was fetched.
To assist clients, Example 14-24 includes tools, which match message headers on deletions to ensure accuracy and perform general inbox synchronization tests on demand. These tools can be used only by clients that retain the fetched email list as state information. We’ll use these in the PyMailGUI client in Chapter 15. There, deletions use the safe interface, and loads run the synchronization test on demand; on errors, the inbox index is automatically reloaded. For now, see Example 14-24 source code and comments for more details.
Note that the synchronization tests try a variety of
matching techniques, but require the complete headers text and, in
the worst case, must parse headers and match many header fields.
In many cases, the single previously fetched message-id
header field would be
sufficient for matching against messages in the server’s inbox.
However, because this field is optional and can be forged to have
any value, it might not always be a reliable way to identify
messages. In other words, a same-valued message-id
may not suffice to guarantee
a match, although it can be used to identify a mismatch; in Example 14-24, the message-id
is used to rule out a match
if either message has one, and they differ in value. This test is
performed before falling back on slower parsing and multiple
header matches.
Example 14-24. PP3EInternetEmailmailtoolsmailFetcher.py
############################################################################### # retrieve, delete, match mail from a POP server (see _ _init_ _ for docs, test) ############################################################################### import poplib, mailconfig # client's mailconfig: script dir or pythonpath print 'user:', mailconfig.popusername from mailParser import MailParser # for headers matching from mailTool import MailTool, SilentMailTool # trace control supers # index/server msgnum out of synch tests class DeleteSynchError(Exception): pass # msg out of synch in del class TopNotSupported(Exception): pass # can't run synch test class MessageSynchError(Exception): pass # index list out of sync class MailFetcher(MailTool): """ fetch mail: connect, fetch headers+mails, delete mails works on any machine with Python+Inet; subclass me to cache implemented with the POP protocol; IMAP requires new class """ def _ _init_ _(self, popserver=None, popuser=None, poppswd=None, hastop=True): self.popServer = popserver or mailconfig.popservername self.popUser = popuser or mailconfig.popusername self.srvrHasTop = hastop self.popPassword = poppswd # ask later if None def connect(self): self.trace('Connecting...') self.getPassword( ) # file, GUI, or console server = poplib.POP3(self.popServer) server.user(self.popUser) # connect,login POP server server.pass_(self.popPassword) # pass is a reserved word self.trace(server.getwelcome( )) # print returned greeting return server def downloadMessage(self, msgnum): """ load full raw text of one mail msg, given its POP relative msgnum; caller must parse content """ self.trace('load '+str(msgnum)) server = self.connect( ) try: resp, msglines, respsz = server.retr(msgnum) finally: server.quit( ) return ' '.join(msglines) # concat lines for parsing def downloadAllHeaders(self, progress=None, loadfrom=1): """ get sizes, raw header text only, for all or new msgs begins loading headers from message number loadfrom use loadfrom to load newly arrived mails only use downloadMessage to get a full msg text later progress is a function called with (count, total); returns: [headers text], [mail sizes], loadedfull? """ if not self.srvrHasTop: # not all servers support TOP return self.downloadAllMsgs(progress) # naively load full msg text else: self.trace('loading headers') server = self.connect( ) # mbox now locked until quit try: resp, msginfos, respsz = server.list( ) # 'num size' lines list msgCount = len(msginfos) # alt to srvr.stat[0] msginfos = msginfos[loadfrom-1:] # drop already loadeds allsizes = [int(x.split( )[1]) for x in msginfos] allhdrs = [] for msgnum in range(loadfrom, msgCount+1): # poss empty if progress: progress(msgnum, msgCount) # callback? resp, hdrlines, respsz = server.top(msgnum, 0) # hdrs only allhdrs.append(' '.join(hdrlines)) finally: server.quit( ) # make sure unlock mbox assert len(allhdrs) == len(allsizes) self.trace('load headers exit') return allhdrs, allsizes, False def downloadAllMessages(self, progress=None, loadfrom=1): """ load full message text for all msgs from loadfrom..N, despite any caching that may be being done in the caller; much slower than downloadAllHeaders, if just need hdrs; """ self.trace('loading full messages') server = self.connect( ) try: (msgCount, msgBytes) = server.stat( ) # inbox on server allmsgs = [] allsizes = [] for i in range(loadfrom, msgCount+1): # empty if low >= high if progress: progress(i, msgCount) (resp, message, respsz) = server.retr(i) # save text on list allmsgs.append(' '.join(message)) # leave mail on server allsizes.append(respsz) # diff from len(msg) finally: server.quit( ) # unlock the mail box assert len(allmsgs) == (msgCount - loadfrom) + 1 # msg nums start at 1 #assert sum(allsizes) == msgBytes # not if loadfrom > 1 return allmsgs, allsizes, True def deleteMessages(self, msgnums, progress=None): """ delete multiple msgs off server; assumes email inbox unchanged since msgnums were last determined/loaded; use if msg headers not available as state information; fast, but poss dangerous: see deleteMessagesSafely """ self.trace('deleting mails') server = self.connect( ) try: for (ix, msgnum) in enumerate(msgnums): # don't reconnect for each if progress: progress(ix+1, len(msgnums)) server.dele(msgnum) finally: # changes msgnums: reload server.quit( ) def deleteMessagesSafely(self, msgnums, synchHeaders, progress=None): """ delete multiple msgs off server, but use TOP fetches to check for a match on each msg's header part before deleting; assumes the email server supports the TOP interface of POP, else raises TopNotSupported - client may call deleteMessages; use if the mail server might change the inbox since the email index was last fetched, thereby changing POP relative message numbers; this can happen if email is deleted in a different client; some ISPs may also move a mail from inbox to the undeliverable box in response to a failed download; synchHeaders must be a list of already loaded mail hdrs text, corresponding to selected msgnums (requires state); raises exception if any out of synch with the email server; inbox is locked until quit, so it should not change between TOP check and actual delete: synch check must occur here, not in caller; may be enough to call checkSynchError+deleteMessages, but check each msg here in case deletes and inserts in middle of inbox; """ if not self.srvrHasTop: raise TopNotSupported('Safe delete cancelled') self.trace('deleting mails safely') errmsg = 'Message %s out of synch with server. ' errmsg += 'Delete terminated at this message. ' errmsg += 'Mail client may require restart or reload.' server = self.connect( ) # locks inbox till quit try: # don't reconnect for each (msgCount, msgBytes) = server.stat( ) # inbox size on server for (ix, msgnum) in enumerate(msgnums): if progress: progress(ix+1, len(msgnums)) if msgnum > msgCount: # msgs deleted raise DeleteSynchError(errmsg % msgnum) resp, hdrlines, respsz = server.top(msgnum, 0) # hdrs only msghdrs = ' '.join(hdrlines) if not self.headersMatch(msghdrs, synchHeaders[msgnum-1]): raise DeleteSynchError(errmsg % msgnum) else: server.dele(msgnum) # safe to delete this msg finally: # changes msgnums: reload server.quit( ) # unlock inbox on way out def checkSynchError(self, synchHeaders): """ check to see if already loaded hdrs text in synchHeaders list matches what is on the server, using the TOP command in POP to fetch headers text; use if inbox can change due to deletes in other client, or automatic action by email server; raises except if out of synch, or error while talking to server; for speed, only checks last in last: this catches inbox deletes, but assumes server won't insert before last (true for incoming mails); check inbox size first: smaller if just deletes; else top will differ if deletes and newly arrived messages added at end; result valid only when run: inbox may change after return; """ self.trace('synch check') errormsg = 'Message index out of synch with mail server. ' errormsg += 'Mail client may require restart or reload.' server = self.connect( ) try: lastmsgnum = len(synchHeaders) # 1..N (msgCount, msgBytes) = server.stat( ) # inbox size if lastmsgnum > msgCount: # fewer now? raise MessageSynchError(errormsg) # none to cmp if self.srvrHasTop: resp, hdrlines, respsz = server.top(lastmsgnum, 0) # hdrs only lastmsghdrs = ' '.join(hdrlines) if not self.headersMatch(lastmsghdrs, synchHeaders[-1]): raise MessageSynchError(errormsg) finally: server.quit( ) def headersMatch(self, hdrtext1, hdrtext2): """" may not be as simple as a string compare: some servers add a "Status:" header that changes over time; on one ISP, it begins as "Status: U" (unread), and changes to "Status: RO" (read, old) after fetched once - throws off synch tests if new when index fetched, but have been fetched once before delete or last-message check; "Message-id:" line is unique per message in theory, but optional, and can be anything if forged; match more common: try first; parsing costly: try last """ # try match by simple string compare if hdrtext1 == hdrtext2: self.trace('Same headers text') return True # try match without status lines split1 = hdrtext1.splitlines( ) # s.split(' '), but no final '' split2 = hdrtext2.splitlines( ) strip1 = [line for line in split1 if not line.startswith('Status:')] strip2 = [line for line in split2 if not line.startswith('Status:')] if strip1 == strip2: self.trace('Same without Status') return True # try mismatch by message-id headers if either has one msgid1 = [line for line in split1 if line[:11].lower( ) == 'message-id:'] msgid2 = [line for line in split2 if line[:11].lower( ) == 'message-id:'] if (msgid1 or msgid2) and (msgid1 != msgid2): self.trace('Different Message-Id') return False # try full hdr parse and common headers if msgid missing or trash tryheaders = ('From', 'To', 'Subject', 'Date') tryheaders += ('Cc', 'Return-Path', 'Received') msg1 = MailParser( ).parseHeaders(hdrtext1) msg2 = MailParser( ).parseHeaders(hdrtext2) for hdr in tryheaders: # poss multiple Received if msg1.get_all(hdr) != msg2.get_all(hdr): # case insens, dflt None self.trace('Diff common headers') return False # all common hdrs match and don't have a diff message-id self.trace('Same common headers') return True def getPassword(self): """ get POP password if not yet known not required until go to server from client-side file or subclass method """ if not self.popPassword: try: localfile = open(mailconfig.poppasswdfile) self.popPassword = localfile.readline( )[:-1] self.trace('local file password' + repr(self.popPassword)) except: self.popPassword = self.askPopPassword( ) def askPopPassword(self): assert False, 'Subclass must define method' ################################################################################ # specialized subclasses ################################################################################ class MailFetcherConsole(MailFetcher): def askPopPassword(self): import getpass prompt = 'Password for %s on %s?' % (self.popUser, self.popServer) return getpass.getpass(prompt) class SilentMailFetcher(SilentMailTool, MailFetcher): pass # replaces trace
Example
14-25 implements the last major class in the mailtools
package—given the text of an
email message, its tools parse the mail’s content into a message
object, with headers and decoded parts. This module is largely just
a wrapper around the standard library’s email
package, but it adds convenience
tools—finding the main text part of a message, filename generation
for message parts, saving attached parts to files, and so on. See
the code for more information. Also notice the parts walker here: by
coding its search logic in one place, we guarantee that all three
clients implement the same traversal.
Example 14-25. PP3EInternetEmailmailtoolsmailParser.py
############################################################################### # parsing and attachment extract, analyse, save (see _ _init_ _ for docs, test) ############################################################################### import os, mimetypes # mime: type to name import email.Parser from email.Message import Message from mailTool import MailTool class MailParser(MailTool): """ methods for parsing message text, attachments subtle thing: Message object payloads are either a simple string for non-multipart messages, or a list of Message objects if multipart (possibly nested); we don't need to distinguish between the two cases here, because the Message walk generator always returns self first, and so works fine on non-multipart messages too (a single object is walked); for simple messages, the message body is always considered here to be the sole part of the mail; for multipart messages, the parts list includes the main message text, as well as all attachments; this allows simple messages not of type text to be handled like attachments in a UI (e.g., saved, opened); Message payload may also be None for some oddball part types; """ def walkNamedParts(self, message): """ generator to avoid repeating part naming logic skips multipart headers, makes part filenames message is already parsed email.Message object doesn't skip oddball types: payload may be None """ for (ix, part) in enumerate(message.walk( )): # walk includes message maintype = part.get_content_maintype( ) # ix includes multiparts if maintype == 'multipart': continue # multipart/*: container else: filename, contype = self.partName(part, ix) yield (filename, contype, part) def partName(self, part, ix): """ extract filename and content type from message part; filename: tries Content-Disposition, then Content-Type name param, or generates one based on mimetype guess; """ filename = part.get_filename( ) # filename in msg hdrs? contype = part.get_content_type( ) # lower maintype/subtype if not filename: filename = part.get_param('name') # try content-type name if not filename: if contype == 'text/plain': # hardcode plain text ext ext = '.txt' # else guesses .ksh! else: ext = mimetypes.guess_extension(contype) if not ext: ext = '.bin' # use a generic default filename = 'part-%03d%s' % (ix, ext) return (filename, contype) def saveParts(self, savedir, message): """ store all parts of a message as files in a local directory; returns [('maintype/subtype', 'filename')] list for use by callers, but does not open any parts or attachments here; get_payload decodes base64, quoted-printable, uuencoded data; mail parser may give us a None payload for oddball types we probably should skip over: convert to str here to be safe; """ if not os.path.exists(savedir): os.mkdir(savedir) partfiles = [] for (filename, contype, part) in self.walkNamedParts(message): fullname = os.path.join(savedir, filename) fileobj = open(fullname, 'wb') # use binary mode content = part.get_payload(decode=1) # decode base64,qp,uu fileobj.write(str(content)) # make sure is a str fileobj.close( ) partfiles.append((contype, fullname)) # for caller to open return partfiles def saveOnePart(self, savedir, partname, message): """ ditto, but find and save just one part by name """ if not os.path.exists(savedir): os.mkdir(savedir) fullname = os.path.join(savedir, partname) (contype, content) = self.findOnePart(partname, message) open(fullname, 'wb').write(str(content)) return (contype, fullname) def partsList(self, message): """" return a list of filenames for all parts of an already parsed message, using same filename logic as saveParts, but do not store the part files here """ validParts = self.walkNamedParts(message) return [filename for (filename, contype, part) in validParts] def findOnePart(self, partname, message): """ find and return part's content, given its name intended to be used in conjunction with partsList we could also mimetypes.guess_type(partname) here we could also avoid this search by saving in dict """ for (filename, contype, part) in self.walkNamedParts(message): if filename == partname: content = part.get_payload(decode=1) # base64,qp,uu return (contype, content) def findMainText(self, message): """ for text-oriented clients, return the first text part; for the payload of a simple message, or all parts of a multipart message, looks for text/plain, then text/html, then text/*, before deducing that there is no text to display; this is a heuristic, but covers most simple, multipart/alternative, and multipart/mixed messages; content-type defaults to text/plain if not in simple msg; handles message nesting at top level by walking instead of list scans; if non-multipart but type is text/html, returns the HTML as the text with an HTML type: caller may open in web browser; if nonmultipart and not text, no text to display: save/open in UI; caveat: does not try to concatenate multiple inline text/plain parts """ # try to find a plain text for part in message.walk( ): # walk visits message type = part.get_content_type( ) # if nonmultipart if type == 'text/plain': return type, part.get_payload(decode=1) # may be base64,qp,uu # try to find an HTML part for part in message.walk( ): type = part.get_content_type( ) if type == 'text/html': return type, part.get_payload(decode=1) # caller renders # try any other text type, including XML for part in message.walk( ): if part.get_content_maintype( ) == 'text': return part.get_content_type( ), part.get_payload(decode=1) # punt: could use first part, but it's not marked as text return 'text/plain', '[No text to display]' # returned when parses fail errorMessage = Message( ) errorMessage.set_payload('[Unable to parse message - format error]') def parseHeaders(self, mailtext): """ parse headers only, return root email.Message object stops after headers parsed, even if nothing else follows (top) email.Message object is a mapping for mail header fields payload of message object is None, not raw body text """ try: return email.Parser.Parser( ).parsestr(mailtext, headersonly=True) except: return self.errorMessage def parseMessage(self, fulltext): """ parse entire message, return root email.Message object payload of message object is a string if not is_multipart( ) payload of message object is more Messages if multiple parts the call here same as calling email.message_from_string( ) """ try: return email.Parser.Parser( ).parsestr(fulltext) # may fail! except: return self.errorMessage # or let call handle? can check return def parseMessageRaw(self, fulltext): """ parse headers only, return root email.Message object stops after headers parsed, for efficiency (not yet used here) payload of message object is raw text of mail after headers """ try: return email.Parser.HeaderParser( ).parsestr(fulltext) except: return self.errorMessage
The last file in the mailtools
package, Example 14-26, lists the
self-test code for the package. This code is a separate script file,
in order to allow for import search path manipulation—it emulates a
real client, which is assumed to have a mailconfig.py
module in its own source
directory (this module can vary per client).
Example 14-26. PP3EInternetEmailmailtoolsselftest.py
############################################################################### # self-test when this file is run as a program ############################################################################### # # mailconfig normally comes from the client's source directory or # sys.path; for testing, get it from Email directory one level up # import sys sys.path.append('..') import mailconfig print 'config:', mailconfig._ _file_ _ # get these from _ _init_ _ from mailtools import MailFetcherConsole, MailSender, MailSenderAuthConsole if not mailconfig.smtpuser: sender = MailSender( ) else: sender = MailSenderAuthConsole( ) sender.sendMessage(From = mailconfig.myaddress, To = [mailconfig.myaddress], Subj = 'testing 123', extrahdrs = [('X-Mailer', 'mailtools')], bodytext = 'Here is my source code', attaches = ['selftest.py']) fetcher = MailFetcherConsole( ) def status(*args): print args hdrs, sizes, loadedall = fetcher.downloadAllHeaders(status) for num, hdr in enumerate(hdrs[:5]): print hdr if raw_input('load mail?') in ['y', 'Y']: print fetcher.downloadMessage(num+1), ' ', '-'*70 last5 = len(hdrs)-4 msgs, sizes, loadedall = fetcher.downloadAllMessages(status, loadfrom=last5) for msg in msgs: print msg[:200], ' ', '-'*70 raw_input('Press Enter to exit')
Finally, to give a use case for the mailtools
module package of the preceding
sections, Example 14-27
provides an updated version of the pymail
program we met earlier, which uses
mailtools
to access email instead
of older tools. Compare its code to the original pymail
in this chapter to see how mailtools
is employed here. You’ll find
that its mail download and send logic is substantially
simpler.
Example 14-27. pymail2.py
#!/usr/local/bin/python ########################################################################## # pymail2 - simple console email interface client in Python; this # version uses the mailtools package, which in turn uses poplib, # smtplib, and the email package for parsing and composing emails; # displays first text part of mails, not entire full text; # fetches just mail headers initially, using the TOP command; # fetches full text of just email selected to be displayed; # caches already fetched mails; caveat: no way to refresh index; # uses standalone mailtools objects - they can also be superclasses; ########################################################################## mailcache = {} def fetchmessage(i): try: fulltext = mailcache[i] except KeyError: fulltext = fetcher.downloadMessage(i) mailcache[i] = fulltext return fulltext def sendmessage( ): from pymail import inputmessage From, To, Subj, text = inputmessage( ) sender.sendMessage(From, To, Subj, [], text, attaches=None) def deletemessages(toDelete, verify=True): print 'To be deleted:', toDelete if verify and raw_input('Delete?')[:1] not in ['y', 'Y']: print 'Delete cancelled.' else: print 'Deleting messages from server.' fetcher.deleteMessages(toDelete) def showindex(msgList, msgSizes, chunk=5): count = 0 for (msg, size) in zip(msgList, msgSizes): # email.Message, int count += 1 print '%d: %d bytes' % (count, size) for hdr in ('From', 'Date', 'Subject'): print ' %s=>%s' % (hdr, msg.get(hdr, '(unknown)')) if count % chunk == 0: raw_input('[Press Enter key]') # pause after each chunk def showmessage(i, msgList): if 1 <= i <= len(msgList): fulltext = fetchmessage(i) message = parser.parseMessage(fulltext) ctype, maintext = parser.findMainText(message) print '-'*80 print maintext # main text part, not entire mail print '-'*80 # and not any attachments after else: print 'Bad message number' def savemessage(i, mailfile, msgList): if 1 <= i <= len(msgList): fulltext = fetchmessage(i) open(mailfile, 'a').write(' ' + fulltext + '-'*80 + ' ') else: print 'Bad message number' def msgnum(command): try: return int(command.split( )[1]) except: return -1 # assume this is bad helptext = """ Available commands: i - index display l n? - list all messages (or just message n) d n? - mark all messages for deletion (or just message n) s n? - save all messages to a file (or just message n) m - compose and send a new mail message q - quit pymail ? - display this help text """ def interact(msgList, msgSizes, mailfile): showindex(msgList, msgSizes) toDelete = [] while 1: try: command = raw_input('[Pymail] Action? (i, l, d, s, m, q, ?) ') except EOFError: command = 'q' if not command: command = '*' if command == 'q': # quit break elif command[0] == 'i': # index showindex(msgList, msgSizes) elif command[0] == 'l': # list if len(command) == 1: for i in range(1, len(msgList)+1): showmessage(i, msgList) else: showmessage(msgnum(command), msgList) elif command[0] == 's': # save if len(command) == 1: for i in range(1, len(msgList)+1): savemessage(i, mailfile, msgList) else: savemessage(msgnum(command), mailfile, msgList) elif command[0] == 'd': # mark for deletion later if len(command) == 1: toDelete = range(1, len(msgList)+1) else: delnum = msgnum(command) if (1 <= delnum <= len(msgList)) and (delnum not in toDelete): toDelete.append(delnum) else: print 'Bad message number' elif command[0] == 'm': # send a new mail via SMTP try: sendmessage( ) except: print 'Error - mail not sent' elif command[0] == '?': print helptext else: print 'What? -- type "?" for commands help' return toDelete def main( ): global parser, sender, fetcher import mailtools, mailconfig mailserver = mailconfig.popservername mailuser = mailconfig.popusername mailfile = mailconfig.savemailfile parser = mailtools.MailParser( ) sender = mailtools.MailSender( ) fetcher = mailtools.MailFetcherConsole(mailserver, mailuser) def progress(i, max): print i, 'of', max hdrsList, msgSizes, ignore = fetcher.downloadAllHeaders(progress) msgList = [parser.parseHeaders(hdrtext) for hdrtext in hdrsList] print '[Pymail email client]' toDelete = interact(msgList, msgSizes, mailfile) if toDelete: deletemessages(toDelete) if _ _name_ _ == '_ _main_ _': main( )
This program is used interactively, the same as the original. In fact, the output is nearly identical, so we won’t go into further details. Here’s a quick look at this script in action; run this on your own machine to see it firsthand:
C:...PP3EInternetEmail>pymail2.py
user: pp3e loading headers Connecting... Password for pp3e on pop.earthlink.net? +OK NGPopper vEL_6_10 at earthlink.net ready <[email protected]... 1 of 5 2 of 5 3 of 5 4 of 5 5 of 5 load headers exit [Pymail email client] 1: 876 bytes From=>[email protected] Date=>Wed, 08 Feb 2006 05:23:13 -0000 Subject=>I'm a Lumberjack, and I'm Okay 2: 800 bytes From=>[email protected] Date=>Wed, 08 Feb 2006 05:24:06 -0000 Subject=>testing 3: 818 bytes From=>[email protected] Date=>Tue Feb 07 22:51:08 2006 Subject=>A B C D E F G 4: 770 bytes From=>[email protected] Date=>Tue Feb 07 23:19:51 2006 Subject=>testing smtpmail 5: 819 bytes From=>[email protected] Date=>Tue Feb 07 23:34:23 2006 Subject=>a b c d e f g [Press Enter key] [Pymail] Action? (i, l, d, s, m, q, ?)l 5
load 5 Connecting... +OK NGPopper vEL_6_10 at earthlink.net ready <[email protected]... -------------------------------------------------------------------------------- Spam; Spam and eggs; Spam, spam, and spam -------------------------------------------------------------------------------- [Pymail] Action? (i, l, d, s, m, q, ?)s 1
load 1 Connecting... +OK NGPopper vEL_6_10 at earthlink.net ready <[email protected]... [Pymail] Action? (i, l, d, s, m, q, ?)m
From?[email protected]
To?[email protected]
Subj?test pymail2 send
Type message text, end with line="."Run away! Run away!
.
Sending to...['[email protected]'] From: [email protected] To: [email protected] Subject: test pymail2 send Date: Wed, 08 Feb 2006 07:09:40 -0000 Run away! Run away! Send exit [Pymail] Action? (i, l, d, s, m, q, ?)q
As you can see, this version’s code eliminates some
complexities, such as the manual formatting of composed mail message
text. It also does a better job of displaying a mail’s text—instead
of blindly listing the full mail text (attachments and all), it uses
mailtools
to fetch the first text
part of the message. The messages we’re using are too simple to show
the difference, but for a mail with attachments, this new version
will be more focused about what it displays.
Moreover, because the interface to mail is encapsulated in the
mailtools
package’s modules, if
it ever must change, it will only need to be changed in that module,
regardless of how many mail clients use its tools. And because this
code is shared, if we know it works for one client, we can be sure
it will work in another; there is no need to debug new code.
On the other hand, pymail2
doesn’t really leverage much of the power of either mailtools
or the underlying email
package it uses. Things like
attachments and inbox synchronization are not handled at all, for
example. To see the full scope of the email
package, we need to explore a larger
email system, such as PyMailGUI or PyMailCGI. The first of these is
the topic of the next chapter, and the second appears in Chapter 17. First, though, let’s
quickly survey a handful of additional client-side protocol
tools.
18.119.162.49