So far, we’ve stepped through the path the system follows to send new mail. Let’s now see what happens when we try to view incoming POP mail.
If you flip back to the main page in Figure 17-2, you’ll see a View link; pressing it triggers the script in Example 17-6 to run on the server.
Example 17-6. PP3EInternetWebPyMailCgicgi-binonRootViewLink.py
#!/usr/bin/python ############################################################## # on view link click on main/root HTML page # this could almost be an HTML file because there are likely # no input params yet, but I wanted to use standard header/ # footer functions and display the site/usernames which must # be fetched; On submission, doesn't send the user along with # password here, and only ever sends both as URL params or # hidden fields after the password has been encrypted by a # user-uploadable encryption module; put HTML in commonhtml? ############################################################## # page template pswdhtml = """ <form method=post action=%sonViewPswdSubmit.py> <p> Please enter POP account password below, for user "%s" and site "%s". <p><input name=pswd type=password> <input type=submit value="Submit"></form></p> <hr><p><i>Security note</i>: The password you enter above will be transmitted over the Internet to the server machine, but is not displayed, is never transmitted in combination with a username unless it is encrypted, and is never stored anywhere: not on the server (it is only passed along as hidden fields in subsequent pages), and not on the client (no cookies are generated). This is still not guaranteed to be totally safe; use your browser's back button to back out of PyMailCgi at any time.</p> """ # generate the password input page import commonhtml # usual parms case: user, pswd, site = commonhtml.getstandardpopfields({}) # from module here, commonhtml.pageheader(kind='POP password input') # from html|url later print pswdhtml % (commonhtml.urlroot, user, site) commonhtml.pagefooter( )
This script is almost all embedded HTML: the triple-quoted
pswdhtml
string is printed, with
string formatting to insert values, in a single step. But because we
need to fetch the username and server name to display on the
generated page, this is coded as an executable script, not as a
static HTML file. The module commonhtml
either loads usernames and
server names from script inputs (e.g., appended as query parameters
to the script’s URL) or imports them from the mailconfig
file; either way, we don’t want
to hardcode them into this script or its HTML, so a simple HTML file
won’t do.
Since this is a script, we can also use the commonhtml
page header and footer routines
to render the generated reply page with a common look-and-feel, as
shown in Figure
17-7.
At this page, the user is expected to enter the password for
the POP email account of the user and server displayed. Notice that
the actual password isn’t displayed; the input field’s HTML
specifies type=password
, which
works just like a normal text field, but shows typed input as stars.
(See also the pymail program in Chapter 14 for doing this at a
console and PyMailGUI in Chapter
15 for doing this in a GUI.)
After you fill out the last page’s password field and press its Submit button, the password is shipped off to the script shown in Example 17-7.
Example 17-7. PP3EInternetWebPyMailCgicgi-binonViewPswdSubmit.py
#!/usr/bin/python ############################################################ # On submit in POP password input window--make view list; # in 2.0 we only fetch mail headers here, and fetch 1 full # message later upon request; we still fetch all headers # each time the index page is made: caching requires a db; ############################################################ import cgi import loadmail, commonhtml from externs import mailtools from secret import encode # user-defined encoder module MaxHdr = 35 # max length of email hdrs in list # only pswd comes from page here, rest usually in module formdata = cgi.FieldStorage( ) mailuser, mailpswd, mailsite = commonhtml.getstandardpopfields(formdata) try: newmails = loadmail.loadmailhdrs(mailsite, mailuser, mailpswd) mailnum = 1 maillist = [] for mail in newmails: # list of hdr text msginfo = [] hdrs = mailtools.MailParser( ).parseHeaders(mail) # email.Message for key in ('Subject', 'From', 'Date'): msginfo.append(hdrs.get(key, '?')[:MaxHdr]) msginfo = ' | '.join(msginfo) maillist.append((msginfo, commonhtml.urlroot + 'onViewListLink.py', {'mnum': mailnum, 'user': mailuser, # data params 'pswd': encode(mailpswd), # pass in URL 'site': mailsite})) # not inputs mailnum += 1 commonhtml.listpage(maillist, 'mail selection list') except: commonhtml.errorpage('Error loading mail index')
This script’s main purpose is to generate a selection list page for the user’s email account, using the password typed into the prior page (or passed in a URL). As usual with encapsulation, most of the details are hidden in other files:
loadmail.loadmailhdrs
Reuses the mailtools
module package from Chapter
14 to fetch email with the POP protocol; we need a
message count and mail headers here to display an index list.
In this version, the software fetches only mail header text to
save time, not full mail messages (provided your server
supports the TOP
command of
the POP interface—if not, see mailconfig
to disable this).
commonhtml.listpage
Generates HTML to display a passed-in list of tuples
(text
, URL
, parameter-dictionary)
as a list of
hyperlinks in the reply page; parameter values show up as
query parameters at the end of URLs in the response.
The maillist
list built
here is used to create the body of the next page—a clickable email
message selection list. Each generated hyperlink in the list page
references a constructed URL that contains enough information for
the next script to fetch and display a particular email message. As
we learned in the last chapter, this is a simple kind of state
retention between pages and scripts.
If all goes well, the mail selection list page HTML generated
by this script is rendered as in Figure 17-8. If your inbox is
as large as some of mine, you’ll probably need to scroll down to see
the end of this page. This page follows the common look-and-feel for
all PyMailCGI pages, thanks to commonhtml
.
If the script can’t access your email account (e.g., because
you typed the wrong password), its try
statement handler instead produces a
commonly formatted error page. Figure 17-9 shows one that
gives the Python exception and details as part of the reply after a
genuine exception is caught; as usual, the exception details are
fetched from sys.exc_info
, and
Python’s traceback
module is used
to generate a stack trace.
The central mechanism at work in Example 17-7 is the generation of URLs that embed message numbers and mail account information. Clicking on any of the View links in the selection list triggers another script, which uses information in the link’s URL parameters to fetch and display the selected email. As mentioned in Chapter 16, because the list’s links are effectively programmed to “know” how to load a particular message, they effectively remember what to do next. Figure 17-10 shows part of the HTML generated by this script (use your web browser View Source option to see this for yourself).
Did you get all that? You may not be able to read generated HTML like this, but your browser can. For the sake of readers afflicted with human-parsing limitations, here is what one of those link lines looks like, reformatted with line breaks and spaces to make it easier to understand:
<tr><th><a href="onViewListLink.py? pswd=%C5%D9%E3%E1%A5%2C%C0D%A4u%AD%A3d%96%7EB& mnum=8& user=pp3e& site=pop.earthlink.net">View</a> 8 <td>America MP3 file | [email protected] | Wed, 18 Jan 2006 15:45:10 -0500 (ES
PyMailCGI generates relative minimal URLs (server and pathname
values come from the prior page, unless set in commonhtml
). Clicking on the word
View in the hyperlink rendered from this HTML
code triggers the onViewListLink
script as usual, passing it all the parameters embedded at the end
of the URL: the POP username, the POP message number of the message
associated with this link, and the POP password and site
information. These values will be available in the object returned
by cgi.FieldStorage
in the next
script run. Note that the mnum
POP message number parameter differs in each link because each opens
a different message when clicked and that the text after <td>
comes from message headers
extracted by the mailtools
package, using the email
package.
The commonhtml
module
escapes all of the link parameters with the urllib
module, not cgi.escape
, because they are part of a
URL. This is obvious only in the pswd
password parameter—its value has been
encrypted, but urllib
additionally escapes nonsafe characters in the encrypted string per
URL convention (that’s where all the %xx
characters come from). It’s OK if the
encryptor yields odd—even nonprintable—characters because URL
encoding makes them legible for transmission. When the password
reaches the next script, cgi.FieldStorage
undoes URL escape
sequences, leaving the encrypted password string without %
escapes.
It’s instructive to see how commonhtml
builds up the stateful link
parameters. Earlier, we learned how to use the urllib.quote_plus
call to escape a string
for inclusion in URLs:
>>>import urllib
>>>urllib.quote_plus("There's bugger all down here on Earth")
'There%27s+bugger+all+down+here+on+Earth'
The module commonhtml
,
though, calls the higher-level urllib.urlencode
function, which
translates a dictionary of name:value
pairs into a complete URL query parameter string, ready to add after
a ?
marker in a URL. For
instance, here is urlencode
in
action at the interactive prompt:
>>>parmdict = {'user': 'Brian',
...'pswd': '#!/spam',
...'text': 'Say no more, squire!'}
>>>urllib.urlencode(parmdict)
'pswd=%23%21/spam&user=Brian&text=Say+no+more,+squire%21' >>>"%s?%s" % ("http://scriptname.py", urllib.urlencode(parmdict))
'http://scriptname.py?pswd=%23%21/spam&user=Brian&text=Say+no+more,+squire%21'
Internally, urlencode
passes each name and value in the dictionary to the built-in
str
function (to make sure they
are strings), and then runs each one through urllib.quote_plus
as they are added to the
result. The CGI script builds up a list of similar dictionaries and
passes it to commonhtml
to be
formatted into a selection list page.[*]
In broader terms, generating URLs with parameters like this is one way to pass state information to the next script (along with cookies, hidden form input fields, and server databases, discussed in Chapter 16). Without such state information, users would have to reenter the username, password, and site name on every page they visit along the way.
Incidentally, the list generated by this script is not radically different in functionality from what we built in the PyMailGUI program in Chapter 15, though the two differ cosmetically. Figure 17-11 shows this strictly client-side GUI’s view on the same email list displayed in Figure 17-8.
However, PyMailGUI uses the Tkinter GUI library to build up a user interface instead of sending HTML to a browser. It also runs entirely on the client and downloads mail from the POP server directly to the client machine over sockets on demand. Because it retains memory for the duration of the session, PyMailGUI can easily minimize mail server access. After the initial header load, it needs to load only newly arrived email headers on load requests. Moreover, it can update its email index in-memory on deletions instead of reloading anew from the server, and it has enough state to perform safe deletions of messages that check for server inbox matches. PyMailGUI also remembers emails you’ve already viewed—they need not be reloaded again while the program runs.
In contrast, PyMailCGI runs on the web server machine and simply displays mail text on the client’s browser—mail is downloaded from the POP server machine to the web server, where CGI scripts are run. Due to the autonomous nature of CGI scripts, PyMailCGI by itself has no automatic memory that spans pages and may need to reload headers and already viewed messages during a single session. These architecture differences have some important ramifications, which we’ll discuss later in this chapter.
In onViewPswdSubmit
’s source code (Example 17-7), notice that
password inputs are passed to an encode
function as they are added to the
parameters dictionary; hence they show up encrypted in hyperlinked
URLs. They are also URL encoded for transmission (with %
escapes) and are later decoded and
decrypted within other scripts as needed to access the POP account.
The password encryption step, encode
, is at the heart of PyMailCGI’s
security policy.
In Python today, the standard library’s socket
module supports Secure Sockets
Layer (SSL), if the required library is built into your Python. SSL
automatically encrypts transmitted data to make it safe to pass over
the Net. Unfortunately, for reasons we’ll discuss when we reach the
secret.py
module later in this
chapter (see Example
17-13), this wasn’t a universal solution for PyMailCGI’s
password data. (In short, the web server we’re using doesn’t
directly support its end of a secure HTTP encrypted dialog.) Because
of that, an alternative scheme was devised to minimize the chance
that email account information could be stolen off the Net in
transit.
Here’s how it works. When this script is invoked by the
password input page’s form, it gets only one input parameter: the
password typed into the form. The username is imported from a
mailconfig
module installed on
the server; it is not transmitted together with the unencrypted
password (that could be intercepted).
To pass the POP username and password to the next page as
state information, this script adds them to the end of the mail
selection list URLs, but only after the password has been encrypted
by secret.encode
—a function in a
module that lives on the server and may vary in every location that
PyMailCGI is installed. In fact, PyMailCGI was written to not have
to know about the password encryptor at all; because the encoder is
a separate module, you can provide any flavor you like. Unless you
also publish your encoder module, the encoded password shipped with
the username won’t be of much help to snoopers.
The upshot is that normally, PyMailCGI never sends or receives
both username and password values together in a single transaction,
unless the password is encrypted with an encryptor of your choice.
This limits its utility somewhat (since only a single account
username can be installed on the server), but the alternative of
popping up two pages—one for password entry and one for user—seems
even less friendly. In general, if you want to read your mail with
the system as coded, you have to install its files on your server,
edit its mailconfig.py to reflect your account
details, and change its secret.py
encryptor as desired.
One exception: since any CGI script can be invoked with
parameters in an explicit URL instead of form field values, and
since commonhtml
tries to fetch
inputs from the form object before importing them from mailconfig
, it is possible for any
person to use this script to check his mail without installing and
configuring a copy of PyMailCGI. For example, a URL such as the
following typed into your browser’s address field or submitted
with tools such as urllib
(but
without the line break used to make it fit here):
http://localhost:8000/cgi-bin/ onViewPswdSubmit.py?user=pp3e&pswd=guess&site=pop.earthlink.net
will actually load email into a selection list page such as that in Figure 17-8, using whatever user, password, and mail site names are appended to the URL. From the selection list, you may then view, reply, forward, and delete email.
Notice that at this point in the interaction, the password
you send in a URL of this form is not
encrypted. Later scripts expect that the password inputs will be
sent encrypted, though, which makes it more difficult to use them
with explicit URLs (you would need to match the encrypted form
produced by the secret
module
on the server). Passwords are encrypted as they are added to links
in the reply page’s selection list, and they remain encrypted in
URLs and hidden form fields thereafter.
But you shouldn’t use a URL like this, unless you don’t care about exposing your email password. Sending your unencrypted mail user ID and password strings across the Net in a URL such as this is unsafe and open to snoopers. In fact, it’s like giving away your email—anyone who intercepts this URL, or views it in a server logfile will have complete access to your email account. It is made even more treacherous by the fact that this URL format appears in a book that will be distributed all around the world.
If you care about security and want to use PyMailCGI on a
remote server, install it on your server and configure mailconfig
and secret
. That should at least guarantee
that both your user and password information will never be
transmitted unencrypted in a single transaction. This scheme may
still not be foolproof, so be careful out there. Without secure
HTTP and sockets, the Internet is a “use at your own risk”
medium.
Back to our page flow; at this point, we are still viewing the message selection list in Figure 17-8. When we click on one of its generated hyperlinks, the stateful URL invokes the script in Example 17-8 on the server, sending the selected message number and mail account information (user, password, and site) as parameters on the end of the script’s URL.
Example 17-8. PP3EInternetWebPyMailCgicgi-binonViewListLink.py
#!/usr/bin/python ############################################################## # On user click of message link in main selection list; # cgi.FieldStorage undoes any urllib escapes in the link's # input parameters (%xx and '+' for spaces already undone); # in 2.0 we only fetch 1 mail here, not entire list again! # in 2.0 we also find mail's main text part intelligently # instead of blindly displaying full text (poss attachments), # and generate links to attachment files saved on the server; # saved attachment files only work for 1 user and 1 message; # most 2.0 enhancements inherited from the mailtools package; ############################################################## import cgi import commonhtml, secret from externs import mailtools #commonhtml.dumpstatepage(0) def saveAttachments(message, parser, savedir='partsdownload'): """ save fetched email's parts to files on server to be viewed in user's web browser """ import os if not os.path.exists(savedir): # in CGI script's cwd on server os.mkdir(savedir) # will open per your browser for filename in os.listdir(savedir): # clean up last message: temp! dirpath = os.path.join(savedir, filename) os.remove(dirpath) typesAndNames = parser.saveParts(savedir, message) filenames = [fname for (ctype, fname) in typesAndNames] for filename in filenames: os.chmod(filename, 0666) # some srvrs may need read/write return filenames form = cgi.FieldStorage( ) user, pswd, site = commonhtml.getstandardpopfields(form) pswd = secret.decode(pswd) try: msgnum = form['mnum'].value # from URL link parser = mailtools.MailParser( ) fetcher = mailtools.SilentMailFetcher(site, user, pswd) fulltext = fetcher.downloadMessage(int(msgnum)) # don't eval! message = parser.parseMessage(fulltext) # email.Message parts = saveAttachments(message, parser) # for URL links mtype, content = parser.findMainText(message) # first txt part commonhtml.viewpage(msgnum, message, content, form, parts) # encoded pswd except: commonhtml.errorpage('Error loading message')
Again, much of the work here happens in the commonhtml
module, listed later in this
section (see Example
17-14). This script adds logic to decode the input password
(using the configurable secret
encryption module) and extract the selected mail’s headers and text
using the mailtools
module
package from Chapter 14 again.
The full text of the selected message is ultimately fetched and
parsed by mailtools
, using the
standard library’s poplib
module
and email
package. Although we’ll
have to refetch this message if viewed again, this version does not
grab all mails to get just the one selected.[*]
Also new in this version, the saveAttachments
function in this script
splits off the parts of a fetched message and stores them in a
directory on the web server machine. This was discussed earlier in
this chapter—the view page is then augmented with URL links that
point at the saved part files. Your web browser will open them
according to their filenames. All the work of part extraction and
naming is inherited from mailtools
. Part files are kept
temporarily; they are deleted when the next message is fetched. They
are also currently stored in a single directory and so apply to only
a single user.
If the message can be loaded and parsed successfully, the
result page, shown in Figure
17-12, allows us to view, but not edit, the mail’s text. The
function commonhtml.viewpage
generates a “read-only” HTML option for all the text widgets in this
page.
View pages like this have a pull-down action selection list near the bottom; if you want to do more, use this list to pick an action (Reply, Forward, or Delete) and click on the Next button to proceed to the next screen. If you’re just in a browsing frame of mind, click the “Back to root page” link at the bottom to return to the main page, or use your browser’s Back button to return to the selection list page.
Figure 17-12 show the mail we sent earlier in this chapter, being viewed (we sent it to ourselves). Notice its “Parts:” links—when clicked, they trigger URLs that open the temporary part files on the server, according to your browser’s rules for the file type. For instance, clicking in the “doc” file will likely open it in Microsoft Word on Windows; selecting the “jpg” link will open it either in a local image viewer or within the browser itself, as captured in Figure 17-13.
What you don’t see on the view page in Figure 17-12 is just as
important as what you do see. We need to defer to Example 17-14 for coding
details, but something new is going on here. The original message
number, as well as the POP user and (still encrypted) password
information sent to this script as part of the stateful link’s URL,
wind up being copied into the HTML used to create this view page, as
the values of hidden input fields in the form. The hidden field
generation code in commonhtml
looks like this:
print '<form method=post action="%s/onViewPageAction.py">' % urlroot print '<input type=hidden name=mnum value="%s">' % msgnum print '<input type=hidden name=user value="%s">' % user # from page|url print '<input type=hidden name=site value="%s">' % site # for deletes print '<input type=hidden name=pswd value="%s">' % pswd # pswd encoded
As we’ve learned, much like parameters in generated hyperlink URLs, hidden fields in a page’s HTML allow us to embed state information inside this web page itself. Unless you view that page’s source, you can’t see this state information because hidden fields are never displayed. But when this form’s Submit button is clicked, hidden field values are automatically transmitted to the next script along with the visible fields on the form.
Figure 17-14 shows part of the source code generated for another message’s view page; the hidden input fields used to pass selected mail state information are embedded near the top.
The net effect is that hidden input fields in HTML, just like parameters at the end of generated URLs, act like temporary storage areas and retain state between pages and user interaction steps. Both are the Web’s equivalent to programming language variables. They come in handy anytime your application needs to remember something between pages.
Hidden fields are especially useful if you cannot invoke the next script from a generated URL hyperlink with parameters. For instance, the next action in our script is a form submit button (Next), not a hyperlink, so hidden fields are used to pass state. As before, without these hidden fields, users would need to reenter POP account details somewhere on the view page if they were needed by the next script (in our example, they are required if the next action is Delete).
Notice that everything you see on the message view
page in Figure 17-14 is
escaped with cgi.escape
. Header
fields and the text of the mail itself might contain characters that
are special to HTML and must be translated as usual. For instance,
because some mailers allow you to send messages in HTML format, it’s
possible that an email’s text could contain a </textarea>
tag, which would throw
the reply page hopelessly out of sync if not escaped.
One subtlety here: HTML escapes are important only when text is sent to the browser initially (by the CGI script). If that text is later sent out again to another script (e.g., by sending a reply mail), the text will be back in its original, nonescaped format when received again on the server. The browser parses out escape codes and does not put them back again when uploading form data, so we don’t need to undo escapes later. For example, here is part of the escaped text area sent to a browser during a Reply transaction (use your browser’s View Source option to see this live):
<tr><th align=right>Text: <td><textarea name=text cols=80 rows=10 readonly> more stuff --Mark Lutz (http://rmi.net/~lutz) [PyMailCgi 2.0] > -----Original Message----- > From: [email protected] > To: [email protected] > Date: Tue May 2 18:28:41 2000 > > <table><textarea> > </textarea></table> > --Mark Lutz (http://rmi.net/~lutz) [PyMailCgi 2.0] > > > > -----Original Message-----
After this reply is delivered, its text looks as it did before escapes (and exactly as it appeared to the user in the message edit web page):
more stuff --Mark Lutz (http://rmi.net/~lutz) [PyMailCgi 2.0] > -----Original Message----- > From: [email protected] > To: [email protected] > Date: Tue May 2 18:28:41 2000 > > <table><textarea> > </textarea></table> > --Mark Lutz (http://rmi.net/~lutz) [PyMailCgi 2.0] > > > > -----Original Message-----
Did you notice the odd characters in the hidden password field of the generated HTML screenshot (Figure 17-14)? It turns out that the POP password is still encrypted when placed in hidden fields of the HTML. For security, they have to be. Values of a page’s hidden fields can be seen with a browser’s View Source option, and it’s not impossible that the text of this page could be intercepted off the Net.
The password is no longer URL encoded when put in the hidden
field, however, even though it was when it appeared at the end of
the smart link URL. Depending on your encryption module, the
password might now contain nonprintable characters when generated as
a hidden field value here; the browser doesn’t care, as long as the
field is run through cgi.escape
like everything else added to the HTML reply stream. The commonhtml
module is careful to route all
text and headers through cgi.escape
as the view page is
constructed.
As a comparison, Figure 17-15 shows what the mail message captured in Figure 17-12 looks like when viewed in PyMailGUI, the client-side Tkinter-based email tool from Chapter 15. In that program, message parts are listed with the Parts button and are extracted, saved, and opened with the Split button; we also get quick-access buttons to parts and attachments just below the message headers. The net effect is similar.
PyMailGUI doesn’t need to care about things such as passing state in URLs or hidden fields (it saves state in Python in-process variables) or escaping HTML and URL strings (there are no browsers, and no network transmission steps once mail is downloaded). It also doesn’t have to rely on temporary server file links to give access to message parts—the message is retained in memory attached to a window object and lives on between interactions. PyMailGUI does require Python to be installed on the client, but we’ll return to that in a few pages.
[*] Technically, again, you should generally escape &
separators in generated URL
links by running the URL through cgi.escape
, if any parameter’s name
could be the same as that of an HTML character escape code
(e.g., &=high
). See
Chapter 16 for more
details; they aren’t escaped here because there are no clashes
between URL and HTML.
[*] Notice that the message number arrives as a string and
must be converted to an integer in order to be used to fetch the
message. But we’re careful not to convert with eval
here, since this is a string
passed over the Net and could have arrived embedded at the end
of an arbitrary URL (remember that earlier warning?).
18.190.239.166