Let’s get back to writing some code again. It’s time for something a bit more useful than the examples we’ve seen so far (well, more entertaining, at least). This section presents a program that displays the basic syntax required by various programming languages to print the string “Hello World,” the classic language benchmark.
To keep it simple, this example assumes that the string is printed to the standard output stream in the selected language, not to a GUI or web page. It also gives just the output command itself, not the complete programs. The Python version happens to be a complete program, but we won’t hold that against its competitors here.
Structurally, the first cut of this example consists of a main page HTML file, along with a Python-coded CGI script that is invoked by a form in the main HTML page. Because no state or database data is stored between user clicks, this is still a fairly simple example. In fact, the main HTML page implemented by Example 16-17 is mostly just one big pull-down selection list within a form.
Example 16-17. PP3EInternetWeblanguages.html
<html><title>Languages</title> <body> <h1>Hello World selector</h1> <P>This demo shows how to display a "hello world" message in various programming languages' syntax. To keep this simple, only the output command is shown (it takes more code to make a complete program in some of these languages), and only text-based solutions are given (no GUI or HTML construction logic is included). This page is a simple HTML file; the one you see after pressing the button below is generated by a Python CGI script which runs on the server. Pointers: <UL> <LI>To see this page's HTML, use the 'View Source' command in your browser. <LI>To view the Python CGI script on the server, <A HREF="cgi-bin/languages-src.py">click here</A> or <A HREF="cgi-bin/getfile.py?filename=cgi-bin/languages.py">here</A>. <LI>To see an alternative version that generates this page dynamically, <A HREF="cgi-bin/languages2.py">click here</A>. </UL></P> <hr> <form method=POST action="cgi-bin/languages.py"> <P><B>Select a programming language:</B> <P><select name=language> <option>All <option>Python <option>Perl <option>Tcl <option>Scheme <option>SmallTalk <option>Java <option>C <option>C++ <option>Basic <option>Fortran <option>Pascal <option>Other </select> <P><input type=Submit> </form> </body></html>
For the moment, let’s ignore some of the hyperlinks near the middle of this file; they introduce bigger concepts like file transfers and maintainability that we will explore in the next two sections. When visited with a browser, this HTML file is downloaded to the client and is rendered into the new browser page shown in Figure 16-21.
That widget above the Submit button is a pull-down selection
list that lets you choose one of the <option>
tag values in the HTML file.
As usual, selecting one of these language names and pressing the
Submit button at the bottom (or pressing your Enter key) sends the
selected language name to an instance of the server-side CGI script
program named in the form’s action
option. Example 16-18
contains the Python script that is run by the web server upon
submission.
Example 16-18. PP3EInternetWebcgi-binlanguages.py
#!/usr/bin/python ############################################################################# # show hello world syntax for input language name; note that it uses r'...' # raw strings so that ' ' in the table are left intact, and cgi.escape( ) # on the string so that things like '<<' don't confuse browsers--they are # translated to valid HTML code; any language name can arrive at this script, # since explicit URLs "http://servername/cgi-bin/languages.py?language=Cobol" # can be typed in a web browser or sent by a script (e.g., urllib.urlopen). # caveats: the languages list appears in both the CGI and HTML files--could # import from single file if selection list generated by a CGI script too; ############################################################################# debugme = False # True=test from cmd line inputkey = 'language' # input parameter name hellos = { 'Python': r" print 'Hello World' ", 'Perl': r' print "Hello World "; ', 'Tcl': r' puts "Hello World" ', 'Scheme': r' (display "Hello World") (newline) ', 'SmallTalk': r" 'Hello World' print. ", 'Java': r' System.out.println("Hello World"); ', 'C': r' printf("Hello World "); ', 'C++': r' cout << "Hello World" << endl; ', 'Basic': r' 10 PRINT "Hello World" ', 'Fortran': r" print *, 'Hello World' ", 'Pascal': r" WriteLn('Hello World'), " } class dummy: # mocked-up input obj def _ _init_ _(self, str): self.value = str import cgi, sys if debugme: form = {inputkey: dummy(sys.argv[1])} # name on cmd line else: form = cgi.FieldStorage( ) # parse real inputs print 'Content-type: text/html ' # adds blank line print '<TITLE>Languages</TITLE>' print '<H1>Syntax</H1><HR>' def showHello(form): # HTML for one language choice = form[inputkey].value print '<H3>%s</H3><P><PRE>' % choice try: print cgi.escape(hellos[choice]) except KeyError: print "Sorry--I don't know that language" print '</PRE></P><BR>' if not form.has_key(inputkey) or form[inputkey].value == 'All': for lang in hellos.keys( ): mock = {inputkey: dummy(lang)} showHello(mock) else: showHello(form) print '<HR>'
And as usual, this script prints HTML code to the standard output stream to produce a response page in the client’s browser. Not much is new to speak of in this script, but it employs a few techniques that merit special focus:
Notice the use of raw strings (string constants preceded by an “r” character)
in the language syntax dictionary. Recall that raw strings
retain backslash characters
in the string literally, instead of interpreting them as string
escape-code introductions. Without them, the
newline character sequences in some
of the language’s code snippets would be interpreted by Python
as line feeds, instead of being printed in the HTML reply as
. The code also uses double
quotes for strings that embed an unescaped single-quote
character, per Python’s normal string rules.
This script takes care to format the text of each
language’s code snippet with the cgi.escape
utility function. This
standard Python utility automatically translates characters that
are special in HTML into HTML escape code sequences, so that
they are not treated as HTML operators by browsers. Formally,
cgi.escape
translates
characters to escape code sequences, according to the standard
HTML convention: <
,
>
, and &
become <
, >
, and &
. If you pass a second true
argument, the double-quote character ("
) is translated to "
.
For example, the <<
left-shift operator in the
C++ entry is translated to <<
—a pair of HTML escape
codes. Because printing each code snippet effectively embeds it
in the HTML response stream, we must escape any special HTML
characters it contains. HTML parsers (including Python’s
standard htmllib
module)
translate escape codes back to the original characters when a
page is rendered.
More generally, because CGI is based upon the notion of passing formatted strings across the Net, escaping special characters is a ubiquitous operation. CGI scripts almost always need to escape text generated as part of the reply to be safe. For instance, if we send back arbitrary text input from a user or read from a data source on the server, we usually can’t be sure whether it will contain HTML characters, so we must escape it just in case.
In later examples, we’ll also find that characters
inserted into URL address strings generated by our scripts may
need to be escaped as well. A literal &
in a URL is special, for
example, and must be escaped if it appears embedded in text we
insert into a URL. However, URL syntax reserves different
special characters than HTML code, and so different escaping
conventions and tools must be used. As we’ll see later in this
chapter, cgi.escape
implements escape translations in HTML code, but urllib.quote
(and its relatives)
escapes characters in URL strings.
Here again, form inputs are “mocked up” (simulated), both
for debugging and for responding to a request for all languages
in the table. If the script’s global debugme
variable is set to a true
value, for instance, the script creates a dictionary that is
plug-and-play compatible with the result of a cgi.FieldStorage
call—its “languages”
key references an instance of the dummy
mock-up class. This class in
turn creates an object that has the same interface as the
contents of a cgi.FieldStorage
result—it makes an
object with a value
attribute
set to a passed-in string.
The net effect is that we can test this script by running it from the system command line: the generated dictionary fools the script into thinking it was invoked by a browser over the Net. Similarly, if the requested language name is “All,” the script iterates over all entries in the languages table, making a mocked-up form dictionary for each (as though the user had requested each language in turn).
This lets us reuse the existing showHello
logic to display each
language’s code in a single page. As always in Python, object
interfaces and protocols are what we usually code for, not
specific datatypes. The showHello
function will happily
process any object that responds to the syntax form['language'].value
.[*] Notice that we could achieve similar results with
a default argument in showHello
, albeit at the cost of
introducing a special case in its code.
Now let’s get back to interacting with this program. If we select a particular language, our CGI script generates an HTML reply of the following sort (along with the required content-type header and blank line). Use your browser’s View Source option to see this:
<TITLE>Languages</TITLE> <H1>Syntax</H1><HR> <H3>Scheme</H3><P><PRE> (display "Hello World") (newline) </PRE></P><BR> <HR>
Program code is marked with a <PRE>
tag to specify preformatted text
(the browser won’t reformat it like a normal text paragraph). This
reply code shows what we get when we pick Scheme. Figure 16-22 shows the page
served up by the script after selecting Python in the pull-down
selection list.
Our script also accepts a language name of “All” and interprets
it as a request to display the syntax for every language it knows
about. For example, here is the HTML that is generated if we set the
global variable debugme
to True
and run from the system command line
with a single argument, All
. This
output is the same as what is printed to the client’s web browser in
response to an “All” selection:[*]
C:...PP3EInternetWebcgi-bin>python languages.py All
Content-type: text/html
<TITLE>Languages</TITLE>
<H1>Syntax</H1><HR>
<H3>C</H3><P><PRE>
printf("Hello World
");
</PRE></P><BR>
<H3>Java</H3><P><PRE>
System.out.println("Hello World");
</PRE></P><BR>
<H3>Python</H3><P><PRE>
print 'Hello World'
</PRE></P><BR>
<H3>Pascal</H3><P><PRE>
WriteLn('Hello World'),
</PRE></P><BR>
<H3>C++</H3><P><PRE>
cout << "Hello World" << endl;
</PRE></P><BR>
<H3>Perl</H3><P><PRE>
print "Hello World
";
</PRE></P><BR>
<H3>Fortran</H3><P><PRE>
print *, 'Hello World'
</PRE></P><BR>
<H3>Tcl</H3><P><PRE>
puts "Hello World"
</PRE></P><BR>
<H3>Basic</H3><P><PRE>
10 PRINT "Hello World"
</PRE></P><BR>
<H3>Scheme</H3><P><PRE>
(display "Hello World") (newline)
</PRE></P><BR>
<H3>SmallTalk</H3><P><PRE>
'Hello World' print.
</PRE></P><BR>
<HR>
Each language is represented here with the same code pattern—the
showHello
function is called for
each table entry, along with a mocked-up form object. Notice the way
that C++ code is escaped for embedding inside the HTML stream; this is
the cgi.escape
call’s handiwork.
Your web browser translates the <
escapes to <
characters when the page is rendered.
When viewed with a browser, the “All” response page is rendered as
shown in Figure
16-23.
So far, we’ve been triggering the CGI script by
selecting a language name from the pull-down list in the main HTML
page. In this context, we can be fairly sure that the script will
receive valid inputs. Notice, though, that there is nothing to
prevent a client from passing the requested language name at the end
of the CGI script’s URL as an explicit query parameter, instead of
using the HTML page form. For instance, a URL of the following kind
typed into a browser’s address field or submitted with the module
urllib
:
http://localhost/cgi-bin/languages.py?language=Python
yields the same “Python” response page shown in Figure 16-22. However,
because it’s always possible for a user to bypass the HTML file and
use an explicit URL, a user could invoke our script with an unknown
language name, one that is not in the HTML file’s pull-down list
(and so not in our script’s table). In fact, the script might be
triggered with no language input at all if someone explicitly
submits its URL with no language
parameter (or no parameter value) at the end. Such an erroneous URL
could be entered into a browser’s address field, or be sent by
another script using the urllib
module techniques described earlier in this chapter:
>>>from urllib import urlopen
>>>request = 'http://localhost/cgi-bin/languages.py?language=Python'
>>>reply = urlopen(request).read( )
>>>print reply
<TITLE>Languages</TITLE> <H1>Syntax</H1><HR> <H3>Python</H3><P><PRE> print 'Hello World' </PRE></P><BR> <HR>
To be robust, the script checks for both cases explicitly, as all CGI scripts generally should. For instance, here is the HTML generated in response to a request for the fictitious language GuiDO (you can also see this by selecting your browser’s View Source option, after typing the URL manually into your browser’s address field):
>>>request = 'http://localhost/cgi-bin/languages.py?language=GuiDO'
>>>reply = urlopen(request).read( )
>>>print reply
<TITLE>Languages</TITLE> <H1>Syntax</H1><HR> <H3>GuiDO</H3><P><PRE> Sorry--I don't know that language </PRE></P><BR> <HR>
If the script doesn’t receive any language name input, it
simply defaults to the “All” case (this can also be triggered if the
URL ends with just ?language=
and
no language name value):
>>>reply = urlopen('http://localhost/cgi-bin/languages.py').read( )
>>>print reply
<TITLE>Languages</TITLE> <H1>Syntax</H1><HR> <H3>C</H3><P><PRE> printf("Hello World "); </PRE></P><BR> <H3>Java</H3><P><PRE> System.out.println("Hello World"); </PRE></P><BR> <H3>Python</H3><P><PRE> print 'Hello World' ...more...
If we didn’t detect these cases, chances are that our script
would silently die on a Python exception and leave the user with a
mostly useless half-complete page or with a default error page (we
didn’t assign stderr
to stdout
here, so no Python error message
would be displayed). Figure
16-24 shows the page generated if the script is invoked with
an explicit URL like this:
http://localhost/cgi-bin/languages.py?language=COBOL
To test this error case interactively, the pull-down list includes an “Other” name, which produces a similar error page reply. Adding code to the script’s table for the COBOL “Hello World” program is left as an exercise for the reader.
[*] If you are reading closely, you might notice that this
is the second time we’ve used mock-ups in this chapter (see
the earlier tutor4.cgi example). If you
find this technique generally useful, it would probably make
sense to put the dummy
class, along with a function for populating a form
dictionary on demand, into a module so that it can be
reused. In fact, we will do that in the next section. Even
for two-line classes like this, typing the same code the
third time around will do much to convince you of the power
of code reuse.
[*] Interestingly, we also get the “All” reply if debugme
is set to False
when we run the script from the
command line. Instead of throwing an exception, the cgi.FieldStorage
call returns an empty
dictionary if called outside the CGI environment, so the test for
a missing key kicks in. It’s likely safer to not rely on this
behavior, however.
3.144.37.196