Developing CGI Scripts with CGI.pm

The core of any CGI script is the CGI.pm module. While you can write CGI scripts using raw Perl, or using various other CGI libraries out there, CGI.pm is available as part of the Perl distribution, is well-supported and robust, runs across platforms, and gives you just about any feature you might want as you work with CGI. Working without it will make your CGI-scripting life much harder.

In the preceding section, I showed you a really simple example of using CGI.pm. Let's explore the features of this module in more detail in this section, so you know the sorts of things you can do with it.

Using CGI.pm

To use CGI.pm in your Perl scripts, you import it like you do any module. CGI.pm has several import tags you can use, including the following:

  • :cgi imports features of CGI protocol itself, including param()

  • :html2 imports features to help generate HTML 2 tags including start_html() and end_html()

  • :form imports features for generating form elements

  • :standard imports all the features from :cgi, :html2, and :form.

  • :html3 imports features from HTML 3.0

  • :netscape imports features from Netscape's version of HTML

  • :html imports all the features from :html2, :html3, and :netscape.

  • :all imports everything

CGI.pm is implemented such that you can use it in an object-oriented way (using a CGI object and calling methods defined for that object), or by calling plain subroutines. If you use any of the preceding import tags, you'll have those subroutine names available to you in your script. If you just use CGI without any import tags, it's assumed you'll be using the object-oriented version of CGI.

Processing Form Input

The most significant feature that CGI.pm gives you is that it handles the CGI-encoded form input that the Web browser sends to the Web server and the server passes on to the script. This input comes from the browser in a special encoded form, sometimes on the standard input and sometimes as keyword arguments, with nonalphanumeric characters encoded in hex. If you were writing a raw CGI processor you'd have to deal with all that decoding yourself (and it's not pretty). By using CGI.pm you can avoid all that stuff and deal solely with the actual input values, which is what you really care about.

The input you get from a form is composed of key/value pairs, where the key is the name of the form element (as indicated by the name attribute in HTML), and the value is the actual thing the user typed, selected, or chose from the form when it was submitted. The value you actually get depends on the form element—some values are a string, as with text fields; others might be simply yes or no, as with check boxes. Pop-up menus and scrolling lists created with the HTML <select> tag (or check box groups) might have multiple values.

The CGI.pm module stores these keys and values in a parameter array. To get at them you use the param() subroutine. Without any arguments, param() returns a list of keys in the parameter array (the names of all the form elements). This is mostly useful to see whether the form was filled out in the case where the CGI script generates both the initial HTML page and the result. You could also call param() without any arguments if you were printing out all the keys and values on the form for debugging purposes, like this:

foreach $key (param()) {
   print "$key has the value ", param($key), "
";
}

Note that the order of the parameters in the array is the same order in which the browser sent them, originally, which in many cases will be the same order in which they appear on the page. However, that behavior isn't guaranteed, so you're safest referring to each form element explicitly if you're concerned about order.

The param() subroutine with a form element name as an argument returns the value of that form element, or undef if there's no submitted value for that form element. This is probably the way you'll use param() most often in your own CGI scripts. The key you use as the argument to param() must match the name of the form elements in the HTML file exactly—to get to the value of a text field defined <INPUT NAME="foozle"> you'd use param('foozle'). Most of the time you'll get back a single scalar value, but some form elements that allow multiple selections will return a list of all the possible selections. It's up to you to handle the different values you get from a form in the CGI script.

Generating HTML

The bulk of a CGI script is often taken up mostly by generating the HTML for the response. The scripts we write for handling CGI will probably have more print statements than any other scripts we've written so far.

To generate HTML output, you simply print lines of output to the standard output as you would in any other script. You have a number of options for doing this:

  • Using individual print statements

  • Using “here” documents

  • Using shortcut subroutines from CGI.pm

Output with print

The first way is simply to use print with the bit of HTML to output, as you've been doing all along, like this:

print "<html><head><title>This is a Web page</title></head>
";
print "<body bgcolor="white">
";
# and so on

Output with Here Documents

While print statements work fine, they can get somewhat unwieldy, particularly when you have a lot of print and you have to deal with nested quotation marks (as with the “white” value in the preceding example). If you have a block of HTML to print pretty much verbatim, you can use a Perl feature called a here document. Awkward name, but what it means is simply “print everything up to here.” A here document looks something like this:

print <<EOF;
These lines will be printed
as they appear here, without the need
for fancy print statements, "escaped quotes,"
or special newline characters.  Just like this.
EOF

That bit of Perl code results in the following output:

These lines will be printed
as they appear here, without the need
for fancy print statements, "escaped quotes,"
or special newline characters.  Just like this.

In other words, the text outputs pretty much exactly the same way the text appears in the script. The initial print line in the here document determines how far to read and print before stopping, using a special code that can be either a word that has no other meaning in the language, or a quoted string. Here I used EOF, which is a nice, short, common phrase and easy to pick out from the rest of the script.

The quotes you use around that word, if any, determine how the text inside the here document is processed. A single word, like the EOF we used above, allows variable interpolation as if the text inside the here document were a double-quoted string. The same is true if you use a double-quoted word (“EOF”). A single-quoted word ('EOF') suppresses variable interpolation just as it does in a regular single-quoted string.

The end of the here document is that same word or string that started the here document, minus the quotes, on a line by itself with no leading trailing characters or whitespace. After the ending tag you can start another here document, go back to print, or use any other lines of Perl code that you like.

For more information on using here documents, see the perldata man page.

Output with CGI.pm Subroutines

The third way to generate HTML in a CGI script is to use CGI.pm's subroutines for doing so. Most HTML tags have equivalent Perl subroutines in CGI.pm, and the CGI.pm subroutines have the advantage of letting you insert variable references (through double-quoted strings) and to generate things like form elements quickly. In addition, CGI.pm has the start_html() and end_html() subroutines to print the top and bottom of an HTML file, respectively. All the HTML generation subroutines return strings; you'll need to use them with a print to actually print them.

Some subroutines generate one-sided tags that take no arguments (p() and hr(), for example, which generate <p> and <hr> tags, respectively). Others create opening and closing tags, in which case they take one or more string arguments for the text in between the opening and closing tags. You can nest subroutines inside other subroutines:

h1('This is a heading'),     # <H1>This is a heading</H1>
b('Bold'),                   # <B>Bold</n>
# <b>some bold and some <i>italic</i></b>
b('some bold and some', i('italic'));
ol(
  li('item one),
  li('item two'),
  li('item three'),
);

If the HTML tag takes attributes inside the tag itself, indicate those using curly brackets {} and the name and value of the attribute separated using => (as with hashes):

a({href=>"index.html", name=>"foo"} , "Home Page");
# <a href="index.html" name="foo">Home Page</a>

Most every HTML tag you can use in an HTML file is available as a subroutine, although different tags might be available at different times depending which group of subroutines you import in your use CGI line. CGI.pm has a particularly robust set of subroutines for generating other form elements. I won't go through all of them here; if you're interested, see the documentation for CGI.pm.

Debugging the Result

You've already seen, in the Hello example, one of the ways to debug your scripts before you install them by entering your CGI input as name=value pairs either on the script command line or as standard input. This mechanism can be invaluable in fixing the smaller errors that creep up as you're writing your CGI scripts. By running the script from the command line you can also use the Perl debugger to make sure your script is running right before you install it.

There comes a time, however, when you need to install the CGI script and run it in place to make sure that it's working right. When it's installed, however, it can be difficult to debug because errors tend to get reported to the browser with unhelpful messages like Server Error 500, or to the error logs without any kind of identifier or timestamp to figure out which errors are yours.

That's where the CGI::Carp module comes in. CGI::Carp comes with CGI.pm, although as with the latter module it should also be part of your standard Perl distribution (make sure you look for it in the CGI subdirectory of Perl's lib directory; there's also a regular Carp module that's related, but not the same thing). Carp is used to generate error messages for CGI scripts, which can be useful for debugging those scripts. In particular, the fatalsToBrowser keyword can be very useful for debugging because it prints any Perl errors in the CGI script as an HTML response, which are then displayed in response to the form submission in the browser that called the form in the first place. To echo these errors to the browser, include a use line at the top of your script like this:

use CGI::Carp qw(fatalsToBrowser);

Even if you don't use fatalsToBrowser, the CGI::Carp module provides new definitions of the warn() and die() functions (and adds the croak(), carp(), and confess() subroutines) such that errors are printed with sensible identifiers and appear in the errors logs for your Web server. (See the documentation for the standard Carp module for information on croak(), carp(), and confess()). CGI::Carp is quite useful for debugging your CGI scripts in place.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.143.25.128