Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Chapter 17. Efficiency and Optimization

Let’s face it, CGI applications, run under normal conditions, are not exactly speed demons. In this chapter, we will show you a few tricks that you can use to speed up current applications, and also introduce you to two technologies—FastCGI and mod_perl —that allow you to develop significantly accelerated CGI applications. If you develop Perl CGI scripts on Win32, then you may also wish to look at ActiveState’s PerlEx. Although we do not discuss PerlEx in this chapter, it provides many of the same benefits as mod_perl.

First, let’s try to understand why CGI applications are so slow. When a user requests a resource from a web server that turns out to be a CGI application, the server has to create another process to handle the request. And when you’re dealing with applications that use interpreted languages, like Perl, there is an additional delay incurred in firing up the interpreter, then parsing and compiling the application.

So, how can we possibly improve the performance of Perl CGI applications? We could ask Perl to interpret only the most commonly used parts of our application, and delay interpreting other pieces unless necessary. That certainly would speed up applications. Or, we could turn our application into a server ( daemon) that runs in the background and executes on demand. We would no longer have to worry about the overhead of firing up the interpreter and evaluating the code. Or, we could embed the Perl interpreter within the web server itself. Again, we avoid the overhead of having to start a new process, and we don’t even suffer the communication delay we would have talking to another daemon.

We’ll look at all the techniques mentioned here, in addition to basic Perl tips for writing more efficient applications. Let’s start with the basics.

Basic Perl Tips, Top Ten

Here is a list of ten techniques you can use to improve the performance of your CGI scripts:

10. Benchmark your code.

9. Benchmark modules, too.

8. Localize variables with my.

7. Avoid slurping data from files.

6. Clear arrays with undef instead of ( ).

5. Use SelfLoader where applicable.

4. Use autouse where applicable.

3. Avoid the shell.

2. Find existing solutions for your problems.

1. Optimize your regular expressions.

Let’s look at each one in more detail.

Benchmark Your Code

Before we can determine how well our program is working, we need to know how to benchmark the critical code. Benchmarking may sound involved, but all it really involves is timing a piece of code, and there are some standard Perl modules to make this very easy to perform. Let’s look at a few ways to benchmark code, and you can choose the one that works best for you.

First, here’s the simplest way to benchmark:

$start = (times)[0];

## your code goes here

$end = (times)[0];

printf "Elapsed time: %.2f seconds!
", $end - $start;

This determines the elapsed user time needed to execute your code in seconds. It is important to consider a few rules when benchmarking:

Try to benchmark only the relevant piece(s) of code.
Don’t accept the first benchmark value. Benchmark the code several times and take the average.
If you are comparing different benchmarks, make sure they are tested under comparable conditions. For example, make sure that the load on the machine doesn’t differ between tests because another user happened to be running a heavy job during one.

Second, we can use the Benchmark module. The Benchmark module provides us with several functions that allow us to compare multiple pieces of code and determine elapsed CPU time as well as elapsed real-world time.

Here’s the easiest way to use the module:

use Benchmark;
$start = new Benchmark;

## your code goes here

$end = new Benchmark;

$elapsed = timediff ($end, $start);
print "Elapsed time: ", timestr ($elapsed), "
";

The result will look similar to the following:

Elapsed time:  4 wallclock secs (0.58 usr +  0.00 sys =  0.58 CPU)

You can also use the module to benchmark several pieces of code. For example:

use Benchmark;
timethese (100, {
                    for => <<'end_for',
                        my   $loop;
                        for ($loop=1; $loop <= 100000; $loop++) { 1 }
end_for
                    foreach => <<'end_foreach'
                        my      $loop;
                        foreach $loop (1..100000) { 1 }
end_foreach
                } );

Here, we are checking the for and foreach loop constructs. As a side note, you might be interested to know that, in cases where the loop iterator is great, foreach is much less efficient than for in versions of Perl older than 5.005.

The resulting output of timethese will look something like this:

Benchmark: timing 100 iterations of for, foreach...
       for: 49 wallclock secs (49.07 usr +  0.01 sys = 49.08 CPU)
   foreach: 69 wallclock secs (68.79 usr +  0.00 sys = 68.79 CPU)

One thing to note here is that Benchmark uses the time system call to perform the actual timing, and therefore the granularity is still limited to one second. If you want higher resolution timing, you can experiment with the Time::HiRes module. Here’s an example of how to use the module:

use Time::HiRes;
my $start = [ Time::HiRes::gettimeofday(  ) ];

## Your code goes here

my $elapsed = Time::HiRes::tv_interval( $start );
print "Elapsed time: $elapsed seconds!
";

The gettimeofday function returns the current time in seconds and microseconds; we place these in a list, and store a reference to this list in $start. Later, after our code has run, we call tv_interval, which takes $start and calculates the difference between the original time and the current time. It returns a floating-point number indicating the number of seconds elapsed.

One caveat: the less time your code takes, the less reliable your benchmarks will be. Time::HiRes can be useful for determining how long portions of your program take to run, but do not use it if you want to compare two subroutines that each take less than one second. When comparing code, it is better to use Benchmark and have it test your subroutines over many iterations.

Benchmark Modules, Too

CPAN is absolutely wonderful. It contains a great number of highly useful Perl modules. You should take advantage of this resource because the code available on CPAN has been tested and improved by the entire Perl community. However, if you are creating applications where performance is critical, remember to benchmark code included from modules you are using in addition to your own. For example, if you only need a portion of the functionality available in a module, you may benefit by deriving your own version of the module that is tuned for your application. Most modules distributed on CPAN are available according to the same terms as Perl, which allows you to modify code without restriction for your own internal use. However, be sure to verify the licensing terms for a module before you do this, and if you believe your solution would be beneficial to others, notify the module author, and please give back to CPAN.

You should also determine whether using a module make sense. For example, a popular module is IO::File, which provides a set of functions to deal with file I/O:

use IO::File;
$fh = new IO::File;
if ($fh->open ("index.html")) {
    print <$fh>;
    $fh->close;
}

There are advantageous to using an interface like IO::File. Unfortunately, due to module loading and method-call overhead, this code is, on the average, ten times slower than:

if (open FILE, "index.html") {
    print <FILE>;
    close FILE;
}

So the bottom line is, pay very careful attention to modules that you use.

Localize Variables with my

You should create lexical variables with the my function. Perl keeps track of managing memory usage for you, but it doesn’t look ahead to see if you are going to use a variable in the future. In order to create a variable that you need only within a particular block of code, such as a subroutine, declare it with my. Then the memory for that variable will be reclaimed at the end of the block.

Note that despite its name, the local function doesn’t localize variables in the standard sense of the term. Here is an example:

sub name {
    local $my_name = shift;
    greeting(  );
}

sub greeting {
    print "Hello $my_name, how are you!
";
}

If you run this simple program, you can see that $my_name isn’t exactly local to the name function. In fact, it is also visible in greeting. This behavior can produce unexpected results if you are not careful. Thus, most Perl developers avoid using local and use my instead for everything except global variables, file handles, and Perl’s built-in global punctuation variables like $_ or $/.

Avoid Slurping

What is slurping, you ask? Consider the following code:

local $/;
open FILE, "large_index.html" or die "Could not open file!
";
$large_string = <FILE>;
close FILE;

Since we undefine the input record separator, one read on the file handle will slurp (or read in) the entire file. When dealing with large files, this can be highly inefficient. If what you are doing can be done a line at a time, then use a while loop to process only a line at a time:

open FILE, "large_index.html" or die "Could not open file!
";
while (<FILE>) {
    # Split fields by whitespace, output as HTML table row
    print $q->tr( $q->td( [ split ] ) );
}
close FILE;

Of course, there are situations when you cannot process a line at a time. For example, you may be looking for data that crosses line boundaries. In this case, you may fall back to slurping for small files. Try benchmarking your code to see what kind of penalty is imposed by slurping in the entire file.

undef Versus ( )

If you intend to reuse arrays, especially large ones, it is more efficient to clear them out by equating them to a null list instead of undefining them. For example:

...
while (<FILE>) {
    chomp;
    $count++;  
    $some_large_array[$count] .= int ($_);
}
...
 
@some_large_array = (  );     ## Good
undef @some_large_array;    ## Not so good

If you undefine @some_large_array to clear it out, Perl will deallocate the space containing the data. And when you populate the array with new data, Perl will have to reallocate the necessary space again. This can slow things down.

SelfLoader

The SelfLoader module allows you to hide functions and subroutines, so the Perl interpreter does not compile them into internal opcodes when it loads up your application, but compiles them only where there is a need to do so. This can yield great savings, especially if your program is quite large and contains many subroutines that may not all be run for any given request.

Let’s look at how to convert your program to use self-loading, and then we can look at the internals of how it works. Here’s a simple framework:

use SelfLoader;

## step 1: subroutine stubs

sub one;
sub two;
...

## your main body of code
...

## step 2: necessary/required subroutines

sub one {
    ...
}

__DATA__

## step 3: all other subroutines

sub two {
    ...
}
...
__END__

It’s a three-step process:

Create stubs for all the functions and subroutines in your application.
Determine which functions are used often enough that they should be loaded by default.
Take the rest of your functions and move them between the __DATA__ and __END__ tokens.

Congratulations, Perl will now load these functions only on demand!

Now, how does it actually work? The __DATA__ token has a special significance to Perl; everything after the token is available for reading through the DATA filehandle. When Perl reaches the __DATA__ token, it stops compiling, and all the subroutines defined after the token do not exist, as far as Perl is concerned.

When you call an unavailable function, SelfLoader reads in all the subroutines from the DATA filehandle, and caches them in a hash. This is a one-time process, and is performed the first time you call an unavailable function. It then checks to see if the specified function exists, and if so, will eval it within the caller’s namespace. As a result, that function now exists in the caller’s namespace, and any subsequent calls to that function are handled via symbol table lookups.

The costs of this process are the one time reading and parsing of the self-loaded subroutines, and a eval for each function that is invoked. Despite this overhead, the performance of large programs with many functions and subroutines can improve dramatically.

autouse

If you use many external modules in your application, you may consider using the autouse feature to delay loading them until a specific function from a module is used:

use autouse DB_File;

You have to be very careful when using this feature, since a portion of the chain of execution will shift from compile time to runtime. Also, if a module needs to execute a particular sequence of steps early on in the compile phase, using autouse can potentially break your applications.

If the modules you need behave as expected, using autouse for modules can yield a big savings when it comes time to “load” your application.

Avoid the Shell

Avoid accessing the shell from your application, unless you have no other choice. Perl has equivalent functions to many Unix commands. Whenever possible, use the functions to avoid the shell overhead. For example, use the unlink function, instead of executing the external rm command:

system( "/bin/rm", $file );                     ## External command
unlink $file or die "Cannot remove $file: $!";  ## Internal function

It as also much safer to avoid the shell, as we saw in Chapter 8. However, there are some instances when you may get better performance using some standard external programs than you can get in Perl. If you need to find all occurrences of a certain term in a very large text file, it may be faster to use grep than performing the same task in Perl:

system( "/bin/grep", $expr, $file );

Note however, that the circumstances under which you might need to do this are rare. First, Perl must do a lot of extra work to invoke a system call, so the performance difference gained by an external command is seldom worth the overhead. Second, if you only were interested in the first match and not all the matches, then Perl gains speed because your script can exit the loop as soon as it finds a match:

my $match;
open FILE, $file or die "Could not open $file: $!";
while (<FILE>) {
    chomp;
    if ( /$expr/ ) {
        $match = $_;
        last;
    }
}

grep will always read the entire file. Third, if you find yourself needing to resort to using grep to handle text files, it likely means that the problem isn’t so much with Perl as with the structure of your data. You should probably consider a different data format, such as a DBM file or a RDBMS.

Also avoid using the glob <*> notation to get a list of files in a particular directory. Perl must invoke a subshell to expand this. In addition to this being inefficient, it can also be erroneous; certain shells have an internal glob limit, and will return files only up to that limit. Note that Perl 5.6, when released, will solve these limitations by handling globs internally.

Instead, use Perl’s opendir, readdir, and closedir functions. Here is an example:

@files = </usr/local/apache/htdocs/*.html>;      ## Uses the shell
....
$directory = "/usr/local/apache/htdocs";         ## A better solution
if (opendir (HTDOCS, $directory)) {
    while ($file = readdir (HTDOCS)) {
        push (@files, "$directory/$file") if ($file =~ /.html$/);
    }
}

Find Existing Solutions for Your Problems

Chances are, if you find yourself stuck with a problem, someone else has encountered it elsewhere and has spent a lot of time developing a solution. And thanks to the spirit of Perl, you can likely borrow it. Throughout this book, we have referred to many modules that are available on CPAN. There are countless more. Take the time to browse through CPAN regularly to see what is available there.

You should also check out the Perl newsgroups. news:comp.lang.perl.modules is a good place to go to check in with new module announcements or to get help with particular modules. news:comp.lang.perl and news:comp.lang.perl.misc are more general newsgroups.

Finally, there are many very good books available that discuss algorithms or useful tricks and tips. The Perl Cookbook by Tom Christiansen and Nathan Torkington and Mastering Algorithms with Perl by Jon Orwant, Jarkko Hietaniemi, and John Macdonald are full of gems specifically for Perl. Of course, don’t overlook books whose focus is not Perl. Programming Pearls by John Bentley, The C Programming Language by Brian Kernighan and Dennis Ritchie, and Code Complete by Steve McConnell are also all excellent references.

Regular Expressions

Regular expressions are an integral part of Perl, and we use them in many CGI applications. There are many different ways that we can improve the performance of regular expressions.

First, avoid using $&, $`, and $'. If Perl spots one of these variables in your application, or in a module that you imported, it will make a copy of the search string for possible future reference. This is highly inefficient, and can really bog down your application. You can use the Devel::SawAmpersand module, available on CPAN, to check for these variables.

Second, the following type of regular expressions are highly inefficient:

while (<FILE>) {
    next if (/^(?:select|update|drop|insert|alter)/);     
    ...  
}

Instead, use the following syntax:

while (<FILE>) {
    next if (/^select/);
    next if (/^update/);
    ...
}

Or, consider building a runtime compile pattern if you do not know what you are searching against at compile time:

@keywords = qw (select update drop insert);
$code = "while (<FILE>) {
";

foreach $keyword (@keywords) {
    $code .= "next if (/^$keyword/);
";
}

$code .= "}
";
eval $code;

This will build a code snippet that is identical to the one shown above, and evaluate it on the fly. Of course, you will incur an overhead for using eval, but you will have to weigh that against the savings you will gain.

Third, consider using o modifier in expressions to compile the pattern only once. Take a look at this example:

@matches = (  );
...
while (<FILE>) {
    push @matches, $_ if /$query/i;
}
...

Code like this is typically used to search for a string in a file. Unfortunately, this code will execute very slowly, because Perl has to compile the pattern each time through the loop. However, you can use the o modifier to ask Perl to compile the regex just once:

push @matches, $_ if /$query/io;

If the value of $query changes in your script, this won’t work, since Perl will use the first compiled value. The compiled regex features introduced in Perl 5.005 address this; refer to the perlre manpage for more information.

Finally, there are often multiple ways that you can build a regular expression for any given task, but some ways are more efficient than others. If you want to learn how to write more efficient regular expressions, we highly recommend Jeffrey Friedl’s Mastering Regular Expressions.

These tips are general optimization tips. You’ll get a lot of mileage from some, and not so much from the others, depending on your application. Now, it’s time to look at more complicated ways to optimize our CGI applications.

FastCGI

FastCGI is a web server extension that allows you to convert CGI programs into persistent, long-lived server-like applications. The web server spawns a FastCGI process for each specified CGI application at startup, and these processes respond to requests, until they are explicitly terminated. If you expect a certain application to be used more than others, you can also ask FastCGI to spawn multiple processes to handle concurrent requests.

There are several advantages to this approach. A typical Perl CGI application has startup overhead for each request that includes the process of spawning a process and interpreting the code. And, if the code has a lengthy initialization process, that simply adds to the overhead. A typical FastCGI application does not suffer from any of these problems. There is no extra spawning for each request, and all the initialization is done at startup. Since these applications are long-lived, they allow you to store data between requests, which is also an advantage.

Example 17.1 shows what a typical CGI script looks like.

Example 17-1. fast_count.cgi

#!/usr/bin/perl -wT

use strict;
use vars qw( $count );
use FCGI;

local $count = 0;

while ( FCGI::accept >= 0 ) {
    $count++;
    print "Content-type: text/plain

";
    print "You are request number $count. Have a good day!
";
}

Other than a few extra details, this is not much different than a regular CGI program. Since this is initialized only once, the value of $count (a global variable) will be zero at startup and will be persistent for all subsequent requests. If the web server receives a request for this FastCGI application, it passes it on and the FCGI::accept accepts the request and returns a response, which executes the body of the while loop. In this case, you will notice that the value of $count will be incremented for each request.

If your CGI script uses CGI.pm, you can use CGI.pm’s FastCGI interface, CGI::Fast, instead. CGI::Fast is included in the standard CGI.pm distribution. Example 17.2 shows how Example 17.1 looks with CGI::Fast.

Example 17-2. fast_count.cgi

#!/usr/bin/perl -wT

use strict;
use vars qw( $count );
use CGI::Fast;

local $count = 0;

while ( my $q = new CGI::Fast ) {
    $count++;
    print $q->header( "text/plain" ),
          "You are request number $count. Have a good day!
";
}

This works the same way. Everything before the creation of a CGI::Fast object is only executed once. Then the script waits until it receives a request, which creates a new CGI::Fast object and runs the body of the while loop.

Now that you’ve seen how FastCGI works, let’s see how to install it. FastCGI works with a wide variety of web servers, but we’ll walk through the setup for Apache.

Installing FastCGI

Early versions of FastCGI required a modified version of Perl to work its magic. Fortunately, this is no longer the case. However, FastCGI does require a change to your web server. The FastCGI distribution includes modules for your web server as well as the Perl module, FCGI (which is also available on CPAN). You can obtain it from http://www.fastcgi.com/, the home of the FastCGI open source project. Note this is separate from http://www.fastcgi.org/, which offers commercial solutions built upon FastCGI. In this case the .org and .com web sites are the reverse of what you might expect.

Here are the instructions for installing FastCGI with Apache. If you’re using a version of Apache greater than 1.3, you can simply run Apache’s configure in the following manner:

configure --add-module=/usr/local/src/apache-fastcgi/src/mod_fastcgi.c

Then, you need to determine where you will place your FastCGI applications. We let Apache know the location by adding the following directives in httpd.conf (Location goes in access.conf, and Alias in srm.conf if used):

<Location /fcgi>
SetHandler fastcgi-script
</Location>

Alias /fcgi/  /usr/local/apache/fcgi/

For each FastCGI application that you want to start, you need to make an entry like the following:

AppClass /usr/local/apache/fcgi/fast_count.cgi

Now, when you start your Apache server, you should see a fcgi_count process in your system’s process table. And you can access it by simply pointing your browser at:

http://localhost/fcgi/fast_count.cgi

Go ahead and convert one of your applications to FastCGI. You’ll see a major speed improvement. Before you do that, however, a few things to note. You should fix all memory leaks within your FastCGI programs, or else it could drastically effect your system resources. So, make sure to begin your scripts this way:

#!/usr/bin/perl -wT

use strict;

to check for warnings and to restrict variable scope.

Also, you should think about collapsing the functionality from your various CGI applications. Since CGI applications incur significant overhead for each request, it is a common practice to split the functionality into several little applications to reduce the overhead. But, with FastCGI, that is no longer a concern.

FastCGI offers other functionality as well, including the ability for the local web server to run FastCGI programs on remote machines. It’s beyond the scope of this chapter to go into detail about that topic, but you can find more information in the FastCGI documentation.

The technology we are about to look at offers high speed improvements over conventional CGI applications, much like FastCGI, but does so in an entirely different manner.

mod_perl

mod_perl is an Apache server extension that embeds Perl within Apache, providing a Perl interface to the Apache API. This allows us to develop full-blown Apache modules in Perl to handle particular stages of a client request. It was written by Doug MacEachern, and since it was introduced, its popularity has grown quickly.

The most popular Apache/Perl module is Apache::Registry, which emulates the CGI environment, allowing us to write CGI applications that run under mod_perl. Since Perl is embedded within the server, we avoid the overhead of starting up an external interpreter. In addition, we can load and compile all the external Perl modules we want to use at server startup, and not during the execution of our application. Apache::Registry also caches compiled versions of our CGI applications, thereby providing a further boost. Users have reported performance gains of up to 2000 percent in their CGI applications using a combination of mod_perl and Apache::Registry.

Apache::Registry is a response handler, which means that it is responsible for generating the response that will be sent back to the client. It forms a layer over our CGI applications; it executes our applications and sends the resulting output back to the client. If you don’t want to use Apache::Registry, you can implement your own response handler to take care of the request. However, these handlers are quite different from standard CGI scripts, so we won’t discuss how to create handlers with mod_perl. To learn about handlers along with anything else you might want to know about mod_perl, refer to Writing Apache Modules with Perl and C by Lincoln Stein and Doug MacEachern (O’Reilly & Associates, Inc.).

Installation and Configuration

Before we go any further, let’s install mod_perl. You can obtain it from CPAN at http://www.cpan.org/modules/by-module/Apache/. The Apache namespace is used by modules that are specific to mod_perl. The installation is relatively simple and should proceed well:

$ cd mod_perl-1.22
$ perl Makefile.PL 
> APACHE_PREFIX=/usr/local/apache  
> APACHE_SRC=../apache-1.3.12/src  
> DO_HTTPD=1                       
> USE_APACI=1                      
> EVERYTHING=1 
$ make
$ make test
$ su
# make install

Refer to the installation directions that came with Apache and mod_perl if you want to perform a custom installation. If you’re not interested in possibly developing and implementing the various Apache/Perl handlers, then you do not need the EVERYTHING=1 directive, in which case, you can implement only a PerlHandler.

Once that’s complete, we need to configure Apache. Here’s a simple setup:

PerlRequire      /usr/local/apache/conf/startup.pl
PerlTaintCheck   On
PerlWarn         On

Alias /perl/ /usr/local/apache/perl/

<Location /perl>
SetHandler       perl-script
PerlSendHeader   On
PerlHandler      Apache::Registry
Options          ExecCGI
</Location>

As you can see, this is very similar to the manner in which we configured FastCGI. We use the PerlRequire directive to execute a startup script. Generally, this is where you would pre-load all the modules that you intend to use (see Example 17.3).

However, if you are interested in loading only a small set of modules (a limit of ten), you can use the PerlModule directive instead:

PerlModule  CGI  DB_File  MLDBM  Storable

For Apache::Registry to honor taint mode and warnings, we must add directive the PerlTaintMode and PerlWarn directives. Otherwise, they won’t be enabled. We do this globally. Then we configure the directory we are setting up to run our scripts.

All requests for resources in the /perl directory go through the perl-script (mod_perl) handler, which then passes the request off to the Apache::Registry module. We also need to enable the ExecCGI option. Otherwise, Apache::Registry will not execute our CGI applications.

Now, here’s a sample configuration file in Example 17.3.

Example 17-3. startup.pl

#!/usr/bin/perl -wT

use Apache::Registry;

use CGI;

## any other modules that you may need for your
## other mod_perl applications running ...

print "Finished loading modules. Apache is ready to go!
";

1;

It is really a very simple program, which does nothing but load the modules. We also want Apache::Registry to be pre-loaded since it’ll be handling all of our requests. A thing to note here is that each of Apache’s child processes will have access to these modules.

If we do not load a module at startup, but use it in our applications, then that module will have to be loaded once for each child process. The same applies for our CGI applications running under Apache::Registry. Each child process compiles and caches the CGI application once, so the first request that is handled by that child will be relatively slow, but all subsequent requests will be much faster.

mod_perl Considerations

In general, Apache::Registry, does provide a good emulation of a standard CGI environment. However, there are some differences you need to keep in mind:

The same precautions that apply to FastCGI apply to mod_perl, namely, always use strict mode and it helps to enable warnings. You should also always initialize your variables and not assume they are empty when your script starts; the warning flag will tell you when you are using undefined values. Your environment is not cleaned up with you when your script ends, so variables that do not go out of scope and global variables remain defined the next time your script is called.
Due to the fact that your code is only compiled once and then cached, lexical variables in the body of your scripts that you access within your subroutines create closures. For example, it is possible to do this in a standard CGI script:
```
my $q = new CGI;

check_input(  );
.
.

sub check_input {
    unless ( $q->param( "email" ) ) {
        error( $q, "You didn't supply an email address." );
    }
    .
    .
```
Note that we do not pass our CGI object to check_input . However, the variable is still visible to us from within that subroutine. This works fine in CGI. It will create very subtle, confusing errors in mod_perl. The problem is that the first time the script is run on a particular Apache child process, the value of the CGI object becomes trapped in the cached copy of check_input. All future calls to that same Apache child process will reuse the original value of the CGI object within check_input. The solution is to pass $q to check_input as a parameter or else change $q from a lexical to a global local variable.
If you are not familiar with closures (they are not commonly used in Perl), refer to the perlsub manpage or Programming Perl.
The constant module creates constants by defining them internally as subroutines. Since Apache::Registry creates a persistent environment, using constants in this manner can produce the following warnings in the error log when these scripts are recompiled:
```
Constant subroutine FILENAME redefined at ...
```
It will not affect the output of your scripts, so you can just ignore these warnings. Another alternative is to simply make them global variables instead; the closure issue is not an problem for variables whose values never change. This warning should no longer appear for unmodified code in Perl 5.004_05 and higher.
Regular expressions that are compiled with the o flag will remain compiled across all requests for that script, not just for one request.
File age functions, such as -M, calculate their values relative to the time the application began, but with mod_perl, that is typically the time the server begins. You can get this value from $^T . Thus adding (time - $^T) to the age of a file will yield the true age.
BEGIN blocks are executed once when your script is compiled, not at the beginning of each request. However, END blocks are executed at the end of each request, so you can use these as you normally would.
__END__ and __DATA__ cannot be used within CGI scripts with Apache::Registry. They will cause your scripts to fail.
Typically, your scripts should not call exit in mod_perl, or it will cause Apache to exit instead (remember, the Perl interpreter is embedded within the web server). However, Apache::Registry overrides the standard exit command so it is safe for these scripts.

If it’s too much of a hassle to convert your application to run effectively under Apache::Registry, then you should investigate the Apache::PerlRun module. This module uses the Perl interpreter embedded within Apache, but doesn’t cache compiled versions of your code. As a result, it can run sloppy CGI scripts, but without the full performance improvement of Apache::Registry. It will, nonetheless, be faster than a typical CGI application.

Increasing the speed of CGI scripts is only part of what mod_perl can do. It also allows you do write code in Perl that interacts with the Apache response cycle, so you can do things like handle authentication and authorization yourself. A full discussion of mod_perl is certainly beyond the scope of this book. If you want to learn more about mod_perl, then you should definitely start with Stas Bekman’s mod_perl guide, available at http://perl.apache.org/guide/. Then look at Writing Apache Modules with Perl and C, which provides a very thorough, although technical, overview of mod_perl.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for 17. Efficiency and Optimization

Create new playlist

Sign In

Sign Up