Chapter 9. Sending Email

One of the most common tasks your CGI scripts need to perform is sending email. Email is a popular method for exchanging information between people, whether that information comes from other people or from automated systems. You may need to send email updates or receipts to visitors of your web site. You may need to notify members of your organization about certain events like a purchase, a request for information, or feedback about your web site. Email is also a useful tool to notify you when there are problems with your CGI scripts. When you write subroutines that respond to errors in your CGI scripts, it is a very good idea to include code to notify whomever is responsible for maintaining the site about the error.

There are several ways to send email from an application, including using an external mail client, such as sendmail or mail, or by directly communicating with the remote mail server via Perl. There are also Perl modules that make sending mail especially easy. We’ll explore all these options in this chapter by building a sample application that provides a web front end to an emailer.

Security

Since the subject of security is still fresh in our minds, however, we should take a moment to review security as it relates to email. Sending email is probably one of the largest causes of security errors in CGI scripts.

Mailers and Shells

Most CGI scripts open a pipe to an external mail client such as sendmail and mail, and pass the email address through the shell as a parameter. Passing any user data through a shell is a very bad thing as we saw in the previous chapter (if you skipped ahead to this chapter, it would be wise to go back and review Chapter 8, before continuing). Unless you like living dangerously, you should never pass an email address to an external application via a shell. It is not possible to verify that email addresses contain only certain safe characters either. Contrary to what you may expect, a proper email address can contain any valid ASCII character, including control characters and all those troublesome characters that have special meaning in the shell. We’ll review what comprises a valid email address in the next section.

False Identities

You have likely received email claiming to be from someone other than the true sender. It happens all the time with unsolicited bulk email (spam). Falsifying the return address in an email message is very simple to do, and can even be quite useful. You probably would rather have email messages sent by your web server appear to come from actual individuals or groups within your company than the user (e.g., nobody) that the web user runs as. We’ll see how to do this in our examples later in this chapter.

So how does this relate to security? Say, for example, you create a web form that allows users to send feedback to members of your organization. You decide to generalize the CGI script responsible for this so you don’t have to update it when internal email addresses change. Instead, you insert the email addresses into hidden fields in the feedback form since they’re easier to update there. However, you do take security precautions. Because you recognize that it’s possible for a cracker to change hidden fields, you are careful not to pass the email addresses through a shell, and you treat them as tainted data. You handled all the details correctly, but you still have a potential security problem—it’s just at a higher level.

If the user can specify the sender, the recipient, and the body of the message, you are allowing them to send any message to anyone anywhere, and the resulting message will originate from your machine. Anyone can falsify the return address in an email message, but it is very difficult to try to mask the message’s routing information. A knowledgeable person can look at the headers in an email message and see where that message truly originated, and all the email messages your web server sends out will clearly originate from the machine hosting it.

Thus this feedback page is a security problem because crackers given this much freedom could send damaging or embarrassing email to whomever they wanted, and all the messages would look like they are from your organization. Although this may not seem as serious as a system breach, it is still something you probably would rather avoid.

Spam

Spam, of course, refers to unsolicited junk email. It’s those messages that you get from someone you’ve never heard of advertising weight loss plans, get-rich schemes, and less-than-reputable web sites. None of us like spam, so be certain your web site doesn’t contribute to the problem. Avoid creating CGI scripts that are so flexible that they allow the user to specify the recipient and the content of the message. The previous example of the feedback page illustrates this. As we saw in the last chapter, it is not difficult to create a web client with LWP and a little bit of Perl code. Likewise, it would not be difficult for a spammer to use LWP to repeatedly call your CGI script in order to send out numerous, annoying messages.

Of course, most spammers don’t operate this way. The big ones have dedicated equipment, and for those who don’t, it’s much more convenient to hijack an SMTP server, which is designed to send mail, than having to pass requests through a CGI script. So even if you do create scripts that are wide open to hijacking, the chances that someone will exploit it are slim ... but what if it does happen? You probably do not want to face the mass of angry recipients who have tracked the routing information back to you. When it comes to security, it’s always better to play it safe.

Email Addresses

Part of handling mail includes handling email addresses. Collecting email addresses from users seems to be part of almost any registration form on the Web.[17] You may wonder how you can know whether an email address entered into a form is valid. The simple answer, of course, is that you can’t. You can validate that the email address is syntactically valid (although this is considerably more difficult than you might expect), but you cannot know whether the email address actually corresponds to a valid account or not.

You may think you should be able to make a query to an SMTP server to check whether an email address is valid or not. In fact, the SMTP protocol supports a command to validate an email address. Unfortunately, this really cannot be used in practice. There are two problems.

The first problem is that the SMTP server responsible for handling the mail for that email address may not always be accessible. There may be intermediate network outages, and even when the network is fine, mail servers are frequently overloaded and may refuse requests. These are not typically a problem for Internet mail because other mail servers trying to deliver to them maintain queues of messages and retry several times, often for days, before giving up. However, if you need immediate verification, the mail server may not be available to give it to you.

The second problem is that even when the final SMTP server is available, it may not provide reliable information. Many SMTP servers simply gateway messages to an internal mail system, which may speak another protocol and be located on another network. Because of this, one of these SMTP gateways may not know which email addresses are valid on the other network; it may simply be set up to forward all Internet mail. Therefore, when this SMTP server is asked to verify an email address, it may state that any email address addressed to its domain is deliverable, whether it is or not.

The best that you can do if you need to validate an email address is send an actual email to that address and ask the user to respond. We will look at ways to write scripts to respond to email later in this chapter. For now lets look at how to recognize syntactically valid email addresses.

Validating Syntax

A common question that new CGI developers ask is what the regular expression for matching email addresses looks like. If you ask around, some people will refer you to a book called Mastering Regular Expressions by Jeffrey Friedl (O’Reilly & Associates, Inc.). Others might give you a simple expression that checks for “@” and that checks that the domain name ends in a dot and two or three letters. In fact, neither of these answers is fully accurate.

To understand why, let’s review a little history. The standard document for defining email address names is RFC 822. It was published in 1982. Does that seem like a long time ago to you? It should. The Internet was radically different then. In fact, it wasn’t called the Internet then—it was a collection of many different networks, including ARPAnet, Bitnet, and CSNET, each with their own naming conventions. TCP/IP was being introduced as a new networking protocol and hosts only numbered in the hundreds. It wasn’t until 1983 that serious work began on implementing domain name servers. The hierarchical names that we recognize today like www.oreilly.com did not exist back then.

So that is half of the story. The other half of the story is that Jeffrey Friedl, in his book Mastering Regular Expressions, tackled creating a regular expression to handle the parsing of RFC 822 email addresses. The book is the best reference for understanding regular expressions in Perl or any other context. Many people cite the regular expression he constructs as the only definitive test of whether an Internet email address is valid. But unfortunately these people have misunderstood what it does; it tests for compliance with RFC 822. According to RFC 822, these are all syntactically valid email addresses:

Alfred Neuman <Neuman@BBN-TENEXA>
":sysmail"@  Some-Group. Some-Org
Muhammed.(I am  the greatest) Ali @(the)Vegas.WBA

Do any of them look like the type of email address you’d want to capture in an HTML form? It is true that RFC 822 has not been superseded by another RFC and is still a standard, but it is equally true that the problem we are trying to solve is radically different in time and context from the problem that it solved in 1982.

We want an expression to recognize a syntactically valid email address as required on the Internet today. We are interested only in today’s standard Internet domain-naming convention. That would actually rule out all of the above addresses, since none of them end in one of our current top level domains (.com, .net, .edu, .uk, etc.). There are other important distinctions.

The first example is a full email address including a name and what RFC 822 refers to as the address specification in angled brackets. You may have seen this expanded syntax in your email software. We do not need, and probably don’t want, this additional information in an email address captured in a form. In all likelihood, the user’s name is being captured separately in other fields. When we need to validate an email address that a user has entered, we are generally only interested in the address specification itself. So henceforth when we refer to an email address, we are simply referring to this address specification, the user@hostname part.

The second example contains a quoted element (any group of characters separated by a “.” or a “@” we will refer to as an element [18]). Quoted elements are completely acceptable and still work fine on today’s Internet. If you want to accept valid email addresses, you should accept quoted elements. Only elements on the left side of the “@” may be quoted, but any ASCII character is allowed within quotes (some have to be escaped with a backslash). This is why any check in our code for “invalid characters” in an email address would be flawed, and this is why it is very dangerous to pass email addresses through a shell as an argument to a command.

The second email address also includes spaces. Spaces (and tabs) are legal between any element and at the beginning and end of the email address. However, it doesn’t change the meaning to remove them and that is exactly what emailers generally do when you send a message to an email address containing spaces. Note, however, that you cannot simply remove every space in an email address since spaces appearing within quotes do carry meaning and must be left intact. Only those appearing outside of quotes can be removed. We will strip them in our example. We probably don’t have to; it is not unreasonable to expect your users to enter the email address without extra spaces.

The last example contains comments. It is perfectly legal to include comments, which are enclosed within parentheses, anywhere where spaces are allowed. Comments are only intended to pass additional information to humans, and machines can ignore them. Thus, it is rather silly to enter them into an automated web form. We will simplify our code by not accepting comments in the email addresses we are checking.

So here is the code that we will use to validate email addresses. It is considerably shorter than the example given by Mr. Friedl, but it is not nearly so flexible. It does not support comments, it removes spaces before validating, and it limits hosts to modern domain names and IP addresses. Nonetheless, it is quite complicated, and the regular expression to perform the check would be too difficult to type out. Instead, we build it through a number of intermediate variables. The process of doing this is too involved to explain here. If you want to understand how to build complex regular expressions like this, we highly recommend Mastering Regular Expressions.

One note, however: the variable $top_level contains the expression that matches valid top-level domains. Our current top level domains have two (e.g., .us, .uk, .au, etc.) or three letters (e.g., .com, .org, .net, etc.). The number of top-level domains will certainly increase. Some of the proposed new names, such as .firm, have more than three characters. Thus, the regular expression below will allow anywhere from two to four characters:

my $top_level   = qq{ (?: $atom_char ){2,4} };

If you want to be more restrictive today, you can limit it to three. Likewise, if top-level domains with more than four characters are someday allowed, you would need to increase it.

Finally, here’s the code:

sub validate_email_address {
    my $addr_to_check = shift;
    $addr_to_check =~ s/("(?:[^"\]|\.)*"|[^	 "]*)[ 	]*/$1/g;
    
    my $esc         = '\';
    my $space       = '40';
    my $ctrl        = '00-37';
    my $dot         = '.';
    my $nonASCII    = 'x80-xff';
    my $CRlist      = '1215';
    my $letter      = 'a-zA-Z';
    my $digit       = 'd';
    
    my $atom_char   = qq{ [^$space<>@,;:".\[\]$esc$ctrl$nonASCII] };
    my $atom        = qq{ $atom_char+ };
    my $byte        = qq{ (?: 1?$digit?$digit | 
                              2[0-4]$digit    | 
                              25[0-5]         ) };
    
    my $qtext       = qq{ [^$esc$nonASCII$CRlist"] };
    my $quoted_pair = qq{ $esc [^$nonASCII] };
    my $quoted_str  = qq{ " (?: $qtext | $quoted_pair )* " };
    
    my $word        = qq{ (?: $atom | $quoted_str ) };
    my $ip_address  = qq{ \[ $byte (?: $dot $byte ){3} \] };
    my $sub_domain  = qq{ (?: [$letter$digit]|
                            [$letter$digit][$letter$digit-]{0,61} [$letter$digit]) };
    my $top_level   = qq{ (?: $atom_char ){2,4} };
    my $domain_name = qq{ (?: $sub_domain $dot )+ $top_level };
    my $domain      = qq{ (?: $domain_name | $ip_address ) };
    my $local_part  = qq{ $word (?: $dot $word )* };
    my $address     = qq{ $local_part @ $domain };
    
    return $addr_to_check =~ /^$address$/ox ? $addr_to_check : "";
}

If you supply an email address to validate_email_address, it will strip out any spaces or tabs that are not within quotes. We’re being a little lenient here since spaces within elements (as opposed to spaces around elements) are actually illegal, but we’ll just strip them in this step along with the legal spaces. We then check the address against our regular expression. If it matches, the email address is valid and is returned (without spaces). Otherwise, an empty string is returned, which evaluates to false in Perl. You can use the subroutine like so:

use strict;
use CGI;
use CGIBook::Error;

my $q     = new CGI;
my $email = validate_email_address( $q->param( "email" ) );

unless ( $email ) {
    error( $q, "The email address you entered is invalid. " .
               "Please use your browser's Back button to " .
               "return to the form and try again." );
}
.
.

If you were planning to check multiple email addresses or intended to use this in an environment where your Perl code is precompiled (like mod_perl or FastCGI), then you could optimize this code by building the regular expression once and caching this expression. However, this example is intended more to demonstrate why validating email addresses is a challenge than to be used in production (it does not resolve the issue that an email address can be syntactically valid yet bad).

Structure of Internet Email

Email messages are documents containing headers and a body separated by a blank line. Each header contains a field name followed by a colon, some space, and a value. Does this sound familiar? In a basic sense, Internet mail messages are similar in structure to HTTP messages. There are also a number of differences of course: there is no request line or status line; email messages are text documents (binary attachments must be encoded as text before being sent); and most of the field names are different. But if you recall the basic header and body format from our earlier HTTP discussion, that will help you understand how to create email messages.

Some header fields hold email addresses. These can support the full syntax of email addresses that we saw earlier, including the recipient’s name in addition to the email address itself, like so:

Mary Smith <[email protected]>

The shorter [email protected] is also acceptable.

There are only a few header fields you need to include in email messages: who it is to, who it is from, and what it is about. The first of these fields is actually any of three fields: To , Cc, and Bcc. To and Cc (which stands for carbon-copy) contain the email addresses of any of the recipients of the message. The Bcc field (which stands for blind carbon-copy) does likewise but is deleted from the message before it is sent. The From field contains the email address of the person the message is from. If you want replies to be directed elsewhere, you may also specify that email address in the Reply-To field. Finally, the Subject field contains a summary of the email address.

So far, this is all pretty basic; all of us have received email before. There is, however, a subtle distinction that is important to note. Internet email is in some ways similar to real paper mail: it has a message, which can contain anything, inside an envelope, and the envelope carries the routing information. On formal letters, you often add the recipient’s address information to the top of the message; however, it’s quite possible to put a message addressed this way in an envelope that is actually addressed and routed to someone else. The same thing is possible with Internet email. The To, Cc, Bcc, and From fields are actually part of the message. They do not determine any of the routing information and do not need to match who the message truly is to or from. You have probably received spam that, according to the To field, appeared to be addressed to someone other than you; likewise, the recipient listed in the From field on most spam is not the true sender. However, for our purposes, we typically do want the address information and these fields to line up. We’ll explore this more when we review each of the mailers below.

There are many other important fields that appear in the headers of email addresses, but mailers take care of adding these for you, so we won’t include them in our discussion.

sendmail

Without sendmail , Internet email might not exist. Although other mail transport agents (MTAs) do exist, the vast majority of mail servers on the Internet all run sendmail. It was originally written by Eric Allman starting around 1980 and, as we mentioned earlier, the Internet was very different then. sendmail tackled the formidable task of transferring mail between very different networks. Thus, it has never been a simple program, and it has grown. It has become one of the most complicated applications to ever fully understand; the number of command-line flags and configuration parameters it now accepts is truly mind-boggling. Fortunately, we only need to learn a few things in order to have it send messages for us. If you want to learn more about sendmail, see sendmail by Bryan Costales with Eric Allman (O’Reilly & Associates, Inc.).

sendmail generally comes preinstalled on Unix machines and has recently been ported to Windows NT. On Unix, it is often installed in /usr/lib/sendmail, but /usr/sbin/sendmail and /usr/ucb/lib/sendmail are also possible locations. These examples will use /usr/lib/sendmail as the location of the sendmail program. If your copy is installed elsewhere, simply replace this with the path to your copy of sendmail.

Command-Line Options

You generally want to call sendmail with at least a couple of command-line options. When sending a message, sendmail assumes it is being run interactively by a user, so it sets the sender to be that of the user, and it allows the user to enter a period on its own line to signal the end of the message. You can override these settings and will probably want to. In addition, if you are sending multiple email messages, you may wish to queue them so that sendmail can deliver them asynchronously without pausing to deliver each one.

Table 9.1 lists the important options you should know.

Table 9-1. Common sendmail Options

Option

Description

-t

Read To, Cc, and Bcc from the message headers.

-f "email address"

Make the message appear to be From the specified email address.

-F "full name"

Make the message appear to be From the specified name.

-i

Ignore periods on lines by themselves.

-odq

Queue messages to be sent later instead of processing them one at a time.

Example 9.1 is a short CGI script in Perl that uses many of these options.

Example 9-1. feedback_sendmail.cgi

#!/usr/bin/perl -wT

use strict;
use CGI;

# Clean up environment for taint mode before calling sendmail
BEGIN {
    $ENV{PATH} = "/bin:/usr/bin";
    delete @ENV{ qw( IFS CDPATH ENV BASH_ENV ) };
}

my $q       = new CGI;
my $email   = validate_email_address( $q->param( "email" ) );
my $message = $q->param( "message" );

unless ( $email ) {
    print $q->header( "text/html" ),
          $q->start_html( "Invalid Email Address" ),
          $q->h1( "Invalid Email Address" ),
          $q->p( "The email address you entered is invalid. " .
                 "Please use your browser's Back button to " .
                 "return to the form and try again." );
          $q->end_html;
    exit;
}

send_feedback( $email, $message );
send_receipt( $email );

print $q->redirect( "/feedback/thanks.html" );

sub send_feedback {
    my( $email, $message ) = @_;
    
    open MAIL, "| /usr/lib/sendmail -t -i"
        or die "Could not open sendmail: $!";
    
    print MAIL <<END_OF_MESSAGE;
To: [email protected]
Reply-To: $email
Subject: Web Site Feedback

Feedback from a user:

$message
END_OF_MESSAGE
    close MAIL or die "Error closing sendmail: $!";
}

sub send_receipt {
    my $email = shift || $ENV{SERVER_ADMIN};
    
    open MAIL, "| /usr/lib/sendmail -t -F'$from_name' -f'$from_email'"
        or die "Could not open sendmail: $!";
    print MAIL <<END_OF_MESSAGE;
To: $email
Subject: Your feedback

Your message has been sent and someone should be responding to you 
shortly. Thanks for taking the time to provide us with your feedback!
END_OF_MESSAGE
    close MAIL or die "Error closing sendmail: $!";
}

We collect two pieces of information from the user: an email address and a message to send to customer service. We validate the email address according to the subroutine earlier in this chapter, but we don’t include the code for that subroutine here. The script then composes two messages and forwards users to a static page to thank them.

The first message goes to customer service. It uses the -t option as well as the -i option. The -i option is a good idea if the message includes any dynamic information. It prevents a single dot from prematurely ending the email message.

The -t option is the most important of these options. It tells sendmail to read the routing information for the recipient from the message itself. Otherwise you have to provide the recipient’s email address on the command line. Generally, you call sendmail like this:

/usr/lib/sendmail [email protected]

sendmail then reads the message including the headers and body from its STDIN and sends the message on to Mary, even if the To, Cc, or Bcc fields say it should go elsewhere! This can get confusing.

You should always use the -t flag. First, it makes your life easier, since it automatically handles the To, Cc, and Bcc fields. Second, it lets you avoid that awful security risk of passing user data through the shell. Many times you will be sending email to an address that was entered into an HTML form, so being able to simply include the email address in the body of the message instead is another big win.

Once this message has been sent, the script sends a confirmation to the user. It also uses the -t option here, and here we see the security benefit. The email address comes from the user, but we don’t have to worry about passing it through the shell.

In this second email, we also use two other fields to override the sender’s routing information. sendmail will not automatically read the sender’s email address from the headers as it does for the -t option. This must be specified with the -f and -F options. There are two options in order to support the extended address notation including a name and an email address in this form:

The Webmaster <[email protected]>

It is important to override the sender’s routing information because if the message to Mary bounces, it will come back to the original sender, and if the user that the web server runs as has a standard account with a mail box, bounced messages will collect there. If it has no mail account, then they’ll bounce back and forth either until they time out or some system administrator gets annoyed at the increased network traffic and steps in. Ideally, your system should be configured so that any mail addressed to nobody (the user your web server runs as) is automatically forwarded to the webmaster. If this hasn’t been done, or you aren’t sure, then it’s a good idea to set the -f option to a real email address that someone monitors or that is processed automatically. We’ll see how to set up a process to handle mail like this at the end of this chapter.

Note that if you do override the sender’s email address with the -f option, sendmail will add an extra header to the email message unless you are a trusted user. This extra header typically looks like the following:

X-Authentication-Warning: scripted.com: sguelich set sender to [email protected] using -f

By default, the users who have permission to use the -f option without generating this warning are root, daemon, and uucp. Most mail agents do not actually pay attention to this header, so it is rare that recipients will see it. However, you can avoid sending it by adding nobody to the trusted users section in /etc/sendmail.cf.

Mail Queue

The remaining option we haven’t discussed is the -odq option. It is useful if you are sending out many messages at the same time. For example, you may run a web site that connects job hunters with available positions. You have the job hunters record keywords describing the types of positions they are looking for in a database along with their email addresses. Then, when the new positions available today have been entered, you start a CGI script which matches the job hunters’ keywords against the positions. The script generates and sends out customized messages to the job hunters notifying them if there are any matches. In this example, you would want to use the -odq option. It takes sendmail time to find remote servers and deliver messages, so your script runs much, much faster if you simply add them to the queue to be processed separately and don’t wait for sendmail to try to deliver each message.

You do need to make sure that sendmail is configured on your system to process the queue or the messages may just sit around indefinitely. If you aren’t sure, ask your system administrator.

Also note that queuing messages this way is only a good idea if each message you are sending out is unique. If you are sending the same message to multiple people, don’t queue a separate message addressed to each person, use the Bcc field instead.

mailx and mail

mailx and mail are other popular options for sending email. Some people even argue that they are more secure than sendmail. It is true that because sendmail is such a large, complicated program, and because it runs as root, it has been the source of a number of security holes over the years. However, the notion that it is a less secure option in CGI scripts is a dubious one. One serious problem with mailx and mail is that they allow tilde escapes: any line in the body of the message beginning with ~! is executed as a command. Many versions do attempt to detect whether they are being run by a user on a terminal and disable tilde escapes otherwise, but this is a serious potential risk.

A second problem with mailx and mail is that they offer nothing comparable to sendmail ’s -t option. Thus, if you want to use mail, for example, you must use the fork and exec trick we described in the last chapter:

open MAIL "|-" or exec( "/bin/mail", $email ) or 
    die "Cannot exec mail $!";

Finally, mailx and mail also lack the useful options we discussed with sendmail, such as overriding the sender.

Perl Mailers

There are other programs you can use for sending mail, but they are not as common. Some of these, such as blat, provide simple mailers for Windows systems. Instead of looking at these, we’ll look at a Perl solution that works across all operating systems.

Mail::Mailer is a popular Perl module for sending Internet email. It provides a simple interface for sending messages with sendmail and mail (or mailx). It also allows you to send messages via SMTP without an external application, which makes it possible to send messages on non-Unix systems like Windows and even the MacOS.

You can use Mail::Mailer this way:

my $mailer = new Mail::Mailer ( "smtp" );
$mailer->open( {
    To       => $email,
    From     => 'The Webmaster <[email protected]>',
    Subject  => 'Web Site Feedback'
} );

print $mailer <<END_OF_MESSAGE;
Your message has been sent and someone should be responding to you 
shortly. Thanks for taking the time to provide us with your feedback!
END_OF_MESSAGE

close $mailer;

When you create a Mail::Mailer object, you can specify whether you want it to send the message one of three ways:

mail

Mail::Mailer will search your system for mailx, Mail, or mail in that order and use the first one it finds (we didn’t discuss Mail, although on many systems Mail and mail are the same—mail is simply a symlink to Mail ).

sendmail

Mail::Mailer will use sendmail to send mail.

smtp

Mail::Mailer will use the Net::SMTP Perl module to send mail.

If you do not specify an argument when you create an object, Mail::Mailer will search through each of these three options in order and use the first one it finds When Mail::Mailer uses an external mailer, it uses the fork and exec technique to avoid passing arguments through the shell.

Mail::Mailer is primarily useful when you use it to send mail via SMTP on systems without sendmail. Even though it allows you to use sendmail as its mailer, there is no way for you to specify command-line options the way you can if you use sendmail directly. Mail::Mailer only uses the -t option when it calls sendmail.

To send mail directly through SMTP with Mail::Mailer, you need to have the Net::SMTP module, which is part of the libnet bundle available on CPAN. When you install this module, it should ask you for the SMTP server you use on your network. If this was not configured when the module was installed, you have two options. You can edit the installed Net/Config.pm file in your Perl libraries folder and add your SMTP server to the smtp_hosts element of the NetConfig hash at the bottom of the file, or you can specify it when you create a Mail::Mailer object. You can do so like this:

my $mailer = new Mail::Mailer ( "smtp", Server => $server );

In this example, $server would contain the name of your SMTP server. Your network administrator or internet service provider should be able to provide you with the name of this machine.

procmail

If your CGI scripts send out email, procmail is a very handy tool to learn, but it is only available for Unix. If you are on a Unix system and you do not have it, you can download it from http://www.procmail.org. procmail is a filtering application that allows you to automatically process email based on virtually any criteria. It’s not simple, of course; few powerful tools are. And again, like the other tools presented in this chapter, we won’t be able to discuss it in great detail here. Instead, we’ll look at a couple configurations that should handle your basic needs. If you want to learn more, you can find links to several useful resources including quick-start guides and FAQs at http://www.iki.fi/era/procmail/. Also, don’t forget to review the manpage; most of these online resources assume you have done this already. You may not normally enjoy reading manpages, but the procmail man pages are very well written and include numerous examples.

In order to run procmail, you need to create two files in your home directory (or the home directory of the user whose mail you want to forward). The first file, .forward , is used by sendmail when it delivers mail to your account. That file should be set up to it to direct sendmail to run procmail, and procmail uses the .procmailrc file to process the message. It is possible to have procmail set up as the mail transport agent on your system, instead of sendmail; in that case, you do not need the .forward file. Check with your system administrator to see if this is the case.

Your .forward file needs to include only the following line:

"|IFS=' '&&exec /usr/local/bin/procmail -f-||exit 75 #YOUR_USERNAME"

All of the quotes are necessary, there is only one space between the single quotes, you must supply the full path to procmail, and you should of course replace YOUR_USERNAME with your own username (or something to make this line different from the line in other users’ .forward files).

Autoreply from nobody

Now all we need to do is create a .procmailrc file. The .procmailrc file contains rules and a command to execute if the rule matches. In this example, we will create only one rule that sends an autoreply to all incoming messages. This would be handy if messages sent to the user that the web server runs as are not redirected. If your web server runs as a valid user named nobody, you could place this in nobody’s home directory. Here is the .procmailrc file:

## This is your email address
[email protected]

## Uncomment and edit this line if sendmail isn't at /usr/lib/sendmail
#SENDMAIL=/path/to/sendmail

## If we get a message, verify that it wasn't sent by a mail daemon
## or isn't one of our marked messages. If not, then reply to it using
## the contents of the autoreply.txt file as the body of the message
## and mark the message by adding an X-Loop header.
:0 h
* !^FROM_DAEMON
* !^X-Loop: $EMAIL_ADDRESS
| ( formail -r -A"X-Loop: $EMAIL_ADDRESS"; 
    cat "$HOME/autoreply.txt"            ) | $SENDMAIL -t

## Throw away the messages we're not replying to
:0
/dev/null

We’ll briefly review what this file does. For more detailed information, refer to one of the references listed earlier. First, it sets the $EMAIL_ADDRESS variable to the email address of the account receiving this mail. Next, it should specify the path to sendmail if it is something other than the path that procmail defaults to (typically either /usr/lib/sendmail or /usr/sbin/sendmail ). The remaining lines are rules.

All rules start with :0. The first rule also has an h option indicating that we are only interested in the message headers of the incoming message; its body will not be included in our reply. All the lines that begin with * are conditions. Basically, any message that doesn’t look like it was generated by a daemon process (this includes bounced mail, mailing lists, etc.) and doesn’t include an X-Loop header with our email address in it should be processed by this rule. We’ll see why we check for this header in a moment.

The message is processed by piping the headers through formail , a helper application included with procmail. It constructs a reply to the given headers and adds an X-Loop header containing our email address. The reason for adding this to our replies and checking for it in incoming messages is to avoid endless loops. If our CGI script sends a message that bounces (because of an invalid email address, a full account, etc.) and comes back to us, and we automatically reply to it, our reply will also bounce. This could go on forever, but if we add an X-Loop header, that should be maintained within replies so we will know if we see it that we have already replied to this message and to not process another reply. The check for whether the message was generated by a daemon should actually prevent us from replying to a bounce, but the daemon check isn’t foolproof, so the X-Loop check is a good way to be safe.

formail takes care of the headers for us, and then we cat the contents of the autoreply.txt file in our home directory. You should create a message in this file appropriate to your site, saying something to the effect that this email address is not used and providing an alternative email address to the recipient. The final results of both the headers and the body are piped to sendmail, which reads the headers and delivers our new reply.

The remaining rule in the file has no conditions. It catches all messages that are not processed by the preceding rule, in other words, all messages that are sent by daemons or that have already been replied to. These messages are simply discarded by moving them to /dev/null.

Forwarding to Another User

It is also possible to simply forward all messages to another user. There are better alternatives than procmail for doing this. Specifically, sendmail allows aliases to be created to redirect mail sent from one email address to another. However, if you cannot get your system administrator to create an alias for you, here is a .procmailrc file that forwards all incoming mail to another email address:

## This is the email address to forward to
[email protected]

## Uncomment and edit this line if sendmail isn't at /usr/lib/sendmail
#SENDMAIL=/path/to/sendmail

## Forward all messages
:0
! $FORWARD_TO

As you can see, procmail provides you with a number of options for automatically processing email. In one of our examples earlier, we piped the headers of incoming messages through formail. We could have just as easily piped the headers, the body, or the whole message through a Perl script and thus be able to react to every email that arrives. For example, you might want to flag or delete a database record when mail you send to that user is returned as undeliverable. That’s just one example; you can probably think of others specific to your site.



[17] This isn’t necessarily a good thing. Many sites have adopted the common practice of requiring an email address for accessing otherwise free services. These sites often allow the user to check a checkbox to be exempted from mass mailings, but if this is optional, then why is entering an email address not optional? If you are asked to create forms like this, please ask yourself and your sponsors why you are collecting private information. If you have a good reason, then explain it on your registration form. If not, then there is no reason to collect more than you need; user privacy should not be an afterthought.

[18] RFC 822 more technically refers to this as an “atom.”

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.142.171.90