Chapter 15. Secure CGI/API Programming

Web servers are fine programs, but innovative applications delivered over the World Wide Web require that servers be extended with custom-built programs. Unfortunately, these programs can have flaws that allow attackers to compromise your system.

The Common Gateway Interface (CGI) was the first and remains the most popular means of extending web servers. CGI programs run as subtasks of the web server; arguments are supplied in environment variables and to the program’s standard input; results are returned on the program’s standard output. CGI programs have been written that perform database queries and display the results; that allow people to perform complex financial calculations; and that allow web users to “chat” with others on the Internet. Indeed, practically every innovative use of the World Wide Web, from WWW search engines to web pages that let you track the status of overnight packages, was originally written using the CGI interface.

A new way to extend web servers is by using proprietary Application Programmer Interfaces (APIs). APIs are a faster way to interface custom programs to web servers because they do not require that a new process be started for each web interaction. Instead, the web server process itself runs application code within its own address space that is invoked through a documented interface.

This chapter focuses on programming techniques that you can use to make CGI and API programs more secure.

The Danger of Extensibility

Largely as a result of their power, the CGI and API interfaces can completely compromise the security of your web server and the host on which it is running. That’s because any program can be run through these interfaces. This can include programs that have security problems, programs that give outsiders access to your computer, and even programs that change or erase critical files from your system.

Two techniques may be used to limit the damage that can be performed by CGI and API programs:

  • The programs themselves should be designed and inspected to ensure that they can perform only the desired functions.

  • The programs should be run in a restricted environment. If these programs can be subverted by an attacker to do something unexpected, the damage that they can do will be limited.

On operating systems that allow for multiple users running at multiple authorization levels, web servers are normally run under a restricted account, usually the nobody or httpd user. Programs that are spawned from the web server, either through CGI or API interfaces, are then run as the same restricted user.

Unfortunately, other operating systems do not have this same notion of restricted users. On Windows 3.1, Windows 95, and the Macintosh operating systems, there is no easy way to have the operating system restrict the reach of a CGI program.

Programs That Should Not Be CGIs

Interpreters, shells, scripting engines, and other extensible programs should never appear in a cgi-bin directory, nor should they be located elsewhere on a computer where they might be invoked by a request to the web server process. Programs that are installed in this way allow attackers to run any program they wish on your computer.

For example, on Windows-based systems the Perl executable PERL.EXE should never appear in the cgi-bin directory. Unfortunately, many Windows-based web servers have been configured this way because it makes it easier to set up Perl scripts on these.

It is easy to probe a computer to see if it has been improperly configured. To make matters worse, web search engines can be used to find vulnerable machines automatically.

Another serious source of concern are CGI scripts that are distributed with web servers and then later found to have security flaws. Because webmasters rarely delete files from a cgi-bin directory, these dangerous CGI scripts may persist for many months or even years—even if new versions of the web server are installed that do not contain the bug. One example is the script named phf that was distributed with the NCSA web server and the many versions of the Apache web server. This script can be used to retrieve files from a computer on which it is running. This is an example of an unintended side effect, explained in the next section.

CGIs with Unintended Side Effects

To understand the potential problems with CGI programming, consider the script in Example 15.1.[85]

Example 15-1. A CGI Script with a Problem

#!/usr/local/bin/perl
#
# bad_finger
#
sub CGI_GET { return ($ENV{'REQUEST_METHOD'} eq "GET");}
sub CGI_POST{ return ($ENV{'REQUEST_METHOD'} eq "POST");}

sub ReadForm {
  local (*in) = @_ if @_;
  local ($i, $key, $val, $input);

  # Read in text
  $input = $ENV{'QUERY_STRING'} if (&CGI_GET);
  read(STDIN,$input,$ENV{'CONTENT_LENGTH'}) if (&CGI_POST);

  @in = split(/[&;]/,$input);

  foreach $i (0 .. $#in) {
    $in[$i] =~ s/+/ /g;                # plus to space
    ($key, $val) = split(/=/,$in[$i],2);# get key and value

    # Convert %XX from hex numbers to alphanumeric
    $key =~ s/%(..)/pack("c",hex($1))/ge;
    $val =~ s/%(..)/pack("c",hex($1))/ge;

    # Add to array
    $in{$key} .= "" if (defined($in{$key})); #  is the mult. separator
    $in{$key} .= $val;

  }
  return length($in);
}
###########################################################################
#
# The real action (and the security problems) follow

print "Content-type: text/html

<html>";

if(&ReadForm(*input)){
    print "<pre>
";
    print `/usr/bin/finger $input{'command'}`;
    print "</pre>
";
}

print <<XX;
<hr>
<form method="post" action="bad_finger">
Finger command: <input type="text" size="40" name="command"
</form>
XX

The first half of this script defines three Perl functions, CGI_GET, CGI_POST, and ReadForm, which will be used throughout this chapter for CGI form handling. There are no problems with these functions—all they do is take input from a CGI GET or POST operation and stuff them into an associative array provided by the programmer.

The second half of this script defines a finger gateway. If called by the result of a normal HTTP GET command, it simply generates the HTML for a CGI form:

Content-type: text/html

<html><hr>
<form method="post" action="bad_finger">
Finger command: <input type="text" size="40" name="command"
</form>

which produces the expected display in a web browser, as shown in Figure 15.1.

The finger gateway

Figure 15-1. The finger gateway

Type a typical user, like [email protected], into the field, hit Return, and you’ll get the expected result (see Figure 15.2).

The form displayed by the finger script

Figure 15-2. The form displayed by the finger script

But despite the fact that this script works as expected, it has a serious problem: an attacker can use this script to seriously compromise the security of your computer.

You might have some security problems in the CGI scripts on your server that are similar to this one. Security problems in scripts can remain dormant for years before they are exploited. Sometimes, obscure security holes may even be inserted by the programmer who first wrote the scripts—a sort of “back door” that allows the programmer to gain access in the future, should the programmer’s legitimate means of access be lost.

Can you see the problem? We discuss it in the next section.

The problem with the script

The problem with the script mentioned above is the single line that executes the finger command:

print `/usr/bin/finger $input{'command'}`;

This line executes the program /usr/bin/finger with the input provided and displays the result. The problem with this line is the way in which the finger command is invoked—from Perl’s backquote function. The backquote function provides its input to the UNIX shell—and the UNIX shell may interpret some of that input in an unwanted manner!

Thus, when we sent the value [email protected] to this CGI script, it ran the UNIX command:

print `/usr/bin/finger [email protected]`;

and that evaluated to:

/usr/bin/finger [email protected]

and that then produced the expected result.

The UNIX shell is known and admired for its power and flexibility by programmers and malicious hackers alike. One of these interesting abilities of the UNIX shell is the ability to put multiple commands on a single line. For example, if we wanted to run the finger command in the background and, while we are waiting,[86] do an ls command on the current directory, we might execute this command:

/usr/bin/finger [email protected] & /bin/ls -l

And indeed, if we type in the name [email protected] & /bin/ls -l as our finger request (see Figure 15.3), the bad_finger script will happily execute it, which produces the output (see Figure 15.4).

Executing the bad_finger script

Figure 15-3. Executing the bad_finger script

Output from the bad_finger script

Figure 15-4. Output from the bad_finger script

What’s the harm in allowing a user to list the files? By looking at the files, an attacker might learn about other confidential information stored on the web server. Also, the /bin/ls command is simply one of many commands that the attacker might run. The attacker could as easily run commands to delete files or to open up connections to other computers on your network, or even to crash your machine.

Although most operating systems are not fundamentally unsecure, few operational computers are administered in such a way that they can withstand an inside attack from a determined attacker. Thus, you want to ensure that attackers never get inside your system. To prevent an attacker from gaining that foothold, you must be sure that your CGI scripts cannot be turned against you.

Fixing the problem

Fixing the problem with the bad_finger script is remarkably easy. All you need to do is not trust the user’s input. Instead of merely sending $input{`command'} to a shell, you should filter the input, extracting out legal characters for the command that you wish to execute.

In the case of finger, there is a very small set of characters that are valid in email addresses or hostnames. The script below selects those characters with a regular expression pattern match:

if(&ReadForm(*input)){
    $input{'command'} =~ m/([[email protected]]*)/i;		# Match alphanumerics, @ and -
    print "<pre>
";
    print `/usr/bin/finger $1`;
    print "<pre>
";
}

This command works as before, except that it won’t pass on characters such as “&” or “;” or " ` " to the subshell.

Notice that this example matches legal characters, rather than filtering out disallowed ones. This is an important distinction! Many publications recommend filtering out special characters—and then they don’t tell you all of the characters that you need to remove. Indeed, it’s sometime difficult to know, because the list of which characters to remove depends on how you employ the user input as well as which shells and programs are invoked. For example, in some cases you might wish to allow the characters “.” and “/”. In other cases you might not, because you might not want to let the user specify the pathname .. /.. /.. /.. /.. /etc /passwd. That’s why best practice recommends selecting which characters to let through, rather than guessing which characters should be filtered out.[87]

The script can be made more secure (and faster) by using Perl’s system function to run the finger command directly. This entirely avoids calling the shell:

if(&ReadForm(*input)){
    $input{'command'} =~ m/([w+@-]*)/i;		# Match alphanumerics, @ and -
    print "<pre>
";
    system '/usr/bin/finger', $1;
    print "<pre>
";
}

The next section gives many “rules of thumb” that will help you to avoid these kinds of problems in your CGI and API programs.



[85] The CGI_GET, CGI_POST, and ReadForm Perl functions are based on Steven E. Brenner’s cgi-lib.pl. See http://www.bio.cam.ac.uk/web/form.html for more information. The serious Perl programmer may wish instead to use the CGI.pm Perl module, which is available from the CPAN archives.

[86] It may take a long time to pull down Spaf ’s finger file because of the interesting quotes he has in it.

[87] Another reason that you should select which characters are matched, rather than choose which characters to filter out, is that different programs called by your script may treat 8-bit and multibyte characters in different ways. You may not filter out the 8-bit or multibyte versions of a special character, but when they reach the underlying system they may be interpreted as single byte, 7-bit characters—much to your dismay.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.144.48.135