2 Error Handling

In Chapter 1, “Security Is a Server Issue and Other Myths,” we discussed the need to integrate security measures into every application. In this chapter, we tackle one of the most basic ways you can secure your application: handling erroneous data.

The Guestbook Application

This chapter also gets us into the sample application we’ll be working on throughout the book. It’s a simple guestbook application, but as you’ll see, there is plenty of room for security holes even in the smallest program. If you haven’t written your application, be sure to read through Chapter 16, “Plan A: Designing a Secure Application from the Beginning.”

Program Summary

The guestbook application will allow visitors to enter comments on the Web site. The comments will be stored in a database, and the ten most recent comments will be displayed on the Web site. Comments will also be e-mailed to a customer service address. The feature list includes the following:

• Allow anonymous comments (Phase I).

• Allow users to enter a name along with the comment, regardless of whether or not they are logged in to an account (Phase I).

Allow users to create accounts. Once they have created an account, they can view and modify their past comments (Phase II).

• Allow users to upload a small image with their comment (Phase II).

• Allow administrative users to view and delete user accounts and moderate comments (Phase III).

Primary Code Listing

The following is a first shot at the guestbook code. It implements the first requirement—to allow anonymous comments.

<?php
// Be sure we have access to database.php
// for the storeComment() function
require_once('database.php'),


// Create user interface
$html = beginHtml();
$html .= '<form name='enter_comment' action='guestbook.php' method='POST'>';
$html .= 'Please enter your comment here: ';
$html .= '<textarea rows='20' cols='100' name='comment'>&nbsp;</textarea>';
$html .= '<input type='submit' value='Send your comment'>';
$html .= '<form>';
$html .= endHtml();
print $html;

// Store comment in the database
// storeComment() is one of the application's
// custom library functions in database.php
storeComment($_POST['comment']);

// HTML functions
function beginHtml() {
      return '<html><head><title>Guestbook</title></head><body> ';
}

function endHtml() {
      return '</body></html>';
}


// Database functions
// coming soon
?>


Users Do the Darnedest Things …

So far things are pretty straightforward—we’ve implemented the Phase I feature list, so we’re done, right? In an ideal world, the answer would be yes. Back here on Earth, we’ve only just begun.

I Wonder What Will Happen If I Do This?

Yes, you will get users who say this in a gleeful tone as they do something completely bizarre to your application. Why would someone bother to put something clearly wrong into your guestbook? Here are a few reasons:

• Honest mistakes such as typing errors

• Boredom

• The challenge of outsmarting you

• Simple curiosity

• Actual malicious intent

Notice that malicious intent is at the bottom of the list. In fact, the first reason users might send bad data to your application isn’t malicious at all. The vast majority of users are perfectly reasonable people who aren’t perfect typists. (Who is?) Even when you start dealing with actual hackers, most of them are little more than bored high school and college kids with too much time on their hands. That doesn’t mean you can dismiss them, though. You may never have to deal with sophisticated cyber-terrorists, but the fact is, regardless of intent, the damage done to your application or your system is the same.

The good news is that the vast majority of hackers aren’t all that sophisticated. This means that you can take a few simple steps and eliminate the majority of the threats. The bad news is that it can take very little effort for a hacker to take down an insecure application. Take our guestbook, for example. We have two input fields: comment and username. If a hacker were to type the following into either of those fields, havoc would ensue:

This is a great guestbook); drop table USERS;

The application would take that input and insert it into a SQL statement that ends up looking like this:

INSERT INTO comments VALUES(This is a great guestbook); drop table USERS;);

Yes, that code does exactly what you think it does—it inserts a comment into the database, then drops the USERS table. Let’s go through the code piece by piece to see how this bit of SQL injection works. We’ll start with the input:

This is a great guestbook); drop table USERS;

This is actually two lines of code, separated by a semicolon. Separate PHP instructions aren’t defined by their placement on separate lines on the screen, but by the semicolon character. You already knew that you have to put a semicolon at the end of every line of PHP code, but you may not have really thought about why the semicolon is there. It marks the end of a line of code. Consider the following two code snippets, first, a few lines from our guestbook program, with standard formatting:

require_once('database.php'),

$html = beginHtml();
$html .= '<form name='enter_comment' action='guestbook.php' method='POST'>';

Here is the same code, all on one line:

require_once('database.php'),$html = beginHtml();$html .= '<form
   name='enter_comment' action='guestbook.php' method='POST'>';

Both examples do exactly the same thing. The first is easier for humans to read, but the computer really doesn’t care how you format your code. Hackers use this fact to make their attacks. If it were necessary to place separate instructions on their own line of code, injection attacks would be impossible because HTML input forms don’t retain line breaks. If code breaks were tied to line breaks, code entered through HTML forms would run together meaninglessly. Unfortunately, all a hacker needs to do in order to send multiple lines of code to your application is to separate them by semicolons. The hacker piggybacks malicious code on legitimate input, as we’ve done here.

Whatever the user enters into the comment area of our input form is inserted into a SQL statement inside our application:

$user_comment = $_POST['comment']
$sql = "INSERT INTO comments VALUES($user_comment);"

In fact, most programmers would condense this even further:

$sql = "INSERT INTO comments VALUES($_POST['comment']);"

Simple, right? One line of code, and we’ve created the SQL statement that will store the comment in the database.

Unfortunately, this is one time when simplicity isn’t necessarily a good thing. Let’s see what that one simple line of code looks like when a malicious user attempts a SQL injection attack using our input form:

$sql = "INSERT INTO comments VALUES(This is a great guestbook);
   drop table USERS;);"

What the database server will see is three distinct commands:

INSERT INTO comments VALUES(This is a great guestbook);
   drop table USERS;
);

Yes, the server will probably complain about the syntax of the third line of injected code. The dangling ); is there because you assume that your user’s comment won’t close the SQL statement for you, as we’ve done in our example injection attack. But the database server will happily execute the first two commands before it hits the third and throws a syntax error. By the time the server hits the erroneous command, it’s too late. The Users table has already been dropped and your application—at least the parts that require users to log in—has become a virtual paperweight.

Now, if a true cyber-terrorist decides to take aim at your application, you’re going to have to call in your own big guns to counteract their activities, but for the majority of cases, you can follow the steps in this book and harden your application to most hackers.

What are the odds someone would take down our guestbook? After all, it’s really not that sophisticated an application! Consider the following scenario:

Mrs. Smith, a longtime customer, has just had a horrible experience with a customer service representative. She decides to visit the Web site to look up contact information so she can write a letter to the company. While she’s on the Web site, Mrs. Smith sees the guestbook application. Rather than waiting a week for her letter to arrive, she decides to post it to the guestbook in the hope that someone will see it and respond to it more quickly than to a traditional letter.

She types up her complaint, then rereads it to be sure she covered all the details. After rereading it, she realizes that she may have been a bit harsh and worries that her letter could cost the customer service representative his or her job. She certainly doesn’t want that, so she deletes the letter and hits the Enter key. Her screen refreshes and displays a cryptic database error. Mrs. Smith panics, thinking she’s just broken the Web site, and calls customer service to see what went wrong.

In this case, all Mrs. Smith has done is to send empty input to the application. Depending on your database server configuration, it could handle empty input seamlessly, or it could throw an error. If the comment field of the database is set to NOT NULL, the database will reject Mrs. Smith’s empty comment.

If your Web server isn’t equipped to handle database errors gracefully, your users will see a raw database error. It won’t make a bit of sense to the benign users, and it will give malicious ones far too much information about your server, making their next attack even easier to execute.

Odds are the customer service representative who answers Mrs. Smith’s call will have no idea what the database error means, so he or she will not be able to give Mrs. Smith a satisfactory answer—compounding her frustration with the company. The customer service rep will escalate the problem to your IT department, where it will slowly filter through two or three other people before it reaches someone who knows how to fix the problem.

How much will this error cost the company, in terms of customer frustration and lost productivity, as the error travels up the problem-solving ladder? Who knows what the actual dollar value of that error might be, but the good news is, this type of error is amazingly easy to prevent. We’ll explain how in the rest of this chapter.

Expecting the Unexpected

The first step in prevention is predicting the problem. Ask yourself, “What is the most bizarre thing a user could do here?” These are your boundary conditions—the outermost boundaries of irrational input. Spend a few minutes brainstorming as you ask yourself, “What could someone possibly do here?” Here’s our list of boundary conditions for the guestbook. Don’t take this as a complete list of every boundary condition that exists. You can never be sure you’ve tested every possible scenario, so be as complete as you can and move on. When you think of something new, add another test.

• Blank input (the boundary we explored in the previous section)

• Control characters

• Non-alphanumeric data (symbols, etc.)

• Excessively long inputs (greater than 256 characters)

• Guestbook spam

• Binary data

Alternate encoded data—ASCII, Unicode, UTF-8, hexadecimal, octal, etc.

• SQL injection

• Code injection

• Cross-site scripting

For now, we’re going to concentrate on the first several items on the list; SQL injection, code injection, cross-site scripting, and those types of conditions get their own chapters later in the book.

We’ve already discussed why blank input is a problem, but what about the rest? First of all, it’s highly unlikely that control characters, binary data, or alternate encoded data would be part of a legitimate guestbook comment, so by their very nature those types of inputs are suspect in our application. Second, these types of inputs often carry malicious code to be used in cross-site scripting and injection attacks. The underlying philosophy we use to determine boundary conditions is to reject any input that seems suspicious. This is a fairly strict security philosophy, but it is a lot more reliable (and a lot less hassle) than trying to give input data the benefit of the doubt, or worse, trying to strip out the parts that may be harmful to the system. You’re much better off simply ignoring input that isn’t what you expect, giving users an error message and the chance to try again.

Once you have a pretty solid list of what a user could do to your application, you’re ready to build your preventive measures.

Building an Error-Handling Mechanism

One of the most important things you can do to secure your application is to build a system to handle errors. Why not just handle errors inline, as the situation arises? Two reasons:

• You will miss something. We promise.

• Consistency. If you build an error-handling system, you have to decide only once how to handle errors. Without a system in place, you have to remember how you decided to handle errors each time you encounter an input.

Building an error-handling mechanism or system isn’t really as massive a task as it sounds. In fact, this is one of those beautiful situations where a relatively small amount of effort results in big gains.

Test for Unexpected Input

Now that you’ve thought about some of the bizarre data users could send to your application, you can write code to test for it. Our philosophy is to test all user input and reject anything that doesn’t appear to be legitimate (rather than trying to test for every possible type of malicious or erroneous input). In order to do this, we have to define what we’re expecting user input to look like. In the case of a guestbook comment, we can’t be too specific, but we can define a couple of basic traits:

• The data should be alphanumeric with a few specific punctuation symbols.

• It should be relatively short. A legitimate user won’t type a novel into a guestbook comment field.

At this point, we have to decide if we will allow users to enter HTML code in their comments. On one hand, it is perfectly legitimate for users to put their e-mail address or a link or two into their comments. On the other hand, if we allow HTML, we open up the application to a variety of scripting attacks. We’ll cover both possibilities and leave the final choice up to you.

Stripping HTML from User Input

Deciding not to allow HTML is certainly the safer choice when it comes to user input, but many users will assume that basic HTML is acceptable input and will use it anyway. Unfortunately, so will hackers—and they’ll be trying to do more than use bold for emphasis.

If you’ve decided not to allow HTML, you’ll need to eliminate it from the data sent to your application by the user. Since so many legitimate users will use HTML in their comments, regardless of your restrictions, this is one case where you don’t necessarily want to reject the entire message due to the presence of HTML. Instead, we’ll strip out the HTML tags and then evaluate the data. If it contains suspicious elements in addition to HTML, we can be sure that it isn’t legitimate.

The striptags() function in PHP removes HTML tags, leaving only the raw data, as shown in the following example. The user enters the following data:

This is the <em>best</em> guestbook!

Our application stores this string and strips the tags:

$tainted_string = "This is the <em>best</em> guestbook!";
$safe_string = striptags($tainted_string);

$safe_string now holds the raw data:

$safe_string = "This is the best guestbook!";

The user data isn’t changed, except for the missing <em></em> tags.

Accepting HTML from Users Safely

You may decide to go ahead and allow HTML from your users. If you expect your users to post their e-mail addresses, links to their Web sites, or other HTML-specific content, you’ll need to provide a way for them to do that without compromising your application or the server it runs on. PHP provides two built-in functions to handle this problem:

htmlentities()

htmlspecialchars()

htmlentities() is the simpler of the two options. It replaces a few common HTML tags with their equivalent character codes. For example:

& (ampersand) becomes &amp;

" (double quote) becomes &quot;

' (single quote) becomes &#039;

< (less than, or open tag) becomes &lt;

> (greater than, or close tag) becomes &gt;

To use htmlentites(), simply pass in the string you want to sanitize, as shown here:

$tainted_string = "This is the <em>best</em> guestbook!";
$safe_string = htmlentities($tainted_string);

At this point, $safe_string holds the following:

$safe_string = "This is the &lt;best&gt; guestbook!";

If you need to escape (strip special meaning from) every possible HTML tag, instead of just these five, use htmlspecialchars() instead.

Make Life Difficult for Spammers

We’re not sure that anyone has much patience for spammers. Let’s face it: “Spam is bad” is one of the very few truths that just about everyone online can agree with. In fact, distaste for spam and the people who send it out is so universal that ISPs and Web hosts (most of them anyway) hand out swift justice when they catch a spammer, usually canceling his or her account if they even suspect the user is sending out spam. Forget the trial and jury, folks—this is the Internet.

Since it takes time and effort to get a new ISP account set up, most spammers don’t risk getting their own accounts canceled. Instead, they send their spam through insecure Web applications—let’s hope not through yours! That way, if anyone’s account is canceled, it’s yours instead of the spammer’s. At the point where people are sending out spam for a living, they’ve pretty much lost any sense of personal responsibility and don’t really care if they inconvenience anyone else. (If they cared about inconveniencing you or 100,000 of your closest friends, they wouldn’t be sending out spam in the first place.)

So how do you make sure that yours isn’t one of the applications spammers can use? First of all, don’t use the underlying mail transport system in your application unless you absolutely need to. Does your application really need a built-in form to allow your users to send their friends a link to your site? That’s your call. If you decide that e-mail is essential to your application, one of the simplest things you can do to discourage spammers is to prevent users from sending e-mails to more than one person at a time. Spammers work in bulk—they have to send out between 10,000 and 100,000 e-mails to make a single sale. That means that they don’t have time to type in single e-mail addresses. They need to put tens or hundreds of thousands of e-mail addresses at once into a form input. Most mail transport systems accept multiple e-mail addresses separated by a comma or semicolon, so adding a simple regular expression (don’t panic—there’s a tutorial on regular expressions in Chapter 5, “Input Validation”) to check for commas and semicolons in an e-mail address is a good first defense against spammers.

This code snippet takes the “to” field from the $_POST superglobal and checks it for the presence of commas or semicolons. If the data is clean, it is stored in the $to variable. What happens if the data contains either of those characters? We’ll discuss that in the next section.

$tainted_to = $_POST['to'];
if ($tainted_to !~ ^.*[;|,].*$) {
      $to = $tainted_to;
}

Keep in mind, this won’t prevent a dedicated spammer from using your application; it will just make it more difficult. Luckily, spammers take the easiest route possible, so even this small step will keep your application relatively secure from them.

Decide What to Do with Erroneous Data

The first step in building an error-handling mechanism is deciding what to do when the system encounters an error, such as the boundary conditions on the list in the previous section.

You will probably want to display an error message to your users, with a hint as to what you were expecting them to do, then give them a chance to try again. You may also want to write the error to a log file and, depending on its severity, notify someone on your IT security team, a system admin, or the lead developer on the project.

Depending on the type of error, you could also try to fix it yourself, but this is usually a last-ditch effort and yields mixed results. Sometimes you can guess which part of the data is bad—for instance, a whitespace character in a zip code field is probably a mistake. But what if that zip code has an extra digit? Which digit should you strip off, the first digit or the last one? You can create programmatic rules to strip bad data such as control characters, binary data, or alternate encodings and leave the rest of the input intact, but this method requires that you anticipate what a hacker might do. We prefer to simply reject inputs with any sign of bad data, because we know a lot more about what good data looks like than we do about what a hacker might attempt to send us. Unless you absolutely cannot ask the user to go back and try again, you should avoid attempting to fix the erroneous data yourself.

For the guestbook application, we will do the following when we encounter erroneous input, as shown in Figure 2.1:

1. Redirect the browser to the input page.

2. Display a formatted error message to the user.

Figure 2.1. A simple treatment for erroneous input.

A simple treatment for erroneous input.

Later on, we’ll add some advanced features to the system to handle more serious threats such as cross-site scripting and SQL injection. For now, redirecting the browser and displaying an error message are sufficient.

We need to be careful in writing our error messages, too. We want to be as helpful as possible to users who made legitimate mistakes, but we don’t want to give away too much information about the security measures we’ve put in place. In this case, we will simply tell the user, “I’m sorry, I didn’t understand your comment. Please try again.” It’s nonconfrontational, so it shouldn’t annoy most users, but it also doesn’t really say much about why the original data was rejected.

Make the System Mind-Numbingly Easy to Use

Finally, we need to make the error-handling system so easy to use that we won’t be tempted to skip it. The most important thing we can do to achieve this is to encapsulate everything under one little bitty function call. For this application, we’ll achieve this by using the following function:

function error($message) {
      // Take in a plain text error message, format it, and return the formatted
      // error message
      return '<font color='red'>$message</font>';
}

This is a very simple error-handling mechanism, and odds are we’ll need to extend it as the application grows, but for our purposes right now it is sufficient. Here’s how we’ll modify the code to use the error handler:

<?php
// Create user interface
$html = beginHtml();
$html .= '<form name='enter_comment' action='guestbook.php' method='POST'>';

// If the err POST variable is set, we've just come back from the error handler.
// Add the formatted error message (stored in the err POST variable) to the string
// of HTML.

if($_POST['err']) {
      $html .= $_POST['err'];
      $html .= '<br>';
}
$html .= 'Please enter your comment here: ';
$html .= '<textarea rows='20' cols='100' name='comment'>&nbsp;</textarea>';
$html .= '<input type='submit' value='Send your comment'>';
$html .= '<form>';
$html .= endHtml();
print $html;

// Store comment in the database, or call the error handler if the comment field
// is blank
$error_message = "I'm sorry, I didn't understand your comment.";
$error_message .= "Please try again.";
if($_POST['comment'] && $_POST['comment'] != '') {
      storeComment($_POST['comment']);
} else {
      error($error_message);
}

// HTML functions
function beginHtml() {
      return '<html><head><title>Guestbook</title></head><body> ';
}

function endHtml() {
      return '</body></html>';
}

// Database functions
// coming soon

// Error handling functions
function error($message) {
// Take in the error message, format it, and return
      $formatted_error = '<font color='red'>$message</font>';
      http_redirect('guestbook.php', array(err=$formatted_error);
}
?>

As you can see, we haven’t really added all that much code. But we have effectively handled one of our boundary conditions, by testing for the condition, then calling our error handler in the event of a problem. The rest of the boundary conditions can be handled in the same way. The error handler function is very simple. All it does is format the error message in a standardized way (so that all error messages across our application look the same), then it refreshes the guestbook application with the formatted error message.

Wrapping It Up

In this chapter, we looked at some of the reasons your application could be hacked, thought about the outer boundaries of what users could enter into our sample guestbook application, and added some code to handle errors. This is only the start, but even if all you do is implement an error-handling system, your application will be quite a bit more secure than it was before.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.118.24.30