Chapter 11. Debugging Shell Scripts

According to legend, the first computer bug was a real insect, a moth that caused problems for the inner workings of an early computer. Since that time, problems in computer software have been termed bugs. Debugging is the glorified act of removing errors from your scripts.

Let's face it, scripts aren't always perfect. Even so, almost everything surrounding bugs remains controversial. Whether a particular behavior in a program or script is a bug or a feature can inspire great debate. Many companies consider the term bug itself to be pejorative, so they mandate more innocent-sounding terms. For example, Microsoft uses issue instead of bug. Apparently, using the term bug or defect could imply that their software isn't perfect.

Calling a behavior a bug can hurt people's feelings. For your own scripting efforts, however, consider debugging to simply be the act of making your scripts work the way you'd like and leave it at that.

Scripts can create a lot of havoc on your system. For example, your script may remove files necessary for the system to properly function. Or, worse yet, your scripts might accidentally copy a file on top of a crucial system file. The act of changing file permissions may inflict your system with a security hole. For example, a malicious script was discovered as one of the first attacks on Mac OS X systems. Therefore, numerous issues demand your attention.

In most cases, though, you simply need to do the following:

  1. Determine what has gone wrong.

  2. Correct the problem.

Sounds simple, doesn't it? Unfortunately, this is not always the case. However, several techniques can help, including the following:

  • If the shell outputs an error message, you need to decipher the message to determine the real error that caused problems for the shell.

  • Whether or not you see an error message, you can use several general techniques to track down bugs.

  • The shell can help, too. You can run your scripts in a special debugging mode to get a better idea of what is going on and where the problem occurs.

  • You can often avoid bugs in the first place by thoroughly testing your scripts prior to using them in a production environment. Furthermore, following good scripting practices will help avoid bugs.

This chapter covers general debugging techniques, as well as specific ways to track down and correct problems in scripts, whether in your scripts or scripts created by someone else. Most of these techniques are, by their nature, general-purpose techniques. Despite over 50 years of software development, on the earliest computers to modern PCs, the industry still faces problems with bugs. No magic techniques have appeared, despite any claims to the contrary. Although you can follow many practices to help find bugs and avoid them in the first place, you should expect bugs.

One of the first steps you need to take is to decipher any error messages.

Deciphering Error Messages

When the shell outputs an error message, it does so with a reason. This reason isn't always readily apparent, but when the shell outputs an error, there is likely some problem with your script. The error message may not always indicate the true nature of the problem or the real location within your script, but the message indicates that the shell has detected an error.

What you need to do, of course, is the following:

  1. Decipher the error message to figure out what the shell is complaining about.

  2. Track down the error to the real location of the problem in your script.

  3. Fix the error.

All of these are easier said than done, and much of the difficulty results from how the shell processes your script.

The shell processes your scripts sequentially, starting with the first command and working its way down to the end, or an exit statement, which may terminate the script prior to the end of the file. The shell doesn't know in advance what the values of all variables will be, so it cannot determine in advance whether the script will function or not.

For example, consider the following script from Chapter 2:

DIRECTORY=/usr/local
LS=ls
CMD="$LS $DIRECTORY"
$CMD

This script builds a command in a variable and then executes the value of the variable as a command with arguments. Any similar constructs in your scripts make it hard for the shell to determine in advance if the script is correct, at least as far as syntax is concerned.

In this particular example, all variables are set from within the script. This means that prior analysis could work. However, if the script used the read command to read in a value from the user, or if the script used an environment variable, then there is no way the shell can know in advance of running the script whether all variables have values that will make the script operate correctly.

The following sections work through some examples that show different types of errors you are likely to encounter and provide tips for deciphering the messages and tracking down the problems.

Finding Missing Syntax

One of the most common problems when writing shell scripts is following the often-cryptic syntax requirements. If you miss just one little thing, then the shell will fail to run your script.

The next example shows a case where the shell is a bit more forthcoming about the detected problem.

If you choose a text editor that performs syntax highlighting, you can often use the highlight colors to help track down problems. That's because syntax errors will often cause the highlight colors to go awry or appear to be missing something. For example, in this case with the missing ending double quote, the editor will likely show the text message as continuing to the next line. The colors should then look wrong for this type of construct, alerting you to a potential problem. Editors such as jEdit (www.jedit.org) and others described in Chapter 2 perform syntax highlighting.

Finding Syntax Errors

Another common source of scripting errors lies in simple typos—errors of some kind in the script. For example, forgetting something as simple as a space can create a lot of problems.

The next example shows one of the errors most difficult to track down: a syntax error in a command, which the shell will not detect.

All of these examples are fairly simple. You can find the errors without a lot of work. That won't always be true when you are working with a larger shell script, especially a script that was written a while ago and is no longer fresh in anyone's memory. In addition, if someone else wrote the script, you have an even larger burden to decipher someone else's scripting style. To help solve script problems, you can try the following general-purpose techniques.

Tracking Down Problems with Debugging Techniques

Because computers have been plagued by bugs since the very beginning, techniques for foiling bugs have been around almost as long.

Learn these techniques and you'll become an excellent programmer. These techniques apply to shell scripts and programming in any computer language.

If you can, try to track the errors to the actual commands in the script that appear to be causing the errors. This is easier said than done, of course; otherwise, this book wouldn't need a chapter on the subject of debugging.

In general, you want to isolate the problem area of the script and then, of course, fix the problem. The larger the script, the more you need to isolate the problem area. Fixing the problem isn't always easy either, especially if you can only isolate the area where the bug occurs and not the actual cause of the bug.

Use the following points as a set of guidelines to help track down and solve scripting problems.

Look Backward

Start with the line number that the shell outputs for the error and work backward, toward the beginning of the script file. As shown previously, the line number reported by the shell often fails to locate the problem. That's because the shell cannot always track errors back to their sources. Therefore, you need to start with the reported error line and work backward, trying to find an error.

Typically, the shell detects an error, such as a missing double quote or ending statement for an if, for, while, or other construct. The shell's error message tells you where the shell detected the problem, often at the end of the file. You need to traverse backward toward the beginning of the file, looking for the missing item.

In many cases, the shell will help out by telling you what kind of item is missing, which can narrow the search considerably.

Look for Obvious Mistakes

Look for syntax errors, typos, and other obvious mistakes. These types of errors are usually the easiest to find and fix.

For example, a typo in a variable name will likely not get flagged as an error by the shell but may well be problematic. In most cases, this will mean a variable is accessed (read) but never set to a value. The following script illustrates this problem:

echo -n "Please enter the amount of purchase: "
read amount
echo

echo -n "Please enter the total sales tax: "
read rate
echo

if [ $tax_rate -lt 3 ]
then
    echo "Sales tax rate is too small."
fi

The variable read in is rate, but the variable accessed is tax_rate. Both are valid variable names, but the tax_rate variable is never set.

Look for Weird Things

No, this is not another front in the culture wars between conservative pundits and the rest of us. Instead, the goal is to focus your energies on any part of a script that looks strange. It doesn't matter what appears strange or whether the strangeness can be justified. Look for any part of the script that appears weird for any reason.

What you are doing is trying to find likely places for an error. The assumption here is that anything that looks weird is a good candidate for the error.

Of course, weird is another one of those technical terms sporting a nonprecise definition. All that can be said is that you'll know it when you see it. Moreover, as your experience with scripts grows, you'll be better able to separate the normal from the strange.

To help determine what is weird and what is not, use the following guidelines:

  • Any use of command substitution, especially when several items are piped to a command, as shown in the previous examples with the bc command.

  • Any here document. These are just weird. Very useful, but weird.

  • Any statement calling a command you do not recognize.

  • Any if statement with a complex test, such as an AND or OR operation combining two or more test conditions.

  • Any use of awk if it looks like the output is not correct.

  • Any use of sed if the script is modifying a number of files and the files are not modified correctly.

  • Any redirection of stderr.

  • Any statement that looks too clever for its own good.

These guidelines may seem a bit strict, but they've proved useful in practice. Again, you're trying to identify areas of the script you should examine more closely, areas with a higher potential for holding the error or errors.

Look for Hidden Assumptions

For example, not all Unix systems include a compiler for programs written in the C programming language. Sun's Solaris is rather well known for the lack of a general-purpose C compiler with its standard operating system. Scripts that assume all systems contain a C compiler are making an unjustified assumption. Such an assumption may be buried within the script, making it even harder to find.

Windows systems rarely include a C compiler either, but if you have loaded the Cygwin environment for scripting on Windows, you can use the C compiler that is part of Cygwin.

Common assumptions include the following:

  • That a certain command exists on the given system. Or, that the command has a certain name. For example, the C compiler command name has traditionally been cc, but you can find names such as lpicc, or gcc for C compilers. With gcc, however, there is usually a shell script called cc that calls gcc.

  • That a command takes a certain type of option. For example, the ps command options to list all processes may be aux or ef, depending on the system.

  • That a command will actually run. This may sound hilarious, but such a problem may occur on purpose or by silly oversight. For example, Fedora Core 3 Linux ships with a shell script named /usr/bin/java. This script outputs an error message that is a placeholder for the real java command. Unfortunately, however, if you install the latest version of the java command from Sun Microsystems, you'll find the java command under /usr/java. The Sun java package does not overwrite the /usr/bin/java script. Therefore, even if the java command is installed, you may get the wrong version, depending on your command path.

  • That files are located in a certain place on the hard disk. This is especially true of device files. For example, SUSE Linux normally mounts the CD-ROM drive at /cdrom. Older versions of Red Hat Linux mounted the CD-ROM drive at /mnt/cdrom by default. Newer versions of Fedora Core Linux mount the CD-ROM drive at /media/cdrom. In addition, these are all very similar versions of Linux.

Divide and Conquer

This technique worked for the ancient Romans; why not make it work for you? The divide-and-conquer technique is a last resort when you cannot narrow down the location of a problem. You'll use this most often when debugging long shell scripts written by others.

You start by choosing a location about halfway into the script file. You do not have to be exact. Stop the script at the halfway point with an exit statement. Run the script. Does the error occur? If so, you know the problem lies within the first half of the script. If not, then you know the problem lies within the last half of the script.

Next, you repeat the process in whichever half of the script appears to have the error. In this case, you divide the relevant half of the script in half again (into one-fourth of the entire script). Keep going until you find the line with the error.

You should not have to divide the script more than 10 times. Any book on computer algorithms should be able to explain why.

This technique is simple, but it doesn't always work. Some scripts simply aren't appropriate for stopping at some arbitrary location. Nor is this technique appropriate if running part of the script will leave your system in an uncertain state—for example, starting a backup without completing the backup.

In this case, you can try a less aggressive approach. Instead of stopping the script at a dividing point, put in a read statement, as shown in the following example:

echo "Press enter to continue."
read ignored

This example requires the user to press the Enter key. The variable read in, ignored, is, you guessed it, ignored. The point of this snippet of a script is to pause the script. You can extend this technique by accessing the values of variables in the message passed to the echo command. This way, you can track the value of key variables through the script.

Break the Script into Pieces

This technique is analogous to the divide-and-conquer method. See if you can break the script into small pieces. You want the pieces to be small enough that you can test each one independently of the others. By making each piece work, you can then make the entire script work, at least in theory.

In many cases, if you can break each piece of the script into a function, then you can test each function separately. (See Chapter 10 for more information about writing functions.) Otherwise, you need to extract a section of script commands into a separate script file.

In either case, you want to verify that the scripted commands work as expected. To test this, you need to provide the expected input and verify that the script section or function produces the required output. This can be tedious, so you may want to do this only for areas you've identified as likely error areas (using the other techniques in this section, of course).

Once you verify that all of the pieces work, or you fix them to work, you can start assembling the pieces. Again, take a step-by-step approach. You want to assemble just two pieces first and then add another, and another, and so on. The idea is to ensure that you always have a working script as you assemble the pieces back together. When finished, you should have a working version of the original script. Your script may now look a lot different from the original, but it should work.

Trace the Execution

This technique is a lot like a Dilbert cartoon with a tagline of "You be the computer now." You need to pretend to be the computer and step through the script. Start at the beginning, examining each statement. Essentially, you pretend you are the shell executing the script. For each statement, determine what the statement does and how that affects the script. You need to work through the script step by step.

Look over all the commands, especially each if statement, for loop, case statement, and so on. What you want to do is see how the script will really function. Often, this will show you where the error or errors are located. For example, you'll see a statement that calls the wrong command, or an if statement with a reversed condition, or something similar. This process is tedious, but usually, with time, you can track down problems.

While you step through the code, statement by statement, keep in mind the other techniques, especially the ones about looking for hidden assumptions and detecting weird things. If any statement or group of statements stands out, perform more investigation. For example, look up the online documentation for the commands. You can see if the script is calling the commands properly.

Another good technique is to replace commands in the script with echo. That way, you see the command that would be executed but avoid the problematic commands.

Get Another Set of Eyes

Following this advice literally may help you defeat biometric security, but actually you want more than the eyes. Ask another person to look at the script.

Start by describing the problem and then explain how you narrowed down the search to the area in which you think the error occurs. Then ask the person to look at the script. The goal of this technique is obvious: Often, another person can see what you have overlooked. This is especially true if you have been working at the problem for a while.

Don't feel embarrassed doing this, as top-level software developers use this technique all the time.

All of these techniques are manual techniques, however. You must perform all of the work yourself. While primitive, the shell does offer some help for debugging your scripts through the use of special command-line options.

Running Scripts in Debugging Mode

What's missing from all these attempts to track down bugs is a good debugger. A debugger is a tool that can run a program or script that enables you to examine the internals of the program as it runs. In most debuggers, you can run a script and stop it at a certain point, called a breakpoint. You can also examine the value of variables at any given point and watch for when a variable changes values.

Most other programming languages support several debuggers. Shells don't, however. With shell scripting, you're stuck with the next best thing: the ability to ask the shell to output more information.

The following sections describe the three main command-line options to help with debugging, -n, -v, and -x.

Disabling the Shell

The -n option, short for noexec (as in no execution), tells the shell to not run the commands. Instead, the shell just checks for syntax errors. This option will not convince the shell to perform any more checks. Instead, the shell just performs the normal syntax check. With the -n option, the shell does not execute your commands, so you have a safe way to test your scripts to see if they contain syntax errors.

The following example shows how to use the -n option.

Displaying the Script Commands

The -v option tells the shell to run in verbose mode. In practice, this means that the shell will echo each command prior to executing the command. This is very useful in that it can often help you find errors.

Combining the -n and -v Options

You can combine the shell command-line options. Of these, the -n and -v options make a good combination because you can check the syntax of a script while seeing the script output.

The following example shows this combination.

Tracing Script Execution

The -x option, short for xtrace or execution trace, tells the shell to echo each command after performing the substitution steps. Thus, you'll see the value of variables and commands. Often, this alone will help diagnose a problem.

In most cases, the -x option provides the most useful information about a script, but it can lead to a lot of output. The following examples show this option in action.

The preceding example shows a relatively straightforward script. The following examples show slightly more complicated scripts.

In the next example, you can see how the -x option tells the shell to output information about each iteration in a for loop. This is very useful if the loop itself contains a problem. The -x option enables you to better see how the script looks from the shell's point of view.

Avoiding Errors with Good Scripting

After all this work, you can see that tracking down errors can be difficult and time-consuming. Most script writers want to avoid this. While there is no magical way to never experience errors, you can follow a few best practices that will help you avoid problems.

The basic idea is to write scripts so that errors are unlikely to occur; and if they do occur, the errors are easier to find. The following sections provide some tips to help you reduce the chance of errors.

Tidy Up Your Scripts

Because many script errors are caused by typos, you can format your script to make the syntax clearer. The following guidelines will not only make it easier to understand your scripts, but also help you see if they contain syntax errors:

  • Don't jam all the commands together. You can place blank lines between sections of your script.

  • Indent all blocks inside if statements, for loops, and so on. This makes the script clearer, as shown in the following example:

    if [ $rate -lt 3 ]
    then
        echo "Sales tax rate is too small."
    fi

    Note how the echo statement is indented.

  • Use descriptive variable names. For example, use rate or, better yet, tax_rate instead of r or, worse, r2.

  • Store file and directory names in variables. Set the variables once and then access the values of the variables in the rest of your script, as shown in the following example:

    CONFIG_DIR=$HOME/config
    if [ -e $CONFIG_DIR ]
    then
        # Do something....
    fi

This way, if the value of the directory ever changes, you have only one place in your script to change. Furthermore, your script is now less susceptible to typos. If you repeatedly type a long directory name, you may make a mistake. If you type the name just once, you are less likely to make a mistake.

Comment Your Scripts

The shell supports comments for a reason. Every script you write should have at least one line of comments explaining what the script is supposed to do.

In addition, you should comment all the command-line options and arguments, if the script supports any. For each option or argument, explain the valid forms and how the script will use the data.

These comments don't have to be long. Overly verbose comments aren't much help. However, don't use this as an excuse to avoid commenting altogether. The comments serve to help you, and others, figure out what the script does. Right now, your scripts are probably fresh in your memory, but six months from now you'll be glad you commented them.

Any part of your script that appears odd, could create an error, or contains some tricky commands merits extra comments. Your goal in these places should be to explain the rationale for the complex section of commands.

Create Informative Error Messages

If the cryptic error messages from the shell impede your ability to debug scripts, then you shouldn't contribute to the problem. Instead, fight the Man and be a part of the solution. Create useful, helpful error messages in your scripts.

One of the most interesting, and perhaps confusing, error messages from a commercial application was "Pre-Newtonian degeneracy discovered." The error was a math error, not a commentary about the moral values of old England.

Error messages should clearly state the problem discovered, in terms the user is likely to understand, along with any corrective actions the user can take.

You may find that the error messages are longer than the rest of your script. That's okay. Error messages are really a part of your script's user interface. A user-friendly interface often requires a lot of commands.

Simplify Yourself Out of the Box

Clever commands in your scripts show how clever you are, right? Not always, especially if one of your scripts doesn't work right. When faced with a script that's too clever for itself, you can focus on simplifying the script. Start with the most complicated areas, which are also likely to be the areas that aren't working. Then try to make simpler commands, if statements, for loops, case statements, and so on.

Often, you can extract script commands from one section into a function or two to further clarify the situation. The idea is to end up with a script that is easier to maintain over the long run. Experience has shown that simpler scripts are far easier to maintain than overly complicated ones.

Test, Test, and Test Again

Test your scripts. Test your scripts. Yep, test your scripts. If you don't test your scripts, then they will become examples for others of problematic debugging.

In many cases, especially for larger scripts, you may need to follow the techniques described in the section on breaking scripts into pieces. This concept works well for the scripts you write as well. If you can build your scripts from small, tested pieces, then the resulting whole is more likely to work (and more likely to be testable).

The only way you can determine whether your scripts work is to try them out.

Summary

Scripts can experience problems. Usually, it isn't the script suffering from a bad hair day. Instead, there is usually some sort of problem in the script or a faulty assumption that caused the problem. When a problem occurs, the shell should output some sort of error message. When this happens, you need to remember the following:

  • One of the first things you have to do is decipher the error messages from the shell, if there are any.

  • Error messages may not always refer to the right location. Sometimes you have to look around in the script to find the error.

  • The script may contain more than one error.

  • The shell -v command-line option runs a script in verbose mode.

  • The shell -n command-line option runs a script in no-execute mode. The shell will not run any of the commands. Instead, it will just check the syntax of your script.

  • The shell -x command-line option runs the shell in an extended trace mode. The shell will print out information on each command, including command substitution, prior to executing the command.

  • Always test your scripts prior to using them in a production environment.

This chapter ends the part of the book that covers the beginning steps of shell scripting. The next chapter begins by showing you how to use scripts—in this case, how to use scripts to graph system performance data, along with any other data you desire. With the next chapter, you'll use the techniques introduced so far in real-world situations.

Exercises

  1. What is wrong with the following script? What is the script supposed to do? At least, what does it look like it is supposed to do? Write a corrected version of the script.

    # Assumes $1, first command-line argument,
    # names directory to list.
    
    directory=$1
    
    if [ -e $directory ]
    then
        directroy="/usr/local"
    fi
    
    cd $directroy
    for filename in *
    do
        echo -n $filename
    
        if [ -d $filename ]
        then
            echo "/"
        elif [ ! -x $filename ]
        then
            echo "*"
        else
            echo
        fi
    done
  2. What is wrong with this script? What is the script supposed to do? At least, what does it look like it is supposed to do? Write a corrected script.

    #!/bin/sh
    
    # Using bc for math,
    # calculates sales tax.
    
    echo -n Please enter the amount of purchase: "
    read amount
    echo
    
    echo -n "Please enter the total sales tax rate: "
    read rate
    echo
    
    result=$( echo "
    scale=2; tax=$amount*$rate/100.00;total=$amount+tax;print total" | bc )
    
    if [ $( expr "$result > 200" ) ]
    then
        echo You could qualify for a special free shipping rate.
        echo -n Do you want to? "(yes or no) "
        read shipping_response
        if [ $shipping_response -eq "yes" ]
        then
            echo "Free shipping selected.
        fi
    fi
    
    echo "The total with sales tax = $ $result."
    echo "Thank you for shopping with the Bourne Shell."
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.223.209.180