Chapter 5. Flow Control

If you are a programmer, you may have read the last chapter—with its claim at the outset that bash has an advanced set of programming capabilities—and wondered where many of the features from conventional languages were. Perhaps the most glaringly obvious “hole” in our coverage thus far concerns flow control constructs like if, for, while, and so on.

Flow control gives a programmer the power to specify that only certain portions of a program run, or that certain portions run repeatedly, according to conditions such as the values of variables, whether or not commands execute properly, and others. We call this the ability to control the flow of a program’s execution.

Almost every shell script or function that’s been shown thus far has had no flow control—they have just been lists of commands to be run! Yet bash, like the C and Bourne shells, has all of the flow control abilities you would expect and more; we will examine them in this chapter. We’ll use them to enhance the solutions to some of the programming tasks we saw in the last chapter and to solve tasks that we will introduce here.

Although we have attempted to explain flow control so that nonprogrammers can understand it, we also sympathize with programmers who dread having to slog through yet another tabula rasa explanation. For this reason, some of our discussions relate bash’s flow-control mechanisms to those that programmers should know already. Therefore you will be in a better position to understand this chapter if you already have a basic knowledge of flow control concepts.

bash supports the following flow control constructs:

if/else

Execute a list of statements if a certain condition is/is not true

for

Execute a list of statements a fixed number of times

while

Execute a list of statements repeatedly while a certain condition holds true

until

Execute a list of statements repeatedly until a certain condition holds true

case

Execute one of several lists of statements depending on the value of a variable

In addition, bash provides a new type of flow-control construct:

select

Allow the user to select one of a list of possibilities from a menu

We will now cover each of these in detail.

if/else

The simplest type of flow control construct is the conditional, embodied in bash’s if statement. You use a conditional when you want to choose whether or not to do something, or to choose among a small number of things to do, according to the truth or falsehood of conditions. Conditions test values of shell variables, characteristics of files, whether or not commands run successfully, and other factors. The shell has a large set of built-in tests that are relevant to the task of shell programming.

The if construct has the following syntax:

if condition
then
    statements
[elif condition
    then statements...]
[else 
 statements]
fi

The simplest form (without the elif and else parts, or clauses) executes the statements only if the condition is true. If you add an else clause, you get the ability to execute one set of statements if a condition is true or another set of statements if the condition is false. You can use as many elif (a contraction of “else if”) clauses as you wish; they introduce more conditions, and thus more choices for which set of statements to execute. If you use one or more elifs, you can think of the else clause as the “if all else fails” part.

Exit Status

Perhaps the only aspect of this syntax that differs from that of conventional languages like C and Pascal is that the “condition” is really a list of statements rather than the more usual Boolean (true or false) expression. How is the truth or falsehood of the condition determined? It has to do with a general UNIX concept that we haven’t covered yet: the exit status of commands.

Every UNIX command, whether it comes from source code in C, some other language, or a shell script/function, returns an integer code to its calling process—the shell in this case—when it finishes. This is called the exit status. 0 is usually the OK exit status, while anything else (1 to 255) usually denotes an error. [1]

if checks the exit status of the last statement in the list following the if keyword. The list is usually just a single statement. If the status is 0, the condition evaluates to true; if it is anything else, the condition is considered false. The same is true for each condition attached to an elif statement (if any).

This enables us to write code of the form:

if command ran successfully
then
   normal processing
else
   error processing
fi

More specifically, we can now improve on the pushd function that we saw in the last chapter:

pushd ( )
{
    dirname=$1
    DIR_STACK="$dirname ${DIR_STACK:-$PWD' '}"
    cd ${dirname:?"missing directory name."}
    echo $DIR_STACK
}

This function requires a valid directory as its argument. Let’s look at how it handles error conditions: if no argument is given, the third line of code prints an error message and exits. This is fine.

However, the function reacts deceptively when an argument is given that isn’t a valid directory. In case you didn’t figure it out when reading the last chapter, here is what happens: the cd fails, leaving you in the same directory you were in. This is also appropriate. But the second line of code has pushed the bad directory onto the stack anyway, and the last line prints a message that leads you to believe that the push was successful. Even placing the cd before the stack assignment won’t help because it doesn’t exit the function if there is an error.

We need to prevent the bad directory from being pushed and to print an error message. Here is how we can do this:

pushd ( )
{
  dirname=$1
  if cd ${dirname:?"missing directory name."}    # if cd was successful
  then
      DIR_STACK="$dirname ${DIR_STACK:-$PWD' '}" # push the directory        
      echo $DIR_STACK
  else
      echo still in $PWD.                        # else do nothing
  fi
}

The call to cd is now inside an if construct. If cd is successful, it will return 0; the next two lines of code are run, finishing the pushd operation. But if the cd fails, it returns with exit status 1, and pushd will print a message saying that you haven’t gone anywhere.

Notice that in providing the check for a bad directory, we have slightly altered the way pushd functions. The stack will now always start out with two copies of the first directory pushed onto it. That is because $PWD is expanded after the new directory has been changed to. We’ll fix this in the next section.

You can usually rely on built-in commands and standard UNIX utilities to return appropriate exit statuses, but what about your own shell scripts and functions? For example, what if you wrote a cd function that overrides the built-in command?

Let’s say you have the following code in your .bash_profile.

cd ( )
{
    builtin cd "$@"
    echo "$OLDPWD --> $PWD"
}

The function cd simply changes directories and prints a message saying where you were and where you are now. Because functions have higher priority than most built-in commands in the shell’s order of command look-up, we need to make sure that the built-in cd is called, otherwise the shell will enter an endless loop of calling the function, known as infinite recursion.

The builtin command allows us to do this. builtin tells the shell to use the built-in command and ignore any function of that name. Using builtin is easy; you just give it the name of the built-in you want to execute and any parameters you want to pass. If you pass in the name of something which isn’t a built-in command, builtin will display an appropriate message. For example: builtin: alice: not a shell builtin.

We want this function to return the same exit status that the built-in cd returns. The problem is that the exit status is reset by every command, so it “disappears” if you don’t save it immediately. In this function, the built-in cd’s exit status disappears when the echo statement runs (and sets its own exit status).

Therefore, we need to save the status that cd sets and use it as the entire function’s exit status. Two shell features we haven’t seen yet provide the way. First is the special shell variable ?, whose value ($?) is the exit status of the last command that ran. For example:

cd baddir
echo $?

causes the shell to print 1, while the following command causes it to print 0:

cd gooddir
echo $?

So, to save the exit status we need to assign the value of ? to a variable with the line es=$? right after the cd is done.

Return

The second feature we need is the statement return N, which causes the surrounding function to exit with exit status N. N is actually optional; it defaults to the exit status of the last command. Functions that finish without a return statement (i.e., every one we have seen so far) return whatever the last statement returns. return can only be used inside functions, and shell scripts that have been executed with source. In contrast, the statement exit N exits the entire script, no matter how deeply you are nested in functions.

Getting back to our example: if the call to the built-in cd were last in our cd function, it would behave properly. Unfortunately, we really need the assignment statement where it is. Therefore we need to save cd’s exit status and return it as the function’s exit status. Here is how to do it:

cd ( )
{
    builtin cd "$@"
    es=$?
    echo "$OLDPWD --> $PWD"
    return $es
}

The second line saves the exit status of cd in the variable es; the fourth returns it as the function’s exit status. We’ll see a substantial cd “wrapper” in Chapter 7.

Exit statuses aren’t very useful for anything other than their intended purpose. In particular, you may be tempted to use them as “return values” of functions, as you would with functions in C or Pascal. That won’t work; you should use variables or command substitution instead to simulate this effect.

Combinations of Exit Statuses

One of the more obscure parts of bash syntax allows you to combine exit statuses logically, so that you can test more than one thing at a time.

The syntax statement1 && statement2 means, “execute statement1, and if its exit status is 0, execute statement2.” The syntax statement1 || statement2 is the converse: it means, “execute statement1, and if its exit status is not 0, execute statement2.” At first, these look like “if/then” and “if not/then” constructs, respectively. But they are really intended for use within conditions of if constructs—as C programmers will readily understand.

It’s much more useful to think of these constructs as “and” and “or,” respectively. Consider this:

if statement1 && statement2
then
    ...
fi

In this case, statement1 is executed. If it returns a 0 status, then presumably it ran without error. Then statement2 runs. The then clause is executed if statement2 returns a 0 status. Conversely, if statement1 fails (returns a non-zero exit status), then statement2 doesn’t even run; the last statement that actually ran was statement1, which failed—so the then clause doesn’t run, either. Taken all together, it’s fair to conclude that the then clause runs if statement1 and statement2 both succeeded.

Similarly, consider this:

if statement1 || statement2
then
    ...
fi

If statement1 succeeds, then statement2 does not run. This makes statement1 the last statement, which means that the then clause runs. On the other hand, if statement1 fails, then statement2 runs, and whether the then clause runs or not depends on the success of statement2. The upshot is that the then clause runs if statement1 or statement2 succeeds.

bash also allows you to reverse the return status of a statement with the use of !, the logical “not”. Preceding a statement with ! will cause it to return 0 if it fails and 1 if it succeeds. We’ll see an example of this at the end of this chapter.

As a simple example of testing exit statuses, assume that we need to write a script that checks a file for the presence of two words and just prints a message saying whether either word is in the file or not. We can use grep for this: it returns exit status 0 if it found the given string in its input, non-zero if not:

filename=$1
word1=$2
word2=$3
     
if grep $word1 $filename || grep $word2 $filename
then
    echo "$word1 or $word2 is in $filename."
fi

The then clause of this code runs if either grep statement succeeds. Now assume that we want the script to say whether the input file contains both words. Here’s how to do it:

filename=$1
word1=$2
word2=$3
     
if grep $word1 $filename && grep $word2 $filename
then
    echo "$word1 and $word2 are both in $filename."
fi

We’ll see more examples of these logical operators later in this chapter.

Condition Tests

Exit statuses are the only things an if construct can test. But that doesn’t mean you can check only whether commands ran properly. The shell provides two ways of testing a variety of conditions. The first is with the [...] construct, which is available in many different versions of the Bourne shell.[2] The second is by using the newer [[...]] construct.[3] The second version is identical to the first except that word splitting and pathname expansion are not performed on the words within the brackets. For the examples in this chapter we will use the first form of the construct.

You can use the construct to check many different attributes of a file (whether it exists, what type of file it is, what its permissions and ownership are, etc.), compare two files to see which is newer, and do comparisons on strings.

[ condition ] is actually a statement just like any other, except that the only thing it does is return an exit status that tells whether condition is true. (The spaces after the opening bracket “[” and before the closing bracket “]” are required.) Thus it fits within the if construct’s syntax.

String comparisons

The square brackets ([]) surround expressions that include various types of operators. We will start with the string comparison operators, listed in Table 5-1. (Notice that there are no operators for “greater than or equal” or “less than or equal” comparisons.) In the table, str1 and str2 refer to expressions with a string value.

Table 5-1. String comparison operators

Operator

True if...

str1 = str2[4]

str1 matches str2

str1 != str2

str1 does not match str2

str1 < str2

str1 is less than str2

str1 > str2

str1 is greater than str2

-n str1

str1 is not null (has length greater than 0)

-z str1

str1 is null (has length 0)

[4] Note that there is only one equal sign (=). This is a common source of error.

We can use one of these operators to improve our popd function, which reacts badly if you try to pop and the stack is empty. Recall that the code for popd is:

popd ( )
{
    DIR_STACK=${DIR_STACK#* }
    cd ${DIR_STACK%% *}
    echo "$PWD"
}

If the stack is empty, then $DIR_STACK is the null string, as is the expression ${DIR_STACK%% }. This means that you will change to your home directory; instead, we want popd to print an error message and do nothing.

To accomplish this, we need to test for an empty stack, i.e., whether $DIR_STACK is null or not. Here is one way to do it:

popd ( )
{
    if [ -n "$DIR_STACK" ]; then
        DIR_STACK=${DIR_STACK#* }
        cd ${DIR_STACK%% *}
        echo "$PWD"
    else
        echo "stack empty, still in $PWD."
    fi
}

In the condition, we have placed the $DIR_STACK in double quotes, so that when it is expanded it is treated as a single word. If you don’t do this, the shell will expand $DIR_STACK to individual words and the test will complain that it was given too many arguments.

There is another reason for placing $DIR_STACK in double quotes, which will become important later on: sometimes the variable being tested will expand to nothing, and in this example the test will become [ -n ], which returns true. Surrounding the variable in double quotes ensures that even if it expands to nothing, there will be an empty string as an argument (i.e., [ -n “” ]).

Also notice that instead of putting then on a separate line, we put it on the same line as the if after a semicolon, which is the shell’s standard statement separator character.

We could have used operators other than -n. For example, we could have used -z and switched the code in the then and else clauses.

While we’re cleaning up code we wrote in the last chapter, let’s fix up the error handling in the highest script (Task 4-1). The code for that script was:

filename=${1:?"filename missing."}
howmany=${2:-10}
sort -nr $filename | head -$howmany

Recall that if you omit the first argument (the filename), the shell prints the message highest: 1: filename missing. We can make this better by substituting a more standard “usage” message. While we are at it, we can also make the command more in line with conventional UNIX commands by requiring a dash before the optional argument.

if [ -z "$1" ]; then
    echo 'usage: highest filename [-N]'
else
  filename=$1
  howmany=${2:--10}
  sort -nr $filename | head $howmany
fi

Notice that we have moved the dash in front of $howmany inside the parameter expansion ${2:—10}.

It is considered better programming style to enclose all of the code in the if-then-else, but such code can get confusing if you are writing a long script in which you need to check for errors and bail out at several points along the way. Therefore, a more usual style for shell programming follows.

if [ -z "$1" ]; then
    echo 'usage: highest filename [-N]'
    exit 1
fi
     
filename=$1
howmany=${2:--10}
sort -nr $filename | head $howmany

The exit statement informs any calling program whether it ran successfully or not.

As an example of the = operator, we can add to the graphics utility that we touched on in Task 4-2. Recall that we were given a filename ending in .pcx (the original graphics file), and we needed to construct a filename that was the same but ended in .jpg (the output file). It would be nice to be able to convert several other types of formats to JPEG files so that we could use them on a web page. Some common types we might want to convert besides PCX include XPM (X PixMap), TGA (Targa), TIFF (Tagged Image File Format), and GIF.

We won’t attempt to perform the actual manipulations needed to convert one graphics format to another ourselves. Instead we’ll use some tools that are freely available on the Internet, graphics conversion utilities from the NetPBM archive. [5]

Don’t worry about the details of how these utilities work; all we want to do is create a shell frontend that processes the filenames and calls the correct conversion utilities. At this point it is sufficient to know that each conversion utility takes a filename as an argument and sends the results of the conversion to standard output. To reduce the number of conversion programs necessary to convert between the 30 or so different graphics formats it supports, NetPBM has its own set of internal formats. These are called Portable Anymap files (also called PNMs) with extensions .ppm (Portable Pix Map) for color images, .pgm (Portable Gray Map) for grayscale images, and .pbm (Portable Bit Map) for black and white images. Each graphics format has a utility to convert to and from this “central” PNM format.

The frontend script we are developing should first choose the correct conversion utility based on the filename extension, and then convert the resulting PNM file into a JPEG:

filename=$1
extension=${filename##*.}
pnmfile=${filename%.*}.pnm
outfile=${filename%.*}.jpg

if [ -z $filename ]; then
    echo "procfile: No file specified"
    exit 1
fi

if [ $extension = jpg ]; then
    exit 0
elif [ $extension = tga ]; then
    tgatoppm $filename > $pnmfile
elif [ $extension = xpm ]; then
    xpmtoppm $filename > $pnmfile
elif [ $extension = pcx ]; then
    pcxtoppm $filename > $pnmfile
elif [ $extension = tif ]; then
    tifftopnm $filename > $pnmfile
elif [ $extension = gif ]; then
    giftopnm $filename > $pnmfile
else
    echo "procfile: $filename is an unknown graphics file."
    exit 1
fi

pnmtojpeg $pnmfile > $outfile

rm $pnmfile

Recall from the previous chapter that the expression ${filename%.*} deletes the extension from filename; ${filename##*.} deletes the basename and keeps the extension.

Once the correct conversion is chosen, the script runs the utility and writes the output to a temporary file. The second to last line takes the temporary file and converts it to a JPEG. The temporary file is then removed. Notice that if the original file was a JPEG we just exit without having to do any processing.

This script has a few problems. We’ll look at improving it later in this chapter.

File attribute checking

The other kind of operator that can be used in conditional expressions checks a file for certain properties. There are 24 such operators. We will cover those of most general interest here; the rest refer to arcana like sticky bits, sockets, and file descriptors, and thus are of interest only to systems hackers. Refer to Appendix B for the complete list. Table 5-2 lists those that we will examine.

Table 5-2. File attribute operators

Operator

True if...

-a file

file exists

-d file

file exists and is a directory

-e file

file exists; same as - a

-f file

file exists and is a regular file (i.e., not a directory or other special type of file)

-r file

You have read permission on file

-s file

file exists and is not empty

-w file

You have write permission on file

-x file

You have execute permission on file, or directory search permission if it is a directory

-N file

file was modified since it was last read

-O file

You own file

-G file

file ’s group ID matches yours (or one of yours, if you are in multiple groups)

file1 -nt file2

file1 is newer than file2 [6]

file1 -ot file2

file1 is older than file2

[6] Specifically, the -nt and -ot operators compare modification times of two files.

Before we get to an example, you should know that conditional expressions inside [ and ] can also be combined using the logical operators && and ||, just as we saw with plain shell commands, in the previous section entitled Section 5.1.3 ." For example:

if [ condition ] && [ condition ]; then

It’s also possible to combine shell commands with conditional expressions using logical operators, like this:

if command && [ condition ]; then
    ...

You can also negate the truth value of a conditional expression by preceding it with an exclamation point (!), so that ! expr evaluates to true only if expr is false. Furthermore, you can make complex logical expressions of conditional operators by grouping them with parentheses (which must be “escaped” with backslashes to prevent the shell from treating them specially), and by using two logical operators we haven’t seen yet: -a (AND) and -o (OR).

The -a and -o operators are similar to the && and || operators used with exit statuses. However, unlike those operators, -a and -o are only available inside a test conditional expression.

Here is how we would use two of the file operators, a logical operator, and a string operator to fix the problem of duplicate stack entries in our pushd function. Instead of having cd determine whether the argument given is a valid directory—i.e., by returning with a bad exit status if it’s not—we can do the checking ourselves. Here is the code:

pushd ( )
{
    dirname=$1
    if [ -n "$dirname" ] && [ ( -d "$dirname" ) -a 
            ( -x "$dirname" ) ]; then
        DIR_STACK="$dirname ${DIR_STACK:-$PWD' '}"
        cd $dirname
        echo "$DIR_STACK"
    else
        echo "still in $PWD."
    fi
}

The conditional expression evaluates to true only if the argument $1 is not null (-n), a directory (-d) and the user has permission to change to it (-x).[7] Notice that this conditional handles the case where the argument is missing ($dirname is null) first; if it is, the rest of the condition is not executed. This is important because, if we had just put:

if [ ( -n "$dirname") -a  ( -d "$dirname" ) -a 
         ( -x "$dirname" ) ]; then

the second condition, if null, would cause test to complain and the function would exit prematurely.

Here is a more comprehensive example of the use of file operators.

Although the code for this task looks at first sight quite complicated, it is a straightforward application of many of the file operators:

if [ ! -e "$1" ]; then
    echo "file $1 does not exist."
    exit 1
fi
if [ -d "$1" ]; then
    echo -n "$1 is a directory that you may "
    if [ ! -x "$1" ]; then
        echo -n "not "
    fi
    echo "search."
elif [ -f "$1" ]; then
    echo "$1 is a regular file."
else
    echo "$1 is a special type of file."
fi
if [ -O "$1" ]; then
    echo 'you own the file.'
else
    echo 'you do not own the file.'
fi
if [ -r "$1" ]; then
    echo 'you have read permission on the file.'
fi
if [ -w "$1" ]; then
    echo 'you have write permission on the file.'
fi
if [ -x "$1" -a ! -d "$1" ]; then
    echo 'you have execute permission on the file.'
fi

We’ll call this script fileinfo. Here’s how it works:

  • The first conditional tests if the file given as argument does not exist (the exclamation point is the “not” operator; the spaces around it are required). If the file does not exist, the script prints an error message and exits with error status.

  • The second conditional tests if the file is a directory. If so, the first echo prints part of a message; remember that the -n option tells echo not to print a LINEFEED at the end. The inner conditional checks if you do not have search permission on the directory. If you don’t have search permission, the word “not” is added to the partial message. Then, the message is completed with “search.” and a LINEFEED.

  • The elif clause checks if the file is a regular file; if so, it prints a message.

  • The else clause accounts for the various special file types on recent UNIX systems, such as sockets, devices, FIFO files, etc. We assume that the casual user isn’t interested in details of these.

  • The next conditional tests to see if the file is owned by you (i.e., if its owner ID is the same as your login ID). If so, it prints a message saying that you own it.

  • The next two conditionals test for your read and write permission on the file.

  • The last conditional checks if you can execute the file. It checks to see if you have execute permission and that the file is not a directory. (If the file were a directory, execute permission would really mean directory search permission.) In this test we haven’t used any brackets to group the tests and have relied on operator precedence. Simply put, operator precedence is the order in which the shell processes the operators. This is exactly the same concept as arithmetic precedence in mathematics, where multiply and divide are done before addition and subtraction. In our case, [ -x “$1” -a ! -d “$1” ] is equivalent to [( -x “$1” ) -a ( ! -d “$1” ) ]. The file tests are done first, followed by any negations (!) and followed by the AND and OR tests.

As an example of fileinfo’s output, assume that you do an ls -l of your current directory and it contains these lines:

-rwxr-xr-x   1 cam      users        2987 Jan 10 20:43 adventure
-rw-r--r--   1 cam      users          30 Jan 10 21:45 alice
-r--r--r--   1 root     root        58379 Jan 11 21:30 core
drwxr-xr-x   2 cam      users        1024 Jan 10 21:41 dodo

alice and core are regular files, dodo is a directory, and adventure is a shell script. Typing fileinfo adventure produces this output:

adventure is a regular file.
you own the file.
you have read permission on the file.
you have write permission on the file.
you have execute permission on the file.

Typing fileinfo alice results in this:

alice is a regular file.
you own the file.
you have read permission on the file.
you have write permission on the file.

Finally, typing fileinfo dodo results in this:

dodo is a directory that you may search.
you own the file.
you have read permission on the file.
you have write permission on the file.

Typing fileinfo core produces this:

core is a regular file.
you do not own the file.
you have read permission on the file.

Integer Conditionals

The shell also provides a set of arithmetic tests. These are different from character string comparisons like < and >, which compare lexicographic values of strings,[8] not numeric values. For example, “6” is greater than “57” lexicographically, just as “p” is greater than “ox,” but of course the opposite is true when they’re compared as integers.

The integer comparison operators are summarized in Table 5-3.

Table 5-3. Arithmetic test operators

Test

Comparison

-lt

Less than

-le

Less than or equal

-eq

Equal

-ge

Greater than or equal

-gt

Greater than

-ne

Not equal

You’ll find these to be of the most use in the context of the integer variables we’ll see in the next chapter. They’re necessary if you want to combine integer tests with other types of tests within the same conditional expression.

However, the shell has a separate syntax for conditional expressions that involve integers only. It’s considerably more efficient, so you should use it in preference to the arithmetic test operators listed above. Again, we’ll cover the shell’s integer conditionals in the next chapter.

for

The most obvious enhancement to make the previous script is the ability to report on multiple files instead of just one. Tests like -e and -d take only single arguments, so we need a way of calling the code once for each file given on the command line.

The way to do this—indeed, the way to do many things with bash—is with a looping construct. The simplest and most widely applicable of the shell’s looping constructs is the for loop. We’ll use for to enhance fileinfo soon.

The for loop allows you to repeat a section of code a fixed number of times. During each time through the code (known as an iteration), a special variable called a loop variable is set to a different value; this way each iteration can do something slightly different.

The for loop is somewhat, but not entirely, similar to its counterparts in conventional languages like C and Pascal. The chief difference is that the shell’s standard for loop doesn’t let you specify a number of times to iterate or a range of values over which to iterate; instead, it only lets you give a fixed list of values. In other words, you can’t do anything like this Pascal-type code, which executes statements 10 times:

for x := 1 to 10 do
begin
    statements...
end

However, the for loop is ideal for working with arguments on the command line and with sets of files (e.g., all files in a given directory). We’ll look at an example of each of these. But first, we’ll show the syntax for the for construct:

for name [in list]
do
    statements that can use 
            $name...
done

The list is a list of names. (If in list is omitted, the list defaults to "$@“, i.e., the quoted list of command-line arguments, but we’ll always supply the in list for the sake of clarity.) In our solutions to the following task, we’ll show two simple ways to specify lists.

The easiest way to do this is by changing the IFS variable we saw in Chapter 4:

IFS=:
     
for dir in $PATH
do
    ls -ld $dir
done

This sets the IFS to be a colon, which is the separator used in PATH. The for loop loops through, setting dir to each of the colon delimited fields in PATH. ls is used to print out the directory name and associated information. The -l parameter specifies the “long” format and the -d tells ls to show only the directory itself and not its contents.

In using this you might see an error generated by ls saying, for example, ls: /usr/TeX/bin: No such file or directory. It indicates that a directory in PATH doesn’t exist. We can modify the listpath script to check the PATH variable for nonexistent directories by adding some of the tests we saw earlier:

IFS=:
     
for dir in $PATH; do
    if [ -z "$dir" ]; then dir=.; fi
     
    if ! [ -e "$dir" ]; then
        echo "$dir doesn't exist"
    elif ! [ -d "$dir" ]; then
        echo "$dir isn't a directory"
    else
        ls -ld $dir
    fi
done

This time, as the script loops, we first check to see if the length of $dir is zero (caused by having a value of :: in the PATH). If it is, we set it to the current directory, then check to see if the directory doesn’t exist. If it doesn’t, we print out an appropriate message. Otherwise, we check to see if the file is not a directory. If it isn’t, we say so.

The foregoing illustrated a simple use of for, but it’s much more common to use for to iterate through a list of command-line arguments. To show this, we can enhance the fileinfo script above to accept multiple arguments. First, we write a bit of “wrapper” code that does the iteration:

for filename in "$@" ; do
    finfo "$filename"
    echo
done

Next, we make the original script into a function called finfo:[9]

finfo ( )
{
    if [ ! -e "$1" ]; then
        print "file $1 does not exist."
        return 1
    fi
    ...
}

The complete script consists of the for loop code and the above function, in either order; good programming style dictates that the function definition should go first.

The fileinfo script works as follows: in the for statement, "$@" is a list of all positional parameters. For each argument, the body of the loop is run with filename set to that argument. In other words, the function finfo is called once for each value of $filename as its first argument ($1). The call to echo after the call to finfo merely prints a blank line between sets of information about each file.

Given a directory with the same files as the earlier example, typing fileinfo* would produce the following output:

adventure is a regular file.
you own the file.
you have read permission on the file.
you have write permission on the file.
you have execute permission on the file.
     
alice is a regular file.
you own the file.
you have read permission on the file.
you have write permission on the file.
     
core is a regular file.
you do not own the file.
you have read permission on the file.
     
dodo is a directory that you may search.
you own the file.
you have read permission on the file.
you have write permission on the file.

Here is a programming task that exploits the other major use of for.

We’ll probably want output that looks something like this:

.
        adventure
                aaiw
                        dodo
                        duchess
                        hatter
                        march_hare
                        queen
                        tarts
                biog
                ttlg
                        red_queen
                        tweedledee
                        tweedledum
        lewis.carroll

Each column represents a directory level. Entries below and to the right of an entry are files and directories under that directory. Files are just listed with no entries to their right. This example shows that the directory adventure and the file lewis.carroll are in the current directory; the directories aaiw and ttlg, and the file biog are under adventure, etc. To make life simple, we’ll use TABs to line the columns up and ignore any “bleed over” of filenames from one column into an adjacent one.

We need to be able to traverse the directory hierarchy. To do this easily we’ll use a programming technique known as recursion. Recursion is simply referencing something from itself; in our case, calling a piece of code from itself. For example, consider this script, tracedir, in your home directory:

file=$1
echo $file
     
if [ -d "$file" ]; then
    cd $file
    ~/tracedir $(ls)
    cd ..
fi

First we copy and print the first argument. Then we test to see if it is a directory. If it is, we cd to it and call the script again with an argument of the files in that directory. This script is recursive; when the first argument is a directory, a new shell is invoked and a new script is run on the new directory. The old script waits until the new script returns, then the old script executes a cd back up one level and exits. This happens in each invocation of the tracedir script. The recursion will stop only when the first argument isn’t a directory.

Running this on the directory structure listed above with the argument adventure will produce:

adventure
aaiw
dodo

dodo is a file and the script exits.

This script has a few problems, but it is the basis for the solution to this task. One major problem with the script is that it is very inefficient. Each time the script is called, a new shell is created. We can improve on this by making the script into a function, because (as you probably remember from Chapter 4) functions are part of the shell they are started from. We also need a way to set up the TAB spacing. The easiest way is to have an initializing script or function and call the recursive routine from that. Let’s look at this routine.

recls ( )
{
    singletab="	"
     
    for tryfile in "$@"; do
        echo $tryfile
        if [ -d "$tryfile" ]; then
            thisfile=$tryfile
            recdir $(command ls $tryfile)
        fi
    done
     
    unset dir singletab tab
}

First, we set up a variable to hold the TAB character for the echo command (Chapter 7 explains all of the options and formatting commands you can use with echo). Then we loop through each argument supplied to the function and print it out. If it is a directory, we call our recursive routine, supplying the list of files with ls. We have introduced a new command at this point: command. command is a shell built-in that disables function and alias look-up. In this case, it is used to make sure that the ls command is one from your command search path, PATH, and not a function (for further information on command see Chapter 7). After it’s all over, we clean up by unsetting the variables we have used.

Now we can expand on our earlier shell script.

recdir ( )
{
    tab=$tab$singletab
     
    for file in "$@"; do
        echo -e $tab$file
        thisfile=$thisfile/$file
     
        if [ -d "$thisfile" ]; then
            recdir $(command ls $thisfile)
        fi
     
        thisfile=${thisfile%/*}
    done
     
    tab=${tab%"$singletab"}
}

Each time it is called, recdir loops through the files it is given as arguments. For each one it prints the filename and then, if the file is a directory, calls itself with arguments set to the contents of the directory. There are two details that have to be taken care of: the number of TABs to use, and the pathname of the “current” directory in the recursion.

Each time we go down a level in the directory hierarchy we want to add a TAB character, so we append a TAB to the variable tab every time we enter recdir. Likewise, when we exit recdir we are moving up a directory level, so we remove the TAB when we leave the function. Initially, tab is not set, so the first time recdir is called, tab will be set to one TAB. If we recurse into a lower directory, recdir will be called again and another TAB will be appended. Remember that tab is a global variable, so it will grow and shrink in TABs for every entry and exit of recdir. The -e option to echo tells it to recognize escaped formatting characters, in our case the TAB character, .

In this version of the recursive routine we haven’t used cd to move between directories. That means that an ls of a directory will have to be supplied with a relative path to files further down in the hierarchy. To do this, we need to keep track of the directory we are currently examining. The initialization routine sets the variable thisfile to the directory name each time a directory is found while looping. This variable is then used in the recursive routine to keep the relative pathname of the current file being examined. On each iteration of the loop, thisfile has the current filename appended to it, and at the end of the loop the filename is removed.

You might like to think of ways to modify the behavior and improve the output of this code. Here are some programming challenges:

  1. In the current version, there is no way to determine if biog is a file or a directory. An empty directory looks no different to a file in the listing. Change the output so it appends a / to each directory name when it displays it.

  2. Modify the code so that it only recurses down a maximum of eight subdirectories (which is about the maximum before the lines overflow the right-hand side of the screen). Hint: think about how TABs have been implemented.

  3. Change the output so it includes dashed lines and adds a blank line after each directory, thus:

    .
    |
    |-------adventure
    |       |
    |       |-------aaiw
    |       |       |
    |       |       |-------dodo
    |       |       |-------duchess
    |       |       |-------hatter
    |       |       |-------march_hare
    |       |       |-------queen
    |       |       |-------tarts
    |       |
    |       |-------biog
    ...
  4. Hint: you need at least two other variables that contain the characters "|" and "-“.

At the start of this section we pointed out that the for loop in its standard form wasn’t capable of iterating over a specified range of values as can be done in most programming languages. bash 2.0 introduced a new style of for loop which caters for this task; the arithmetic for loop. Well come back to it in the next chapter when we look at arithmetic operations.

case

The next flow-control construct we will cover is case. While the case statement in Pascal and the similar switch statement in Java and C can be used to test simple values like integers and characters, bash’s case construct lets you test strings against patterns that can contain wildcard characters. Like its conventional-language counterparts, case lets you express a series of if-then-else type statements in a concise way.

The syntax of case is as follows:

case expression 
            in
            pattern1 
            )
            statements ;;
             pattern2 
            )
            statements ;;
             ...
            esac

Any of the patterns can actually be several patterns separated by pipe characters (|). If expression matches one of the patterns, its corresponding statements are executed. If there are several patterns separated by pipe characters, the expression can match any of them in order for the associated statements to be run. The patterns are checked in order until a match is found; if none is found, nothing happens.

This construct should become clearer with an example. Let’s revisit our solution to Task 4-2 and the additions to it presented earlier in this chapter (our graphics utility). Remember that we wrote some code that processed input files according to their suffixes ( .pcx for PCX format, .gif for GIF format, etc.).

We can improve upon this solution in two ways. Firstly, we can use a for loop to allow multiple files to be processed one at a time; secondly, we can use the case construct to streamline the code:

for filename in "$@"; do
    pnmfile=${filename%.*}.ppm

    case $filename in
        *.jpg ) exit 0 ;;

        *.tga ) tgatoppm $filename > $pnmfile ;;

        *.xpm ) xpmtoppm $filename > $pnmfile ;;

        *.pcx ) pcxtoppm $filename > $pnmfile ;;

        *.tif ) tifftopnm $filename > $pnmfile ;;

        *.gif ) giftopnm $filename > $pnmfile ;;

            * ) echo "procfile: $filename is an unknown graphics file."
                exit 1 ;;
    esac

    outfile=${pnmfile%.ppm}.new.jpg

    pnmtojpeg $pnmfile > $outfile

    rm $pnmfile

done

The case construct in this code does the same thing as the if statements that we saw in the earlier version. It is, however, clearer and easier to follow.

The first six patterns in the case statement match the various file extensions that we wish to process. The last pattern matches anything that hasn’t already been matched by the previous statements. It is essentially a catchall and is analogous to the default case in C.

There is another slight difference to the previous version; we have moved the pattern matching and replacement inside the added for loop that processes all of the command-line arguments. Each time we pass through the loop, we want to create a temporary and final file with a name based on the name in the current command-line argument.

We’ll return to this example in Chapter 6, when we further develop the script and discuss how to handle dash options on the command line. In the meantime, here is a task that requires that we use case.

We can implement this by using a case statement to check the number of arguments and the built-in cd command to do the actual change of directory.

Here is the code:[10]

cd( )
{
    case "$#" in
        0 | 1)  builtin cd $1 ;;
        2    ) newdir=${PWD//$1/$2}
                case "$newdir" in
                    $PWD)   echo "bash: cd: bad substitution" >&2 ;
                        return 1 ;;
                    *   )   builtin cd "$newdir" ;;
                esac ;;
        *    )  echo "bash: cd: wrong arg count" 1>&2 ; return 1 ;;
    esac
}

The case statement in this task tests the number of arguments to our cd command against three alternatives.

For zero or one arguments, we want our cd to work just like the built-in one. The first alternative in the case statement does this. It includes something we haven’t used so far; the pipe symbol between the 0 and 1 means that either pattern is an acceptable match. If the number of arguments is either of these, the built-in cd is executed.

The next alternative is for two arguments, which is where we’ll add the new functionality to cd. The first thing that has to be done is finding and replacing the old string with the new one. We use the pattern matching and replacement that we saw in the last chapter, the result being assigned to newdir. If the substitution didn’t take place, the pathname will be unchanged. We’ll use this fact in the next few lines.

Another case statement chooses between performing the cd or reporting an error because the new directory is unchanged. The * alternative is a catchall for anything other than the current pathname (caught by the first alternative).

You might notice one small problem with this code: if your old and new strings are the same you’ll get bash:: cd: bad substitution. It should just leave you in the same directory with no error message, but because the directory path doesn’t change, it uses the first alternative in the inner case statement. The problem lies in knowing if sed has performed a substitution or not. You might like to think about ways to fix this problem (hint: you could use grep to check whether the pathname has the old string in it).

The last alternative in the outer case statement prints an error message if there are more than two arguments.

select

All of the flow-control constructs we have seen so far are also available in the Bourne shell, and the C shell has equivalents with different syntax. Our next construct, select, is available only in the Korn shell and bash;[11] moreover, it has no analogy in conventional programming languages.

select allows you to generate simple menus easily. It has concise syntax, but it does quite a lot of work. The syntax is:

select name 
            [in 
            list
            ]
do
    statements that can use 
            $name...
done

This is the same syntax as for except for the keyword select. And like for, you can omit the in list and it will default to "$@“, i.e., the list of quoted command-line arguments. Here is what select does:

  1. Generates a menu of each item in list, formatted with numbers for each choice

  2. Prompts the user for a number

  3. Stores the selected choice in the variable name and the selected number in the built-in variable REPLY

  4. Executes the statements in the body

  5. Repeats the process forever (but see below for how to exit)

Here is a task that adds another command to our pushd and popd utilities.

The display and selection of directories is best handled by using select. We can start off with something along the lines of:[12]

selectd ( )
{
    PS3='directory? '
    select selection in $DIR_STACK; do
        if [ $selection ]; then
            #statements that manipulate the stack...
            break
        else
            echo 'invalid selection.'
        fi
    done
}

If you type DIR_STACK="/usr /home /bin" and execute this function, you’ll see:

1) /usr
2) /home
3) /bin
directory?

The built-in shell variable PS3 contains the prompt string that select uses; its default value is the not particularly useful "#?“. So the first line of the above code sets it to a more relevant value.

The select statement constructs the menu from the list of choices. If the user enters a valid number (from 1 to the number of directories), then the variable selection is set to the corresponding value; otherwise it is null. (If the user just presses RETURN, the shell prints the menu again.)

The code in the loop body checks if selection is non-null. If so, it executes the statements we will add in a short while; then the break statement exits the select loop. If selection is null, the code prints an error message and repeats the menu and prompt.

The break statement is the usual way of exiting a select loop. Actually (like its analog in Java and C), it can be used to exit any surrounding control structure we’ve seen so far (except case, where the double semicolons act like break) as well as the while and until we will see soon. We haven’t introduced break until now because it is considered bad coding style to use it to exit a loop. However, it can make code easier to read if used judiciously. break is necessary for exiting select when the user makes a valid choice. [13]

Now we’ll add the missing pieces to the code:

selectd ( )
{
    PS3='directory? '
    dirstack=" $DIR_STACK "
     
    select selection in $dirstack; do
        if [ $selection ]; then
            DIR_STACK="$selection${dirstack%% $selection *}"
            DIR_STACK="$DIR_STACK ${dirstack##* $selection }"
            DIR_STACK=${DIR_STACK% }
            cd $selection
            break
        else
            echo 'invalid selection.'
        fi
    done
}

The first two lines initialize environment variables. dirstack is a copy of DIR_STACK with spaces appended at the beginning and end so that each directory in the list is of the form space directory space. This form simplifies the code when we come to manipulating the directory stack.

The select and if statements are the same as in our initial function. The new code inside the if uses bash’s pattern-matching capability to manipulate the directory stack.

The first statement sets DIR_STACK to selection, followed by dirstack with everything from selection to the end of the list removed. The second statement adds everything in the list from the directory following selection to the end of DIR_STACK. The next line removes the trailing space that was appended at the start. To complete the operation, a cd is performed to the new directory, followed by a break to exit the select code.

As an example of the list manipulation performed in this function, consider a DIR_STACK set to /home /bin /usr2. In this case, dirstack would become /home /bin /usr2. Typing selectd would result in:

$ selectd
1) /home
2) /bin
3) /usr2
directory?

After selecting /bin from the list, the first statement inside the if section sets DIR_STACK to /bin followed by dirstack with everything from /bin onwards removed, i.e., /home.

The second statement then takes DIR_STACK and appends everything in dirstack following /bin (i.e., /usr2) to it. The value of DIR_STACK becomes /bin /home /usr2. The trailing space is removed in the next line.

while and until

The remaining two flow control constructs bash provides are while and until. These are similar; they both allow a section of code to be run repetitively while (or until) a certain condition becomes true. They also resemble analogous constructs in Pascal (while/do and repeat/until) and C (while and do/until).

while and until are actually most useful when combined with features we will see in the next chapter, such as integer arithmetic, input/output of variables, and command-line processing. Yet we can show a useful example even with what we have covered so far.

The syntax for while is:

while condition
            do
            statements...
            done

For until, just substitute until for while in the above example. As with if, the condition is really a list of statements that are run; the exit status of the last one is used as the value of the condition. You can use a conditional with test here, just as you can with if.

Note that the only difference between while and until is the way the condition is handled. In while, the loop executes as long as the condition is true; in until, it runs as long as the condition is false. The until condition is checked at the top of the loop, not at the bottom as it is in analogous constructs in C and Pascal.

The result is that you can convert any until into a while by simply negating the condition. The only place where until might be more meaningful is something like this:

until command
            ; do
            statements...
            done

The meaning of this is essentially, “Do statements until command runs correctly.” This is not a likely contingency.

Here is an earlier task that can be rewritten using a while.

We can use the while construct and pattern matching to traverse the PATH list:

path=$PATH:
     
while [ $path ]; do
    ls -ld ${path%%:*}
    path=${path#*:}
done

The first line copies PATH to a temporary copy, path, and appends a colon to it. Normally colons are used only between directories in PATH; adding one to the end makes the code simple.

Inside the while loop we display the directory with ls as we did in Task 5-2. path is then updated by removing the first directory pathname and colon (which is why we needed to append the colon in the first line of the script). The while will keep looping until $path expands to nothing (the empty string “”), which occurs once the last directory in path has been listed.

Here is another task that is a good candidate for until.

Here is the code:

until cp $1 $2; do
    echo 'Attempt to copy failed. waiting...'
    sleep 5
done

This is a fairly simple use of until. First, we use the cp command to perform the copy for us. If it can’t perform the copy for any reason, it will return with a non-zero exit code. We set our until loop so that if the result of the copy is not 0 then the script prints a message and waits five seconds.

As we said earlier, an until loop can be converted to a while by the use of the ! operator:

while ! cp $1 $2; do
    echo 'Attempt to copy failed. waiting...'
    sleep 5
done

In our opinion, you’ll seldom need to use until; therefore, we’ll use while throughout the rest of this book. We’ll see further use of the while construct in Chapter 7.



[1] Because this is a convention and not a “law,” there are exceptions. For example, diff (find differences between two files) returns 0 for “no differences,” 1 for “differences found,” or 2 for an error such as an invalid filename argument.

[2] The built-in command test is synonymous with [...]. For example, to test the equivalence of two strings you can either put [ string1 = string2 ] or test string1 = string2.

[3] [[...]] is not available in versions of bash prior to 2.05.

[5] NetPBM is a free, portable graphics conversion utility package. Further details can be found on the NetPBM homepage http://netpbm.sourceforge.net/

[7] Remember that the same permission flag that determines execute permission on a regular file determines search permission on a directory. This is why the -x operator checks both things depending on file type.

[8] “Lexicographic order” is really just “dictionary order.”

[9] A function can have the same name as a script; however, this isn’t good programming practice.

[10] To make the function a little clearer, we’ve used some advanced I/O redirection. I/O redirection is covered in Chapter 7.

[11] select is not available in bash versions prior to 1.14.

[12] Versions of bash prior to 1.14.3 have a serious bug with select. These versions will crash if the select list is empty. In this case, surround selects with a test for a null list.

[13] A user can also type CTRL-D (for end-of-input) to get out of a select loop. This gives the user a uniform way of exiting, but it doesn’t help the shell programmer much.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.141.30.162