Chapter 9. Learn Shell Scripting

This chapter covers many aspects of shell scripting. In keeping with the spirit of the book, it’s not an A-to-Z tutorial on the subject. Rather, each project tackles a particular technology pertinent to writing shell scripts. The 13 projects cover the following topics:

  • Bash functions in a script, parameter expansion, here-documents, and script debugging

  • Regular expressions, both modern (extended) and obsolete (basic)

  • Shell quoting

  • Forming conditions for use in conditional expressions

  • Subshells and command blocks

  • Traps and handles, and how to implement them in a shell script

If you are not familiar with writing shell scripts, read Projects 9 and 10 for an introduction. See Project 4 for a discussion of shell and environment variables and how they differ in scope. Project 52 covers Bash functions.

The chapter focuses on scripting with Bash, the default shell for accounts created in Mac OS X. If your chosen interactive shell is not Bash, don’t worry; you can still write and use the scripts you’ll find here. Just make Bash execute them by making the first line of each script read #!/bin/bash. See Project 5 for a comparison of shells.

The projects in this chapter are fairly advanced. It’s not a tutorial on writing shell scripts; rather, it presents useful and practical solutions to some of the most common scripting tasks. It’s of most use to those who have grasped the basics of scripting and want to start writing real-world scripts.

Use Functions in Scripts

“How do I avoid repeating the same piece of code in a shell script?”

This project demonstrates the use of Bash functions in shell scripts. It shows you how to use functions as a way of gathering commonly used code into blocks and demonstrates some handy tricks you can employ in your own code.

Learn More

Learn More

If you’re not familiar with Bash functions, refer to Project 52.

Functions’ Power, Multiplied

In Project 52, we covered the technique of combining command sequences into functions that can be invoked from the command line. Within Bash scripts, functions work much the same way that functions do in other languages, such as JavaScript and C.

When functions are incorporated into a script, they usually are grouped at the top of the file, ahead of the main body of the code. When the script is invoked, Bash reads and parses the functions, which makes them available for use within the actual script. (Functions are not executed when they are parsed—only when they are called by the script.)

Tip

Tip

Access an argument passed to a shell script from within a function by passing the argument to the function. To access the script’s $2 from within function usage, for example, call usage as follows:

usage "$2" other
params...

Within usage, the value of the main script’s $2 can be accessed through the function’s $1.

Like shell scripts, functions accept arguments, and both use the same syntax to refer to arguments. The first argument passed to a script or function is available in the variable $1; the second, in $2, and the nth, in variable $n. Bash also provides two special variables: $* expands to a list of all arguments, and $# expands to the total number of arguments passed.

Because of their shared syntax, arguments passed to a script are not accessible directly by the functions within it, but are available again when a function terminates and the main body of the script executes.

One point to be aware of: The variable $0 represents the script name in both the script and its functions. Use the special variable $FUNCNAME to access the name of the current function.

Write a Function

Most nontrivial shell scripts take arguments, and a well-written script will perform some validation on the arguments it receives. Validation methods can vary widely, depending on the nature of the arguments involved (testing for numbers versus text, for example), but most Unix commands and scripts respond the same way when incorrect arguments are passed: by writing a usage line to the terminal. In a script that does a lot of validation, handling this kind of repetitive task is an ideal candidate for a function.

Suppose that we are writing a script that does a lot of validation. We might write a simple function to be called from the many points of validation in our hypothetical script. The function would display usage information in the terminal window. Our example function, appropriately called usage, has been taken from a real-world script that creates a new Unix group.

usage ()
{
  echo "Create a new group"
  echo " Usage: ${0##*/} groupname gid"
  if [ "$*" != "" ]; then echo " Error: $*"; fi
  exit
}

Tip

Tip

A usage line traditionally displays the name of the script. Use the special variable $0 instead of writing the script name literally. In this way, the name that’s displayed always reflects that of the script, even if the script is renamed after it’s written. The special variable expansion ${0##*/} truncates the leading pathname from the script name. If the script is called by a command line such as /usr/local/bin/my-script, the variable expansion becomes my-script.

Project 76 covers Bash parameter expansion.

In the new-group script, usage displays an informational message and a usage line, and (optionally) an error message preceded by the text Error:. Because this function is called in response to fatal errors, it also shuts down, or exits, the script. A function that simply completes and returns to the main body of the script should not finish on exit: An exit statement terminates the entire script.

Let’s use our function to report an error when the number of arguments passed to a script is not two. Our script calls the usage function if the wrong number of arguments is passed.

if [ $# -ne 2 ]; then
  usage
fi

To pass an error message to the function, call it like this.

if [ $# -ne 2 ]; then
  usage "Two arguments expected but $# received"
fi

Learn More

Learn More

Projects 9 and 10 show you how to write simple Bash shell scripts.

Unix commands usually write error messages to standard error instead of standard output. We can change our usage function to honor this convention by using a redirection trick. Normally, the echo command writes to standard output, but if we merge standard output into standard error by using the notation 1>&2, or the equivalent >&2, all output will be sent to standard error instead. As an example:

echo " Usage: ${0##*/} groupname gid" 1>&2

Learn More

Learn More

Project 6 covers the concepts of redirection, standard output, and standard error.

Underline a String

Here’s a handy function to underline a line of text. It accepts a line of text as a single argument, displays the text on a line, and places a line of dashes equal in length to the text on the line below it.

# Function Underline(string-to-underline)
# Display and underline a string.
# $1: the string to underline
Underline ()
{
  local -i len # to hold the length of the string
  # write out the string and a '-' for each charater
  len=${#1}; echo "$1"
  while ((len!=0)); do echo -n "-"; len=len-1; done; echo
  return 0
}

Tip

Tip

An often-used convention names functions starting with a capital letter, helping distinguish functions from variables and commands.

Our function, named Underline, assigns the number of characters in parameter 1 to the variable len by using the special notation ${#1}. It then displays the text held in parameter 1 and loops to display the appropriate number of dashes below the text. We employ a few more tricks besides ${#1}. Passing option -n to echo stops it from displaying each dash on a new line. Also, we declare len to be a local integer variable in the line

local -i len # to hold the length of the string

A local variable exists only while its defining function executes and prevents us from accidentally overwriting a variable of the same name from the main script. The option -i makes len an integer variable, allowing us to employ Bash integer expressions such as the condition in

Learn More

Learn More

Project 87 gives tips on declaring variables and Bash integer arithmetic.

while ((len!=0));

which loops for as long as the value of the variable len is not equal to 0; and the arithmetic expression

len=len-1

which subtracts 1 from the value of len.

Learn More

Learn More

Project 81 covers Bash conditions.

We’d call Underline from the main body of the script in the following manner.

Underline "The Title"

yielding

The Title
---------

Tip

Tip

To find out more about the local command—which, when used within a function, is equivalent to the declare command—type

$ help local
$ help declare

Use Bash Parameter Expansion

“How do I perform string manipulation in Bash?”

This project covers the topic of parameter expansion. Parameter expansion is most often used to expand variables and arguments by means of the familiar $ notation: $length or $1. Parameter expansion, however, is more than simply the expansion of a variable or an argument into its value; it also involves manipulation of the value, such as pattern replacement and default initialization.

Basic Expansion

By now, the basics of parameter expansion are probably familiar. We give a variable a value.

$ title="101 Projects"

Later, we expand the variable to expose its value.

$ echo $title
101 Projects

Bash uses the terms parameter and parameter expansion not only for variables, but also for arguments passed to a script or function. Where Bash refers specifically to arguments such as $1, it uses the terms positional parameter and positional parameter expansion.

Positional parameter expansion works as follows: The first argument passed to a Bash script or function is available in the variable $1; the second, in $2; and the nth, in the variable $n. The special expansion $* expands to a list of all arguments passed, and $# expands to the number of arguments passed.

The special expansion $@ is useful when enclosed in double quotes. To illustrate this, suppose that we pass two arguments to a script, both of which contain spaces.

$ ./tst "param one" "param two"

Whereas both $* and $@ expand to four items— "param", "one", "param", and "two" —the quoted versions behave differently.

  • "$*" expands to one item: "param one param two".

  • "$@" expands more usefully to two items: "param one" and "param two".

Tip

Tip

The special parameter $$ expands to the process ID of the shell. It provides an easy way to generate a uniquely named temporary file in a script. For example:

echo "Test" > $0$$.tmp

Complex Expansion

More complex parameter expansion lets us assign a default value to a parameter or change its value by cutting and replacing portions of its contents. Complex expansion uses the notation

${parameter-name<expansion-type>}

Tip

Tip

Use the following technique to embed parameter expansion in text that might otherwise be confused with the name of the parameter. To expand an abbreviated day name, where day="Tues", to Tuesday, we type

${day}day

Set Default Values

Suppose that we have a script that takes one optional argument. We want to assign the value of argument (parameter) 1 to the variable level, but only if parameter 1 is given a value. If no argument is given when the script is called, we want the value of level set to equal the text string normal. We can take the conventional, long-handed approach and use an if statement.

if [ "$1" = "" ]; then
  level="normal"
else
  level="$1"
fi

Better, we can use the functionally equivalent complex expansion.

level=${1:-"normal"}

An alternate approach also initializes a parameter to a default value if the parameter is null. This technique doesn’t apply to positional parameters (arguments), only to parameters (variables).

new_level=${level:="normal"}

A third method causes a script to exit if a compulsory argument is not supplied. The following expansion displays an error message and aborts the script if no value is passed to $1; otherwise, it assigns the passed value to the variable level.

$ cat tst
level=${1:?Please supply a value}

If we run the script and fail to supply an argument, it displays the error message and aborts.

$ ./tst
./tst: line 1: 1: Please supply a value

Slice Strings

Bash can expand a string variable (a variable that contains text) to a fragment of the text. This process is called slicing. We might expand the variable $string by taking a slice starting from the 8th character (that’s character 7, because the first character is 0) and returning the next 6 characters.

$ echo $hi
Hello, please slice me
$ echo ${hi:7:6}
please

Here’s a practical application of slicing—a script that checks each of its arguments to see whether it’s an option flag (an argument that starts with a - character). This technique is commonly used in connection with scripts for which option flags have been defined. Our example is part of a script that has two legal option flags: -p, which prompts the user for a password; and -v, which sets verbose mode. Any other option will cause the script to exit and report an error to the user.

We want to extract and compare the first character of each argument by slicing a substring one character long, beginning with the first character (character position 0). The following script uses the expansion ${1:0:1} to slice $1, where :0 specifies the start position and :1 specifies the number of characters to extract.

while [ "$1" != "" ]; do
  if [ "${1:0:1}" = "-" ]; then
    case "$1" in
      "-p") stty -echo; read -p "Password:" password
            stty echo; echo;;
      "-v") verbose=yes;;
      *) echo "invalid option $1"; exit
    esac
  else
    echo "Here we process non-option arguments..."
  fi
  shift
done

Tip

Tip

Use the command stty -echo to stop the user’s input from being echoed to the screen as she types. This is useful when a password or other such sensitive information needs to be input. The command stty echo puts things back to normal.

The script then tests to see whether the character is a dash (-). If so, it issues instructions depending on whether the dash is followed by p, v, or any other character (denoted by *); if not, it writes a message to the screen: Here we process non-option arguments...

The script demonstrates a few other useful techniques, too. It loops, processing each argument in turn. At the end of the loop, it uses the shift command to shift all positional parameters down one place, so $n becomes $n-1, $2 becomes $1, and $1 (which we just processed) drops off the end.

Tip

Tip

Use the special expansion ${#var} to return the length of a parameter, in characters.

Top and Tail Strings

Bash provides a way to remove the head or tail of a string. We’ll illustrate a few useful techniques on a Unix pathname written to the variable fullpath.

$ fullpath="/usr/local/bin/backup.user.sh"

In our first example, we remove the head of the string by specifying a parameter expansion in the form

${parameter##word}

The character combination ## instructs Bash to remove all characters, starting from the left (the start) of the specified parameter, that match word. We’ll specify word as */, where * is matched by zero or more occurrences of any character and / represents itself. The star symbol is interpreted exactly as it would be for shell globbing. Our pattern, therefore, matches any string of characters from the start of the string, ending with /.

Learn More

Learn More

Refer to Project 11 for a full explanation of globbing.

$ echo ${fullpath##*/}
backup.user.sh

You’ll notice that the pattern matched the longest string it could, up to the last /. Try the same command, but type a single # to match the shortest string—up to the first /.

To extract the file extension, we type

$ echo ${fullpath##*.}
sh

To remove all characters starting from the right (the end) of the string instead of the left, specify % instead of #. The same convention of % versus %% applies. To remove the extension part (.sh) from fullpath, we require the shortest match, starting from the right (%), for the word.*.

$ echo ${fullpath%.*}
/usr/local/bin/backup.user

Here’s an example script that splits a pathname into its component directories and the filename. We match the shortest string from the left and the longest from the right. Contrast this with the previous two examples. If you can figure out how it works, you’ve got topping and tailing down to a tee.

$ cat tst
#!/bin/bash
pathname=${1}"/"
while [ ! -z ${pathname#*/} ]; do
  pathname=${pathname#*/}
  echo ${pathname%%/*}
done

Let’s try it on a pathname.

$ ./tst /usr/local/bin/command
usr
local
bin
command

Search and Replace

Bash gives us a means to search a parameter for a pattern, replacing each occurrence of that pattern with a new string. The syntax is

${parameter/match-pattern/replace-pattern}

Here are some examples in which we use the echo command to demonstrate search and replace.

Search for the first occurrence of Hello, and replace it with Goodbye.

$ message="Hello, Hello World"
$ echo ${message/Hello/Goodbye}
Goodbye, Hello World

Only the first occurrence of Hello is replaced: To replace all occurrences, specify a double slash instead if a single slash.

$ echo ${message//Hello/Goodbye}
Goodbye, Goodbye World

To match a pattern that must be at the very start of the string, introduce the search-and-replace expression with the character sequence /#.

$ echo ${message/#Hello/Goodbye}
Goodbye, Hello World

Similarly, to specify that the pattern must be at the end of the string, introduce the search-and-replace expression with the character sequence /%.

$ echo ${message/%World/Earth}
Hello, Hello Earth

Learn Regular Expressions

“How do I search for text that matches a specific pattern?”

This project shows you how to write regular expressions. A regular expression is formed to match a particular text pattern. Project 78 covers advanced use of regular expressions.

Note

Note

Regular expressions are not the same as globbing (covered in Projects 11 and 12). Globbing is implemented by the shell and by commands such as find, and matches a pattern against a list of filenames—usually, the files in the current directory. Regular expressions are more powerful and are used by text-processing commands to match against lines of text—usually, to search for and replace text.

The Match Game

Regular expressions are widely used in Unix, and most text-processing tools support them. The most common uses include:

  • Searching a text file for lines containing particular text

  • Filtering the output from other commands for relevant lines

  • Performing search and replace in text editors such as nano and TextWrangler, and in text-editing tools such as sed and awk

  • Performing text manipulation in a programming language such as Perl or PHP

The simplest regular expressions are plain text sequences (such as index.html) that match other instances of themselves. More often, regular expressions contain a mix of wildcards, repetitions, and alternatives.

Unix supports three types of regular expressions, which unfortunately don’t share a compatible syntax. The three forms are modern (also termed extended); obsolete (also termed basic); and Perl regular expressions (introduced by the Perl programming language). This project focuses on extended regular expressions, but a section at the end highlights how extended expressions differ from basic expressions. Perl regular expressions, the most powerful of all, are not generally supported by the Unix tools covered in this book.

Basic regular expressions are supported by the grep and sed commands. Extended regular expressions are supported by the awk command and by the extended variants of grep and sed—namely, egrep (or grep -E) and sed -E.

Learn More

Learn More

Refer to Project 23 for examples of using the grep command.

Regular expressions are employed in many of the projects in this book. Read this project to brush up on the theory, and you’ll be ready to apply it in a more practical way to other projects.

Learn More

Learn More

Refer to Projects 59 to 62 for more information on the sed and awk commands.

Basic Rules

Depending on context, regular-expression matching is performed on a string (a sequence of characters) or a line of text. Matched text cannot span lines but must be wholly contained within one line. Matching is normally done in a case-sensitive manner, but most tools let you specify that matching should be case insensitive.

Tip

Tip

Remember that the escaping character is a special character itself. To use it literally, escape it by typing \.

Regular expressions are greedy: Given a choice of several possible matches, they always choose the longest one. Consider the text

backup.user.sh

A regular-expression match against “anything followed by dot” will return backup.user. but the shorter match backup. will not be returned.

Regular-Expression Syntax

A regular expression consists of a sequence of atoms and repeaters.

An atom is any of the following:

  • A character (most characters match themselves)

  • . (matches any single character)

  • ^ (matches the start of a line or string)

  • $ (matches the end of a line or string)

  • [...] (called a bracketed expression; represents exactly one instance from a group of possible characters and is explained more fully later in this project)

A repeater is any of the following:

  • * (matches zero or more occurrences of the preceding atom)

  • + (matches one or more occurrences of the preceding atom)

  • ? (matches zero or one occurrence of the preceding atom)

The syntax is explained by examples in the rest of the project. Project 78 covers advanced regular expressions, extending the syntax shown here.

Tip

Tip

When you enter a regular expression on the command line, remember that characters such as star have a special meaning to the shell and must be escaped from it. It’s good practice always to surround regular expressions with single quotes.

To match a character such as star (*), which normally has a special meaning, you must escape its special meaning by preceding it with a backslash (). The special characters that must be escaped in extended regular expressions are

. ^ $ * ? +  [ { () |

Simple Regular Expressions

Let’s form a very simple regular expression that we might use to match an incomplete crossword entry: a p blank l blank. In regular-expression language, a single-character blank is represented by a dot, so here’s our regular expression.

'ap.l.'

When applied to a list of words, one per line, this expression will match lines that contain apple, apply, and aptly. It will also match lines that contain words such as appliance, pineapple, and inapplicable.

When applied to lines (or long strings) of text, the regular expression 'ap.l.' will match lines such as an apple a day and clap loudly because those lines contain matches. It’s not necessary to match the entire line or string.

Tip

Tip

A simple method of dry-running a regular expression uses the command egrep (or grep for basic regular expressions). Type

$ egrep 'the-regular-¬
    expression'

but give no filename. You can now experiment by typing lines of text, which egrep will read from standard input. Lines that match the regular expression will be echoed back when you press Return; those that don’t, won’t. Press Control-d when you’re finished.

Anchors

The special symbol caret (^) matches the start of a line or string; it matches a position rather than a character. Repeating our example from the previous section, we find that the regular expression

'^ap.l.'

matches lines that start with ap.l. and won’t match pineapple, inapplicable, or clap loudly.

Tip

Tip

To match empty lines or strings, use the regular expression '^$'.

Similarly, the special symbol dollar ($) matches the end of a line or string, so the regular expression

'ap.l.$'

matches words that end with ap.l. and won’t match appliance or inapplicable.

It’s important to realize that anchoring applies to the whole line (or string), not to individual words. If we pass the line red apple, it will not match ^apple because caret anchors to the start of the line. It will match the line apple mac. Similarly, apple$ will match red apple but not apple mac.

Tip

Tip

Pass the -w option to grep to tell it to match only whole words. “ apple” would match the string “ an apple a day” but not the string “ a pineapple a day”.

Finally, we match an entire line or string by applying both anchors. To match only apple, apply, and aptly, use the regular expression

'^ap.l.$'

Repeaters

To search for fixed patterns of text separated by arbitrary text, we must specify any number of any character. We do this by combining the atom dot (.) to mean any character and the repeater star (*) to mean zero or more repetitions thereof. Here are some examples that use a text file, paren.

$ cat paren
Here is (some text) in parentheses.
Here we have () empty parentheses.
Here we have (a) letter in parentheses.
Here we have no parentheses.

Let’s search for lines that contain anything, including nothing, enclosed in parentheses. To do so, we create a regular expression that means (, followed by anything or nothing, followed by). We must escape the parentheses (and braces, too) because they are special characters (a topic discussed at greater length in Project 78).

Tip

Tip

You may employ any number of repeaters in a regular expression.

$ egrep '(.*)' paren
Here is (some text) in parentheses.
Here we have () empty parentheses.
Here we have (a) letter in parentheses.

To exclude the empty parentheses, we specify one or more repetitions of any character by using the special character plus (+) instead of star.

Learn More

Learn More

Project 78 shows you how to apply finer control to repeaters and how to repeat constructs that are more complex than a single character.

$ egrep '(.+)' paren
Here is (some text) in parentheses.
Here we have (a) letter in parentheses.

To specify zero or one repetitions, we use the special character query (?).

$ egrep '(.?)' paren
Here we have () empty parentheses.
Here we have (a) letter in parentheses.

Repeaters can be applied to specific characters as well as to special characters like dot. Here are two regular expressions, the first matching two or more consecutive dashes (-); the second matching star, then one or two dots, and then star.

$ egrep -- '--+' test.txt
$ egrep '*..?*' test.txt

The first example uses a trick to prevent the egrep command from thinking the regular expression is an option because it begins with a dash. A double-dash option preceding the regular expression signifies that no more options follow. The second example uses the special character to escape the star and dot characters.

Repeaters are summarized in “Regular-Expression Syntax” earlier in this project.

Bracket Expressions

To match any digit 0 to 9, or perhaps any letter, we list the alternative characters and have the text match exactly one of those characters. Regular expressions provide bracket expressions for just such a purpose, whereby we list the alternative characters in square brackets. For example, the regular expression

'b[aeiou]g'

matches bag, beg, big, bog, and bug. It does not match byg or boog.

Learn More

Learn More

Project 78 shows you how to choose alternatives that are more complex than a single character.

The following regular expression will match any line that starts with a, b, or c (uppercase or lowercase) immediately followed by a two-digit number.

'^[aAbBcC][0123456789][0123456789]'

To match all characters except a particular set, enclose the characters to be excluded in brackets, preceded by a caret (^) symbol. To match any character except a digit, specify the regular expression

'[^0123456789]'

Tip

Tip

All special characters lose their meaning inside bracketed expressions, where they should not (and in fact cannot) be escaped.

Character Ranges

A character range is a bracketed expression with a start point and an end point separated by a dash. Here are some simple examples to illustrate this.

  • All digits is '[0-9]' and equivalent to '[0123456789]'.

  • All letters is '[a-zA-Z]'.

  • All letters plus [ ] ^ and - is '[][a-zA-Z^-]'. To clarify, we specify the character set ][a-zA-Z^- enclosed in square brackets.

In the last example, we employed a few tricks to include the special characters [, -, and ^ in the list. To include a ] character, make it first in the bracketed list (or the second when you’re negating the list with a caret symbol). A caret must not be the first in the list, and a dash character should be the last in the list.

Character Classes

Regular expressions provide special character classes to prevent the need to list many characters in bracketed expressions. To match all letters and digits, for example, we specify the class alnum (alphanumeric). A class name should be surrounded by [: :] and enclosed in brackets.

Tip

Tip

The sequence [[:alpha:]][[:digit:]] differs from [[:alpha:][:digit:]]. The former specifies a letter followed by a digit; the latter specifies either a letter or a digit.

Let’s pose a matching problem and solve it by using character classes. We want to match lines starting with one or more digits, followed by one or more letters, followed by a colon, followed by anything. The line may optionally start with a white space. Here’s an example.

        42HHGG: Life, the universe, and everything.

We might describe our matching criteria by using a regular expression such as

'^[[:space:]]*[[:digit:]]+[[:alpha:]]+:'

The regular expression uses the character classes space (any white space, including tab), digit (0-9), and alpha (a-z, A-Z). The rest of the expression is formed with the now-familiar repeaters and anchors.

The following character classes are defined.

alnum alpha blank cntrl digit graph
lower print punct space upper zdigit

Tip

Tip

To discover exactly which characters are included in a particular class, read the Section 3 man page for the corresponding library function. The library function is named like the class but starts with is. To read about character class [:space:], for example, look at the man page for isspace by typing

$ man 3 isspace

Be Clever with Regular Expressions

“How do I search for text that matches a specific pattern?”

This project shows you how to write advanced regular expressions. A regular expression is formed to match a particular text pattern. Project 77 introduces regular expressions.

If you’re not familiar with regular expressions, read Project 77, on which this project builds. This project introduces advanced techniques such as:

  • Repeaters with bounds to state more precisely how many times a preceding atom must repeat

  • Subexpressions to turn regular expressions into atoms, thereby making them subject to repeaters

  • Branches to form choices more complex than the simple character alternatives offered by bracket expressions

Repeaters with Bounds

Project 77 introduced regular expressions and showed you how to use an atom followed by a simple repeater to say match multiple occurrences of the specified atom. But the alternatives offered by the simple repeaters *, +, and ? are not always adequate. We can specify to match one or more letters by using the expression

'[[:alpha:]]+'

but not exactly nine letters or between five and nine letters, inclusive.

To specify a precise number of matches, use a bounded repeater, which has the syntax {n,m}. You can use a bounded repeater wherever you’d otherwise use a simple repeater. We’ll demonstrate the use of bounded repeaters by matching words of a particular length and words that fall within a particular length range. First, let’s use egrep and a regular expression to match words of exactly nine letters. The input file contains a list of words, one per line.

Tip

Tip

Attempting to specify a repeater such as {,9} to mean 9 or fewer is not legal syntax. Instead, use either {1,9} or {0,9} as appropriate.

To match all nine-letter words, we employ a bounded repeater in a regular expression such as

$ egrep '^[[:alpha:]]{9}$' /usr/share/dict/web2
...
pinealism
pinealoma
pineapple
pinedrops
pinewoods
pinheaded
...

(The file /usr/share/dict/web2 contains a handy word list.)

The syntax element {9} is a bounded repeater that matches exactly nine occurrences of the preceding atom: a letter. Note that we’ve used a caret symbol and a dollar symbol to ensure that the expression matches a complete line; otherwise, the expression would also match a portion of all words more than nine characters in length.

To extract all words five to nine characters in length, we supply two comma-separated bounds.

'^[[:alpha:]]{5,9}$'

Whereas the first example matched words like pineapple, this example matches from apple through dappled to pineapple.

To search for nine or more occurrences, supply only the lower bound. The next example matches space-separated numbers of nine or more digits.

' [[:digit:]]{9,} '

Subexpressions

By enclosing a regular expression in parentheses, we turn it into an atom (see Project 77). Such an expression is termed a subexpression. A subexpression is seen as a single entity and, therefore, can be made the subject of a repeater.

Here’s an example in which we check for valid IP addresses, which look like 10.0.2.120 or 217.155.168.147. We first construct a regular expression that matches one to three digits, followed by a dot.

'[[:digit:]]{1,3}.'

Then we turn the regular expression into a subexpression, which allows us to repeat the whole expression three times with a repeater.

'([[:digit:]]{1,3}.){3}'

Tip

Tip

Any regular expression enclosed in parentheses becomes a subexpression. A subexpression is an atom and can be treated just like a simple character, which may be incorporated into a new regular expression. The new expression may be enclosed in parentheses and reduced in its turn to an atom. There is no effective limit to this process—at least not until your head starts to hurt!

Finally, we add the original expression to the end, but without the trailing dot. For good measure, we also assume an IP address to be surrounded by nondigit characters. This prevents matching an invalid address such as 1111111.2.3.4444444. Here’s the final regular expression.

'[^[:digit:]]([[:digit:]]{1,3}.){3}¬
    [[:digit:]]{1,3}[^[:digit:]]'

Note

Note

We must extend the definition of an atom given in Project 77 to include a subexpression.

If you try this expression, you’ll notice that it fails on IP addresses that fall at the start or end of a line. We need to delimit an IP address by start of line OR not a digit and not a digit OR end of line. We can achieve this by using branches, introduced in the next section.

Branches

Branches define sets of alternative matches. A regular expression may specify one or more branches separated by vertical-bar (|) symbols and will match anything that matches one of the branches. Each branch is itself a regular expression.

Here’s a regular expression with seven branches that matches any one of the days of the week.

'monday|tuesday|wednesday|thursday|friday|saturday|sunday'

This alone is limited, and an attempt to match a full date will not work. The following regular expression, for example, doesn’t do what we probably intended.

'saturday|sunday jan|feb [[:digit:]]{1,2}'

It actually specifies a line that matches any of the three alternatives.

'saturday' OR 'sunday jan' OR 'feb [[:digit:]]{1,2}'.

Tip

Tip

Don’t get confused by the two meanings of the caret symbol. Outside a bracket expression, it’s a start-of-line anchor, and as the first character inside a bracket expression, it negates the sense of the match.

To get around this problem, we employ subexpressions. Combining multiple branches as subexpressions within larger regular expressions enables complex and highly useful matches. We might use the following to pull out weekend events for January and February from an activities list.

'(saturday|sunday) (jan|feb) ([[:digit:]]{1,2})'

We might match days of the week by using the shorter regular expression

'(mon|tues|wednes|thurs|fri|satur|sun)day'

We’ll conclude our look at branches by completing the IP address-matching example started in the preceding section. Recall that we wanted to delimit an IP address by start of line OR not a digit and not a digit OR end of line. We specify the former by using a two-branch subexpression such as

'(^|[^[:digit:]])'

Here’s the full regular expression, split across three lines for clarity. It should be entered in Terminal on a single line and, obviously, as part of a command.

'(^|[^[:digit:]])
 ([[:digit:]]{1,3}.){3}[[:digit:]]{1,3}
 ([^[:digit:]]|$)'

Capture Patterns

Suppose that we need to match a particular pattern and that that pattern must be occur twice. That’s easy to do; we use the repeater {2}. However, if our requirement is for the text that matched the first time to be repeated verbatim the second time, that’s not so easy. (Imagine a search that’d match Monty Monty and Sugar Sugar but not Monty Python or Sugar Babes.)

Tip

Tip

When subexpressions are nested, capture and playback gets a bit confusing and is best avoided.

To pull off such a trick, we use capture and playback. Whenever a subexpression is matched, the matched string is captured in a buffer. The first string to be captured is held in buffer 1; the second, in buffer 2; and so on. This happens automatically. To replay a buffer, simply specify 1 or 2, and so on.

Here’s an example in which we capture the entire expression and replay it.

'(b[aeiou]g)1'

This expression will match bigbig and bagbag, but not bigbag. Remember that a pattern is captured only when it’s a subexpression—that is, it’s enclosed in parentheses.

Search and Replace

Capture patterns play an important role in search and replace. Editing tools such as sed support the capture-and-playback technique, allowing a pattern captured from the search string to be played back into the replacement string.

Here’s an example in which we process a file that contains information about books. The entry for a book occupies one line in the file (shown split into three shorter lines in this book) and has the following format.

Level: Beginning/Intermediate/Advanced, "101 Projects",
 CBS Category: Macintosh/Unix, Covers: Mac OS X 10.4 Tiger,
 Price: $34.99, Author: Mayo.

Our mission, should we choose to accept it, is not so impossible. We must extract the quoted title and price, and report them in the following format.

Cost 34.99 Title 101 Projects

Learn More

Learn More

Projects 59 and 61 cover the sed text editor.

To realize this, we match an entire line, capturing the title and price, and replace the line with Cost <price> Title <title>.

Let’s build the regular expression piece by piece. Start with .* to match everything up to the title. Match the title with “.*”, and capture it with (“.*”). Then match intervening information with .*, and match and capture the price with ($[0-9]{1,3}.[0-9]{2}). Note that we escape $ and. because they are special characters. Finally, match the remainder of the line with .*.

The sed command’s syntax for search and replace is

s/search-pattern/replace-pattern/

Our replace pattern is Cost 2 Title 1.

Putting this together, we get the following command.

$ sed -E 's/.*(".*").*($[0-9]{1,3}.[0-9]{2}).*/Cost 2 ¬
    Title 1/'

Option -E to sed tells it to switch on extended regular expressions. Let’s try this command, adding a little extra sophistication to display only matching lines with option -n (don’t display input lines) and flag p (display matching lines) placed at the end of the substitute function.

$ sed -En 's/.*(".*").*($[0-9]{1,3}.[0-9]{2}).*¬
   /Cost 2 Title 1/p'
Level: Beginning/Intermediate/Advanced, "101 Projects",
   CBS Category: Macintosh/Unix, Covers: Mac OS X 10.4
Tiger, Price: $34.99, Author: Mayo.
Cost $34.99 Title "101 Projects"
TEST"TITLE"TEST$111.22TEST
Cost $111.22 Title "TITLE"
<Control-d>
$

Use Here-Documents in Scripts

“How do I use an interactive command in a shell script?”

This project explores the use of here-documents in Bash shell scripts. Here-documents provide an easy way to display multi-line messages. They also offer a means of using interactive commands (that normally take input from Terminal) in a shell script by specifying that input will instead be found embedded in the script.

Learn More

Learn More

Project 6 covers the techniques of redirection and pipelining.

Learn More

Learn More

Project 21 gives more information on the cat command.

“Talk” in a Script

A here-document is a clever Bash feature one can employ in shell scripts. It furnishes a technique for redirecting standard input not from a file or pipe, but from the text of the shell script itself. This is best explained by an example.

To display a sizeable message from a shell script, we could of course use the echo or cat commands to display text stored in a file. Instead, we’ll use cat but supply the text inline as part of the shell script.

Redirect from a Here-Document

The cat command, in the absence of a filename, reads its input from standard input. In the next example, we use a here-document to redirect standard input to be from the text of the shell script.

The following example is taken from a shell script that creates a new Unix group, but for brevity of output, we show only the section that’s of interest to us.

$ cat new-group
#!/bin/bash
cat <<EOS
The script creates a new Unix group within NetInfo
  Usage ${0##*/} groupname gid
  Neither the group name nor the group id must exist
EOS
$ ./new-group
The script creates a new Unix group within NetInfo
  Usage new-group groupname gid
  Neither the group name nor the group id must exist

Note

Note

Remember to make the script executable (see Project 9).

Learn More

Learn More

Bash parameter expansion is explained in Project 76.

The start of the region to be read as standard input is marked by <<word. The end of the region is marked by a line containing only word (in which even leading and trailing blanks are not permitted). In this example, the cat command reads the text between <<EOS and EOS and displays it on the terminal line.

Using a here-document has several advantages over just displaying the contents of a file. First, the shell script does not need to rely on or know the location of a second file. Second, you’ll notice that the parameter ${0##*/} is expanded. All lines of a here-document are subjected to parameter expansion, command substitution, and arithmetic expansion.

We could achieve a similar effect by using the echo command, but here-documents have other advantages and uses, which are demonstrated next.

Nontrivial shell scripts usually employ indentation to highlight their structure and organization. Your here-documents can follow the natural flow of script indentation, without having that indentation reflected in the text they pass via redirection: Just set the indents within the here-documents using Tab characters, instead of spaces. To enable this useful feature, type <<- instead of << at the beginning of the here-document. Here’s an example.

$ cat new-group
#!/bin/bash
      cat <<-EOS
      The script creates a new Unix group within NetInfo
        Usage ${0##*/} groupname gid
        Neither the group name nor the group id must exist
      EOS
$ ./new-group
The script creates a new Unix group within NetInfo
  Usage new-group groupname gid
  Neither the group name nor the group id must exist

Although tabs are stripped, spaces are not. This allows space-driven indentation within the here-document text, as in the example above. If you are in the habit of using spaces to indent your shell scripts, revert to using tabs within a here-document.

Tip

Tip

You may turn off parameter expansion, command substitution, and arithmetic expansion in a here-document by quoting the delimiting word at the start of the here-document. Don’t quote the terminating word. For example:

cat <<'EOS'
  This - ${0##*/} -
will not be expanded.
EOS

Control an Interactive Command

If you want to control an interactive command from a shell script, such as ftp to perform a file transfer, use a here-document to supply the command’s input from the text of the script. An interactive command expects to receive its input from standard input (usually, in the form of a human at a keyboard).

Let’s write a shell script that connects to an FTP server and issues three commands— user, ls, and exit—to ftp.

$ cat ftp-eg
ftp -n carcharoth.mayo-family.com <<-EOT
        user saruman mypassword
        ls
        exit
        EOT

Here’s what happens—automatically, with no user intervention—when we run the script.

$ ./ftp-eg
Connected to carcharoth.mayo-family.com.
220 carcharoth.mayo-family.com FTP server ready.
331 Password required for saruman.
...
150 Opening ASCII mode data connection for '/bin/ls'.
total 1
drwxr-xr-x 4 saruman saruman 136 Jun 10 00:21 Public
drwxr-xr-x 27 saruman saruman 918 Jun 28 13:17 Sites
226 Transfer complete.
221-
    Data traffic for this session was 0 bytes in 0 files.
    Total traffic for this session was 3573 bytes in 1...
221 Thank you for using FTP on carcharoth.mayo-family.com.

Tip

Tip

Bash provides a here-string in which the expansion of a variable can be used as standard input. Try the following commands, and compare the results you get from the second and third lines.

$ text="This is a test ¬
     of a here-string"
$ cat $text
$ cat <<<$text

The third line is equivalent to

$ echo $text | cat

Here’s a trick in which we use a here-document to form the standard input to a function, read_data, within a script, function-eg. The function requires three pieces of data.

$ cat function-eg
#!/bin/bash
read_data ()
{
  read make
  read model
  read color
}
read_data <<-HEREDOC
        BMW
        3 series
        Blue
HEREDOC
echo "Make: $make, model: $model, color: $color"
$ ./function-eg
Make: BMW, model: 3 series, color: Blue

Understand Shell Quoting

“How do I selectively turn off the shell’s interpretation and expansion of special characters?”

This project explores the art of quoting in the Bash shell. It shows how we force Bash to interpret characters literally in situations where they normally would be considered special characters.

Recognize Special Characters

The Bash shell expands a command line before the command line is executed. During the expansion phase, all special characters—such as wildcards, redirection symbols, and the dollar symbol used in variable expansion—are interpreted and replaced by their expansion text. To invoke a command and pass it text that includes any of those characters used in their literal senses (as in the strings M*A*S*H and $64,000 Question), the special characters must be quoted or escaped to prevent interpretation.

Before we can employ quoting, we need to know which characters must be quoted. Table 9.1 is a handy reference listing all the special characters, and character combinations, that the shell is likely to interpret.

Table 9.1. Shell Special Characters

Symbol

Expansion or Interpretation

#

Introduce a comment

;

Separate commands

{...} (...)

Introduce a command block and subshell

&& ||

Logical AND and OR operators (placed between commands)

~

Home directory

/

Directory or filename separator

$var

Variable expansion

`...` $(...)

Execute a command and substitute the output

$((...)) ((...))

Evaluate an integer expression and condition

' "

Strong quote, weak quote, escape next character

* [...] ?

Globbing

&

Background execution

< > | !

Redirection and pipelining

!

History expansion

Quote and Escape

Suppose that you want to echo the text

I want $lots

If no quoting is used, $lots will be taken as an instruction to expand the variable lots (which is currently unset). The result would be as follows.

$ echo I want $lots
I want

To prevent the dollar special character from being interpreted, escape it in one of three ways. First, precede it with a backslash.

$ echo I want $lots
I want $lots

Second, enclose the entire string in single quotes, which are also called strong quotes because no special characters within them are interpreted.

$ echo 'I want $lots'
I want $lots

Third, use double quotes, also called weak quotes because most, but not all, special characters they enclose are escaped. The exceptions are

  • The dollar symbol in all three forms: $var, $(...), and $((...))

  • The ! symbol in history expansion

In this example, we cannot use double quotes.

A First Escape Trick

Suppose that you want to echo a line such as

I want $1000000 (a lot of $)

We’ll assume that the number of dollars is not fixed but is held in the variable lots. To illustrate the different forms of quoting, we’ll examine what happens when each form is employed, starting with none.

$ echo I want $$lots (a lot of $)
-bash: syntax error near unexpected token `('

Employing single quotes prevents all forms of expansion.

$ echo 'I want $$lots (a lot of $)'
I want $$lots (a lot of $)

To achieve the intended result, we must employ double quotes, which prevent the parentheses from being interpreted but allow expansion of variable $lots.

$ echo "I want $$lots (a lot of $)"
I want 2324lots (a lot of $)

Closer, but this didn’t quite work. It’s still necessary to escape the first dollar symbol. If we don’t, it attaches itself to the second dollar symbol and causes the shell to expand the special variable $$.

$ echo "I want $$lots (a lot of $)"
I want $1000000 (a lot of $)

A More Daring Escape

Let’s look at a trickier example. How might we quote this?

$ echo $5 - That's ok

We cannot use double quotes, because we don’t want to interpret $5. Single quotes won’t work either, because the text itself contains a single quote acting as an apostrophe. A first attempt might have us escaping the apostrophe.

$ echo '$5 - That's ok'
> Control-c

Tip

Tip

Don’t be tempted to skip quoting because a command appears to work correctly. If your command attempts to pass the text note.* to grep unquoted, for example, and no matching filenames exist in the current directory, the shell will not expand it. Your unquoted command will work—until the day you create a file with a name such as note.1.

This fails because inside single quotes, no special characters are interpreted, including backslash. Hence, Bash sees the apostrophe as the closing quote and the last single quote as an unterminated open quote.

The simplest method involves converting the expression to two strings enclosed in single quotes, with the (unenclosed) apostrophe between them. Then we escape the apostrophe by using either a backslash or double quotes, as shown in the next two examples.

$ echo '$5 - That'''s ok'
$5 - That's ok
$ echo '$5 - That'"'"'s ok'
$5 - That's ok

We could also use the following technique where two quoted parts are run consecutively.

$ echo '$5'" - That's ok"
$5 - That's ok

Consecutive Quotes

Suppose that we use the awk command to filter field number 4 (written as $4 in awk scripting) from the output of a ps command. We type the following, employing single quotes to escape $4 from the shell because we want it to be interpreted by awk.

$ ps xc | awk '{print $4}'

Suppose now that we want to do the same thing, but using the field number stored in a shell variable called field.

$ field=4
$ ps xc | awk '{print $$field}'

This won’t work, of course. So how do we both allow Bash to expand $field to 4 and escape the first dollar so we pass, literally, $4 to awk? In this simple example, there are several ways, but you can apply a general solution to almost all quoting problems of this nature. It may seem trivial now, but remember it for the future; I’ve seen many people completely stumped trying to solve quoting dilemmas that are amenable to this particular solution.

We simply start and stop quoted regions as necessary. The first quoted region is '{print $'; the second is '}'. $field is not quoted and, therefore, is expanded by the shell.

$ ps xc | awk '{print $'$field'}'

Although not necessary in this example, the general rule would have quoted each region to prevent problems with spaces in expanded parameters.

$ ps xc | awk '{print $'"$field"'}'

Ensure that you don’t include spaces between the quoted regions.

Multi-Level Quoting

Let’s write a command that uses grep to search a file for the sequence a*. Here’s our test file.

$ cat file
This line contains a*
This line does not

Learn More

Learn More

Project 23 shows how to use the grep command.

Because star is a special character in regular expressions, we must escape it, passing * to grep. Star and backslash are also special characters to the shell and must be escaped from it too, as \ and *.

Learn More

Learn More

Projects 39 and 40 explore the ps command in detail.

Therefore, we form the following command.

$ grep a\* file
This line contains a*

Quoting within Command Evaluation

Here’s a tip that might save much head-scratching. Suppose that we have a command substitution such as

$(ps xc | grep "$target")

The variable $target may expand to include spaces, so we must double-quote it for the grep command to work correctly. If we then use the command substitution as a parameter to another command, we must enclose the whole substitution in double quotes. A naive attempt has us type the following.

$ grep "$(ps xc | grep "$target")" processes.txt

This shouldn’t work, because as we have seen in previous examples, the expression forms two quoted regions: "$(ps xc | grep " and ")". Surprisingly, it does work, because Bash processes a command substitution ($(...)) as an independent syntactical element. It processes $(ps xc | grep "$target") and then considers the outer expression grep "..." processes.txt.

Write Complex Bash Conditions

“How does Bash interpret conditional expressions?”

This project looks at the many forms of conditional expression supported by Bash. It explains the differences of the forms and compares them with one another. It also presents some handy tricks and gives tips on how to avoid syntax errors and malformed conditions.

Learn More

Learn More

Project 10 introduces basic shell scripting techniques and discusses the conditional statements supported by Bash.

Understand Bash Conditions

Bash supports conditional expressions that are used in conditional statements such as if, while, and until. Here’s an example in which we test whether 5 is less than 7 (we use -lt to mean less than). The condition is enclosed in [...] and evaluates to true or false. Ideally, Bash will find truth in such a condition.

$ if [ 5 -lt 7 ]; then echo "yes"; else echo "no"; fi
yes

Now let’s examine this simple expression in more detail to discover how Bash interprets it, and explore the alternative forms of conditional expression offered by Bash.

There is more to Bash conditional expressions than is at first apparent. Let’s look at how Bash interprets a conditional expression. This is key to understanding the different forms and being able to make the most of them.

When interpreting a conditional statement such as if, Bash does not expect to see a Boolean value (TRUE or FALSE), as other languages do, such as C and PHP. Rather, Bash expects to see an executable command. The syntax is effectively

if command; then...

Within such a command line, Bash executes the command that follows if and replaces it with whatever value the command returns. A return value of 0 is interpreted as TRUE; any other return value is interpreted as FALSE.

The [ Command

In our example statement, you might well ask about the whereabouts of the command Bash requires following if. The answer is a little surprising: Bracket ([) is actually a built-in Bash command. When interpreting a conditional statement such as

if [ 5 -lt 7 ]; then...

Tip

Tip

Just as for any other command, white space must separate [ and each of its parameters. You’ll get a syntax error, or a conditional expression that evaluates incorrectly, if you omit the white space. The final parameter, ], is required for syntactic completeness (or perhaps aesthetic value).

Bash first executes the bracket command, passing it the four parameters that form the remainder of the statement: 5, -lt, 7, and ]. (The statement is terminated by a semicolon.) The bracket command (not Bash command-line interpretation) evaluates the conditional expression and returns 0 if the statement is true (as it is in this case) and 1 if it is false.

Learn More

Learn More

Refer to Project 16 for more information on the type command.

In our example, then, after it has executed the bracket command, Bash effectively sees the statement

if 0; then...

and interprets it as if TRUE; then....

We check the credentials of bracket with the type command.

$ type [
[ is a shell builtin

Equivalent to [ is the test command. The two are identical except that test does not expect to see a closing bracket.

$ if test 5 -lt 4; then echo "yes"; else echo "no"; fi
no
$ type test
test is a shell builtin

To discover all the conditional operators supported by bracket and test, consult Bash’s built-in help command by typing

$ help test

Learn More

Learn More

Project 6 covers redirection and pipelining.

Several examples are given in the next section.

Here’s a neat trick. A conditional statement may be given any command, not just [ or test. We could test whether two files differ by directly testing the return value from the diff command.

$ if diff eg1.txt eg2.txt &> /dev/null
> then echo "Same"; else echo "Different"; fi
Same

Most commands return 0 (TRUE) for success or yes and 1 (FALSE) for failure or no. In the diff example, we took the precaution of throwing away all errors and other output by using the redirection &>/dev/null to prevent the shell script from writing unwanted text to the Terminal screen when it executes.

Tip

Tip

The return value of the last command to be executed is held in the special shell variable $?.

$ diff eg1.txt eg2.txt
$ echo $?
0

Example Conditionals

The bracket command has a number of primaries you can use to test file attributes, such as whether a file exists.

$ if [ -e no-file ]; then echo "Exists"; ¬
    else echo "No such file"; fi
No such file

or whether you own a particular file.

$ if [ -O eg1.txt ]; then echo "It's mine"; fi
It's mine

Bracket can compare strings for less than, greater than, equality, inequality, and emptiness. The next two examples demonstrate tests for equality and emptiness. The -z primary returns TRUE if the length of the string that follows is 0 (the string is empty).

$ ans=""
$ if [ "$ans" = "yes" ]
> then echo "You agree"; else echo "You disagree"; fi
You disagree
$ if [ -z "$ans" ]; then echo "You didn't reply"; fi
You didn't reply

Integer evaluation is performed as demonstrated in previous examples, using -eq for equality, -ne for inequality, and so on. Type help test for more information.

Complex Conditions

You may specify more complex conditions by using AND, represented by -a; OR, represented by -o; and NOT, represented by !. We can test whether both the variables ans and default are empty by using the following complex condition.

$ if [ -z "$ans" -a -z "$default" ]
> then echo "I don't know what you want"; fi

Don’t omit the spaces between operators and operands. In the next example, we have omitted the spaces around the = sign.

$ allow=""; user=""
$ if [ "$allow"="yes" -o "$user"="root" ]; ¬
    then echo "OK"; fi
OK

Omitting the spaces makes the conditional expression appear to be

[ "non-null-string" -o "non-null-string" ]

This is how it should look and evaluate.

$ if [ "$allow" = "yes" -o "$user" = "root" ]; ¬
    then echo "OK"; fi
$

We form expressions that are more complex by employing parentheses to ensure that evaluation occurs in the correct order. Our first attempt does not work.

$ ans="yes"; allow="no"; user="root"
$ if [ "$ans" = "yes" -a ¬
    ( "$allow" = "yes" -o "$user" = "root") ]
-bash: syntax error near unexpected token `('

The syntax error is reported because the parentheses are parameters to the bracket command and must be escaped from the shell, as demonstrated in our next attempt.

$ if [ "$ans" = "yes" -a ¬
    ("$allow" = "yes" -o "$user" = "root" ) ]
> then echo "OK"; fi
OK

Tip

Tip

It’s fine to escape individual items within conditional expressions by enclosing them in quotes, but enclosing an entire expression in quotes will always cause it to be interpreted as a string with value TRUE—a situation that can produce decidedly undesirable results.

$ if [ "1 -gt 9" ]
> then echo "Odd"; fi
Odd

Bash Boolean Operators

Compare the next two commands.

$ if [ "$allow" = "yes" -o "$user" = "root" ]; ¬
    then echo "OK"; fi
$ if [ "$allow" = "yes" ] || [ "$user" = "root" ]; ¬
    then echo "OK"; fi

The difference between the two statements is that in the first example, the built-in bracket command evaluates the whole expression. In the second example, we have two separated bracket commands, and it’s Bash that performs the OR operation, using its own || operator. The two commands are functionally equivalent; which you choose is a matter of personal preference. Bash uses a more friendly and C language–like syntax. It provides OR (||), AND (&&), and NOT (!) operators.

We can employ Bash operators outside a conditional statement. For example:

$ command1 && command2

In such a command, command2 is executed if, and only if, command1 returns TRUE. This technique works because Bash does not evaluate the second part of an AND statement if the first part if FALSE; the result can only ever be FALSE. This behavior is known as short-circuiting. Similarly, we could specify

$ command1 || command2

In this example, command2 is executed if, and only if, command1 returns FALSE.

As a practical example, think of what happens if we type the following command line in a directory where no subdirectory named fred exists.

$ cd fred; ls
-bash: cd: fred: No such file or directory
Desktop     Library    Music        Public
...

Tip

Tip

To source a shell script if, and only if, it exists and is readable, use the following conditional syntax (shown here applied to an initialization script /sw/bin/init.sh).

[ -r /sw/bin/init.sh ]
&& source
/sw/bin/init.sh

Command cd returns an error, but ls executes anyway, listing the current directory.

To avoid executing the ls command when the cd command fails, we use the following trick, which relies on the fact that cd returns TRUE when it succeeds and FALSE when it fails.

$ cd fred && ls
-bash: cd: fred: No such file or directory
$

Use the [[ Keyword

Bash provides a relatively new way of specifying a conditional expression, called an extended conditional expression. It uses the syntax [[...]] instead of [...] and is compatible with the older form. It is, in fact, a keyword like if and while, not a command like [ and cd, and suffers fewer limitations. It also uses the more friendly syntax && and || for AND and OR. We may type a conditional expression such as

$ if [[ "$allow" = "yes" || "$user" = "root" ]]; ¬
    then echo "OK"; fi

Extended conditional expressions also spare you the trouble of escaping any parentheses they contain.

$ if [[ ("$allow" = "yes") || ("$user" = "root") ]]; ¬
   then echo "OK"; fi

Note

Note

The [[...]] construct was introduced in Bash 2.02.

Beware, however, that bare numbers within extended conditionals are treated as strings—text sequences without numerical value. You might be tempted to use this expression in the belief that Bash is employing integer arithmetic when evaluating the expression

$ if [[ (3 < 5) ]]; then echo "OK"; fi
OK

Tip

Tip

To find out more about the [[...]] construct, type

$ help [[

or check the Bash man page by typing

$ man bash

and then type /[[ exp within the man page.

But it’s not, as we can see by this example.

$ if [[ (3 < 15) ]]; then echo "OK"; fi
$

Use Bash Integer Conditions

When writing conditional expressions that involve integer values, use the Bash ((...)) construct. Like [[...]], it uses C language–like syntax. Although [[...]] is for general conditions, ((...)) operates only on integer values and variables.

Here’s an example.

$ v1=3; v2=2
$ if (($v1 < $v2)); then echo "yes"; else echo "no"; fi
no

You may omit the $ normally required for variable expansion.

$ if ((v1 < v2)); then echo "yes"; else echo "no"; fi
no

Tip

Tip

To find out more about the ((...)) construct, check the Bash man page by typing

$ man bash

and then

/^ARITHMETIC EVALUATION within the man page.

The Bash ((...)) construct provides a more friendly syntax by employing && and ||, unescaped parentheses, and < in place of -lt. We could write

$ if (((a < b) && (b < c))); then...

or

$ while ((length!=0))

Debug Your Scripts

“My script doesn’t work, and I can’t figure out where it’s going wrong. How do I debug it?”

This project looks at some useful attributes provided by the Bash shell to help in debugging a script. Projects 9 and 10 introduce the basics of shell scripting. Project 45 covers Bash shell attributes.

Learn More

Learn More

Project 45 covers Bash shell attributes.

Tip

Tip

If you write example scripts to try the debugging techniques, be sure to make the script executable (see Project 9).

Use Bash Attributes to Troubleshoot

Suppose that you’ve just written a 100-line script. You run it, and it goes horribly wrong. It’s time to start debugging. Bash provides several shell attributes to aid debugging. These attributes are normally switched off and are activated by the built-in set command. Multiple switches can be placed in a script so that attributes can be turned on and off selectively within specified sections.

To switch on an attribute— nounset, for example—type

$ set -o nounset

To switch off an attribute, type

$ set +o nounset

We’ll write a simple shell script, complete with a couple of errors, to demonstrate some debugging techniques. Here’s our script.

$ cat debug-me
#!/bin/bash
# debugging
set -o noexec


echo "Calculate the total cost"
price=12; quantity=10
total=((price*quantity))
echo "The total is $totl"

You’ll notice on line 4 the statement set -o noexec, which sets the noexec attribute. This attribute instructs Bash to parse the script and check it for syntax errors, but without actually executing it. Setting noexec is probably the first step in testing a new script. We are able to eliminate syntax errors quickly, without ever executing the script (and potentially doing some harm if the script goes wrong).

Now let’s test-run the script and check it for syntax errors.

$ ./debug-me
./debug-me: line 7: syntax error near unexpected token `('
./debug-me: line 7: `total=((price*quantity))'

Note

Note

Don’t set an attribute on the command line and expect it to carry through to a shell script. A script is executed by a new instance of the shell that does not inherit attributes set in the parent shell.

One syntax error is reported; we’ll correct it by changing line 7

total=$((price*quantity))

(This is the correct syntax for integer arithmetic evaluation.)

We’ll run the script again, having removed the line that sets noexec.

$ ./debug-me
Calculate the total cost
The total is

Tip

Tip

Set the noexec attribute as an easy way to comment out the tail end of a script as you progressively debug it. Place the command in the script and relocate it as you progressively execute more statements.

Another bug has surfaced. Our variable total, which is supposed to hold the calculated total, seems not to do so. A rich area for bug catching is that of misspelled variable names. One way we can catch such errors is to request that Bash disallow the reading of unset variables. Near the top of the file, add the line

set -o nounset

(This reads no unset, not noun set.) Run the script again.

$ ./debug-me
Calculate the total cost
./debug-me: line 8: totl: unbound variable

Tip

Tip

Set the nounset attribute in your scripts as a matter of course. Occasionally, you’ll want to do what it disallows; in these occasions, simply remove it or, better still, switch it off and back on again around the statements you want to exempt. To switch it off, specify +o instead of -o to the set command.

That bug was easily spotted. After a quick correction, the script works.

$ ./debug-me
Calculate the total cost
The total is 120

Trace Script Execution

Our second example script contains some very simple branching. It’s supposed to check whether an argument has been passed to the script and then print whichever of two messages is appropriate: An argument is required if none was passed or Ok if an argument was passed. Let’s view and then run the script without giving it an argument.

$ cat debug-me2
#!/bin/bash
if [ "$1" = "" ]; then
 echo "Usage: An argument is required"
fi
echo "Ok"
$ ./debug-me2
Usage: An argument is required
Ok

Whoops! It printed both messages. Quantum mechanics aside, it can’t have and not have an argument at the same time. Let’s trace through the script by setting the verbose attribute. Every statement that’s read will be echoed to the Terminal screen. Near the top of the file, add the line

set -o verbose

Now run the script.

$ ./debug-me2
if [ "$1" = "" ]; then
 echo "Usage: An argument is required"
fi
Usage: An argument is required
echo "Ok"
Ok

Following this through, we see each statement echoed as it’s read. Interspaced with this debugging output is the actual script output Usage: An argument is required and Ok.

The problem (which is obvious in such a short script) is that we’ve missed the exit statement from just before the end of the if statement. We’ll add the missing exit statement and (if it’s not too presumptuous) switch off verbose and try again.

$ ./debug-me2
Usage: An argument is required

Now the script works.

Tip

Tip

Set both the xtrace and the verbose options to make it easier to follow a long script through execution. You’ll see all statements echoed as they are read, plus those that are executed, expanded, and marked by a plus sign.

Display Executed Statements

The shell attribute xtrace provides an alternative tracing facility. Like verbose, it causes statements to be displayed, but unlike verbose, it displays only those statements that are executed. Remember verbose causes statements to be displayed as they are read, whether they are executed or not. Additionally, xtrace echoes statements after the shell has expanded them, so you see the statements as they will be executed; that can be very useful when debugging. Let’s try it out. Near the top of the file, add the line

set -o xtrace

Tip

Tip

Set the xtrace option on the command line to aid the debugging of interactive commands. This technique can be especially useful when debugging shell or alias expansion, because each line is echoed after all the expansion has taken place, and you see exactly what the shell executes.

We’ll run the script twice, first without and then with an argument. Trace statements are shown preceded by a plus symbol.

$ ./debug-me2
+ '[' '' = '' ']'
+ echo 'Usage: An argument is required'
Usage: An argument is required
+ exit
$ ./debug-me2 hello
+ '[' hello = '' ']'
+ echo Ok
Ok

In the trace output, you’ll see the if statement after expansion and the two alternative echo statements.

Exit on Error

To terminate a script when it executes a command that fails, set the exit-on-error attribute, errexit,

set -o errexit

Whenever a command (mkdir or cp, for example) is executed and fails, the script terminates. This attribute relies on a command’s return code. As discussed in Project 81, all commands return a number when they exit; a return code of 0 means success; nonzero return codes indicate errors; and different commands return different numbers depending on the type of error. Check a command’s man page to find out what codes it’s likely to return. This technique can be used to put the brakes on a script during debugging, ensuring that it doesn’t continue after a failed command, executing potentially harmful statements.

Batch-Process Files

“How do I adapt my scripts to operate on multiple files?”

This project shows you how to write a script that parses a list of filenames and processes each file in the list. It also shows you how to develop wrapper scripts that feed each filename in a list, one at a time, to scripts that accept only single filenames. Projects 9 and 10 cover the basics of shell scripting.

Discover Loops and Wrappers

Suppose that we write a script called action that performs some specified action on a text file. The script takes two option flags: -v for verbose and -a for action followed by an action name. We are not concerned with what the script actually does; it’s presented merely as a vehicle to illustrate how to write a script that processes a list of filenames passed on its command line.

We might call the script to process all the text files in the current directory by typing

$ action -v -a squeeze *.txt

We rely on the shell to expand *.txt into a list of all the .txt files in the current directory. To succeed, our action script must be written to accept any number of files and to act on each file in turn. The script must parse the options, save them, and then loop to process each file listed on the command line.

A second approach sees us writing a general-purpose wrapper script. The wrapper script accepts many filenames and calls a simpler action script a number of times, each time passing it the next filename from the list. Such a wrapper script can also be used on scripts and commands over which we have no control and that do not accept a list of files. Project 58 introduced this technique when it considered how we might batch-edit files. The solutions presented in that project are similar but take advantage of Bash functions. In this project, we write Bash shell scripts.

Process Multiple Files

Let’s jump straight in with a sample script called action, which will take on the functionality described in the previous section.

$ cat action
#!/bin/bash


# This is our main function to process each file
Process () {
  echo "Processing $1, verbose: ${verbose:-n}, ¬
    action: ${action:-none}"
}


# This while loop extracts and remembers each option setting
while getopts "va:" opt; do
  case $opt in
    v) verbose="y";;
    a) action=$OPTARG;;
    *) echo "Usage: ${0##*/} [-v] [-a action] ¬
    filename..."; exit 1;;
  esac
done
shift $((OPTIND-1))


# This for loop processes each filename in turn
for filename in "$@"; do
  Process "$filename"
done
exit 0

Learn More

Learn More

Project 52 explains Bash functions.

The script is written to demonstrate batch-processing techniques. The actual processing is performed in the function Process, appearing at the top of the script. In the example script, this function does nothing more than echo the name of the file it’s supposed to process and its understanding of the options.

Although the script does not perform a real-world task, it serves as a template from which you can build your own scripts.

We assume that the script takes two optional parameters. The first is -v for verbose output; the second is -a for action, followed by an action type. The default values for these options are not verbose and an action of none.

Learn More

Learn More

Project 76 covers parameter expansion. The function Process employs parameter expansion with default values when it echoes the options.

Process the Options

A script of any significance will accept options, and it’s not possible to process the list of filenames without knowing where the options end and the filenames start. To ensure that the example script is a useful template, we’ll first show you how to process and save the list of options.

Tip

Tip

Consult the Bash man page or type

$ help getopts

to learn more about getopts.

The while loop processes the options. The code shown here may be used by any script that must parse a list of options. It takes advantage of the Bash built-in function getopts written to process a script’s positional parameters (the arguments passed on its command line), looking for options and their associated arguments. In our example, the string va: in

getopts "va:" opt

tells getopts that we allow the options -v and -a, but no others. The colon following a tells getopts to expect an argument to follow. getopts writes the next option it reads to the variable opt (or whatever is named in the command) and any associated argument to the variable OPTARG. We employ a case statement to process each argument, setting the variables verbose and action as appropriate. getopts drives the while loop by returning TRUE when an option is found and FALSE when the list of options is exhausted. When the options are exhausted, we expect the list of filenames to follow. The shift statement immediately following the while loop shifts all parameters down such that the first filename is moved to the positional parameter $1 and all the options we’ve just processed drop off the end. The value of OPTIND is set appropriately by getopts so that this works.

Process the Files

The for loop extracts each filename from the remaining positional parameters, expanding "$@" to be the list of quoted filenames.

Note that the for loop uses "$@", which expands to "$1" "$2" . . ., ensuring that our script is able to cope with filenames that include spaces. Note that if we had used "$*", we’d have generated one long filename: "$1 $2...".

The variable filename is assigned the value of the next filename in the list each time around the loop. To process the file, we call the function Process. Remember, the point of this exercise is to write a script that processes a list of filenames; the actual processing performed on a file is incidental.

Here are some examples of what we might see when we run the script.

$ ./action -x
./action: illegal option -- x
Usage: action [-v] [-a action] filename...
$ ./action -v -a
./action: option requires an argument -- a
Usage: action [-v] [-a action] filename...
$ ./action -v -a list
$ ./action -v -a list *.txt
Processing letter.txt, verbose: y, action: list
Processing notes.txt, verbose: y, action: list
Processing three one.txt, verbose" y, action: list

Write a Wrapper Script

For scripts and commands that don’t accept a list of filenames, and perhaps to avoid adding such functionality to your own scripts, write a wrapper script. The script, which we’ll call each, accepts a wildcard pattern, such as *.txt, and a command to execute. It expands the wildcard into a list of filenames and applies the target command to each filename in turn. The target command, therefore, does not have to be written to process a list of filenames.

Here’s our script, in which we assume that the first argument is a wildcard pattern; the remaining arguments form the command to execute and any options it requires.

$ cat each
#!/bin/bash
filetype=$1; shift
for file in $filetype; do
  $* "$file"
done

The first parameter (the wildcard pattern) is saved in the shell variable filetype for use later. The shift operator discards the first parameter, shuffling the remainder down. The for loop processes each file in the expanded wildcard pattern held in filetype (the shell automatically expands this for use, just as it does on the command line) by setting the variable $file to be the next filename in the list each time around the loop. The line that follows expands the remainder of the parameters into the target command and any arguments ($*) and the filename under consideration by the for loop ($file).

$* "$file"

We’ll try out our each script by using it with another script to rename all the text files in a directory, replacing their .txt extensions with .txt.bak. Recall that each simply feeds one file at a time to the target command (or script). The script that does the name-changing is named rename, and it contains just one command. It takes a filename as its only argument and changes the filename by tacking .bak onto the original filename.

mv "$1" "$1.bak"

To create a script that contains this command, simply echo it and redirect output to file rename (after making sure that no file of that name already exists in the working directory); then set execute permissions on the file.

$ echo 'mv "$1" "$1.bak"' > rename
$ chmod +x rename

Before we put each and rename to work, let’s check the files in the current directory. Using wildcard pattern *.txt* with ls ensures that our list will include both normal text files (with extension .txt) and any that have been processed by rename (with extension .txt.bak).

$ ls *.txt*
letter.txt      notes.txt    three one.txt

Type the following to have each call and execute rename.

$ each "*.txt"./rename

Note

Note

You can’t rename all. tx t files to .bak by using a command such as

$ mv *.txt *.bak

because of the way the shell expands wildcard patterns on the command line.

Now run ls again to check the results.

$ ls *.txt*
letter.txt.bak     notes.txt.bak      three
one.txt.bak

For our next trick, we’ll remove the extension we just added. This example pairs our each wrapper with a script called unrename, which uses “topping and tailing” strings during parameter expansion—a technique discussed at length in Project 76. In short, the parameter expansion ${1%.*} expands $1 and removes the final dot, and everything that comes after it, from any filename.

$ echo 'mv "$1" "${1%.*}"' > unrename
$chmod +x unrename
$ each "*.bak"./unrename
$ ls *.txt*
letter.txt      notes.txt       three one.txt

Finally, using each and a new script that applies techniques from rename and unrename, we’ll change the extension of our .txt files to .bak.

$ echo 'mv "$1" "${1%.*}.bak"' > re-rename
$chmod +x re-rename
$ each "*.txt"./re-rename
$ ls *.txt*
ls: *.txt*: No such file or directory
$ ls *.bak
letter.bak      notes.bak      three one.bak

All the above are simply examples of what can be done. The each script can be customized to your own preferences and used from the command line or by another script.

Learn More

Learn More

Project 18 shows what you can do with find and xargs.

Recursive Batch Processing

Here’s a simple recursive version of each, which we call reach. It searches a whole directory hierarchy for matching filenames.

$ cat reach
#!/bin/bash
filetype=$1; shift
find. -name "$filetype" -print0 | xargs -0 -n1 $*

To rename all .txt files to .bak, we employ the same rename script as before, but use reach to apply the script to all .txt files in the current directory hierarchy.

$ echo 'mv "$1" "${1%.*}.bak"' > rename
$ reach "*.txt"./rename

A Bash and Tcsh Reference

“What’s the correct syntax for . . . ?”

This project looks at the syntax of common shell commands such as variable assignment, redirection, and shell scripting statements. It shows the syntax for Bash and Tcsh—the two shells that are used most often in Mac OS X Unix.

Learn More

Learn More

Project 5 compares the various shell flavors.

Learn More

Learn More

Project 4 covers shell variables and environment variables, and how they differ.

Set Variables

Table 9.2 shows you how to set shell variables and environment variables.

Table 9.2. Setting Variables

Set

Bash

Tcsh

Shell variable

variable=value

set variable = value

Environment variable

ENVVAR=value; export ENVVAR

setenv ENVVAR value

Environment variable (Bash only)

declare -x ENVVAR=value

 

Redirection and Pipelining

Table 9.3 shows the syntax employed by both shells to express redirection and pipelining.

Table 9.3. Syntax for Redirection and Pipelining

Redirect or Pipe

Bash

Tcsh

stdout

cmd > file

cmd > file

stderr

cmd 2> file

(cmd >/dev/tty) >& file

stdout appending

cmd >> file

cmd >> file

stderr appending

cmd 2>> file

(cmd > /dev/tty) >>& file

stdout with clobber

cmd >| file

cmd >! file

stderr with clobber

cmd 2>| file

(cmd > /dev/tty) >&! file

Both to same file

cmd &> file

cmd >& file

Both to different files

cmd > out 2> err

(cmd > out) >& err

Merge stdout into stderr

cmd 1>&2

n/a

Merge stderr onto stdout

cmd 2>&1

n/a

stdin

cmd < file

cmd < file

Pipe stdout

cmd1 | cmd 2

cmd1 | cmd2

Pipe both

cmd1 2>&1 | cmd2

cmd1 |& cmd2

Learn More

Learn More

Project 6 covers the concepts of redirection and pipelining.

Tee Time

To see the output of a command onscreen and redirect it to a file, use the tee command.

$ ls Sites | tee list.txt
images
index.html
$ cat list.txt
images
index.html

To redirect to multiple files, just type the names of the files as arguments. Apply option -a to append to the output files rather than overwrite them.

Startup Files

The following script files are executed by the Bash shell when it starts up. For login shells (or shells started with the command bash --login), they are

  • /etc/profile

  • ~/.bash_profile

Note

Note

Noninteractive Bash shells do not execute any startup files.

Learn More

Learn More

Project 47 covers the shell startup sequence.

For non-login shells, they are

  • /etc/bashrc (though the Bash manual claims otherwise)

  • ~/.bashrc

The following script files are executed by the Tcsh shell when it starts up. For login shells (or shells started with the command tcsh -l), they are

  • /etc/csh.cshrc

  • /etc/csh.login

  • ~/.tcshrc

  • ~/.login

For non-login shells, they are

  • /etc/csh.cshrc

  • ~/.tcshrc

Control Constructs

Syntax for each Bash and Tcsh control construct is illustrated in the following examples. All the scripts actually work, so you can play around with them.

The if Construct

#!/bin/bash
if [ "$1" = "positive" ]; then
  echo "Yes"
elif [ "$1" = "negative" ]; then
  echo "No"
else
  echo "Not sure"
fi


#!/bin/tcsh
if ("$1" == "positive") then
  echo "Yes"
else if ("$1" == "negative") then
  echo "No"
else
  echo "Not sure"
endif

The case/switch Construct

#!/bin/bash
case "$1" in
  "positive")
    echo "Yes"
  ;;
  "negative")
    echo "No"
  ;;
  *)
    echo "Not sure"
  ;;
esac

#!/bin/tcsh
switch ("$1")
  case "positive":
    echo "Yes"
  breaksw
  case "negative":
    echo "No"
  breaksw
  default:
    echo "Not sure"
  breaksw
endsw

Learn More

Learn More

Project 10 gives examples of control constructs in a shell script.

The for Loop

#!/bin/bash
for word in hello goodbye au-revoir; do
  echo $word
done

#!/bin/tcsh
foreach word (hello goodbye au-revoir)
    echo $word
end

The while Loop

#!/bin/bash
n=0
while [ ! $n = 10 ]; do
  echo $n
  n=$(expr $n + 1)
done


#!/bin/tcsh
set n = 0
while ($n != 10)
  echo $n
  set n = `expr $n + 1`
end

Take Advantage of Subshells

“How do I force a group of commands to execute in their own environment?”

This project discusses the use of subshells: what they are and how you might take advantage of their special features. It also introduces group commands, which are similar to subshells.

Tip

Tip

Use the environment variable SHLVL to discover how deeply nested the current (sub-)shell instance is. Level 1 is the login shell, level 2 is a subshell, level 3 is a subshell launched by the subshell, and so on.

Subshells

A subshell is a new instance of a shell launched to run a single command, a command list (one or more commands separated by a semicolon), or a shell script.

Learn More

Learn More

Project 4 includes a section explaining shell and environment variables and their respective scopes.

To execute a command list in a subshell, enclose it in parentheses on the command line.

$ (cd /; ls)

This technique produces similar results to executing the command list in the normal manner except for one important difference: Because the command list runs in a new shell instance and not the current shell, it executes in a new environment. Recall that a new shell instance inherits environment variables from the current interactive shell, but not other settings, such as shell variables, attributes, and options. Further, no part of the subshell’s environment is passed back to the parent shell. In our simple example, then, the built-in cd command executed in a subshell can’t change the interactive shell’s current working directory.

Tip

Tip

A script is run by the executable named in the first line of the shell script—usually, #!/bin/bash. The first line of a script may name any executable, not necessarily a shell. Here’s a (pointless) illustration.

$ cat myecho
#!/bin/echo
$ ./myecho Hello there!
./myecho Hello there!

Create Local Blocks of Code

The previous section explained that subshells execute in their own environments. We can take advantage of this when writing shell scripts. Enclosing a section of script code within parentheses, so that it executes in its own subshell, lets us set local shell variables and attributes that apply only to the enclosed code block. Such settings are not visible outside the code block and do not affect the remainder of the script when the code block has completed executing.

Learn More

Learn More

Project 82 talks about the noexec attribute.

Learn More

Learn More

Project 86 shows you how to use a subshell to limit the scope of a signal handler in a Bash script.

Here’s a neat trick that uses a subshell to localize shell attributes. Suppose that we need to comment out a section of code and choose to use the noexec attribute to do so. Here’s our first attempt, which doesn’t work; line 3 is never echoed.

$ cat block-eg
#!/bin/bash
echo line 1
set -o noexec # switch off execution to comment out
echo line 2
set +o noexec # switch execution back on
echo line 3
$ ./block-eg
line 1

The reason for the script’s failure lies in the fact that the set +o noexec statement is never executed; we just switched off execution, and this includes execution of the built-in set command.

We get around this problem by placing the code to comment out in a subshell. Shell attributes set in a subshell—set -o noexec, in this example—are not passed back to the parent shell, so we don’t need to turn execution back on. Clever!

$ cat block-eg
#!/bin/bash
echo line 1
(set -o noexec
echo line 2)
echo line3$
./block-eg
line 1
line 3

Selectively Redirect Input and Output

We can group commands and apply selective input and output redirection. We might discard the standard error from several commands by writing them as a subshell. This technique averts the necessity to redirect the standard error individually from every command in the group.

$ cat redir-eg
#!/bin/bash
dir=$1; file=$2
( cd $dir
ls $file ) 2> junk
# more-commands...

Here’s another example that uses a subshell to redirect the standard input of a group of commands. The main script reads its input (name and age) from the terminal; the parenthesized section takes its input (code and membership) from the file autodata.

$ cat eg
#!/bin/bash
read -p "Name: " name
( read code
  read membership
  echo "Code: $code, membership: $membership"
) < autodata.txt
read -p "Age: " age
echo "Name: $name, age: $age"

The subshell reads from the file autodata.txt.

$ cat autodata.txt
ABC
123

When we run the script, we provide a name and (false Selectively Redirect Input and Output) age.

$ ./eg
Name: Adrian
Code: ABC, membership: 123
Age: 21
Name: Adrian, age: 21

As the script stands, the values read from the file autodata are lost when the subshell completes; the local shell variables code and membership are not passed back to the parent shell. Although this limitation stems from the rather simple example constructed to illustrate subshells, it provides a platform to illustrate some useful tricks.

The next code extracts shows you how to pass values back from a subshell to the main shell. New and changed lines are shown in bold.

$ cat eg
#!/bin/bash
declare -a autodata
read -p "Name: " name
autodata=($(
  ( read code
    read membership
    echo "$code $membership"
  ) < autodata.txt
))
read -p "Age: " age
echo "Name: $name, age: $age"
echo "Code: ${autodata[0]}, membership: ${autodata[1]}"

When we run the script, we see that the code and membership values are passed back to, and displayed from, the main script.

$ ./eg
Name: Adrian
Age: 21
Name: Adrian, age: 21
Code: ABC, membership: 123

How does this work? The whole subshell runs as a subcommand (enclosed in $(...)). As with all subcommands, Bash ultimately reads this expression as the value of its output—in this case, the value of subshell variables code and membership, which are echoed by the subshell before it completes. Furthermore, we capture that value (before the subshell disappears) by assigning the output of the subcommand to an array variable, autodata, using the expression

autodata=(value)

Learn More

Learn More

Project 87 covers Bash array variables.

Learn More

Learn More

Project 55 shows how to launch commands to run in the background.

where value is the subshell run as a subcommand.

This example employs a few techniques, and you might have to experiment a little to follow how it works.

Learn More

Learn More

Project 6 covers redirection.

Redirect stdout with Tsch

The Tcsh shell is not able to redirect standard error independent of standard output. The Bash shell uses the following syntax to redirect only standard error.

cmd 2> file

In the Tcsh shell, we must apply the following trick.

(cmd >/dev/tty) >& file

Note

Note

The syntax to express a group command is quite fussy. The opening brace must be followed by a space, and a semicolon must terminate the last command.

The command is run in a subshell, and standard output is redirected back to the terminal. This has no effect except that the output from the subshell now contains only standard error. Then we specify Tcsh shell syntax to redirect both standard output (there’s none, as it has already been redirected) and standard error to the file file.

Tip

Tip

A group command executes more efficiently than a subshell and is preferable whenever possible.

Group Commands

A group command is like a subshell. To form a group command, enclose a command list, or a section of a shell script, in braces.

{ command; command; ...;}

The difference between a subshell and a group command is that the current shell, not a new instance of the shell, executes a group command. This means that it does not execute in its own local environment, so some of the tricks employed using subshells do not work.

Note

Note

You can think of a group command as being an inline and anonymous (unnamed) function.

Let’s revisit the standard-input example that we used earlier when illustrating subshells. In our new version, we employ a group command instead of a subshell. We need no longer use clever trickery to preserve the value of the variables code and membership, because they are no longer local to the enclosed block of code.

$ cat eg-group
#!/bin/bash
read -p "Name: " name
{ read code
  read membership
} < autodata.txt
read -p "Age: " age
echo "Name: $name, age: $age"
echo "Code: $code, membership: $membership"
$ ./eg-group
Name: Adrian
Age: 21
Name: Adrian, age: 21
Code: ABC, membership: 123

Project 81 shows how Bash operators are used outside a conditional expression to make execution of a second command dependent on the outcome of a first. For example:

$ command1 && command2

The command2 is executed if, and only if, command1 returns TRUE. This technique works because Bash does not evaluate the second part of an AND statement if the first part is FALSE. (There’s no need to; if the first part returns FALSE, the result of the entire AND expression can only ever be FALSE.) This behavior is known as short-circuiting.

If either command is or both commands are a command sequence, you must make the sequence a group command for this technique to work.

$ { cmd1; cmd2; ...; } && { cmd3; cmd4; ...;}

Here’s a useful trick that asks for root authentication.

sudo -p "Admin password " echo 2> /dev/null || ¬
    { echo "Incorrect"; exit; }

The code is useful when placed at the start of a script that, later on, issues commands that require root permission obtained via sudo. The sequence will prompt for an administrator’s password as soon as the script is invoked. Authentication resulting from a correct password lasts 5 minutes–plenty of time for most homemade scripts to run. If authentication fails, the second (group) command displays an error message and the script exits, before needless execution of the code that precedes an internal sudo command.

Trap and Handle Unix Signals

“How do I catch a signal sent to my Bash shell script?”

This project considers signals such as INT, HUP, and TERM, but from the receiving end. It shows how you might equip your Bash shell scripts with custom signal handlers (also called traps) to catch and handle signals sent to it. Project 40 considers signals from the other direction: how to issue them.

Learn More

Learn More

Refer to Project 40, which lists the various signals and shows how you might send them to a running process by using the kill command.

Understand Signals

Signals are a feature built into Unix. A signal is like an interrupt: Sending a signal to a process causes the process to stop what it’s doing and to respond. Signals are used to tell processes to take a specific action, such as restarting, terminating, or temporarily halting.

Tip

Tip

Signals are frequently used by faceless background programs (daemons) to receive instructions from the user: There’s no other simple means to communicate with them. You may use the same technique with your own background scripts. Project 55 covers background jobs.

Try this example.

$ sleep 1000
wake up
ok, you asked for it
<Control-c>

A running process that’s not responding to keyboard input (wake up typed in the example above) somehow manages to respond when you press Control-c. How does this happen? When any process is launched, it’s accompanied by some special code that manages signals, called a handler. A handler is executed whenever a signal is sent to its process. Some processes supply their own handlers; other processes rely on default handlers Unix automatically attaches as the process is launched.

You may send a signal to a process by using the kill command. A limited number of signals may also be sent by pressing control sequences such as Control-c, which instructs Terminal to send the appropriate signal.

There are many signals, and individual processes can elect to respond to some signals and ignore others. Each process may respond to a signal in its own particular way.

Tip

Tip

List the current settings for Terminal, including keystroke-to-interrupt mappings, by typing

$ stty -e

Catch Signals

Let’s write a short Bash shell script that demonstrates how to catch and handle a specific signal. If you don’t supply your own handlers, a script is launched with a default set of handlers. We’ll override the default handler for a signal called SIGINT (or just INT), which can be sent from the kill command or from Terminal by pressing Control-c.

Tip

Tip

Discover the signals that a command or daemon respects by checking its man page. Search for the section titled “SIGNALS.”

Here’s an example script that loops indefinitely, going dotty.

$ cat signal-eg
#!/bin/bash
trap 'echo "Got INT"' INT
while true; do
  echo -n "."; sleep 1
done

Learn More

Learn More

Refer to Projects 9 and 10 for basic shell scripting.

Normally, we’d be able terminate the script by pressing Control-c in Terminal. The script catches the INT signal, however, by including the statement

trap 'echo "Got INT"' INT

This statement simply echoes the text "Got INT" and carries on regardless.

Let’s run the script to see what happens when we press Control-c and when we send an INT signal from kill.

$ ./signal-eg
....^CGot INT          # <--here we typed Control-c (=INT)
......Got INT          # <--here we sent INT using command kill
......Terminated       # <--here we sent TERM using command kill
$

We sent an INT signal by using the line

$ kill -INT $(ps xww | awk '/./signal-eg/{print $1}')

Learn More

Learn More

Project 40 shows you how to identify running processes and send signals to them.

To stop the script, we must send a stronger signal, such as TERM, by typing

$ kill -TERM $(ps xww | awk '/./signal-eg/{print $1}')

To honor the interrupt signal and exit the script, we would change our trap statement to read

trap 'echo "Got INT, bye bye."; exit' INT
$ ./signal-eg
....^C Got INT, bye bye.

Tip

Tip

trap is a Bash built-in command. To learn more about it, and to display a list of signals, type

$ help trap
$ trap -l

Add Multiple Handlers

To add more than one handler, simply add more trap statements. To handle both the INT and TERM signals, for example, we would write

trap 'echo "Got INT, bye bye."; exit' INT
trap 'echo "Got TERM, bye bye."; exit' TERM

If more than one signal requires the same action, you may list them all in a single trap statement.

trap 'echo "Got INT or TERM, bye bye."; exit' INT TERM

Note

Note

The signal KILL cannot be caught, because you can’t override its default handler. This ensures that every process can be terminated by sending a KILL signal.

Write a Complex Handler

If your signal hander is more complex than just a couple of statements, consider having a trap statement call a function. Here’s an example that traps the HUP signal and performs some significant processing upon its receipt.

$ cat signal-eg
handlehup ()
{
  echo "Reloading configuration"
  # more statements here
  echo "Restart complete"
}
trap 'handlehup' HUP
while true; do
  echo -n "."; sleep 1
done

When a HUP signal is received, the function handlehup is called.

$ ./signal-eg
...Reloading configuration # <-- we issued HUP using kill
Restart complete
.......

Tip

Tip

The HUP signal is often interpreted by daemons as a request to reload their configuration settings and restart.

We issue the HUP signal by typing

$ kill -HUP $(ps xww | awk '/./signal-eg/{print $1}')

Learn More

Learn More

Project 52 covers Bash functions.

Trap over a Block of Code

Suppose that you want to trap signals over a critical region of code but not over the whole script. This might be necessary in a script that writes information to a file in several steps, where an interrupt midway through the writing process would result in a half-written file.

We trap signals over the critical period, between opening and closing the file, by executing the critical region of code in a subshell and defining a handler that’s local to the subshell. When the subshell completes, the handler ceases to be defined.

$ cat signal-eg
#!/bin/bash
# critical code - stop interrupts here
(
  trap 'echo "Caught by subshell"' INT
  echo "Critical code"
  a=100000; while ((a!=0)); do ((a--)); done
)
# normal code - allow interrupts from now on
echo "Normal code"
a=100000; while ((a!=0)); do ((a--)); done

To illustrate this, both regions of code have a delay loop to make it possible to interrupt the script before it exits normally. After executing the script and pressing Control-c, we see that during execution of code in the critical region, the INT signal is caught and ignored. Thereafter, the INT signal terminates the script in the usual way.

$ ./signal-eg
Critical code
^CCaught by subshell
^CCaught by subshell
Normal code
^C
$

Tip

Tip

Switch off a handler by specifying null code to the trap statement as a dash character.

trap - SIG

This on/off technique can be used to limit a trap to a block of code in preference to using a subshell.

Learn More

Learn More

Project 85 covers subshells.

Scripting Tips

“How do I write a function that returns an array of values?”

This project presents several tips that you might find useful when writing Bash shell scripts. It shows you how to declare variables and arrays, perform integer arithmetic, test if a value is numeric, return values from functions, and implement variable variables.

Tip

Tip

Display the names of all integer variables by typing

$ declare -i

To learn more about the declare command, type

$ help declare

Declare Your Variables

This tip has nothing to do with the red channel at Customs. Declaring a variable is a way of telling Bash more about how you are going to use the variable. To declare a variable, use the Bash built-in command declare, followed by a type and variable name. Variable types include integer and array, both of which are described at greater length in this project.

In the next example, we declare the variable count to be an integer variable and perform some simple integer arithmetic, setting and incrementing count. First, by way of comparison, we try the sequence with an undeclared variable, which is taken by Bash to be a general-purpose string variable.

$ s=1
$ s=s+1
$ echo $s
s+1
$ declare -i count=1
$ count=count+1
$ echo $count
2

You may also declare variables read-only and export them as environment variables.

Bash lets you declare a variable to be an array. An array variable holds many values, each accessed by its ordinal number (index). The following examples illustrate this.

Declare an array, and initialize it by specifying the option -a and listing the values you wish to assign within parentheses, separated by spaces. Enclose in double quotes any values that contain spaces.

$ declare -a products
$ products=(iBook iMac PowerBook PowerMac ¬
    "iPod shuffle" AirPort)

To retrieve a particular value, expand the array variable name, employing the syntax

${array-variable-name[index]}

To display the first value, which has an index of 0, and the fifth value, which has an index of 4, type

$ echo ${products[0]}
iBook
$ echo ${products[4]}
iPod shuffle

Display all values by giving an index of star.

$ echo ${products[*]}
iBook iMac PowerBook PowerMac iPod shuffle AirPort

To display the number of values in the array, type

$ echo ${#products[*]}
6

To display the length, in characters, of the second value in the array, type

$ echo ${#products[1]}
4

To create a list of all values, use a for loop to list all values one at a time.

$ for p in "${products[@]}"; do echo $p; done
iBook
...
iPod shuffle
AirPort
$

Enclosing the expansion in double quotes, and using an index @ (instead of *), ensures that each value is expanded to preserve spaces; without this, iPod shuffle expands into two values.

Learn More

Learn More

The difference between @ and * used as an array index affects expansion of the array in the same way that it affects expansion of positional parameters, explained in “Basic Expansion” in Project 76.

Use Integer Arithmetic Expressions

Bash provides a special syntax for integer arithmetic expressions and comparisons, in which variables are automatically expanded and assumed to have the type integer. Expressions are enclosed within $((...)), and conditions, within ((...)). Within the double parentheses, you may employ expressions very much like those of the C programming language.

Tip

Tip

Learn about the Bash arithmetic expression allowed within ((...)) and $((...)) by typing /^ARITHMETIC EVALUATION within the Bash man page.

Here are a couple of examples.

$ i=7; j=35
$ echo $((i+j))
42
$ if (((i*j) == 245)); then echo "yes"; fi
yes

As a trivial example, we might offer a thousand greetings in the following manner.

$ a=1000; while ((a!=0)); do echo -n "*hello*"; ¬
    ((a--)); done

Test for a Numeric Value

Here’s a handy tip to determine whether a value is numeric. The line beginning with read generates a prompt, (Give a number) and assigns whatever you type to variable num. The line beginning with if tests num to see whether it’s numeric.

$ read -p "Give a number: " num
Give a number: i23
$ if [ "${num//[0-9]/}" ]; then echo "Not numeric"; fi
Not numeric
$ read -p "Give a number: " num
Give a number: 123
$ if [ "${num//[0-9]/}" ]; then echo "Not numeric"; fi

Project 76 explains the parameter extension techniques we used in this trick.

Return Arbitrary Values

A Bash shell script or function may return an exit condition of 0 to 255, which is available in the shell special variable $?. To return an arbitrary value, we use the following trick, shown here applied to a Bash function.

The function return-eg returns a string value, simply set to Janet, to illustrate the technique. The function echoes its return value, and the calling script captures that value by calling the function and enclosing it in $(...). This syntax tells Bash to execute the function and replace it with its own output; thus, we assign the return value to the variable name.

$ cat return-eg
return-eg () {
  # processing here
  result="Janet"
  echo $result
}
name=$(return-eg)
echo $name
$ ./return-eg
Janet

Learn More

Learn More

Project 52 covers Bash functions.

If we combine the arbitrary-value trick with Bash array variables, we can write and call a function that returns many values.

$ cat return-eg
return-eg () {
  # processing here
  result="Janet Sophie"
  echo $result
}


declare -a guests
guests=($(return-eg))
for ((i=0; i<${#guests[*]}; i++)); do
  echo "guest $((i+1)) ${guests[i]}"
done
$ ./return-eg
guest 1 Janet
guest 2 Sophie

Variable Variables

Languages such as PHP implement variable variables. If you know what they are and would like to simulate their functionality in Bash, this trick is for you. Here’s an example in which we echo the value of the variable detailsJanet.

$ echo $detailsJanet
Name: Janet Forbes, Country: England

Now we try the same exercise, except that the Janet part of the variable is itself held in a variable and, naturally, could be anything.

$ read -p "Give name: " name
Give name: Janet
$ eval "echo $details$name"
Name: Janet Forbes, Country: England

The built-in eval command tells Bash to expand the quoted command sequence and then to execute the expanded text as though it were the original command. The net effect is to expand the line twice before it’s executed. After eval is executed, the command sequence in the example above becomes

echo $detailsJanet

Then this command is executed in the normal manner. If you were to give a different name, such as Sophie, in response to the Give name: prompt, the final statement would evaluate to

echo $detailsSophie
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.218.5.12