This chapter covers many aspects of shell scripting. In keeping with the spirit of the book, it’s not an A-to-Z tutorial on the subject. Rather, each project tackles a particular technology pertinent to writing shell scripts. The 13 projects cover the following topics:
Bash functions in a script, parameter expansion, here-documents, and script debugging
Regular expressions, both modern (extended) and obsolete (basic)
Shell quoting
Forming conditions for use in conditional expressions
Subshells and command blocks
Traps and handles, and how to implement them in a shell script
If you are not familiar with writing shell scripts, read Projects 9 and 10 for an introduction. See Project 4 for a discussion of shell and environment variables and how they differ in scope. Project 52 covers Bash functions.
The chapter focuses on scripting with Bash, the default shell for accounts created in Mac OS X. If your chosen interactive shell is not Bash, don’t worry; you can still write and use the scripts you’ll find here. Just make Bash execute them by making the first line of each script read #!/bin/bash
. See Project 5 for a comparison of shells.
The projects in this chapter are fairly advanced. It’s not a tutorial on writing shell scripts; rather, it presents useful and practical solutions to some of the most common scripting tasks. It’s of most use to those who have grasped the basics of scripting and want to start writing real-world scripts.
“How do I avoid repeating the same piece of code in a shell script?”
This project demonstrates the use of Bash functions in shell scripts. It shows you how to use functions as a way of gathering commonly used code into blocks and demonstrates some handy tricks you can employ in your own code.
If you’re not familiar with Bash functions, refer to Project 52.
In Project 52, we covered the technique of combining command sequences into functions that can be invoked from the command line. Within Bash scripts, functions work much the same way that functions do in other languages, such as JavaScript and C.
When functions are incorporated into a script, they usually are grouped at the top of the file, ahead of the main body of the code. When the script is invoked, Bash reads and parses the functions, which makes them available for use within the actual script. (Functions are not executed when they are parsed—only when they are called by the script.)
Access an argument passed to a shell script from within a function by passing the argument to the function. To access the script’s $2
from within function usage
, for example, call usage
as follows:
usage "$2" other params...
Within usage
, the value of the main script’s $2
can be accessed through the function’s $1
.
Like shell scripts, functions accept arguments, and both use the same syntax to refer to arguments. The first argument passed to a script or function is available in the variable $1
; the second, in $2
, and the nth, in variable $n
. Bash also provides two special variables: $*
expands to a list of all arguments, and $#
expands to the total number of arguments passed.
Because of their shared syntax, arguments passed to a script are not accessible directly by the functions within it, but are available again when a function terminates and the main body of the script executes.
One point to be aware of: The variable $0
represents the script name in both the script and its functions. Use the special variable $FUNCNAME
to access the name of the current function.
Most nontrivial shell scripts take arguments, and a well-written script will perform some validation on the arguments it receives. Validation methods can vary widely, depending on the nature of the arguments involved (testing for numbers versus text, for example), but most Unix commands and scripts respond the same way when incorrect arguments are passed: by writing a usage line to the terminal. In a script that does a lot of validation, handling this kind of repetitive task is an ideal candidate for a function.
Suppose that we are writing a script that does a lot of validation. We might write a simple function to be called from the many points of validation in our hypothetical script. The function would display usage information in the terminal window. Our example function, appropriately called usage
, has been taken from a real-world script that creates a new Unix group.
usage () { echo "Create a new group" echo " Usage: ${0##*/} groupname gid" if [ "$*" != "" ]; then echo " Error: $*"; fi exit }
A usage line traditionally displays the name of the script. Use the special variable $0
instead of writing the script name literally. In this way, the name that’s displayed always reflects that of the script, even if the script is renamed after it’s written. The special variable expansion ${0##*/}
truncates the leading pathname from the script name. If the script is called by a command line such as /usr/local/bin/my-script
, the variable expansion becomes my-script
.
Project 76 covers Bash parameter expansion.
In the new-group script, usage
displays an informational message and a usage line, and (optionally) an error message preceded by the text Error
:. Because this function is called in response to fatal errors, it also shuts down, or exits, the script. A function that simply completes and returns to the main body of the script should not finish on exit
: An exit
statement terminates the entire script.
Let’s use our function to report an error when the number of arguments passed to a script is not two. Our script calls the usage
function if the wrong number of arguments is passed.
if [ $# -ne 2 ]; then usage fi
To pass an error message to the function, call it like this.
if [ $# -ne 2 ]; then usage "Two arguments expected but $# received" fi
Projects 9 and 10 show you how to write simple Bash shell scripts.
Unix commands usually write error messages to standard error instead of standard output. We can change our usage
function to honor this convention by using a redirection trick. Normally, the echo
command writes to standard output, but if we merge standard output into standard error by using the notation 1>&2
, or the equivalent >&2
, all output will be sent to standard error instead. As an example:
echo " Usage: ${0##*/} groupname gid" 1>&2
Project 6 covers the concepts of redirection, standard output, and standard error.
Here’s a handy function to underline a line of text. It accepts a line of text as a single argument, displays the text on a line, and places a line of dashes equal in length to the text on the line below it.
# Function Underline(string-to-underline) # Display and underline a string. # $1: the string to underline Underline () { local -i len # to hold the length of the string # write out the string and a '-' for each charater len=${#1}; echo "$1" while ((len!=0)); do echo -n "-"; len=len-1; done; echo return 0 }
An often-used convention names functions starting with a capital letter, helping distinguish functions from variables and commands.
Our function, named Underline
, assigns the number of characters in parameter 1 to the variable len
by using the special notation ${#1}
. It then displays the text held in parameter 1 and loops to display the appropriate number of dashes below the text. We employ a few more tricks besides ${#1}
. Passing option -n
to echo
stops it from displaying each dash on a new line. Also, we declare len
to be a local integer variable in the line
local -i len # to hold the length of the string
A local variable exists only while its defining function executes and prevents us from accidentally overwriting a variable of the same name from the main script. The option -i
makes len
an integer variable, allowing us to employ Bash integer expressions such as the condition in
Project 87 gives tips on declaring variables and Bash integer arithmetic
.
while ((len!=0));
which loops for as long as the value of the variable len
is not equal to 0; and the arithmetic expression
len=len-1
which subtracts 1 from the value of len
.
Project 81 covers Bash conditions.
We’d call Underline
from the main body of the script in the following manner.
Underline "The Title"
yielding
The Title ---------
“How do I perform string manipulation in Bash?”
This project covers the topic of parameter expansion. Parameter expansion is most often used to expand variables and arguments by means of the familiar $
notation: $length
or $1
. Parameter expansion, however, is more than simply the expansion of a variable or an argument into its value; it also involves manipulation of the value, such as pattern replacement and default initialization.
By now, the basics of parameter expansion are probably familiar. We give a variable a value.
$ title="101 Projects"
Later, we expand the variable to expose its value.
$ echo $title
101 Projects
Bash uses the terms parameter and parameter expansion not only for variables, but also for arguments passed to a script or function. Where Bash refers specifically to arguments such as $1
, it uses the terms positional parameter and positional parameter expansion.
Positional parameter expansion works as follows: The first argument passed to a Bash script or function is available in the variable $1
; the second, in $2
; and the nth, in the variable $n
. The special expansion $*
expands to a list of all arguments passed, and $#
expands to the number of arguments passed.
The special expansion $@
is useful when enclosed in double quotes. To illustrate this, suppose that we pass two arguments to a script, both of which contain spaces.
$ ./tst "param one" "param two"
Whereas both $*
and $@
expand to four items— "param", "one", "param"
, and "two"
—the quoted versions behave differently.
"$*"
expands to one item: "param one param two"
.
"$@"
expands more usefully to two items: "param one"
and "param two"
.
More complex parameter expansion lets us assign a default value to a parameter or change its value by cutting and replacing portions of its contents. Complex expansion uses the notation
${parameter-name<expansion-type>}
Use the following technique to embed parameter expansion in text that might otherwise be confused with the name of the parameter. To expand an abbreviated day name, where day="Tues"
, to Tuesday
, we type
${day}day
Suppose that we have a script that takes one optional argument. We want to assign the value of argument (parameter) 1 to the variable level
, but only if parameter 1 is given a value. If no argument is given when the script is called, we want the value of level
set to equal the text string normal
. We can take the conventional, long-handed approach and use an if
statement.
if [ "$1" = "" ]; then level="normal" else level="$1" fi
Better, we can use the functionally equivalent complex expansion.
level=${1:-"normal"}
An alternate approach also initializes a parameter to a default value if the parameter is null. This technique doesn’t apply to positional parameters (arguments), only to parameters (variables).
new_level=${level:="normal"}
A third method causes a script to exit if a compulsory argument is not supplied. The following expansion displays an error message and aborts the script if no value is passed to $1
; otherwise, it assigns the passed value to the variable level
.
$ cat tst
level=${1:?Please supply a value}
If we run the script and fail to supply an argument, it displays the error message and aborts.
$ ./tst
./tst: line 1: 1: Please supply a value
Bash can expand a string variable (a variable that contains text) to a fragment of the text. This process is called slicing. We might expand the variable $string
by taking a slice starting from the 8th character (that’s character 7, because the first character is 0) and returning the next 6 characters.
$ echo $hi Hello, please slice me $ echo ${hi:7:6} please
Here’s a practical application of slicing—a script that checks each of its arguments to see whether it’s an option flag (an argument that starts with a -
character). This technique is commonly used in connection with scripts for which option flags have been defined. Our example is part of a script that has two legal option flags: -p
, which prompts the user for a password; and -v
, which sets verbose mode. Any other option will cause the script to exit and report an error to the user.
We want to extract and compare the first character of each argument by slicing a substring one character long, beginning with the first character (character position 0). The following script uses the expansion ${1:0:1}
to slice $1
, where :0
specifies the start position and :1
specifies the number of characters to extract.
while [ "$1" != "" ]; do if [ "${1:0:1}" = "-" ]; then case "$1" in "-p") stty -echo; read -p "Password:" password stty echo; echo;; "-v") verbose=yes;; *) echo "invalid option $1"; exit esac else echo "Here we process non-option arguments..." fi shift done
Use the command stty -echo
to stop the user’s input from being echoed to the screen as she types. This is useful when a password or other such sensitive information needs to be input. The command stty echo
puts things back to normal.
The script then tests to see whether the character is a dash (-
). If so, it issues instructions depending on whether the dash is followed by p, v
, or any other character (denoted by *
); if not, it writes a message to the screen: Here we process non-option arguments...
The script demonstrates a few other useful techniques, too. It loops, processing each argument in turn. At the end of the loop, it uses the shift
command to shift all positional parameters down one place, so $n
becomes $n-1, $2
becomes $1
, and $1
(which we just processed) drops off the end.
Bash provides a way to remove the head or tail of a string. We’ll illustrate a few useful techniques on a Unix pathname written to the variable fullpath
.
$ fullpath="/usr/local/bin/backup.user.sh"
In our first example, we remove the head of the string by specifying a parameter expansion in the form
${parameter##word}
The character combination ##
instructs Bash to remove all characters, starting from the left (the start) of the specified parameter, that match word
. We’ll specify word as */
, where *
is matched by zero or more occurrences of any character and /
represents itself. The star symbol is interpreted exactly as it would be for shell globbing. Our pattern, therefore, matches any string of characters from the start of the string, ending with /
.
Refer to Project 11 for a full explanation of globbing.
$ echo ${fullpath##*/}
backup.user.sh
You’ll notice that the pattern matched the longest string it could, up to the last /
. Try the same command, but type a single #
to match the shortest string—up to the first /
.
To extract the file extension, we type
$ echo ${fullpath##*.}
sh
To remove all characters starting from the right (the end) of the string instead of the left, specify %
instead of #
. The same convention of %
versus %%
applies. To remove the extension part (.sh) from fullpath
, we require the shortest match, starting from the right (%
), for the word.*
.
$ echo ${fullpath%.*}
/usr/local/bin/backup.user
Here’s an example script that splits a pathname into its component directories and the filename. We match the shortest string from the left and the longest from the right. Contrast this with the previous two examples. If you can figure out how it works, you’ve got topping and tailing down to a tee.
$ cat tst
#!/bin/bash
pathname=${1}"/"
while [ ! -z ${pathname#*/} ]; do
pathname=${pathname#*/}
echo ${pathname%%/*}
done
$ ./tst /usr/local/bin/command
usr
local
bin
command
Bash gives us a means to search a parameter for a pattern, replacing each occurrence of that pattern with a new string. The syntax is
${parameter/match-pattern/replace-pattern}
Here are some examples in which we use the echo
command to demonstrate search and replace.
Search for the first occurrence of Hello
, and replace it with Goodbye
.
$ message="Hello, Hello World" $ echo ${message/Hello/Goodbye} Goodbye, Hello World
Only the first occurrence of Hello
is replaced: To replace all occurrences, specify a double slash instead if a single slash.
$ echo ${message//Hello/Goodbye}
Goodbye, Goodbye World
To match a pattern that must be at the very start of the string, introduce the search-and-replace expression with the character sequence /#
.
$ echo ${message/#Hello/Goodbye}
Goodbye, Hello World
Similarly, to specify that the pattern must be at the end of the string, introduce the search-and-replace expression with the character sequence /%
.
$ echo ${message/%World/Earth}
Hello, Hello Earth
“How do I search for text that matches a specific pattern?”
This project shows you how to write regular expressions. A regular expression is formed to match a particular text pattern. Project 78 covers advanced use of regular expressions.
Regular expressions are not the same as globbing (covered in Projects 11 and 12). Globbing is implemented by the shell and by commands such as find
, and matches a pattern against a list of filenames—usually, the files in the current directory. Regular expressions are more powerful and are used by text-processing commands to match against lines of text—usually, to search for and replace text.
Regular expressions are widely used in Unix, and most text-processing tools support them. The most common uses include:
Searching a text file for lines containing particular text
Filtering the output from other commands for relevant lines
Performing search and replace in text editors such as nano
and TextWrangler, and in text-editing tools such as sed
and awk
Performing text manipulation in a programming language such as Perl or PHP
The simplest regular expressions are plain text sequences (such as index.html
) that match other instances of themselves. More often, regular expressions contain a mix of wildcards, repetitions, and alternatives.
Unix supports three types of regular expressions, which unfortunately don’t share a compatible syntax. The three forms are modern (also termed extended); obsolete (also termed basic); and Perl regular expressions (introduced by the Perl programming language). This project focuses on extended regular expressions, but a section at the end highlights how extended expressions differ from basic expressions. Perl regular expressions, the most powerful of all, are not generally supported by the Unix tools covered in this book.
Basic regular expressions are supported by the grep
and sed
commands. Extended regular expressions are supported by the awk
command and by the extended variants of grep
and sed
—namely, egrep
(or grep -E
) and sed -E
.
Refer to Project 23 for examples of using the grep
command.
Regular expressions are employed in many of the projects in this book. Read this project to brush up on the theory, and you’ll be ready to apply it in a more practical way to other projects.
Refer to Projects 59 to 62 for more information on the sed
and awk
commands.
Depending on context, regular-expression matching is performed on a string (a sequence of characters) or a line of text. Matched text cannot span lines but must be wholly contained within one line. Matching is normally done in a case-sensitive manner, but most tools let you specify that matching should be case insensitive.
Remember that the escaping character is a special character itself. To use it literally, escape it by typing
\
.
Regular expressions are greedy: Given a choice of several possible matches, they always choose the longest one. Consider the text
backup.user.sh
A regular-expression match against “anything followed by dot” will return backup.user
. but the shorter match backup
. will not be returned.
A regular expression consists of a sequence of atoms and repeaters.
An atom is any of the following:
A character (most characters match themselves)
. (matches any single character)
^ (matches the start of a line or string)
$ (matches the end of a line or string)
[...]
(called a bracketed expression; represents exactly one instance from a group of possible characters and is explained more fully later in this project)
A repeater is any of the following:
*
(matches zero or more occurrences of the preceding atom)
+
(matches one or more occurrences of the preceding atom)
?
(matches zero or one occurrence of the preceding atom)
The syntax is explained by examples in the rest of the project. Project 78 covers advanced regular expressions, extending the syntax shown here.
When you enter a regular expression on the command line, remember that characters such as star have a special meaning to the shell and must be escaped from it. It’s good practice always to surround regular expressions with single quotes.
To match a character such as star (*
), which normally has a special meaning, you must escape its special meaning by preceding it with a backslash (). The special characters that must be escaped in extended regular expressions are
. ^ $ * ? + [ { () |
Let’s form a very simple regular expression that we might use to match an incomplete crossword entry: a p blank l blank. In regular-expression language, a single-character blank is represented by a dot, so here’s our regular expression.
'ap.l.'
When applied to a list of words, one per line, this expression will match lines that contain apple, apply, and aptly. It will also match lines that contain words such as appliance, pineapple, and inapplicable.
When applied to lines (or long strings) of text, the regular expression 'ap.l.'
will match lines such as an apple a day and clap loudly because those lines contain matches. It’s not necessary to match the entire line or string.
A simple method of dry-running a regular expression uses the command egrep
(or grep
for basic regular expressions). Type
$ egrep 'the-regular-¬ expression'
but give no filename. You can now experiment by typing lines of text, which egrep
will read from standard input. Lines that match the regular expression will be echoed back when you press Return; those that don’t, won’t. Press Control-d
when you’re finished.
The special symbol caret (^
) matches the start of a line or string; it matches a position rather than a character. Repeating our example from the previous section, we find that the regular expression
'^ap.l.'
matches lines that start with ap.l
. and won’t match pineapple, inapplicable, or clap loudly.
Similarly, the special symbol dollar ($
) matches the end of a line or string, so the regular expression
'ap.l.$'
matches words that end with ap.l
. and won’t match appliance or inapplicable.
It’s important to realize that anchoring applies to the whole line (or string), not to individual words. If we pass the line red apple, it will not match ^apple
because caret anchors to the start of the line. It will match the line apple mac. Similarly, apple$
will match red apple but not apple mac.
Pass the -w
option to grep
to tell it to match only whole words. “ apple” would match the string “ an apple a day” but not the string “ a pineapple a day”.
Finally, we match an entire line or string by applying both anchors. To match only apple, apply, and aptly, use the regular expression
'^ap.l.$'
To search for fixed patterns of text separated by arbitrary text, we must specify any number of any character. We do this by combining the atom dot (.
) to mean any character and the repeater star (*
) to mean zero or more repetitions thereof. Here are some examples that use a text file, paren
.
$ cat paren
Here is (some text) in parentheses.
Here we have () empty parentheses.
Here we have (a) letter in parentheses.
Here we have no parentheses.
Let’s search for lines that contain anything, including nothing, enclosed in parentheses. To do so, we create a regular expression that means (, followed by anything or nothing, followed by). We must escape the parentheses (and braces, too) because they are special characters (a topic discussed at greater length in Project 78).
$ egrep '(.*)' paren
Here is (some text) in parentheses.
Here we have () empty parentheses.
Here we have (a) letter in parentheses.
To exclude the empty parentheses, we specify one or more repetitions of any character by using the special character plus (+
) instead of star.
Project 78 shows you how to apply finer control to repeaters and how to repeat constructs that are more complex than a single character.
$ egrep '(.+)' paren
Here is (some text) in parentheses.
Here we have (a) letter in parentheses.
To specify zero or one repetitions, we use the special character query (?
).
$ egrep '(.?)' paren
Here we have () empty parentheses.
Here we have (a) letter in parentheses.
Repeaters can be applied to specific characters as well as to special characters like dot. Here are two regular expressions, the first matching two or more consecutive dashes (-
); the second matching star, then one or two dots, and then star.
$ egrep -- '--+' test.txt $ egrep '*..?*' test.txt
The first example uses a trick to prevent the egrep
command from thinking the regular expression is an option because it begins with a dash. A double-dash option preceding the regular expression signifies that no more options follow. The second example uses the special character to escape the star and dot characters.
Repeaters are summarized in “Regular-Expression Syntax” earlier in this project.
To match any digit 0 to 9, or perhaps any letter, we list the alternative characters and have the text match exactly one of those characters. Regular expressions provide bracket expressions for just such a purpose, whereby we list the alternative characters in square brackets. For example, the regular expression
'b[aeiou]g'
matches bag, beg, big, bog, and bug. It does not match byg or boog.
Project 78 shows you how to choose alternatives that are more complex than a single character.
The following regular expression will match any line that starts with a, b, or c (uppercase or lowercase) immediately followed by a two-digit number.
'^[aAbBcC][0123456789][0123456789]'
To match all characters except a particular set, enclose the characters to be excluded in brackets, preceded by a caret (^
) symbol. To match any character except a digit, specify the regular expression
'[^0123456789]'
All special characters lose their meaning inside bracketed expressions, where they should not (and in fact cannot) be escaped.
A character range is a bracketed expression with a start point and an end point separated by a dash. Here are some simple examples to illustrate this.
All digits is '[0-9]'
and equivalent to '[0123456789]'
.
All letters is '[a-zA-Z]'
.
All letters plus [ ] ^
and - is '[][a-zA-Z^-]'
. To clarify, we specify the character set ][a-zA-Z^-
enclosed in square brackets.
In the last example, we employed a few tricks to include the special characters [
, -
, and ^
in the list. To include a ]
character, make it first in the bracketed list (or the second when you’re negating the list with a caret symbol). A caret must not be the first in the list, and a dash character should be the last in the list.
Regular expressions provide special character classes to prevent the need to list many characters in bracketed expressions. To match all letters and digits, for example, we specify the class alnum (alphanumeric). A class name should be surrounded by [: :]
and enclosed in brackets.
The sequence [[:alpha:]][[:digit:]]
differs from [[:alpha:][:digit:]]
. The former specifies a letter followed by a digit; the latter specifies either a letter or a digit.
Let’s pose a matching problem and solve it by using character classes. We want to match lines starting with one or more digits, followed by one or more letters, followed by a colon, followed by anything. The line may optionally start with a white space. Here’s an example.
42HHGG: Life, the universe, and everything.
We might describe our matching criteria by using a regular expression such as
'^[[:space:]]*[[:digit:]]+[[:alpha:]]+:'
The regular expression uses the character classes space
(any white space, including tab), digit
(0-9), and alpha
(a-z, A-Z). The rest of the expression is formed with the now-familiar repeaters and anchors.
The following character classes are defined.
alnum alpha blank cntrl digit graph lower print punct space upper zdigit
To discover exactly which characters are included in a particular class, read the Section 3 man page for the corresponding library function. The library function is named like the class but starts with is
. To read about character class [:space:]
, for example, look at the man page for isspace
by typing
$ man 3 isspace
“How do I search for text that matches a specific pattern?”
This project shows you how to write advanced regular expressions. A regular expression is formed to match a particular text pattern. Project 77 introduces regular expressions.
If you’re not familiar with regular expressions, read Project 77, on which this project builds. This project introduces advanced techniques such as:
Repeaters with bounds to state more precisely how many times a preceding atom must repeat
Subexpressions to turn regular expressions into atoms, thereby making them subject to repeaters
Branches to form choices more complex than the simple character alternatives offered by bracket expressions
Project 77 introduced regular expressions and showed you how to use an atom followed by a simple repeater to say match multiple occurrences of the specified atom. But the alternatives offered by the simple repeaters *
, +
, and ?
are not always adequate. We can specify to match one or more letters by using the expression
'[[:alpha:]]+'
but not exactly nine letters or between five and nine letters, inclusive.
To specify a precise number of matches, use a bounded repeater, which has the syntax {n,m}
. You can use a bounded repeater wherever you’d otherwise use a simple repeater. We’ll demonstrate the use of bounded repeaters by matching words of a particular length and words that fall within a particular length range. First, let’s use egrep
and a regular expression to match words of exactly nine letters. The input file contains a list of words, one per line.
Attempting to specify a repeater such as {,9}
to mean 9 or
fewer is not legal syntax. Instead, use either {1,9}
or {0,9}
as appropriate.
To match all nine-letter words, we employ a bounded repeater in a regular expression such as
$ egrep '^[[:alpha:]]{9}$' /usr/share/dict/web2
...
pinealism
pinealoma
pineapple
pinedrops
pinewoods
pinheaded
...
(The file /usr/share/dict/web2
contains a handy word list.)
The syntax element {9}
is a bounded repeater that matches exactly nine occurrences of the preceding atom: a letter. Note that we’ve used a caret symbol and a dollar symbol to ensure that the expression matches a complete line; otherwise, the expression would also match a portion of all words more than nine characters in length.
To extract all words five to nine characters in length, we supply two comma-separated bounds.
'^[[:alpha:]]{5,9}$'
Whereas the first example matched words like pineapple, this example matches from apple through dappled to pineapple.
To search for nine or more occurrences, supply only the lower bound. The next example matches space-separated numbers of nine or more digits.
' [[:digit:]]{9,} '
By enclosing a regular expression in parentheses, we turn it into an atom (see Project 77). Such an expression is termed a subexpression. A subexpression is seen as a single entity and, therefore, can be made the subject of a repeater.
Here’s an example in which we check for valid IP addresses, which look like 10.0.2.120 or 217.155.168.147. We first construct a regular expression that matches one to three digits, followed by a dot.
'[[:digit:]]{1,3}.'
Then we turn the regular expression into a subexpression, which allows us to repeat the whole expression three times with a repeater.
'([[:digit:]]{1,3}.){3}'
Any regular expression enclosed in parentheses becomes a subexpression. A subexpression is an atom and can be treated just like a simple character, which may be incorporated into a new regular expression. The new expression may be enclosed in parentheses and reduced in its turn to an atom. There is no effective limit to this process—at least not until your head starts to hurt!
Finally, we add the original expression to the end, but without the trailing dot. For good measure, we also assume an IP address to be surrounded by nondigit characters. This prevents matching an invalid address such as 1111111.2.3.4444444. Here’s the final regular expression.
'[^[:digit:]]([[:digit:]]{1,3}.){3}¬ [[:digit:]]{1,3}[^[:digit:]]'
We must extend the definition of an atom given in Project 77 to include a subexpression.
If you try this expression, you’ll notice that it fails on IP addresses that fall at the start or end of a line. We need to delimit an IP address by start of line OR not a digit and not a digit OR end of line. We can achieve this by using branches, introduced in the next section.
Branches define sets of alternative matches. A regular expression may specify one or more branches separated by vertical-bar (|
) symbols and will match anything that matches one of the branches. Each branch is itself a regular expression.
Here’s a regular expression with seven branches that matches any one of the days of the week.
'monday|tuesday|wednesday|thursday|friday|saturday|sunday'
This alone is limited, and an attempt to match a full date will not work. The following regular expression, for example, doesn’t do what we probably intended.
'saturday|sunday jan|feb [[:digit:]]{1,2}'
It actually specifies a line that matches any of the three alternatives.
'saturday' OR 'sunday jan' OR 'feb [[:digit:]]{1,2}'.
Don’t get confused by the two meanings of the caret symbol. Outside a bracket expression, it’s a start-of-line anchor, and as the first character inside a bracket expression, it negates the sense of the match.
To get around this problem, we employ subexpressions. Combining multiple branches as subexpressions within larger regular expressions enables complex and highly useful matches. We might use the following to pull out weekend events for January and February from an activities list.
'(saturday|sunday) (jan|feb) ([[:digit:]]{1,2})'
We might match days of the week by using the shorter regular expression
'(mon|tues|wednes|thurs|fri|satur|sun)day'
We’ll conclude our look at branches by completing the IP address-matching example started in the preceding section. Recall that we wanted to delimit an IP address by start of line OR not a digit and not a digit OR end of line. We specify the former by using a two-branch subexpression such as
'(^|[^[:digit:]])'
Here’s the full regular expression, split across three lines for clarity. It should be entered in Terminal on a single line and, obviously, as part of a command.
'(^|[^[:digit:]]) ([[:digit:]]{1,3}.){3}[[:digit:]]{1,3} ([^[:digit:]]|$)'
Suppose that we need to match a particular pattern and that that pattern must be occur twice. That’s easy to do; we use the repeater {2
}. However, if our requirement is for the text that matched the first time to be repeated verbatim the second time, that’s not so easy. (Imagine a search that’d match Monty Monty and Sugar Sugar but not Monty Python or Sugar Babes.)
To pull off such a trick, we use capture and playback. Whenever a subexpression is matched, the matched string is captured in a buffer. The first string to be captured is held in buffer 1; the second, in buffer 2; and so on. This happens automatically. To replay a buffer, simply specify 1
or 2
, and so on.
Here’s an example in which we capture the entire expression and replay it.
'(b[aeiou]g)1'
This expression will match bigbig and bagbag, but not bigbag. Remember that a pattern is captured only when it’s a subexpression—that is, it’s enclosed in parentheses.
Capture patterns play an important role in search and replace. Editing tools such as sed
support the capture-and-playback technique, allowing a pattern captured from the search string to be played back into the replacement string.
Here’s an example in which we process a file that contains information about books. The entry for a book occupies one line in the file (shown split into three shorter lines in this book) and has the following format.
Level: Beginning/Intermediate/Advanced, "101 Projects", CBS Category: Macintosh/Unix, Covers: Mac OS X 10.4 Tiger, Price: $34.99, Author: Mayo.
Our mission, should we choose to accept it, is not so impossible. We must extract the quoted title and price, and report them in the following format.
Cost 34.99 Title 101 Projects
Projects 59 and 61 cover the sed
text editor.
To realize this, we match an entire line, capturing the title and price, and replace the line with Cost <price> Title <title>
.
Let’s build the regular expression piece by piece. Start with .*
to match everything up to the title. Match the title with “.*
”, and capture it with (“.*
”). Then match intervening information with .*
, and match and capture the price with ($[0-9]{1,3}.[0-9]{2})
. Note that we escape $
and. because they are special characters. Finally, match the remainder of the line with .*
.
The sed
command’s syntax for search and replace is
s/search-pattern/replace-pattern/
Our replace pattern is Cost 2 Title 1
.
Putting this together, we get the following command.
$ sed -E 's/.*(".*").*($[0-9]{1,3}.[0-9]{2}).*/Cost 2 ¬ Title 1/'
Option -E
to sed
tells it to switch on extended regular expressions. Let’s try this command, adding a little extra sophistication to display only matching lines with option -n
(don’t display input lines) and flag p
(display matching lines) placed at the end of the substitute function.
$ sed -En 's/.*(".*").*($[0-9]{1,3}.[0-9]{2}).*¬ /Cost 2 Title 1/p' Level: Beginning/Intermediate/Advanced, "101 Projects", CBS Category: Macintosh/Unix, Covers: Mac OS X 10.4 Tiger, Price: $34.99, Author: Mayo. Cost $34.99 Title "101 Projects" TEST"TITLE"TEST$111.22TEST Cost $111.22 Title "TITLE" <Control-d> $
“How do I use an interactive command in a shell script?”
This project explores the use of here-documents in Bash shell scripts. Here-documents provide an easy way to display multi-line messages. They also offer a means of using interactive commands (that normally take input from Terminal) in a shell script by specifying that input will instead be found embedded in the script.
Project 6 covers the techniques of redirection and pipelining.
Project 21 gives more information on the cat
command.
A here-document is a clever Bash feature one can employ in shell scripts. It furnishes a technique for redirecting standard input not from a file or pipe, but from the text of the shell script itself. This is best explained by an example.
To display a sizeable message from a shell script, we could of course use the echo
or cat
commands to display text stored in a file. Instead, we’ll use cat
but supply the text inline as part of the shell script.
The cat
command, in the absence of a filename, reads its input from standard input. In the next example, we use a here-document to redirect standard input to be from the text of the shell script.
The following example is taken from a shell script that creates a new Unix group, but for brevity of output, we show only the section that’s of interest to us.
$ cat new-group #!/bin/bash cat <<EOS The script creates a new Unix group within NetInfo Usage ${0##*/} groupname gid Neither the group name nor the group id must exist EOS $ ./new-group The script creates a new Unix group within NetInfo Usage new-group groupname gid Neither the group name nor the group id must exist
Remember to make the script executable (see Project 9).
Bash parameter expansion is explained in Project 76.
The start of the region to be read as standard input is marked by <<word
. The end of the region is marked by a line containing only word
(in which even leading and trailing blanks are not permitted). In this example, the cat
command reads the text between <<EOS
and EOS
and displays it on the terminal line.
Using a here-document has several advantages over just displaying the contents of a file. First, the shell script does not need to rely on or know the location of a second file. Second, you’ll notice that the parameter ${0##*/}
is expanded. All lines of a here-document are subjected to parameter expansion, command substitution, and arithmetic expansion.
We could achieve a similar effect by using the echo
command, but here-documents have other advantages and uses, which are demonstrated next.
Nontrivial shell scripts usually employ indentation to highlight their structure and organization. Your here-documents can follow the natural flow of script indentation, without having that indentation reflected in the text they pass via redirection: Just set the indents within the here-documents using Tab characters, instead of spaces. To enable this useful feature, type <<-
instead of <<
at the beginning of the here-document. Here’s an example.
$ cat new-group #!/bin/bash cat <<-EOS The script creates a new Unix group within NetInfo Usage ${0##*/} groupname gid Neither the group name nor the group id must exist EOS $ ./new-group The script creates a new Unix group within NetInfo Usage new-group groupname gid Neither the group name nor the group id must exist
Although tabs are stripped, spaces are not. This allows space-driven indentation within the here-document text, as in the example above. If you are in the habit of using spaces to indent your shell scripts, revert to using tabs within a here-document.
If you want to control an interactive command from a shell script, such as ftp
to perform a file transfer, use a here-document to supply the command’s input from the text of the script. An interactive command expects to receive its input from standard input (usually, in the form of a human at a keyboard).
Let’s write a shell script that connects to an FTP server and issues three commands— user, ls
, and exit
—to ftp
.
$ cat ftp-eg
ftp -n carcharoth.mayo-family.com <<-EOT
user saruman mypassword
ls
exit
EOT
Here’s what happens—automatically, with no user intervention—when we run the script.
$ ./ftp-eg
Connected to carcharoth.mayo-family.com.
220 carcharoth.mayo-family.com FTP server ready.
331 Password required for saruman.
...
150 Opening ASCII mode data connection for '/bin/ls'.
total 1
drwxr-xr-x 4 saruman saruman 136 Jun 10 00:21 Public
drwxr-xr-x 27 saruman saruman 918 Jun 28 13:17 Sites
226 Transfer complete.
221-
Data traffic for this session was 0 bytes in 0 files.
Total traffic for this session was 3573 bytes in 1...
221 Thank you for using FTP on carcharoth.mayo-family.com.
Bash provides a here-string in which the expansion of a variable can be used as standard input. Try the following commands, and compare the results you get from the second and third lines.
$ text="This is a test ¬ of a here-string" $ cat $text $ cat <<<$text
The third line is equivalent to
$ echo $text | cat
Here’s a trick in which we use a here-document to form the standard input to a function, read_data
, within a script, function-eg
. The function requires three pieces of data.
$ cat function-eg #!/bin/bash read_data () { read make read model read color } read_data <<-HEREDOC BMW 3 series Blue HEREDOC echo "Make: $make, model: $model, color: $color" $ ./function-eg Make: BMW, model: 3 series, color: Blue
“How do I selectively turn off the shell’s interpretation and expansion of special characters?”
This project explores the art of quoting in the Bash shell. It shows how we force Bash to interpret characters literally in situations where they normally would be considered special characters.
The Bash shell expands a command line before the command line is executed. During the expansion phase, all special characters—such as wildcards, redirection symbols, and the dollar symbol used in variable expansion—are interpreted and replaced by their expansion text. To invoke a command and pass it text that includes any of those characters used in their literal senses (as in the strings M*A*S*H
and $64,000 Question
), the special characters must be quoted or escaped to prevent interpretation.
Before we can employ quoting, we need to know which characters must be quoted. Table 9.1 is a handy reference listing all the special characters, and character combinations, that the shell is likely to interpret.
Table 9.1. Shell Special Characters
Expansion or Interpretation | |
---|---|
| Introduce a comment |
| Separate commands |
| Introduce a command block and subshell |
| Logical AND and OR operators (placed between commands) |
| Home directory |
| Directory or filename separator |
| Variable expansion |
| Execute a command and substitute the output |
| Evaluate an integer expression and condition |
| Strong quote, weak quote, escape next character |
| Globbing |
| Background execution |
| Redirection and pipelining |
| History expansion |
Suppose that you want to echo the text
I want $lots
If no quoting is used, $lots
will be taken as an instruction to expand the variable lots
(which is currently unset). The result would be as follows.
$ echo I want $lots
I want
To prevent the dollar special character from being interpreted, escape it in one of three ways. First, precede it with a backslash.
$ echo I want $lots
I want $lots
Second, enclose the entire string in single quotes, which are also called strong quotes because no special characters within them are interpreted.
$ echo 'I want $lots'
I want $lots
Third, use double quotes, also called weak quotes because most, but not all, special characters they enclose are escaped. The exceptions are
The dollar symbol in all three forms: $var, $(...), and $((...))
The !
symbol in history expansion
In this example, we cannot use double quotes.
Suppose that you want to echo a line such as
I want $1000000 (a lot of $)
We’ll assume that the number of dollars is not fixed but is held in the variable lots
. To illustrate the different forms of quoting, we’ll examine what happens when each form is employed, starting with none.
$ echo I want $$lots (a lot of $)
-bash: syntax error near unexpected token `('
Employing single quotes prevents all forms of expansion.
$ echo 'I want $$lots (a lot of $)'
I want $$lots (a lot of $)
To achieve the intended result, we must employ double quotes, which prevent the parentheses from being interpreted but allow expansion of variable $lots
.
$ echo "I want $$lots (a lot of $)"
I want 2324lots (a lot of $)
Closer, but this didn’t quite work. It’s still necessary to escape the first dollar symbol. If we don’t, it attaches itself to the second dollar symbol and causes the shell to expand the special variable $$
.
$ echo "I want $$lots (a lot of $)"
I want $1000000 (a lot of $)
Let’s look at a trickier example. How might we quote this?
$ echo $5 - That's ok
We cannot use double quotes, because we don’t want to interpret $5
. Single quotes won’t work either, because the text itself contains a single quote acting as an apostrophe. A first attempt might have us escaping the apostrophe.
$ echo '$5 - That's ok' > Control-c
Don’t be tempted to skip quoting because a command appears to work correctly. If your command attempts to pass the text note.*
to grep
unquoted, for example, and no matching filenames exist in the current directory, the shell will not expand it. Your unquoted command will work—until the day you create a file with a name such as note.1
.
This fails because inside single quotes, no special characters are interpreted, including backslash. Hence, Bash sees the apostrophe as the closing quote and the last single quote as an unterminated open quote.
The simplest method involves converting the expression to two strings enclosed in single quotes, with the (unenclosed) apostrophe between them. Then we escape the apostrophe by using either a backslash or double quotes, as shown in the next two examples.
$ echo '$5 - That'''s ok' $5 - That's ok $ echo '$5 - That'"'"'s ok' $5 - That's ok
We could also use the following technique where two quoted parts are run consecutively.
$ echo '$5'" - That's ok"
$5 - That's ok
Suppose that we use the awk
command to filter field number 4 (written as $4
in awk
scripting) from the output of a ps command. We type the following, employing single quotes to escape $4
from the shell because we want it to be interpreted by awk
.
$ ps xc | awk '{print $4}'
Suppose now that we want to do the same thing, but using the field number stored in a shell variable called field
.
$ field=4 $ ps xc | awk '{print $$field}'
This won’t work, of course. So how do we both allow Bash to expand $field
to 4
and escape the first dollar so we pass, literally, $4
to awk?
In this simple example, there are several ways, but you can apply a general solution to almost all quoting problems of this nature. It may seem trivial now, but remember it for the future; I’ve seen many people completely stumped trying to solve quoting dilemmas that are amenable to this particular solution.
We simply start and stop quoted regions as necessary. The first quoted region is '{print $';
the second is '}'. $field
is not quoted and, therefore, is expanded by the shell.
$ ps xc | awk '{print $'$field'}'
Although not necessary in this example, the general rule would have quoted each region to prevent problems with spaces in expanded parameters.
$ ps xc | awk '{print $'"$field"'}'
Ensure that you don’t include spaces between the quoted regions.
Let’s write a command that uses grep
to search a file for the sequence a*
. Here’s our test file.
$ cat file
This line contains a*
This line does not
Project 23 shows how to use the grep
command.
Because star is a special character in regular expressions, we must escape it, passing *
to grep
. Star and backslash are also special characters to the shell and must be escaped from it too, as \
and *
.
Projects 39 and 40 explore the ps
command in detail.
Therefore, we form the following command.
$ grep a\* file
This line contains a*
Here’s a tip that might save much head-scratching. Suppose that we have a command substitution such as
$(ps xc | grep "$target")
The variable $target
may expand to include spaces, so we must double-quote it for the grep
command to work correctly. If we then use the command substitution as a parameter to another command, we must enclose the whole substitution in double quotes. A naive attempt has us type the following.
$ grep "$(ps xc | grep "$target")" processes.txt
This shouldn’t work, because as we have seen in previous examples, the expression forms two quoted regions: "$(ps xc | grep "
and ")"
. Surprisingly, it does work, because Bash processes a command substitution ($(...)
) as an independent syntactical element. It processes $(ps xc | grep "$target")
and then considers the outer expression grep "..." processes.txt
.
“How does Bash interpret conditional expressions?”
This project looks at the many forms of conditional expression supported by Bash. It explains the differences of the forms and compares them with one another. It also presents some handy tricks and gives tips on how to avoid syntax errors and malformed conditions.
Project 10 introduces basic shell scripting techniques and discusses the conditional statements supported by Bash.
Bash supports conditional expressions that are used in conditional statements such as if, while
, and until
. Here’s an example in which we test whether 5 is less than 7 (we use -lt to mean less than). The condition is enclosed in [...]
and evaluates to true or false. Ideally, Bash will find truth in such a condition.
$ if [ 5 -lt 7 ]; then echo "yes"; else echo "no"; fi
yes
Now let’s examine this simple expression in more detail to discover how Bash interprets it, and explore the alternative forms of conditional expression offered by Bash.
There is more to Bash conditional expressions than is at first apparent. Let’s look at how Bash interprets a conditional expression. This is key to understanding the different forms and being able to make the most of them.
When interpreting a conditional statement such as if
, Bash does not expect to see a Boolean value (TRUE or FALSE), as other languages do, such as C and PHP. Rather, Bash expects to see an executable command. The syntax is effectively
if command; then...
Within such a command line, Bash executes the command that follows if
and replaces it with whatever value the command returns. A return value of 0
is interpreted as TRUE; any other return value is interpreted as FALSE.
In our example statement, you might well ask about the whereabouts of the command Bash requires following if
. The answer is a little surprising: Bracket ([
) is actually a built-in Bash command. When interpreting a conditional statement such as
if [ 5 -lt 7 ]; then...
Just as for any other command, white space must separate [
and each of its parameters. You’ll get a syntax error, or a conditional expression that evaluates incorrectly, if you omit the white space. The final parameter, ]
, is required for syntactic completeness (or perhaps aesthetic value).
Bash first executes the bracket command, passing it the four parameters that form the remainder of the statement: 5, -lt, 7
, and ]. (The statement is terminated by a semicolon.) The bracket command (not Bash command-line interpretation) evaluates the conditional expression and returns 0
if the statement is true (as it is in this case) and 1 if it is false.
Refer to Project 16 for more information on the type
command.
In our example, then, after it has executed the bracket command, Bash effectively sees the statement
if 0; then...
and interprets it as if TRUE; then...
.
We check the credentials of bracket with the type
command.
$ type [
[ is a shell builtin
Equivalent to [
is the test
command. The two are identical except that test
does not expect to see a closing bracket.
$ if test 5 -lt 4; then echo "yes"; else echo "no"; fi no $ type test test is a shell builtin
To discover all the conditional operators supported by bracket and test
, consult Bash’s built-in help
command by typing
$ help test
Project 6 covers redirection and pipelining.
Several examples are given in the next section.
Here’s a neat trick. A conditional statement may be given any command, not just [
or test
. We could test whether two files differ by directly testing the return value from the diff
command.
$ if diff eg1.txt eg2.txt &> /dev/null > then echo "Same"; else echo "Different"; fi Same
Most commands return 0
(TRUE) for success or yes and 1
(FALSE) for failure or no. In the diff
example, we took the precaution of throwing away all errors and other output by using the redirection &>/dev/null
to prevent the shell script from writing unwanted text to the Terminal screen when it executes.
The bracket command has a number of primaries you can use to test file attributes, such as whether a file exists.
$ if [ -e no-file ]; then echo "Exists"; ¬ else echo "No such file"; fi No such file
or whether you own a particular file.
$ if [ -O eg1.txt ]; then echo "It's mine"; fi
It's mine
Bracket can compare strings for less than, greater than, equality, inequality, and emptiness. The next two examples demonstrate tests for equality and emptiness. The -z
primary returns TRUE if the length of the string that follows is 0 (the string is empty).
$ ans="" $ if [ "$ans" = "yes" ] > then echo "You agree"; else echo "You disagree"; fi You disagree $ if [ -z "$ans" ]; then echo "You didn't reply"; fi You didn't reply
Integer evaluation is performed as demonstrated in previous examples, using -eq
for equality, -ne
for inequality, and so on. Type help test
for more information.
You may specify more complex conditions by using AND, represented by -a
; OR, represented by -o
; and NOT, represented by !
. We can test whether both the variables ans
and default
are empty by using the following complex condition.
$ if [ -z "$ans" -a -z "$default" ] > then echo "I don't know what you want"; fi
Don’t omit the spaces between operators and operands. In the next example, we have omitted the spaces around the =
sign.
$ allow=""; user="" $ if [ "$allow"="yes" -o "$user"="root" ]; ¬ then echo "OK"; fi OK
Omitting the spaces makes the conditional expression appear to be
[ "non-null-string" -o "non-null-string" ]
This is how it should look and evaluate.
$ if [ "$allow" = "yes" -o "$user" = "root" ]; ¬ then echo "OK"; fi $
We form expressions that are more complex by employing parentheses to ensure that evaluation occurs in the correct order. Our first attempt does not work.
$ ans="yes"; allow="no"; user="root" $ if [ "$ans" = "yes" -a ¬ ( "$allow" = "yes" -o "$user" = "root") ] -bash: syntax error near unexpected token `('
The syntax error is reported because the parentheses are parameters to the bracket command and must be escaped from the shell, as demonstrated in our next attempt.
$ if [ "$ans" = "yes" -a ¬ ("$allow" = "yes" -o "$user" = "root" ) ] > then echo "OK"; fi OK
It’s fine to escape individual items within conditional expressions by enclosing them in quotes, but enclosing an entire expression in quotes will always cause it to be interpreted as a string with value TRUE—a situation that can produce decidedly undesirable results.
$ if [ "1 -gt 9" ] > then echo "Odd"; fi Odd
Compare the next two commands.
$ if [ "$allow" = "yes" -o "$user" = "root" ]; ¬ then echo "OK"; fi $ if [ "$allow" = "yes" ] || [ "$user" = "root" ]; ¬ then echo "OK"; fi
The difference between the two statements is that in the first example, the built-in bracket command evaluates the whole expression. In the second example, we have two separated bracket commands, and it’s Bash that performs the OR operation, using its own || operator. The two commands are functionally equivalent; which you choose is a matter of personal preference. Bash uses a more friendly and C language–like syntax. It provides OR (||
), AND (&&
), and NOT (!
) operators.
We can employ Bash operators outside a conditional statement. For example:
$ command1 && command2
In such a command, command2
is executed if, and only if, command1
returns TRUE. This technique works because Bash does not evaluate the second part of an AND statement if the first part if FALSE; the result can only ever be FALSE. This behavior is known as short-circuiting. Similarly, we could specify
$ command1 || command2
In this example, command2
is executed if, and only if, command1
returns FALSE.
As a practical example, think of what happens if we type the following command line in a directory where no subdirectory named fred
exists.
$ cd fred; ls
-bash: cd: fred: No such file or directory
Desktop Library Music Public
...
To source a shell script if, and only if, it exists and is readable, use the following conditional syntax (shown here applied to an initialization script /sw/bin/init.sh
).
[ -r /sw/bin/init.sh ] && source /sw/bin/init.sh
Command cd
returns an error, but ls
executes anyway, listing the current directory.
To avoid executing the ls
command when the cd
command fails, we use the following trick, which relies on the fact that cd
returns TRUE when it succeeds and FALSE when it fails.
$ cd fred && ls
-bash: cd: fred: No such file or directory
$
Bash provides a relatively new way of specifying a conditional expression, called an extended conditional expression. It uses the syntax [[...]]
instead of [...]
and is compatible with the older form. It is, in fact, a keyword like if
and while
, not a command like [
and cd
, and suffers fewer limitations. It also uses the more friendly syntax &&
and ||
for AND and OR. We may type a conditional expression such as
$ if [[ "$allow" = "yes" || "$user" = "root" ]]; ¬ then echo "OK"; fi
Extended conditional expressions also spare you the trouble of escaping any parentheses they contain.
$ if [[ ("$allow" = "yes") || ("$user" = "root") ]]; ¬ then echo "OK"; fi
Beware, however, that bare numbers within extended conditionals are treated as strings—text sequences without numerical value. You might be tempted to use this expression in the belief that Bash is employing integer arithmetic when evaluating the expression
$ if [[ (3 < 5) ]]; then echo "OK"; fi
OK
To find out more about the [[...]]
construct, type
$ help [[
or check the Bash man page by typing
$ man bash
and then type /[[ exp
within the man page.
But it’s not, as we can see by this example.
$ if [[ (3 < 15) ]]; then echo "OK"; fi
$
When writing conditional expressions that involve integer values, use the Bash ((...))
construct. Like [[...]]
, it uses C language–like syntax. Although [[...]]
is for general conditions, ((...))
operates only on integer values and variables.
Here’s an example.
$ v1=3; v2=2 $ if (($v1 < $v2)); then echo "yes"; else echo "no"; fi no
You may omit the $
normally required for variable expansion.
$ if ((v1 < v2)); then echo "yes"; else echo "no"; fi
no
To find out more about the ((...))
construct, check the Bash man page by typing
$ man bash
and then
/^ARITHMETIC EVALUATION within the man page.
The Bash ((...))
construct provides a more friendly syntax by employing &&
and ||
, unescaped parentheses, and < in place of -lt
. We could write
$ if (((a < b) && (b < c))); then...
or
$ while ((length!=0))
“My script doesn’t work, and I can’t figure out where it’s going wrong. How do I debug it?”
This project looks at some useful attributes provided by the Bash shell to help in debugging a script. Projects 9 and 10 introduce the basics of shell scripting. Project 45 covers Bash shell attributes.
Project 45 covers Bash shell attributes.
If you write example scripts to try the debugging techniques, be sure to make the script executable (see Project 9).
Suppose that you’ve just written a 100-line script. You run it, and it goes horribly wrong. It’s time to start debugging. Bash provides several shell attributes to aid debugging. These attributes are normally switched off and are activated by the built-in set
command. Multiple switches can be placed in a script so that attributes can be turned on and off selectively within specified sections.
To switch on an attribute— nounset
, for example—type
$ set -o nounset
To switch off an attribute, type
$ set +o nounset
We’ll write a simple shell script, complete with a couple of errors, to demonstrate some debugging techniques. Here’s our script.
$ cat debug-me
#!/bin/bash
# debugging
set -o noexec
echo "Calculate the total cost"
price=12; quantity=10
total=((price*quantity))
echo "The total is $totl"
You’ll notice on line 4 the statement set -o noexec
, which sets the noexec
attribute. This attribute instructs Bash to parse the script and check it for syntax errors, but without actually executing it. Setting noexec
is probably the first step in testing a new script. We are able to eliminate syntax errors quickly, without ever executing the script (and potentially doing some harm if the script goes wrong).
Now let’s test-run the script and check it for syntax errors.
$ ./debug-me
./debug-me: line 7: syntax error near unexpected token `('
./debug-me: line 7: `total=((price*quantity))'
Don’t set an attribute on the command line and expect it to carry through to a shell script. A script is executed by a new instance of the shell that does not inherit attributes set in the parent shell.
One syntax error is reported; we’ll correct it by changing line 7
total=$((price*quantity))
(This is the correct syntax for integer arithmetic evaluation.)
We’ll run the script again, having removed the line that sets noexec.
$ ./debug-me
Calculate the total cost
The total is
Set the noexec
attribute as an easy way to comment out the tail end of a script as you progressively debug it. Place the command in the script and relocate it as you progressively execute more statements.
Another bug has surfaced. Our variable total
, which is supposed to hold the calculated total, seems not to do so. A rich area for bug catching is that of misspelled variable names. One way we can catch such errors is to request that Bash disallow the reading of unset variables. Near the top of the file, add the line
set -o nounset
(This reads no unset, not noun set.) Run the script again.
$ ./debug-me
Calculate the total cost
./debug-me: line 8: totl: unbound variable
Set the nounset
attribute in your scripts as a matter of course. Occasionally, you’ll want to do what it disallows; in these occasions, simply remove it or, better still, switch it off and back on again around the statements you want to exempt. To switch it off, specify +o
instead of -o
to the set
command.
That bug was easily spotted. After a quick correction, the script works.
$ ./debug-me
Calculate the total cost
The total is 120
Our second example script contains some very simple branching. It’s supposed to check whether an argument has been passed to the script and then print whichever of two messages is appropriate: An argument is required
if none was passed or Ok
if an argument was passed. Let’s view and then run the script without giving it an argument.
$ cat debug-me2 #!/bin/bash if [ "$1" = "" ]; then echo "Usage: An argument is required" fi echo "Ok" $ ./debug-me2 Usage: An argument is required Ok
Whoops! It printed both messages. Quantum mechanics aside, it can’t have and not have an argument at the same time. Let’s trace through the script by setting the verbose
attribute. Every statement that’s read will be echoed to the Terminal screen. Near the top of the file, add the line
set -o verbose
Now run the script.
$ ./debug-me2
if [ "$1" = "" ]; then
echo "Usage: An argument is required"
fi
Usage: An argument is required
echo "Ok"
Ok
Following this through, we see each statement echoed as it’s read. Interspaced with this debugging output is the actual script output Usage: An argument is required
and Ok
.
The problem (which is obvious in such a short script) is that we’ve missed the exit
statement from just before the end of the if
statement. We’ll add the missing exit
statement and (if it’s not too presumptuous) switch off verbose
and try again.
$ ./debug-me2
Usage: An argument is required
Now the script works.
The shell attribute xtrace
provides an alternative tracing facility. Like verbose
, it causes statements to be displayed, but unlike verbose
, it displays only those statements that are executed. Remember verbose
causes statements to be displayed as they are read, whether they are executed or not. Additionally, xtrace
echoes statements after the shell has expanded them, so you see the statements as they will be executed; that can be very useful when debugging. Let’s try it out. Near the top of the file, add the line
set -o xtrace
Set the xtrace
option on the command line to aid the debugging of interactive commands. This technique can be especially useful when debugging shell or alias expansion, because each line is echoed after all the expansion has taken place, and you see exactly what the shell executes.
We’ll run the script twice, first without and then with an argument. Trace statements are shown preceded by a plus symbol.
$ ./debug-me2 + '[' '' = '' ']' + echo 'Usage: An argument is required' Usage: An argument is required + exit $ ./debug-me2 hello + '[' hello = '' ']' + echo Ok Ok
In the trace output, you’ll see the if
statement after expansion and the two alternative echo
statements.
To terminate a script when it executes a command that fails, set the exit-on-error attribute, errexit
,
set -o errexit
Whenever a command (mkdir
or cp
, for example) is executed and fails, the script terminates. This attribute relies on a command’s return code. As discussed in Project 81, all commands return a number when they exit; a return code of 0 means success; nonzero return codes indicate errors; and different commands return different numbers depending on the type of error. Check a command’s man page to find out what codes it’s likely to return. This technique can be used to put the brakes on a script during debugging, ensuring that it doesn’t continue after a failed command, executing potentially harmful statements.
“How do I adapt my scripts to operate on multiple files?”
This project shows you how to write a script that parses a list of filenames and processes each file in the list. It also shows you how to develop wrapper scripts that feed each filename in a list, one at a time, to scripts that accept only single filenames. Projects 9 and 10 cover the basics of shell scripting.
Suppose that we write a script called action
that performs some specified action on a text file. The script takes two option flags: -v
for verbose and -a
for action followed by an action name. We are not concerned with what the script actually does; it’s presented merely as a vehicle to illustrate how to write a script that processes a list of filenames passed on its command line.
We might call the script to process all the text files in the current directory by typing
$ action -v -a squeeze *.txt
We rely on the shell to expand *.txt
into a list of all the .txt
files in the current directory. To succeed, our action
script must be written to accept any number of files and to act on each file in turn. The script must parse the options, save them, and then loop to process each file listed on the command line.
A second approach sees us writing a general-purpose wrapper script. The wrapper script accepts many filenames and calls a simpler action
script a number of times, each time passing it the next filename from the list. Such a wrapper script can also be used on scripts and commands over which we have no control and that do not accept a list of files. Project 58 introduced this technique when it considered how we might batch-edit files. The solutions presented in that project are similar but take advantage of Bash functions. In this project, we write Bash shell scripts.
Let’s jump straight in with a sample script called action
, which will take on the functionality described in the previous section.
$ cat action
#!/bin/bash
# This is our main function to process each file
Process () {
echo "Processing $1, verbose: ${verbose:-n}, ¬
action: ${action:-none}"
}
# This while loop extracts and remembers each option setting
while getopts "va:" opt; do
case $opt in
v) verbose="y";;
a) action=$OPTARG;;
*) echo "Usage: ${0##*/} [-v] [-a action] ¬
filename..."; exit 1;;
esac
done
shift $((OPTIND-1))
# This for loop processes each filename in turn
for filename in "$@"; do
Process "$filename"
done
exit 0
Project 52 explains Bash functions.
The script is written to demonstrate batch-processing techniques. The actual processing is performed in the function Process
, appearing at the top of the script. In the example script, this function does nothing more than echo the name of the file it’s supposed to process and its understanding of the options.
Although the script does not perform a real-world task, it serves as a template from which you can build your own scripts.
We assume that the script takes two optional parameters. The first is -v
for verbose output; the second is -a
for action, followed by an action type. The default values for these options are not verbose and an action of none.
Project 76 covers parameter expansion. The function Process
employs parameter expansion with default values when it echoes the options.
A script of any significance will accept options, and it’s not possible to process the list of filenames without knowing where the options end and the filenames start. To ensure that the example script is a useful template, we’ll first show you how to process and save the list of options.
The while
loop processes the options. The code shown here may be used by any script that must parse a list of options. It takes advantage of the Bash built-in function getopts
written to process a script’s positional parameters (the arguments passed on its command line), looking for options and their associated arguments. In our example, the string va
: in
getopts "va:" opt
tells getopts
that we allow the options -v
and -a
, but no others. The colon following a tells getopts
to expect an argument to follow. getopts
writes the next option it reads to the variable opt
(or whatever is named in the command) and any associated argument to the variable OPTARG
. We employ a case
statement to process each argument, setting the variables verbose
and action
as appropriate. getopts
drives the while
loop by returning TRUE when an option is found and FALSE when the list of options is exhausted. When the options are exhausted, we expect the list of filenames to follow. The shift
statement immediately following the while
loop shifts all parameters down such that the first filename is moved to the positional parameter $1
and all the options we’ve just processed drop off the end. The value of OPTIND
is set appropriately by getopts
so that this works.
The for
loop extracts each filename from the remaining positional parameters, expanding "$@"
to be the list of quoted filenames.
Note that the for loop
uses "$@"
, which expands to "$1" "$2" . . .
, ensuring that our script is able to cope with filenames that include spaces. Note that if we had used "$*"
, we’d have generated one long filename: "$1 $2..."
.
The variable filename
is assigned the value of the next filename in the list each time around the loop. To process the file, we call the function Process
. Remember, the point of this exercise is to write a script that processes a list of filenames; the actual processing performed on a file is incidental.
Here are some examples of what we might see when we run the script.
$ ./action -x ./action: illegal option -- x Usage: action [-v] [-a action] filename... $ ./action -v -a ./action: option requires an argument -- a Usage: action [-v] [-a action] filename... $ ./action -v -a list $ ./action -v -a list *.txt Processing letter.txt, verbose: y, action: list Processing notes.txt, verbose: y, action: list Processing three one.txt, verbose" y, action: list
For scripts and commands that don’t accept a list of filenames, and perhaps to avoid adding such functionality to your own scripts, write a wrapper script. The script, which we’ll call each
, accepts a wildcard pattern, such as *.txt
, and a command to execute. It expands the wildcard into a list of filenames and applies the target command to each filename in turn. The target command, therefore, does not have to be written to process a list of filenames.
Here’s our script, in which we assume that the first argument is a wildcard pattern; the remaining arguments form the command to execute and any options it requires.
$ cat each
#!/bin/bash
filetype=$1; shift
for file in $filetype; do
$* "$file"
done
The first parameter (the wildcard pattern) is saved in the shell variable filetype
for use later. The shift
operator discards the first parameter, shuffling the remainder down. The for
loop processes each file in the expanded wildcard pattern held in filetype
(the shell automatically expands this for use, just as it does on the command line) by setting the variable $file
to be the next filename in the list each time around the loop. The line that follows expands the remainder of the parameters into the target command and any arguments ($*
) and the filename under consideration by the for loop ($file
).
$* "$file"
We’ll try out our each
script by using it with another script to rename all the text files in a directory, replacing their .txt
extensions with .txt.bak
. Recall that each simply feeds one file at a time to the target command (or script). The script that does the name-changing is named rename
, and it contains just one command. It takes a filename as its only argument and changes the filename by tacking .bak
onto the original filename.
mv "$1" "$1.bak"
To create a script that contains this command, simply echo it and redirect output to file rename
(after making sure that no file of that name already exists in the working directory); then set execute permissions on the file.
$ echo 'mv "$1" "$1.bak"' > rename $ chmod +x rename
Before we put each
and rename
to work, let’s check the files in the current directory. Using wildcard pattern *.txt*
with ls
ensures that our list will include both normal text files (with extension .txt
) and any that have been processed by rename
(with extension .txt.bak
).
$ ls *.txt* letter.txt notes.txt three one.txt
Type the following to have each
call and execute rename
.
$ each "*.txt"./rename
You can’t rename all. tx
t files to .bak
by using a command such as
$ mv *.txt *.bak
because of the way the shell expands wildcard patterns on the command line.
Now run ls
again to check the results.
$ ls *.txt*
letter.txt.bak notes.txt.bak three
one.txt.bak
For our next trick, we’ll remove the extension we just added. This example pairs our each
wrapper with a script called unrename
, which uses “topping and tailing” strings during parameter expansion—a technique discussed at length in Project 76. In short, the parameter expansion ${1%.*}
expands $1
and removes the final dot, and everything that comes after it, from any filename.
$ echo 'mv "$1" "${1%.*}"' > unrename $chmod +x unrename $ each "*.bak"./unrename $ ls *.txt* letter.txt notes.txt three one.txt
Finally, using each
and a new script that applies techniques from rename
and unrename
, we’ll change the extension of our .txt
files to .bak
.
$ echo 'mv "$1" "${1%.*}.bak"' > re-rename $chmod +x re-rename $ each "*.txt"./re-rename $ ls *.txt* ls: *.txt*: No such file or directory $ ls *.bak letter.bak notes.bak three one.bak
All the above are simply examples of what can be done. The each script can be customized to your own preferences and used from the command line or by another script.
Project 18 shows what you can do with find
and xargs
.
Here’s a simple recursive version of each
, which we call reach
. It searches a whole directory hierarchy for matching filenames.
$ cat reach
#!/bin/bash
filetype=$1; shift
find. -name "$filetype" -print0 | xargs -0 -n1 $*
To rename all .txt
files to .bak
, we employ the same rename
script as before, but use reach
to apply the script to all .txt
files in the current directory hierarchy.
$ echo 'mv "$1" "${1%.*}.bak"' > rename $ reach "*.txt"./rename
“What’s the correct syntax for . . . ?”
This project looks at the syntax of common shell commands such as variable assignment, redirection, and shell scripting statements. It shows the syntax for Bash and Tcsh—the two shells that are used most often in Mac OS X Unix.
Project 5 compares the various shell flavors.
Project 4 covers shell variables and environment variables, and how they differ.
Table 9.2 shows you how to set shell variables and environment variables.
Table 9.3 shows the syntax employed by both shells to express redirection and pipelining.
Table 9.3. Syntax for Redirection and Pipelining
Bash | Tcsh | |
---|---|---|
stdout |
|
|
stderr |
|
|
stdout appending |
|
|
stderr appending |
|
|
stdout with clobber |
|
|
stderr with clobber |
|
|
Both to same file |
|
|
Both to different files |
|
|
Merge stdout into stderr |
|
|
Merge stderr onto stdout |
|
|
stdin |
|
|
Pipe stdout |
|
|
Pipe both |
|
|
Project 6 covers the concepts of redirection and pipelining.
To see the output of a command onscreen and redirect it to a file, use the tee
command.
$ ls Sites | tee list.txt images index.html $ cat list.txt images index.html
To redirect to multiple files, just type the names of the files as arguments. Apply option -a
to append to the output files rather than overwrite them.
The following script files are executed by the Bash shell when it starts up. For login shells (or shells started with the command bash --login
), they are
/etc/profile
~/.bash_profile
Project 47 covers the shell startup sequence.
For non-login shells, they are
/etc/bashrc
(though the Bash manual claims otherwise)
~/.bashrc
The following script files are executed by the Tcsh shell when it starts up. For login shells (or shells started with the command tcsh -l
), they are
/etc/csh.cshrc
/etc/csh.login
~/.tcshrc
~/.login
For non-login shells, they are
/etc/csh.cshrc
~/.tcshrc
Syntax for each Bash and Tcsh control construct is illustrated in the following examples. All the scripts actually work, so you can play around with them.
#!/bin/bash if [ "$1" = "positive" ]; then echo "Yes" elif [ "$1" = "negative" ]; then echo "No" else echo "Not sure" fi #!/bin/tcsh if ("$1" == "positive") then echo "Yes" else if ("$1" == "negative") then echo "No" else echo "Not sure" endif
#!/bin/bash case "$1" in "positive") echo "Yes" ;; "negative") echo "No" ;; *) echo "Not sure" ;; esac #!/bin/tcsh switch ("$1") case "positive": echo "Yes" breaksw case "negative": echo "No" breaksw default: echo "Not sure" breaksw endsw
Project 10 gives examples of control constructs in a shell script.
#!/bin/bash for word in hello goodbye au-revoir; do echo $word done #!/bin/tcsh foreach word (hello goodbye au-revoir) echo $word end
“How do I force a group of commands to execute in their own environment?”
This project discusses the use of subshells: what they are and how you might take advantage of their special features. It also introduces group commands, which are similar to subshells.
Use the environment variable SHLVL
to discover how deeply nested the current (sub-)shell instance is. Level 1 is the login shell, level 2 is a subshell, level 3 is a subshell launched by the subshell, and so on.
A subshell is a new instance of a shell launched to run a single command, a command list (one or more commands separated by a semicolon), or a shell script.
Project 4 includes a section explaining shell and environment variables and their respective scopes.
To execute a command list in a subshell, enclose it in parentheses on the command line.
$ (cd /; ls)
This technique produces similar results to executing the command list in the normal manner except for one important difference: Because the command list runs in a new shell instance and not the current shell, it executes in a new environment. Recall that a new shell instance inherits environment variables from the current interactive shell, but not other settings, such as shell variables, attributes, and options. Further, no part of the subshell’s environment is passed back to the parent shell. In our simple example, then, the built-in cd
command executed in a subshell can’t change the interactive shell’s current working directory.
A script is run by the executable named in the first line of the shell script—usually, #!/bin/bash
. The first line of a script may name any executable, not necessarily a shell. Here’s a (pointless) illustration.
$ cat myecho #!/bin/echo $ ./myecho Hello there! ./myecho Hello there!
The previous section explained that subshells execute in their own environments. We can take advantage of this when writing shell scripts. Enclosing a section of script code within parentheses, so that it executes in its own subshell, lets us set local shell variables and attributes that apply only to the enclosed code block. Such settings are not visible outside the code block and do not affect the remainder of the script when the code block has completed executing.
Project 82 talks about the noexec
attribute.
Project 86 shows you how to use a subshell to limit the scope of a signal handler in a Bash script.
Here’s a neat trick that uses a subshell to localize shell attributes. Suppose that we need to comment out a section of code and choose to use the noexec
attribute to do so. Here’s our first attempt, which doesn’t work; line 3
is never echoed.
$ cat block-eg #!/bin/bash echo line 1 set -o noexec # switch off execution to comment out echo line 2 set +o noexec # switch execution back on echo line 3 $ ./block-eg line 1
The reason for the script’s failure lies in the fact that the set +o noexec
statement is never executed; we just switched off execution, and this includes execution of the built-in set
command.
We get around this problem by placing the code to comment out in a subshell. Shell attributes set in a subshell—set -o noexec
, in this example—are not passed back to the parent shell, so we don’t need to turn execution back on. Clever!
$ cat block-eg #!/bin/bash echo line 1 (set -o noexec echo line 2) echo line3$ ./block-eg line 1 line 3
We can group commands and apply selective input and output redirection. We might discard the standard error from several commands by writing them as a subshell. This technique averts the necessity to redirect the standard error individually from every command in the group.
$ cat redir-eg
#!/bin/bash
dir=$1; file=$2
( cd $dir
ls $file ) 2> junk
# more-commands...
Here’s another example that uses a subshell to redirect the standard input of a group of commands. The main script reads its input (name
and age
) from the terminal; the parenthesized section takes its input (code
and membership
) from the file autodata
.
$ cat eg
#!/bin/bash
read -p "Name: " name
( read code
read membership
echo "Code: $code, membership: $membership"
) < autodata.txt
read -p "Age: " age
echo "Name: $name, age: $age"
The subshell reads from the file autodata.txt
.
$ cat autodata.txt
ABC
123
When we run the script, we provide a name and (false ) age.
$ ./eg Name: Adrian Code: ABC, membership: 123 Age: 21 Name: Adrian, age: 21
As the script stands, the values read from the file autodata
are lost when the subshell completes; the local shell variables code
and membership
are not passed back to the parent shell. Although this limitation stems from the rather simple example constructed to illustrate subshells, it provides a platform to illustrate some useful tricks.
The next code extracts shows you how to pass values back from a subshell to the main shell. New and changed lines are shown in bold.
$ cat eg #!/bin/bash declare -a autodata read -p "Name: " name autodata=($( ( read code read membership echo "$code $membership" ) < autodata.txt )) read -p "Age: " age echo "Name: $name, age: $age" echo "Code: ${autodata[0]}, membership: ${autodata[1]}"
When we run the script, we see that the code
and membership
values are passed back to, and displayed from, the main script.
$ ./eg Name: Adrian Age: 21 Name: Adrian, age: 21 Code: ABC, membership: 123
How does this work? The whole subshell runs as a subcommand (enclosed in $(...))
. As with all subcommands, Bash ultimately reads this expression as the value of its output—in this case, the value of subshell variables code
and membership
, which are echoed by the subshell before it completes. Furthermore, we capture that value (before the subshell disappears) by assigning the output of the subcommand to an array variable, autodata
, using the expression
autodata=(value)
Project 87 covers Bash array variables.
Project 55 shows how to launch commands to run in the background.
where value
is the subshell run as a subcommand.
This example employs a few techniques, and you might have to experiment a little to follow how it works.
Project 6 covers redirection.
The Tcsh shell is not able to redirect standard error independent of standard output. The Bash shell uses the following syntax to redirect only standard error.
cmd 2> file
In the Tcsh shell, we must apply the following trick.
(cmd >/dev/tty) >& file
The syntax to express a group command is quite fussy. The opening brace must be followed by a space, and a semicolon must terminate the last command.
The command is run in a subshell, and standard output is redirected back to the terminal. This has no effect except that the output from the subshell now contains only standard error. Then we specify Tcsh shell syntax to redirect both standard output (there’s none, as it has already been redirected) and standard error to the file file
.
A group command is like a subshell. To form a group command, enclose a command list, or a section of a shell script, in braces.
{ command; command; ...;}
The difference between a subshell and a group command is that the current shell, not a new instance of the shell, executes a group command. This means that it does not execute in its own local environment, so some of the tricks employed using subshells do not work.
Let’s revisit the standard-input example that we used earlier when illustrating subshells. In our new version, we employ a group command instead of a subshell. We need no longer use clever trickery to preserve the value of the variables code
and membership
, because they are no longer local to the enclosed block of code.
$ cat eg-group #!/bin/bash read -p "Name: " name { read code read membership } < autodata.txt read -p "Age: " age echo "Name: $name, age: $age" echo "Code: $code, membership: $membership" $ ./eg-group Name: Adrian Age: 21 Name: Adrian, age: 21 Code: ABC, membership: 123
Project 81 shows how Bash operators are used outside a conditional expression to make execution of a second command dependent on the outcome of a first. For example:
$ command1 && command2
The command2
is executed if, and only if, command1
returns TRUE. This technique works because Bash does not evaluate the second part of an AND statement if the first part is FALSE. (There’s no need to; if the first part returns FALSE, the result of the entire AND expression can only ever be FALSE.) This behavior is known as short-circuiting.
If either command is or both commands are a command sequence, you must make the sequence a group command for this technique to work.
$ { cmd1; cmd2; ...; } && { cmd3; cmd4; ...;}
Here’s a useful trick that asks for root authentication.
sudo -p "Admin password " echo 2> /dev/null || ¬ { echo "Incorrect"; exit; }
The code is useful when placed at the start of a script that, later on, issues commands that require root permission obtained via sudo
. The sequence will prompt for an administrator’s password as soon as the script is invoked. Authentication resulting from a correct password lasts 5 minutes–plenty of time for most homemade scripts to run. If authentication fails, the second (group) command displays an error message and the script exits, before needless execution of the code that precedes an internal sudo
command.
“How do I catch a signal sent to my Bash shell script?”
This project considers signals such as INT, HUP
, and TERM
, but from the receiving end. It shows how you might equip your Bash shell scripts with custom signal handlers (also called traps) to catch and handle signals sent to it. Project 40 considers signals from the other direction: how to issue them.
Refer to Project 40, which lists the various signals and shows how you might send them to a running process by using the kill
command.
Signals are a feature built into Unix. A signal is like an interrupt: Sending a signal to a process causes the process to stop what it’s doing and to respond. Signals are used to tell processes to take a specific action, such as restarting, terminating, or temporarily halting.
Signals are frequently used by faceless background programs (daemons) to receive instructions from the user: There’s no other simple means to communicate with them. You may use the same technique with your own background scripts. Project 55 covers background jobs.
Try this example.
$ sleep 1000 wake up ok, you asked for it <Control-c>
A running process that’s not responding to keyboard input (wake up
typed in the example above) somehow manages to respond when you press Control-c
. How does this happen? When any process is launched, it’s accompanied by some special code that manages signals, called a handler. A handler is executed whenever a signal is sent to its process. Some processes supply their own handlers; other processes rely on default handlers Unix automatically attaches as the process is launched.
You may send a signal to a process by using the kill
command. A limited number of signals may also be sent by pressing control sequences such as Control-c
, which instructs Terminal to send the appropriate signal.
There are many signals, and individual processes can elect to respond to some signals and ignore others. Each process may respond to a signal in its own particular way.
Let’s write a short Bash shell script that demonstrates how to catch and handle a specific signal. If you don’t supply your own handlers, a script is launched with a default set of handlers. We’ll override the default handler for a signal called SIGINT
(or just INT
), which can be sent from the kill
command or from Terminal by pressing Control-c
.
Discover the signals that a command or daemon respects by checking its man page. Search for the section titled “SIGNALS.”
Here’s an example script that loops indefinitely, going dotty.
$ cat signal-eg
#!/bin/bash
trap 'echo "Got INT"' INT
while true; do
echo -n "."; sleep 1
done
Refer to Projects 9 and 10 for basic shell scripting.
Normally, we’d be able terminate the script by pressing Control-c
in Terminal. The script catches the INT
signal, however, by including the statement
trap 'echo "Got INT"' INT
This statement simply echoes the text "Got INT"
and carries on regardless.
Let’s run the script to see what happens when we press Control-c
and when we send an INT
signal from kill
.
$ ./signal-eg ....^CGot INT # <--here we typed Control-c (=INT) ......Got INT # <--here we sent INT using command kill ......Terminated # <--here we sent TERM using command kill $
We sent an INT
signal by using the line
$ kill -INT $(ps xww | awk '/./signal-eg/{print $1}')
Project 40 shows you how to identify running processes and send signals to them.
To stop the script, we must send a stronger signal, such as TERM
, by typing
$ kill -TERM $(ps xww | awk '/./signal-eg/{print $1}')
To honor the interrupt signal and exit the script, we would change our trap
statement to read
trap 'echo "Got INT, bye bye."; exit' INT $ ./signal-eg ....^C Got INT, bye bye.
trap
is a Bash built-in command. To learn more about it, and to display a list of signals, type
$ help trap $ trap -l
To add more than one handler, simply add more trap
statements. To handle both the INT
and TERM
signals, for example, we would write
trap 'echo "Got INT, bye bye."; exit' INT trap 'echo "Got TERM, bye bye."; exit' TERM
If more than one signal requires the same action, you may list them all in a single trap
statement.
trap 'echo "Got INT or TERM, bye bye."; exit' INT TERM
If your signal hander is more complex than just a couple of statements, consider having a trap
statement call a function. Here’s an example that traps the HUP
signal and performs some significant processing upon its receipt.
$ cat signal-eg
handlehup ()
{
echo "Reloading configuration"
# more statements here
echo "Restart complete"
}
trap 'handlehup' HUP
while true; do
echo -n "."; sleep 1
done
When a HUP
signal is received, the function handlehup
is called.
$ ./signal-eg
...Reloading configuration # <-- we issued HUP using kill
Restart complete
.......
The HUP
signal is often interpreted by daemons as a request to reload their configuration settings and restart.
We issue the HUP
signal by typing
$ kill -HUP $(ps xww | awk '/./signal-eg/{print $1}')
Project 52 covers Bash functions.
Suppose that you want to trap signals over a critical region of code but not over the whole script. This might be necessary in a script that writes information to a file in several steps, where an interrupt midway through the writing process would result in a half-written file.
We trap signals over the critical period, between opening and closing the file, by executing the critical region of code in a subshell and defining a handler that’s local to the subshell. When the subshell completes, the handler ceases to be defined.
$ cat signal-eg
#!/bin/bash
# critical code - stop interrupts here
(
trap 'echo "Caught by subshell"' INT
echo "Critical code"
a=100000; while ((a!=0)); do ((a--)); done
)
# normal code - allow interrupts from now on
echo "Normal code"
a=100000; while ((a!=0)); do ((a--)); done
To illustrate this, both regions of code have a delay loop to make it possible to interrupt the script before it exits normally. After executing the script and pressing Control-c
, we see that during execution of code in the critical region, the INT
signal is caught and ignored. Thereafter, the INT
signal terminates the script in the usual way.
$ ./signal-eg Critical code ^CCaught by subshell ^CCaught by subshell Normal code ^C $
Switch off a handler by specifying null code to the trap
statement as a dash character.
trap - SIG
This on/off technique can be used to limit a trap to a block of code in preference to using a subshell.
Project 85 covers subshells.
“How do I write a function that returns an array of values?”
This project presents several tips that you might find useful when writing Bash shell scripts. It shows you how to declare variables and arrays, perform integer arithmetic, test if a value is numeric, return values from functions, and implement variable variables.
Display the names of all integer variables by typing
$ declare -i
To learn more about the declare command, type
$ help declare
This tip has nothing to do with the red channel at Customs. Declaring a variable is a way of telling Bash more about how you are going to use the variable. To declare a variable, use the Bash built-in command declare
, followed by a type and variable name. Variable types include integer and array, both of which are described at greater length in this project.
In the next example, we declare the variable count
to be an integer variable and perform some simple integer arithmetic, setting and incrementing count
. First, by way of comparison, we try the sequence with an undeclared variable, which is taken by Bash to be a general-purpose string variable.
$ s=1 $ s=s+1 $ echo $s s+1 $ declare -i count=1 $ count=count+1 $ echo $count 2
You may also declare
variables read-only and export them as environment variables.
Bash lets you declare a variable to be an array. An array variable holds many values, each accessed by its ordinal number (index). The following examples illustrate this.
Declare an array, and initialize it by specifying the option -a
and listing the values you wish to assign within parentheses, separated by spaces. Enclose in double quotes any values that contain spaces.
$ declare -a products $ products=(iBook iMac PowerBook PowerMac ¬ "iPod shuffle" AirPort)
To retrieve a particular value, expand the array variable name, employing the syntax
${array-variable-name[index]}
To display the first value, which has an index of 0, and the fifth value, which has an index of 4, type
$ echo ${products[0]} iBook $ echo ${products[4]} iPod shuffle
Display all values by giving an index of star.
$ echo ${products[*]}
iBook iMac PowerBook PowerMac iPod shuffle AirPort
To display the number of values in the array, type
$ echo ${#products[*]}
6
To display the length, in characters, of the second value in the array, type
$ echo ${#products[1]}
4
To create a list of all values, use a for loop
to list all values one at a time.
$ for p in "${products[@]}"; do echo $p; done
iBook
...
iPod shuffle
AirPort
$
Enclosing the expansion in double quotes, and using an index @ (instead of *), ensures that each value is expanded to preserve spaces; without this, iPod shuffle
expands into two values.
The difference between @ and * used as an array index affects expansion of the array in the same way that it affects expansion of positional parameters, explained in “Basic Expansion” in Project 76.
Bash provides a special syntax for integer arithmetic expressions and comparisons, in which variables are automatically expanded and assumed to have the type integer. Expressions are enclosed within $((...))
, and conditions, within ((...))
. Within the double parentheses, you may employ expressions very much like those of the C programming language.
Learn about the Bash arithmetic expression allowed within ((...))
and $((...))
by typing /^ARITHMETIC EVALUATION
within the Bash man page.
Here are a couple of examples.
$ i=7; j=35 $ echo $((i+j)) 42 $ if (((i*j) == 245)); then echo "yes"; fi yes
As a trivial example, we might offer a thousand greetings in the following manner.
$ a=1000; while ((a!=0)); do echo -n "*hello*"; ¬ ((a--)); done
Here’s a handy tip to determine whether a value is numeric. The line beginning with read
generates a prompt, (Give a number
) and assigns whatever you type to variable num
. The line beginning with if
tests num
to see whether it’s numeric.
$ read -p "Give a number: " num Give a number: i23 $ if [ "${num//[0-9]/}" ]; then echo "Not numeric"; fi Not numeric $ read -p "Give a number: " num Give a number: 123 $ if [ "${num//[0-9]/}" ]; then echo "Not numeric"; fi
Project 76 explains the parameter extension techniques we used in this trick.
A Bash shell script or function may return an exit condition of 0 to 255, which is available in the shell special variable $?
. To return an arbitrary value, we use the following trick, shown here applied to a Bash function.
The function return-eg
returns a string value, simply set to Janet
, to illustrate the technique. The function echoes its return value, and the calling script captures that value by calling the function and enclosing it in $(...)
. This syntax tells Bash to execute the function and replace it with its own output; thus, we assign the return value to the variable name
.
$ cat return-eg return-eg () { # processing here result="Janet" echo $result } name=$(return-eg) echo $name $ ./return-eg Janet
Project 52 covers Bash functions.
If we combine the arbitrary-value trick with Bash array variables, we can write and call a function that returns many values.
$ cat return-eg return-eg () { # processing here result="Janet Sophie" echo $result } declare -a guests guests=($(return-eg)) for ((i=0; i<${#guests[*]}; i++)); do echo "guest $((i+1)) ${guests[i]}" done $ ./return-eg guest 1 Janet guest 2 Sophie
Languages such as PHP implement variable variables. If you know what they are and would like to simulate their functionality in Bash, this trick is for you. Here’s an example in which we echo the value of the variable detailsJanet
.
$ echo $detailsJanet
Name: Janet Forbes, Country: England
Now we try the same exercise, except that the Janet
part of the variable is itself held in a variable and, naturally, could be anything.
$ read -p "Give name: " name Give name: Janet $ eval "echo $details$name" Name: Janet Forbes, Country: England
The built-in eval
command tells Bash to expand the quoted command sequence and then to execute the expanded text as though it were the original command. The net effect is to expand the line twice before it’s executed. After eval
is executed, the command sequence in the example above becomes
echo $detailsJanet
Then this command is executed in the normal manner. If you were to give a different name, such as Sophie
, in response to the Give name
: prompt, the final statement would evaluate to
echo $detailsSophie
18.218.5.12