IN THIS CHAPTER
In this chapter, you'll learn some more about parameters. Technically, parameters include the arguments passed to a program (the positional parameters), the special shell variables such as $#
and $?
, and ordinary variables, also known as keyword parameters.
Positional parameters cannot be assigned values directly; however, they can be reassigned values with the set
command. Keyword parameters are assigned values simply by writing
variable=value
The format is a bit more general than that shown; actually, you can assign several keyword parameters at once using the format
variable=value variable=value ...
as the following example illustrates:
In the simplest form, to have the value of a parameter substituted, you simply precede the parameter with a dollar sign, as in $i
or $9
.
If there's a potential conflict caused by the characters that follow the parameter name, you can enclose the name inside curly braces, as in
mv $file ${file}x
This command would add an x
to the end of the filename specified by $file
and could not be written as
mv $file $filex
because the shell would substitute the value of filex
for the second argument.
As mentioned in Chapter 7, “Passing Arguments,” to access positional parameters 10 and above, you must enclose the number inside the curly braces, as in ${11}
.
This construct says to substitute the value of parameter if it is not null, and to substitute value otherwise. For example, in the command line
echo Using editor ${EDITOR:-/bin/vi}
the shell substitutes the value of EDITOR
if it's not null, and the value /bin/vi
otherwise. It has the same effect as writing
if [ -n "$EDITOR" ] then echo Using editor $EDITOR else echo Using editor /bin/vi fi
The command line
${EDITOR:-/bin/ed} /tmp/edfile
starts up the program stored in the variable EDITOR
(presumably a text editor), or /bin/ed
if EDITOR
is null.
Here's a simple test of this construct from the terminal:
This version is similar to the last, only if parameter is null; not only is value used, but it is also assigned to parameter as well (note the =
in the construct). You can't assign values to positional parameters this way (that means that parameter can't be a number).
A typical use of this construct would be in testing to see whether an exported variable has been set and, if not, setting it to a default value, as in
${PHONEBOOK:=$HOME/phonebook}
This says that if PHONEBOOK
is set to some value, leave it alone; otherwise, set it to $HOME/phonebook
.
Note that the preceding example could not stand alone as a command because after the substitution was performed the shell would attempt to execute the result:
$ PHONEBOOK= $ ${PHONEBOOK:=$HOME/phonebook} sh: /users/steve/phonebook: cannot execute $
To use this construct as a standalone command, the null command is often employed. If you write
: $ {PHONEBOOK:=$HOME/phonebook}
the shell still does the substitution (it evaluates the rest of the command line), yet executes nothing (the null command).
If parameter is not null, the shell substitutes its value; otherwise, the shell writes value to standard error and then exits (don't worry—if it's done from your login shell, you won't be logged off). If value is omitted, the shell writes the message
prog: parameter: parameter null or not set
Here's an example from the terminal:
$ PHONEBOOK= $ : ${PHONEBOOK:?"No PHONEBOOK file!"} No PHONEBOOK file! $ : ${PHONEBOOK:?} Don't give a value sh: PHONEBOOK: parameter null or not set $
With this construct, you can easily check to see whether a set of variables needed by a program are all set and not null, as in
This one substitutes value if parameter is not null; otherwise, it substitutes nothing.
$ traceopt=T $ echo options: ${traceopt:+"trace mode"} options: trace mode $ traceopt= $ echo options: ${traceopt:+"trace mode"} options: $
The value part for any of the constructs in this section can be a command substitution; it's executed by the shell only if its value is to be used. In
WORKDIR=${DBDIR:-$(pwd)}
WORKDIR
is assigned the value of DBDIR
if it's not null; otherwise, the pwd
command is executed and the result assigned to WORKDIR
. pwd
is executed only if DBDIR
is null.
The POSIX standard shell provides four parameter substitution constructs that perform pattern matching. Note that some older shells do not support this feature.
In each case, the construct takes two arguments: a variable name (or parameter number) and a pattern. The shell searches through the contents of the specified variable to match the supplied pattern. If the pattern is matched, the shell substitutes the value of the variable on the command line, with the matching portion of the pattern deleted. If the pattern is not matched, the entire contents of the variable are substituted on the command line. In any case, the contents of the variable remain unchanged.
The term pattern is used here because the shell allows you to use the same pattern matching characters that it accepts in filename substitution and case
values: *
to match zero or more characters, ?
to match any single character, [...]
to match any single character from the specified set, and [!...]
to match any single character not in the specified set.
When you write the construct
${variable%pattern}
the shell looks inside variable to see whether it ends with the specified pattern. If it does, the contents of variable are substituted on the command line with the shortest matching pattern removed from the right.
If you use the construct
${variable%%pattern}
the shell once again looks inside variable to see whether it ends with pattern. This time, however, it removes the longest matching pattern from the right. This is relevant only if the *
is used in pattern. Otherwise, the %
and %%
behave the same way.
The #
is used in a similar way to force the pattern matching to occur on the left rather than the right. So, the construct.
${variable#pattern}
tells the shell to substitute the value of variable on the command line, with pattern removed from the left.
Finally, the shell construct
${variable##pattern}
works like the #
form, only the longest occurrence of pattern is removed from the left.
Remember that in all four cases, no permanent changes are made to the variable itself; you are affecting only what gets substituted on the command line. Also, remember that the pattern matches are anchored. In the case of the %
and %%
constructs, the variables must end with the specified pattern; in the case of the #
and ##
constructs, the variable must begin with it.
Here are some simple examples to show how these constructs work:
$ var=testcase $ echo $var testcase $ echo ${var%e} Remove e from right testcas $ echo $var Variable is unchanged testcase $ echo ${var%s*e} Remove smallest match from right testca $ echo ${var%%s*e} Remove longest match te $ echo ${var#?e} Remove smallest match from left stcase $ echo ${var#*s} Remove smallest match from left tcase $ echo ${var##*s} Remove longest match from left e $ echo ${var#test} Remove test from left case $ echo ${var#teas} No match testcase $
There are many practical uses for these constructs, even though these examples don't seem to show it. For example, the following tests to see whether the filename stored inside the variable file
ends in the two characters .o
:
if [ ${file%.o} != $file ] then # file ends in .o ... fi
As another example, here's a shell program that works just like the Unix system's basename
command:
$ cat mybasename
echo ${1##*/}
$
The program displays its argument with all the characters up to the last /
removed:
This construct gives you the ability to count the number of characters stored inside a variable. For example,
$ text='The shell' $ echo ${#text} 9 $
Note that some older shells do not support this feature.
Each of the parameter substitution constructs described in this section is summarized in Table A.3 in Appendix A, “Shell Summary.”
Whenever you execute a shell program, the shell automatically stores the name of the program inside the special variable $0
. This can be used to advantage when you have two or more programs that are linked under different names and you want to know which one was executed. It's also useful for displaying error messages because it removes the dependency of the filename from the program. If the name of the program is referenced by $0
, subsequently renaming the program will not require the program to be edited:
$ cat lu # # Look someone up in the phone book # if [ "$#" -ne 1 ] then echo "Incorrect number of arguments" echo "Usage: $0 name" exit 1 fi name=$1 grep "$name" $PHONEBOOK if [ $? -ne 0 ] then echo "I couldn't find $name in the phone book" fi $ PHONEBOOK=$HOME/phonebook $ export PHONEBOOK $ lu Teri Teri Zak 201-555-6000 $ lu Teri Zak Incorrect number of arguments Usage: lu name $ mv lu lookup Rename it $ lookup Teri Zak See what happens now Incorrect number of arguments Usage: lookup name $
The shell's set
command is a dual-purpose command: it's used both to set various shell options as well as to reassign the positional parameters $1
, $2
, and so forth.
This option turns on trace mode in the shell. It does to the current shell what the command
sh -x ctype a
did for the execution of the ctype
program in Chapter 8, “Decisions, Decisions.” From the point that the
set -x
command is executed, all subsequently executed commands will be printed to standard error by the shell, after filename, variable, and command substitution and I/O redirection have been performed. The traced commands are preceded by plus signs.
$ x=* $ set -x Set command trace option $ echo $x + echo add greetings lu rem rolo add greetings lu rem rolo $ cmd=wc + cmd=wc $ ls | $cmd -l + ls + wc -l 5 $
You can turn off trace mode at any time simply by executing set
with the +x
option:
$ set +x + set +x $ ls | wc –l 5 Back to normal $
You should note that the trace option is not passed down to subshells. But you can trace a subshell's execution either by running the shell with the -x
option followed by the name of the program to be executed, as in
sh -x rolo
or you can insert a set -x
command inside the file itself. In fact, you can insert any number of set -x
and set +x
commands inside your program to turn trace mode on and off as desired.
If you don't give any arguments to set
, you'll get an alphabetized list of all the variables that exist in your environment, be they local or exported:
$ set Show me all variables CDPATH=:/users/steve:/usr/spool EDITOR=/bin/vi HOME=/users/steve IFS= LOGNAME=steve MAIL=/usr/spool/mail/steve MAILCHECK=600 PATH=/bin:/usr/bin:/users/steve/bin:.: PHONEBOOK=/users/steve/phonebook PS1=$ PS2=> PWD=/users/steve/misc SHELL=/usr/bin/sh TERM=xterm TMOUT=0 TZ=EST5EDT cmd=wc x=* $
There is no way to directly assign a value to a positional parameter; for example,
1=100
does not work. These parameters are initially set on execution of the shell program. The only way they may be changed is with the shift
or the set
commands. If words are given as arguments to set
on the command line, those words will be assigned to the positional parameters $1
, $2
, and so forth. The previous values stored in the positional parameters will be lost forever. So
set a b c
assigns a
to $1
, b
to $2
, and c
to $3
. $#
also gets set to 3.
$ set one two three four $ echo $1:$2:$3:$4 one:two:three:four $ echo $# This should be 4 4 $ echo $* What does this reference now? one two three four $ for arg; do echo $arg; done one two three four $
So after execution of the set
, everything seems to work consistently: $#
, $*
, and the for
loop without a list.
set
is often used in this fashion to “parse” data read from a file or the terminal. Here's a program called words
that counts the number of words typed on a line (using the shell's definition of a “word”):
$ cat words # # Count words on a line # read line set $line echo $# $ words Run it Here's a line for you to count. 7 $
The program stores the line read in the shell variable line
and then executes the command
set $line
This causes each word stored in line
to be assigned to the positional parameters. The variable $#
is also set to the number of words assigned, which is the number of words on the line.
Try typing in a line to words
that begins with a -
and see what happens:
$ words
-1 + 5 = 4
words: -1: bad option(s)
$
After the line was read and assigned to line
, the command
set $line
was executed. After the shell did its substitution, the command line looked like this:
set -1 + 5 = 4
When set
executed, it saw the -
and thought that an option was being selected, thus explaining the error message.
Another problem with words
occurs if you give it a line consisting entirely of whitespace characters, or if the line is null:
$ words Just Enter is pressed CDPATH=.:/users/steve:/usr/spool EDITOR=/bin/vi HOME=/users/steve IFS= LOGNAME=steve MAIL=/usr/spool/mail/steve MAILCHECK=600 PATH=/bin:/usr/bin:/users/steve/bin:.: PHONEBOOK=/users/steve/phonebook PS1=$ PS2=> PWD=/users/steve/misc SHELL=/usr/bin/sh TERM=xterm TMOUT=0 TZ=EST5EDT cmd=wc x=* 0 $
To protect against both of these problems occurring, you can use the --
option to set
. This tells set
not to interpret any subsequent arguments on the command line as options. It also prevents set
from displaying all your variables if no other arguments follow, as was the case when you typed a null line.
So the set
command in words
should be changed to read
set -- $line
With the addition of a while
loop and some integer arithmetic, the words
program can be easily modified to count the total number of words on standard input, giving you your own version of wc -w
:
$ cat words
#
# Count all of the words on standard input
#
count=0
while read line
do
set -- $line
count=$(( count + $# ))
done
echo $count
$
After each line is read, the set
command is executed to take advantage of the fact that $#
will be assigned the number of words on the line. The --
option is supplied to set
just in case any of the lines read begins with a -
or consists entirely of whitespace characters.
The value of $#
is then added into the variable count
, and the next line is read. When the loop is exited, the value of count
is displayed. This represents the total number of words read.
(Our version is a lot slower than wc
because the latter is written in C.)
Here's a quick way to count the number of files in your directory:[1]
$ set * $ echo $# 8 $
This is much faster than
ls | wc -l
because the first method uses only shell built-in commands. In general, your shell programs run much faster if you try to get as much done as you can using the shell's built-in commands.
set
accepts several other options, each of them enabled by preceding the option with a -
, and disabled by preceding it with a +
. The -x
option that we have described here is perhaps the most commonly used. Others are summarized in Table A.9 in Appendix A.
There is a special shell variable called IFS
, which stands for Internal Field Separator. The shell uses the value of this variable when parsing input from the read
command, output from command substitution (the back-quoting mechanism), and when performing variable substitution. If it's typed on the command line, the shell treats it like a normal whitespace character (that is, as a word delimiter).
See what it's set to now:
$ echo "$IFS"
$
Well, that wasn't very illuminating! To determine the actual characters stored in there, pipe the output from echo
into the od
(octal dump) command with the -b
(byte display) option:
$ echo "$IFS" | od –b
0000000 040 011 012 012
0000004
$
The first column of numbers shown is the relative offset from the start of the input. The following numbers are the octal equivalents of the characters read by od
. The first such number is 040
, which is the ASCII value of the space character. It's followed by 011
, the tab character, and then by 012
, the newline character. The next character is another newline; this was written by the echo
. These characters for IFS
come as no surprise; they're the “whitespace” characters we've talked about throughout the book.
You can change your IFS
to any character or characters you want. This is useful when you want to parse a line of data whose fields aren't delimited by the normal whitespace characters. For example, we noted that the shell normally strips any leading whitespace characters from the beginning of any line that you read with the read
command. You can change your IFS
to just a newline character before the read
is executed, which has the effect of preserving the leading whitespace (because the shell won't consider it a field delimiter):
$ read line Try it the "old" way Here's a line $ echo "$line" Here's a line $ IFS=" > " Set it to a just a newline $ read line Try it again Here's a line $ echo "$line" Here's a line Leading spaces preserved $
To change the IFS
to just a newline, an open quote was typed, followed immediately by the pressing of the Enter key, followed by the closed quote on the next line. No additional characters can be typed inside those quotes because they'll be stored inside IFS
and then used by the shell.
Now let's change the IFS
to something more visible, like a colon:
$ IFS=: $ read x y z 123:345:678 $ echo $x 123 $ echo $z 678 $ list="one:two:three" $ for x in $list; do echo $x; done one two three $ var=a:b:c $ echo "$var" a:b:c $
Because the IFS
was changed to a colon, when the line was read, the shell divided the line into three words: 123
, 345
, and 678
, which were stored into the three variables x
, y
, and z
, respectively. In the next to last example, the shell used the IFS
when substituting the value of list
in the for
loop. The last example shows that the shell doesn't use the IFS
when performing variable assignment.
Changing the IFS
is often done in conjunction with execution of the set
command:
$ line="Micro Logic Corp.:Box 174:Hackensack, NJ 07602" $ IFS=: $ set $line $ echo $# How many parameters were set? 3 $ for field; do echo $field; done Micro Logic Corp. Box 174 Hackensack, NJ 07602 $
This technique is a powerful one; it uses all built-in shell commands, which also makes it very fast. (An alternative approach might have been to echo
the value of $line
into the tr
command, where all colons could have been translated into newlines, an approach that would have been much slower.) This technique is used in a final version of the rolo
program that's presented in Chapter 14, “Rolo Revisited.”
The following program, called number2
, is a final version of the line numbering program presented in Chapter 10, “Reading and Printing Data.” This program faithfully prints the input lines to standard output, preceded by a line number. Notice the use of printf
to right-align the line numbers.
$ cat number2
#
# Number lines from files given as argument or from
# standard input if none supplied (final version)
#
# Modify the IFS to preserve leading whitespace on input
IFS='
' # Just a newline appears between the quotes
lineno=1
cat $* |
while read -r line
do
printf "%5d:%s
" $lineno "$line"
lineno=$(( lineno + 1 ))
done
Here's a sample execution of number
:
$ number2 words 1:# 2:# Count all of the words on standard input 3:# 4: 5:count=0 6:while read line 7:do 8: set -- $line 9: count=$(( count + $# )) 10:done 11: 12:echo $count $
Because the IFS
has an influence on the way things are interpreted by the shell, if you're going to change it in your program, it's usually wise to save the old value first in another variable (such as OIFS
) and then restore it after you've finished the operations that depend on the changed IFS
.
The readonly
command is used to specify variables whose values cannot be subsequently changed. For example,
readonly PATH HOME
makes the PATH
and HOME
variables read-only. Subsequently attempting to assign a value to these variables causes the shell to issue an error message:
$ PATH=/bin:/usr/bin:.: $ readonly PATH $ PATH=$PATH:/users/steve/bin sh: PATH: is read-only $
Here you see that after the variable PATH
was made read-only, the shell printed an error message when an attempt was made to assign a value to it.
To get a list of your read-only variables, type readonly –p
without any arguments:[2]
$ readonly -p
readonly PATH=/bin:/usr/bin:.:
$
unset
removes both exported and local shell variables.
You should be aware of the fact that the read-only variable attribute is not passed down to subshells. Also, after a variable has been made read-only in a shell, there is no way to “undo” it.
Sometimes you may want to remove the definition of a variable from your environment. To do so, you type unset
followed by the names of the variables:
$ x=100 $ echo $x 100 $ unset x Remove x from the environment $ echo $x $
You can't unset
a read-only variable. Furthermore, the variables IFS
, MAILCHECK
, PATH
, PS1
, and PS2
cannot be unset
. Also, some older shells do not support the unset
command.
1: | Given the following variable assignments: $ EDITOR=/bin/vi $ DB= $ EDITFLAG=yes $ PHONEBOOK= $ What will be the results of the following commands? echo ${EDITOR} echo ${DB:=/users/pat/db} echo ${EDITOR:-/bin/ed} echo ${PHONEBOOK:?} echo ${DB:-/users/pat/db} ed=${EDITFLAG:+${EDITOR:-/bin/ed}} |
2: | Rewrite the steve:*:203:100::/users/steve:/usr/bin/ksh Here the fifth field is null ( |
3: | Using the fact that the shell construct |
4: | Write a function called rightmatch value pattern
where value is a sequence of one or more characters, and pattern is a shell pattern that is to be removed from the right side of value. The shortest matching pattern should be removed from value and the result written to standard output. Here is some sample output: $ rightmatch test.c .c test $ rightmatch /usr/spool/uucppublic '/*' /usr/spool $ rightmatch /usr/spool/uucppublic o /usr/spool/uucppublic $ The last example shows that the |
5: | Write a function called leftmatch pattern value
Here are some example uses: $ leftmatch /usr/spool/ /usr/spool/uucppublic uucppublic $ leftmatch s. s.main.c main.c $ |
6: | Write a function called $ substring /usr/ /usr/spool/uucppublic /uucppublic spool $ substring s. s.main.c .c main $ substring s. s.main.c .o Only left match main.c $ substring x. s.main.c .o No matches s.main.c $ |
7: | Modify the |
[1] This technique may not work on very large directories because you may exceed the limit on the length of the command line (the precise length varies between Unix systems). Working with such directories may cause problems when using filename substitution in other commands as well, such as echo *
or for file in *
.
[2] By default, Bash produces output of the form declare –r
variable. To get POSIX-compliant output, you must run Bash with the –posix
command-line option or run the set
command with the –o posix
option.
3.22.41.212