Chapter 6. Can I Quote You on That?

This chapter teaches you about a unique feature of the shell programming language: the way it interprets quote characters. Basically, the shell recognizes four different types of quote characters:

  • The single quote character '

  • The double quote character "

  • The backslash character

  • The back quote character `

The first two and the last characters in the preceding list must occur in pairs, whereas the backslash character is unary in nature. Each of these quotes has a distinct meaning to the shell. We'll cover them in separate sections of this chapter.

The Single Quote

There are several reasons that you might need to use quotes in the shell. One of these is to keep characters otherwise separated by whitespace characters together. Let's look at an example. Here's a file called phonebook that contains names and phone numbers:

$ cat phonebook
Alice Chebba    973-555-2015
Barbara Swingle 201-555-9257
Liz Stachiw     212-555-2298
Susan Goldberg  201-555-7776
Susan Topple    212-555-4932
Tony Iannino    973-555-1295
$

To look up someone in our phonebook file—which has been kept small here for the sake of example—you use grep:

$ grep Alice phonebook
Alice Chebba    973-555-2015
$

Look what happens when you look up Susan:

$ grep Susan phonebook
Susan Goldberg  201-555-7776
Susan Topple    212-555-4932
$

There are two lines that contain Susan, thus explaining the two lines of output. One way to overcome this problem would be to further qualify the name. For example, you could specify the last name as well:

$ grep Susan Goldberg phonebook
grep: can't open Goldberg
Susan Goldberg  201-555-7776
Susan Topple    212-555-4932
$

Recalling that the shell uses one or more whitespace characters to separate the arguments on the line, the preceding command line results in grep being passed three arguments: Susan, Goldberg, and phonebook (see Figure 6.1).

grep Susan Goldberg phonebook.

Figure 6.1. grep Susan Goldberg phonebook.

When grep is executed, it takes the first argument as the pattern and the remaining arguments as the names of the files to search for the pattern. In this case, grep thinks it's supposed to look for Susan in the files Goldberg and phonebook. So it tries to open the file Goldberg, can't find it, and issues the error message:

grep: can't open Goldberg

Then it goes to the next file, phonebook, opens it, searches for the pattern Susan, and prints the two matching lines. The problem boils down to trying to pass whitespace characters as arguments to programs. This can be done by enclosing the entire argument inside a pair of single quotes, as in

grep 'Susan Goldberg' phonebook

When the shell sees the first single quote, it ignores any otherwise special characters that follow until it sees the closing quote.

$ grep 'Susan Goldberg' phonebook
Susan Goldberg   201-555-7776
$

In this case, the shell encountered the first ', and ignored any special characters until it found the closing '. So the space between Susan and Goldberg, which would have normally delimited the two arguments, was ignored by the shell. The shell therefore divided the command line into two arguments, the first Susan Goldberg (which includes the space character) and the second phonebook. It then executed grep, passing it these two arguments (see Figure 6.2).

grep 'Susan Goldberg' phonebook.

Figure 6.2. grep 'Susan Goldberg' phonebook.

grep then took the first argument, Susan Goldberg, and looked for it in the file specified by the second argument, phonebook. Note that the shell removes the quotes from the command line and does not pass them to the program.

No matter how many space characters are enclosed between quotes, they are preserved by the shell.

$ echo  one            two      three    four
one two three four
$ echo 'one            two      three    four'
one           two       three    four
$

In the first case, the shell removes the extra whitespace characters from the line and passes echo the four arguments one, two, three, and four (see Figure 6.3).

echo one two three four.

Figure 6.3. echo one two three four.

In the second case, the space characters are preserved, and the shell treats the entire string of characters enclosed between the quotes as a single argument when executing echo (see Figure 6.4).

echo 'one two three four'.

Figure 6.4. echo 'one two three four'.

As we mentioned, all special characters are ignored by the shell if they appear inside single quotes. That explains the output from the following:

$ file=/users/steve/bin/prog1
$ echo $file
/users/steve/bin/progl
$ echo '$file'          $ not interpreted
$file
$ echo *
addresses intro lotsaspaces names nu numbers phonebook stat
$ echo '*'
*
$ echo '< > | ; ( ) { } >> " ` &'
< > | ; ( ) { } >> " ` &
$

Even the Enter key will be ignored by the shell if it's enclosed in quotes:

$ echo 'How are you today,
> John'
How are you today,
John
$

After typing the first line, the shell sees that the quote isn't matched, so it waits for you to type in the closing quote. As an indication that the shell is waiting for you to finish typing in a command, it changes your prompt character from $ to >. This is known as your secondary prompt character and is displayed by the shell whenever it's waiting for you to finish typing a command.

Quotes are also needed when assigning values containing whitespace or special characters to shell variables:

$ message='I must say, this sure is fun'
$ echo $message
I must say, this sure is fun
$ text='* means all files in the directory'
$ echo $text
names nu numbers phonebook stat means all files in the directory
$

The quotes are needed in the assignments made to the variables message and text because of the embedded spaces. In the preceding example, you are reminded that the shell still does filename substitution after variable name substitution, meaning that the * is replaced by the names of all the files in the current directory before the echo is executed. There is a way to overcome this annoyance, and it's through the use of double quotes.

The Double Quote

Double quotes work similarly to single quotes, except that they're not as restrictive. Whereas the single quotes tell the shell to ignore all enclosed characters, double quotes say to ignore most. In particular, the following three characters are not ignored inside double quotes:

  • Dollar signs

  • Back quotes

  • Backslashes

The fact that dollar signs are not ignored means that variable name substitution is done by the shell inside double quotes.

$ x=*
$ echo $x
addresses intro lotsaspaces names nu numbers phonebook stat
$ echo '$x'
$x
$ echo "$x"
*
$

Here you see the major differences between no quotes, single quotes, and double quotes. In the first case, the shell sees the asterisk and substitutes all the filenames from the current directory. In the second case, the shell leaves the characters enclosed within the single quotes alone, which results in the display of $x. In the final case, the double quotes indicate to the shell that variable name substitution is still to be performed inside the quotes. So the shell substitutes * for $x. Because filename substitution is not done inside double quotes, * is then passed to echo as the value to be displayed.

So if you want to have the value of a variable substituted, but don't want the shell to treat the substituted characters specially, you must enclose the variable inside double quotes.

Here's another example illustrating the difference between double quotes and no quotes:

$ address="39 East 12th Street
> New York, N. Y. 10003"
$ echo $address
39 East 12th Street New York, N. Y. 10003
$ echo "$address"
39 East 12th Street
New York, N. Y. 10003
$

It makes no difference whether the value assigned to address is enclosed in single quotes or double quotes. The shell displays the secondary command prompt in either case to tell you it's waiting for the corresponding closed quote.

After assigning the two-line address to address, the value of the variable is displayed by echo. Notice that the address is displayed on a single line. The reason is the same as what caused

echo one           two      three   four

to be displayed as

one two three four

Recalling that the shell removes spaces, tabs, and newlines (that is, whitespace characters) from the command line and then cuts it up into arguments, in the case of

echo $address

the shell simply removes the embedded newline character, treating it as it would a space or tab: as an argument delimiter. Then it passes the nine arguments to echo to be displayed. echo never gets a chance to see that newline; the shell gets to it first (see Figure 6.5).

echo $address.

Figure 6.5. echo $address.

When the command

echo "$address"

is used instead, the shell substitutes the value of address as before, except that the double quotes tell it to leave any embedded whitespace characters alone. So in this case, the shell passes a single argument to echo—an argument that contains an embedded newline. echo simply displays its single argument at the terminal; Figure 6.6 illustrates this. The newline character is depicted by the characters n.

echo "$address".

Figure 6.6. echo "$address".

Double quotes can be used to hide single quotes from the shell, and vice versa:

$ x="' Hello,' he said"
$ echo $x
'Hello,' he said
$ article=' "Keeping the Logins from Lagging," Bell Labs Record'
$ echo $article
"Keeping the Logins from Lagging," Bell Labs Record
$

The Backslash

Basically, the backslash is equivalent to placing single quotes around a single character, with a few minor exceptions. The backslash quotes the single character that immediately follows it. The general format is

c

where c is the character you want to quote. Any special meaning normally attached to that character is removed. Here is an example:

$ echo >
syntax error: 'newline or ;' unexpected
$ echo >
>
$

In the first case, the shell sees the > and thinks that you want to redirect echo's output to a file. So it expects a filename to follow. Because it doesn't, the shell issues the error message. In the next case, the backslash removes the special meaning of the >, so it is passed along to echo to be displayed.

$ x=*
$ echo $x
$x
$

In this case, the shell ignores the $ that follows the backslash, and as a result, variable substitution is not performed.

Because a backslash removes the special meaning of the character that follows, can you guess what happens if that character is another backslash? Right, it removes the special meaning of the backslash:

$ echo \

$

Naturally, you could have also written

$ echo ''

$

Using the Backslash for Continuing Lines

As mentioned at the start of this section, c is basically equivalent to 'c'. One exception to this rule is when the backslash is used as the very last character on the line:

$ lines=one'
> 'two          Single quotes tell shell to ignore newline
$ echo "$lines"
one
two
$ lines=one          Try it with a  instead
> two
$ echo "$lines"
onetwo
$

The shell treats a backslash at the end of the line as a line continuation. It removes the newline character that follows and also does not treat the newline as an argument delimiter (it's as if it wasn't even typed). This construct is most often used for typing long commands over multiple lines.

The Backslash Inside Double Quotes

We noted earlier that the backslash is one of the three characters interpreted by the shell inside double quotes. This means that you can use the backslash inside these quotes to remove the meaning of characters that otherwise would be interpreted inside double quotes (that is, other backslashes, dollar signs, back quotes, newlines, and other double quotes). If the backslash precedes any other character inside double quotes, the backslash is ignored by the shell and passed on to the program:

$ echo "$x"
$x
$ echo " is the backslash character"
 is the backslash character
$ x=5
$ echo "The value of x is "$x""
The value of x is "5"
$

In the first example, the backslash precedes the dollar sign, interpreted by the shell inside double quotes. So the shell ignores the dollar sign, removes the backslash, and executes echo. In the second example, the backslash precedes a space, not interpreted by the shell inside double quotes. So the shell ignores the backslash and passes it on to the echo command. The last example shows the backslash used to enclose double quotes inside a double-quoted string.

As an exercise in the use of quotes, let's say that you want to display the following line at the terminal:

<<< echo $x >>> displays the value of x, which is $x

The intention here is to substitute the value of x in the second instance of $x, but not in the first. Let's first assign a value to x:

$ x=1
$

Now try displaying the line without using any quotes:

$ echo <<< echo $x >>> displays the value of x, which is $x
syntax error: '<' unexpected
$

The < signals input redirection to the shell; this is the reason for the error message.

If you put the entire message inside single quotes, the value of x won't be substituted at the end. If you enclose the entire string in double quotes, both occurrences of $x will be substituted. Here are two different ways to do the quoting properly (realize that there are usually several different ways to quote a string of characters to get the results you want):

$ echo "<<< echo $x >>> displays the value of x, which is $x"
<<< echo $x >>> displays the value of x, which is 1
$ echo '<<< echo $x >>> displays the value of x, which is' $x
<<< echo $x >>> displays the value of x, which is 1
$

In the first case, everything is enclosed in double quotes, and the backslash is used to prevent the shell from performing variable substitution in the first instance of $x. In the second case, everything up to the last $x is enclosed in single quotes. If the variable x might have contained some filename substitution or whitespace characters, a safer way of writing the echo would have been

echo '<<< echo $x >>> displays the value of x, which is' "$x"

Command Substitution

Command substitution refers to the shell's capability to insert the standard output of a command at any point in a command line. There are two ways in the shell to perform command substitution: by enclosing a shell command with back quotes and with the $(...) construct.

The Back Quote

The back quote is unlike any of the previously encountered types of quotes. Its purpose is not to protect characters from the shell but to tell the shell to execute the enclosed command and to insert the standard output from the command at that point on the command line. The general format for using back quotes is

`command`

where command is the name of the command to be executed and whose output is to be inserted at that point.[1]

Here is an example:

$ echo The date and time is: `date`
The date and time is: Wed Aug 28 14:28:43 EDT 2002
$

When the shell does its initial scan of the command line, it notices the back quote and expects the name of a command to follow. In this case, the shell finds that the date command is to be executed. So it executes date and replaces the `date` on the command line with the output from the date. After that, it divides the command line into arguments in the normal manner and then initiates execution of the echo command.

$ echo Your current working directory is `pwd`
Your current working directory is /users/steve/shell/ch6
$

Here the shell executes pwd, inserts its output on the command line, and then executes the echo. Note that in the following section, back quotes can be used in all the places where the $(...) construct is used.

The $(...) Construct

The POSIX standard shell supports the newer $(...) construct for command substitution. The general format is

$(command)

where, as in the back quoting method, command is the name of the command whose standard output is to be substituted on the command line. For example:

$ echo The date and time is: $(date)
The date and time is: Wed Aug 28 14:28:43 EDT 2002
$

This construct is better than back quotes for a couple of reasons. First, complex commands that use combinations of forward and back quotes can be difficult to read, particularly if the typeface you're using doesn't have visually different single quotes and back quotes; second, $(...) constructs can be easily nested, allowing command substitution within command substitution. Although nesting can also be performed with back quotes, it's a little trickier. You'll see an example of nested command substitution later in this section.

You are not restricted to executing a single command between the parentheses: Several commands can be executed if separated by semicolons. Also, pipelines can be used. Here's a modified version of the nu program that displays the number of logged-in users:

$ cat nu
echo There are $(who | wc –l) users logged in
$ nu                    Execute it
There are 13 users logged in
$

Because single quotes protect everything, the following output should be clear:

$ echo '$(who | wc –l) tells how many users are logged in'
$(who | wc –l) tells how many users are logged in
$

But command substitution is interpreted inside double quotes:

$ echo "You have $(ls | wc –l) fi1es in your directory"
You have       7 files in your directory
$

(What causes those leading spaces before the 7?) Remember that the shell is responsible for executing the command enclosed between the parentheses. The only thing the echo command sees is the output that has been inserted by the shell.

Suppose that you're writing a shell program and want to assign the current date and time to a variable called now, perhaps to display it later at the top of a report, or log it into a file. The problem here is that you somehow want to take the output from date and assign it to the variable. Command substitution can be used for this:

$ now=$(date)          Execute date and store the output in now
$ echo $now          See what got assigned
Wed Aug 28 14:47:26 EDT 2002
$

When you write

now=$(date)

the shell realizes that the entire output from date is to be assigned to now. Therefore, you don't need to enclose $(date) inside double quotes.

Even commands that produce more than a single line of output can be stored inside a variable:

$ filelist=$(ls)
$ echo $filelist
addresses intro lotsaspaces names nu numbers phonebook stat
$

What happened here? You end up with a horizontal listing of the files even though the newlines from ls were stored inside the filelist variable (take our word for it). The newlines got eaten up when the value of filelist was substituted by the shell in processing the echo command line. Double quotes around the variable will preserve the newlines:

$ echo "$filelist"
addresses
intro
lotsaspaces
names
nu
numbers
phonebook
stat
$

To store the contents of a file into a variable, you can use cat:

$ namelist=$(cat names)
$ echo "$names"
Charlie
Emanuel
Fred
Lucy
Ralph
Tony
Tony
$

If you want to mail the contents of the file memo to all the people listed in the names file (who we'll assume here are users on your system), you can do the following:

$ mail $(cat names) < memo
$

Here the shell executes the cat and inserts the output on the command line so it looks like this:

mail Charlie Emanuel Fred Lucy Ralph Tony Tony < memo

Then it executes mail, redirecting its standard input from the file memo and passing it the names of seven users who are to receive the mail.

Notice that Tony receives the same mail twice because he's listed twice in the names file. You can remove any duplicate entries from the file by using sort with the -u option (remove duplicate lines) rather than cat to ensure that each person only receives mail once:

$ mail $(sort -u names) < memo
$

It's worth noting that the shell does filename substitution after it substitutes the output from commands. Enclosing the commands inside double quotes prevents the shell from doing the filename substitution on this output if desired.

Command substitution is often used to change the value stored in a shell variable. For example, if the shell variable name contains someone's name, and you want to convert every character in that variable to uppercase, you could use echo to get the variable to tr's input, perform the translation, and then assign the result back to the variable:

$ name="Ralph Kramden"
$ name=$(echo $name | tr '[a-z]' '[A-Z]')   Translate to uppercase
$ echo $name
RALPH KRAMDEN
$

The technique of using echo in a pipeline to write data to the standard input of the following command is a simple yet powerful technique; it's used often in shell programs.

The next example shows how cut is used to extract the first character from the value stored in a variable called filename:

$ filename=/users/steve/memos
$ firstchar=$(echo $filename | cut -c1)
$ echo $firstchar
/
$

sed is also often used to “edit” the value stored in a variable. Here it is used to extract the last character from the variable file:

$ file=exec.o
$ lastchar=$(echo $file | sed 's/.*(.)$/1/')
$ echo $lastchar
o
$

The sed command says to replace all the characters on the line with the last one. The result of the sed is stored in the variable lastchar. The single quotes around the sed command are important because they prevent the shell from messing around with the backslashes (would double quotes also have worked?).

Finally, command substitutions can be nested. Suppose that you want to change every occurrence of the first character in a variable to something else. In a previous example, firstchar=$(echo $filename | cut -c1) gets the first character from filename, but how do we use this character to change every occurrence in filename? A two-step process is one way:

$ filename=/users/steve/memos
$ firstchar=$(echo $filename | cut -c1)
$ filename=$(echo $filename | tr "$firstchar" "^")    translate / to ^
$ echo $filename
^users^steve^memos
$

Or a single, nested command substitution can perform the same operation:

$ filename=/users/steve/memos
$ filename=$(echo $filename | tr "$(echo $filename | cut -c1)" "^")
$ echo $filename
^users^steve^memos
$

If you have trouble understanding this example, compare it to the previous one: Note how the firstchar variable in the earlier example is replaced by the nested command substitution; otherwise, the two examples are the same.

The expr Command

Although the POSIX standard shell supports built-in integer arithmetic operations, older shells don't. It's likely that you may see command substitution with a Unix program called expr, which evaluates an expression given to it on the command line:

$ expr 1 + 2
3
$

Each operator and operand given to expr must be a separate argument, thus explaining the output from the following:

$ expr 1+2
1+2
$

The usual arithmetic operators are recognized by expr: + for addition, - for subtraction, / for division, * for multiplication, and % for modulus (remainder).

$ expr 10 + 20 / 2
20
$

Multiplication, division, and modulus have higher precedence than addition and subtraction. Thus, in the preceding example the division was performed before the addition.

$ expr 17 * 6
expr: syntax error
$

What happened here? The answer: The shell saw the * and substituted the names of all the files in your directory! It has to be quoted to keep it from the shell:

$ expr "17 * 6"
17 * 6
$

That's not the way to do it. Remember that expr must see each operator and operand as a separate argument; the preceding example sends the whole expression in as a single argument.

$ expr 17 * 6
102
$

Naturally, one or more of the arguments to expr can be the value stored inside a shell variable because the shell takes care of the substitution first anyway:

$ i=1
$ expr $i + 1
2
$

This is the older method for performing arithmetic on shell variables. Do the same type of thing as shown previously only use the command substitution mechanism to assign the output from expr back to the variable:

$ i=1
$ i=$(expr $i + 1)       Add 1 to i
$ echo $i
2
$

In legacy shell programs, you're more likely to see expr used with back quotes:

$ i=`expr $i + 1`       Add 1 to i
$ echo $i
3
$

Note that like the shell's built-in integer arithmetic, expr only evaluates integer arithmetic expressions. You can use awk or bc if you need to do floating point calculations. Also note that expr has other operators. One of the most frequently used ones is the : operator, which is used to match characters in the first operand against a regular expression given as the second operand. By default, it returns the number of characters matched.

The expr command

expr "$file" : ".*"

returns the number of characters stored in the variable file, because the regular expression .* matches all the characters in the string. For more details on expr, consult your Unix User's Manual.

Table A.5 in Appendix A summarizes the way quotes are handled by the shell.

Exercises

1:

Given the following assignments:

$ x=*
$ y=?
$ z='one
> two
> three'
$ now=$(date)
$ symbol='>'
$

and these files in your current directory:

$ echo *
names test1 u vv zebra
$

What will the output be from the following commands?

echo *** error ***        echo 'Is 5 * 4 > 18 ?'

echo $x                   echo What is your name?

echo $y                   echo Would you like to play a game?

echo "$y"                 echo ***

echo $z | wc -l           echo $$symbol

echo "$z" | wc -l         echo $$symbol

echo '$z' I wc -l         echo ""

echo _$now_               echo "\"

echo hello $symbol out    echo \

echo """                echo I don't understand

2:

Write the commands to remove all the space characters stored in the shell variable text. Be sure to assign the result back to text. First use tr to do it and then do the same thing with sed.

3:

Write the commands to count the number of characters stored in the shell variable text. Then write the commands to count all the alphabetic characters. (Hint: Use sed and wc.) What happens to special character sequences such as if they're stored inside text?

4:

Write the commands to assign the unique lines in the file names to the shell variable namelist.



[1] Note that using the back quote for command substitution is no longer the preferred method; however, we cover it here because of the large number of older, canned shell programs that still use this construct. Also, you should know about back quotes in case you ever need to write shell programs that are portable to older Unix systems with shells that don't support the newer $(...) construct.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.144.113.30