Appendix C. Command-Line Processing

Throughout the book we’ve seen a variety of ways in which the shell processes input lines, especially using read. We can think of this process as a subset of the things the shell does when processing command lines. This appendix provides a more detailed description of the steps involved in processing the command line and how you can get bash to make a second pass with eval. The material in this appendix also appears in Learning the bash Shell, 3rd Edition, by Cameron Newham (O’Reilly).

Command-Line Processing Steps

We’ve touched upon command-line processing throughout this book; we’ve mentioned how bash deals with single quotes (''), double quotes (""), and backslashes (); how it separates characters on a line into words, even allowing you to specify the delimiter it uses via the environment variable $IFS; how it assigns the words to shell variables (e.g., $1, $2, etc); and how it can redirect input and output to/from files or other processes (pipelines). In order to be a real expert at shell scripting (or to debug some gnarly problems), you’ll need to understand the various steps involved in command-line processing—especially the order in which they occur.

Each line that the shell reads from STDIN or from a script is called a pipeline because it contains one or more commands separated by zero or more pipe characters (|). Figure C-1 shows the steps in command-line processing.

bcb2 ac01
Figure C-1. Steps in command-line processing

For each pipeline it reads, the shell breaks it up into commands, sets up the I/O for the pipeline, then does the following for each command:

  1. Splits the command into tokens that are separated by the fixed set of metacharacters space, tab, newline, ;, (, ), <, >, |, and &. Types of tokens include words, keywords, I/O redirectors, and semicolons.

  2. Checks the first token of each command to see if it is a keyword with no quotes or backslashes. If it’s an opening keyword such as if or another control-structure opener, function, {, or (, then the command is actually a compound command. The shell sets things up internally for the compound command, reads the next command, and starts the process again. If the keyword isn’t a compound command opener (e.g., it is a control-structure “middle” like then, else, or do; an “end” like fi or done; or a logical operator), the shell signals a syntax error.

  3. Checks the first word of each command against the list of aliases. If a match is found, it substitutes the alias’s definition and goes back to step 1; otherwise, it goes on to step 4. This scheme allows recursive aliases and allows for keywords to be defined (e.g., alias aslongas=while or alias procedure=function).

  4. Performs brace expansion. For example, a{b,c} becomes ab ac .

  5. Substitutes the user’s home directory ($HOME) for tilde if it is at the beginning of a word. Substitutes the user’s home directory for ~user.

  6. Performs parameter (variable) substitution for any expression that starts with a dollar sign ($).

  7. Does command substitution for any expression of the form $(string).

  8. Evaluates arithmetic expressions of the form $((string)).

  9. Takes the parts of the line that resulted from parameter, command, and arithmetic substitution and splits them into words again. This time it uses the characters in $IFS as delimiters instead of the set of metacharacters in step 1.

  10. Performs pathname expansion, a.k.a. wildcard expansion, for any occurrences of *, ?, and [] pairs.

  11. Uses the first word as a command by looking up its source in the following order: as a function command, then as a builtin, then as a file in any of the directories in $PATH .

  12. Runs the command after setting up I/O redirection and other such things.

That’s a lot of steps—and it’s not even the whole story! But before we go on, an example should make this process clearer. Assume that the following command has been run:

alias ll="ls -l"

Further assume that a file exists called .hist537 in user alice’s home directory, which is /home/alice, and that there is a double-dollar-sign variable $$ whose value is 2537 (remember $$ is the process ID, a number unique among all currently running processes).

Now let’s see how the shell processes the following command:

ll $(type -path cc) ~alice/.*$(($$%1000)

Here is what happens to this line:

  1. ll $(type -pathcc) ~alice/.*$(($$%1000)) splits the input into words.

  2. ll is not a keyword, so step 2 does nothing.

  3. ls -l $(type -path cc) ~alice/.*$(($$%1000)) substitutes ls -l for its alias ll. The shell then repeats steps 1 through 3; step 2 splits the ls -l into two words.

  4. ls-l$(type -pathcc) ~alice/.*$(($$%1000)) does nothing.

  5. ls -l $(type -path cc) /home/alice/.*$(($$%1000)) expands ~alice into /home/alice.

  6. ls-l $(type-pathcc) /home/alice/.*$((2537%1000)) substitutes 2537 for $$.

  7. ls-l /usr/bin/cc/home/alice/.*$((2537%1000)) does command substitution on type -path cc.

  8. ls -l /usr/bin/cc/home/alice/.*537 evaluates the arithmetic expression 2537%1000.

  9. ls-l /usr/bin/cc/home/alice/.*537 does nothing.

  10. ls -l /usr/bin/cc/home/alice/.hist537 substitutes the filename for the wild-card expression .*537.

  11. The command ls is found in /usr/bin.

  12. /usr/bin/ls is run with the option -l and the two arguments.

Although this list of steps is fairly straightforward, it is not the whole story. There are still five ways to modify this process: quoting; using command, builtin, or enable; and using the advanced command eval.

Quoting

You can think of quoting as a way of getting the shell to skip some of the 12 steps described earlier. In particular:

  • Single quotes ('') bypass everything from step 1 through step 10, including aliasing. All characters inside a pair of single quotes are untouched. You can’t have single quotes inside single quotes, even if you precede them with backslashes.

  • Double quotes ("") bypass steps 1 through 4, plus steps 9 and 10. That is, they ignore pipe characters, aliases, tilde substitution, wildcard expansion, and splitting into words via delimiters (e.g., blanks) inside the double quotes. Single quotes inside double quotes have no effect. But double quotes do allow parameter substitution, command substitution, and arithmetic expression evaluation. You can include a double quote inside a double-quoted string by preceding it with a backslash (). You must also backslash-escape $, ` (the archaic command substitution delimiter), and itself.

Table C-1 has simple examples to show how these work; they assume the statement person=hatter was run and user alice’s home directory is /home/alice.

Table C-1. Examples of using single and double quotes
Expression Value

$person

hatter

"$person"

hatter

$person

$person

'$person'

$person

"'$person'"

'hatter'

~alice

/home/alice

"~alice"

~alice

'~alice'

~alice

If you are wondering whether to use single or double quotes in a particular shell programming situation, it is safest to use single quotes unless you specifically need parameter, command, or arithmetic substitution.

eval

We have seen that quoting lets you skip steps in command-line processing. Then there’s the eval command, which lets you go through the process again. Performing command-line processing twice may seem strange, but it’s actually very powerful: it lets you write scripts that create command strings on the fly and then pass them to the shell for execution. This means that you can give scripts “intelligence” to modify their own behavior as they are running.

The eval statement tells the shell to take eval’s arguments and run them through the command-line processing steps all over again. To help you understand the implications of eval, we’ll start with a trivial example and work our way up to a situation in which we’re constructing and running commands on the fly.

eval ls passes the string “ls” to the shell to execute; the shell prints a list of files in the current directory. This is very simple—there is nothing about the string “ls” that needs to be sent through the command-processing steps twice. But consider this:

listpage="ls | more"
$listpage

Instead of producing a paginated file listing, the shell will treat | and more as arguments to ls, and ls will complain that no files of those names exist. Why? Because the pipe character appears as a pipe in step 6 when the shell evaluates the variable, which is after it has actually looked for pipe characters. The variable’s expansion isn’t even parsed until step 9. As a result, the shell will treat | and more as arguments to ls, so that ls will try to find files called | and more in the current directory!

Now consider eval $listpage instead of just $listpage. When the shell gets to the last step, it will run the command eval with arguments ls, |, and more. This causes the shell to go back to step 1 with a line that consists of these arguments. It finds | in step 2 and splits the line into two commands, ls and more. Each command is processed in the normal (and in both cases trivial) way. The result is a paginated list of the files in your current directory.

Now you may start to see how powerful eval can be. It is an advanced feature that requires considerable programming cleverness to be used most effectively. It even has a bit of the flavor of artificial intelligence, in that it enables you to write programs that can “write” and execute other programs. You probably won’t use eval for everyday shell programming, but it’s worth taking the time to understand what it can do.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.15.214.155