Chapter 2. Standard Output

No software is worth anything if there is no output of some sort, but I/O has long been one of the nastier areas of computing. If you’re ancient, you remember the days when most of the work involved in running a program was setting up the program’s input and output. Some of the problems have gone away; for example, you no longer need to get operators to mount tapes on a tape drive (at least, not on any laptop or desktop system that we’ve seen!). But many of the difficulties are still with us.

One problem is that there are many different types of output. Writing something on the screen is different from writing something in a file—at least, it sure seems different. Writing something in a file also seems different from writing it on a tape, or in flash memory, or on some other kind of device. And what if you want the output from one program to go directly into another program? Should software developers be tasked with writing code to handle all sorts of output devices, even ones that haven’t been invented yet? That’s certainly inconvenient. Should users have to know how to connect the programs they want to run to different kinds of devices? That’s not a very good idea, either.

One of the most important ideas behind the Unix operating system was that everything looked like a file (an ordered sequence of bytes). The operating system was responsible for this magic. It didn’t matter whether you were writing to a file on the disk, the terminal, a tape drive, a memory stick, or something else; your program only needed to know how to write to a file, and the operating system would take it from there. That approach greatly simplified the problem.

The next question was simply, “Which file?” How does a program know whether to write to the file that represents a terminal window, a file on the disk, or some other kind of file? Simple: that’s something that can be left to the shell.

When you run a program, you still have to connect it to output files and input files (which we’ll explore in the next chapter). That task doesn’t go away, but the shell makes it trivially easy. A command as simple as:

dosomething < inputfile > outputfile

reads its input from inputfile and sends its output to outputfile. If you omit the > outputfile, the output goes to your terminal window. If you omit the < inputfile, the program takes its input from the keyboard. The program literally doesn’t know where its output is going, or where its input is coming from. You can send the output anywhere you want (including to another program) by using bash’s redirection facilities.

But that’s just the start. In this chapter, we’ll look at ways to generate output, and the shell’s methods for sending that output to different places.

2.1 Writing Output to the Terminal/Window

Problem

You want some simple output from your shell commands.

Solution

Use the echo builtin command. All the parameters on the command line are printed to the screen. For example:

echo Please wait.

produces:

Please wait.

as we see in this simple session where we typed the command at the bash prompt (the $ character):

$ echo Please wait.
Please wait.
$

Discussion

The echo command is one of the simplest of all bash commands. It prints the arguments of the command line to the screen. But there are a few points to keep in mind. First, the shell is parsing the arguments on the echo command line (like it does for every other command line). This means that it does all its substitutions, wildcard matching, and other things before handing the arguments off to echo. Second, since they are parsed as arguments, the spacing between arguments is ignored. For example:

$ echo this    was     very    widely    spaced
this was very widely spaced
$

Normally the fact that the shell is very forgiving about whitespace between arguments is a helpful feature. Here, with echo, it’s a bit disconcerting (see Recipe 2.2 for tips on preserving whitespace in output and Recipe 13.15 for tips on trimming it from your data).

2.2 Writing Output but Preserving Spacing

Problem

You want the output to preserve your spacing.

Solution

Enclose the string in quotes. The previous example, but with quotes added, will preserve our spacing:

$ echo "this was    very    widely    spaced"
this    was    very    widely    spaced
$

or:

$ echo 'this  was  very   widely   spaced'
this    was   very  widely  spaced
$

Discussion

Since the words are enclosed in quotes, they form a single argument to the echo command. That argument is a string, and the shell doesn’t need to interfere with the contents of the string. In fact, by using single quotes ('') you explicitly tell the shell not to interfere with the string at all. If you use double quotes (""), some shell substitutions do take place (variable, arithmetic, and tilde expansions and command substitutions), but since we have none in this example, the shell has nothing to change. When in doubt, use the single quotes.

2.3 Writing Output with More Formatting Control

Problem

You want more control over the formatting and placement of output.

Solution

Use the printf builtin command.

For example:

$ printf '%s = %d
' Lines $LINES
Lines = 24
$

or:

$ printf '%-10.10s = %4.2f
' 'Gigahertz' 1.92735
Gigahertz   = 1.93
$

Discussion

The printf builtin command behaves like the C language library call, where the first argument is the format control string and the successive arguments are formatted according to the format specifications (%).

The numbers between the % and the format type (s or f in our example) provide additional formatting details. For the floating-point type (f), the first number (4 in the 4.2 specifier) is the width of the entire field. The second number (2) is how many digits should be printed to the right of the decimal point. Note that it rounds the answer.

For a string, the first number is the maximum field width, and the second is the number of bytes to be printed. The string will be truncated (if longer than max) or blank padded (if less than min) as needed. When the max and min specifiers are the same, then the string is guaranteed to be that length. The negative sign on the specifier means to left-align the string (within its field width). Without the minus sign, the string would right-justify, thus:

$ printf '%10.10s = %4.2f
' 'Gigahertz' 1.92735
  Gigahertz = 1.93
$

The string argument can either be quoted or unquoted. Use quotes if you need to preserve embedded spacing (there were no spaces needed in our one-word strings), or if you need to escape the special meaning of any special characters in the string (again, our example had none). It’s a good idea to be in the habit of quoting any string that you pass to printf, so that you don’t forget the quotes when you need them.

See Also

2.4 Writing Output Without the Newline

Problem

You want to produce some output without the default newline that echo provides.

Solution

Using printf it’s easy—just leave off the ending in your format string:

$ printf "%s %s" next prompt
next prompt$

With echo, use the -n option:

$ echo -n prompt
prompt$

Discussion

Since there was no newline at the end of the printf format string (the first argument), the prompt character ($) appears right where the printf left off. This feature is much more useful in shell scripts where you may want to do partial output across several statements before completing the line, or where you want to display a prompt to the user before reading input.

With the echo command (see Recipe 15.6), there are two ways to eliminate the newline. First, the -n option suppresses the trailing newline. The echo command also has several escape sequences with special meanings similar to those in C language strings (e.g., for newline). To use these escape sequences, you must invoke echo with the -e option. One of echo’s escape sequences is c, which doesn’t print a character, but rather inhibits printing the ending newline. Thus, here’s a third solution:

$ echo -e 'hic'
hi$

Because of the powerful and flexible formatting that printf provides, and because it is a builtin with very little overhead to invoke (unlike in other shells or older versions of bash, where printf was a standalone executable), we will use printf for many of our examples throughout the book.

2.5 Saving Output from a Command

Problem

You want to keep the output from a command by putting it in a file.

Solution

Use the > symbol to tell the shell to redirect the output into a file. For example:

$ echo fill it up
fill it up
$ echo fill it up > file.txt
$

Just to be sure, let’s look at what is inside file.txt to see if it captured our output:

$ cat file.txt
fill it up
$

Discussion

The first line of the first part of the example shows an echo command with three arguments that are printed out. The second line uses the > to capture that output into a file named file.txt, which is why no output appears after that echo command.

The second part of the example uses cat to display the contents of the file. We can see that the file contains what echo would have otherwise sent as output.

The cat command gets its name from the longer word concatenation. The cat command concatenates the output from the files listed on its command line, so if you enter cat file1 filetwo anotherfile morefiles the contents of those files will be sent, one after another, to the terminal window. If a large file has been split in half, you can also use cat to glue it back together (i.e., concatenate the two halves) by capturing the output into a third file:

cat first.half second.half > whole.file

So our simple command, cat file.txt, is really just the trivial case of concatenating only one file, with the result sent to the screen. That is to say, while cat is capable of more, its primary use in this example is to dump the contents of a file to the screen.

2.6 Saving Output to Other Files

Problem

You want to save the output with a redirect to elsewhere in the filesystem, not in the current directory.

Solution

Use more of a pathname when you redirect the output:

echo some more data > /tmp/echo.out

or:

echo some more data > ../../over.here

Discussion

The filename that appears after the redirection character (the >) is actually a path-name. If it begins with no other qualifiers, the file will be placed in the current directory.

If that filename begins with a slash (/) then it is an absolute pathname, and output will be placed where it specifies in the filesystem hierarchy (i.e., tree), beginning at the root (provided all the intermediary directories exist and have permissions that allow you to traverse them). We used /tmp since it is a well-known, universally available scratch directory on virtually all Unix systems. The shell, in this example, will create the file named echo.out in the /tmp directory.

Our second example, placing the output into ../../over.here, uses a relative pathname, and the .. is the specially named directory inside every directory that refers to the parent directory. So, each reference to .. moves up a level in the filesystem tree (toward the root, not what we usually mean by “up” in a tree). The point here is that we can redirect our output, if we want, into a file that is far away from where we are running the command.

See Also

  • Learning the bash Shell, 3rd Edition, by Cameron Newham (O’Reilly), pages 7–10, for an introduction to files, directories, and the dot notation (i.e., . and ..)

2.7 Saving Output from the ls Command

Problem

You tried to save output from the ls command with a redirect, but when you look at the resulting file, the format is not what you expected.

Solution

Use the -C option on ls when you redirect the output.

Here’s the ls command showing the contents of a directory:

$ ls
a.out cong.txt def.conf  file.txt  more.txt  zebra.list
$

But when we save the output with the > to redirect it to a file, and then show the file contents, we get one file per line, like this:

$ ls > /tmp/save.out
$ cat /tmp/save.out
a.out
cong.txt
def.conf
file.txt
more.txt
zebra.list
$

This time we’ll use the -C option:

$ ls -C > /tmp/save.out
$ cat /tmp/save.out
a.out cong.txt def.conf file.txt more.txt zebra.list
$

Alternatively, if we use the -1 option on ls when we don’t redirect, we get output like this:

$ ls -1
a.out
Cong.txt
def.conf
file.txt
more.txt
zebra.list
$

The original attempt at redirection matches this output.

Discussion

Just when you thought that you understood redirection and you tried it on a simple ls command, it didn’t quite work right. What’s going on here?

The shell’s redirection is meant to be transparent to all programs, so programs don’t need special code to make their output redirectable. The shell takes care of it when you use the > to send the output elsewhere. But it turns out that code can be added to a program to figure out when its output is a terminal (see man isatty). Then, the program can behave differently in those two cases—and that’s what ls is doing.

The authors of ls figured that if your output is going to the screen, then you probably want columnar output (the -C option), as screen real estate is limited. But they assumed if you’re redirecting it to a file, then you’ll want one file per line (the -1 option) since there are more interesting things you can do (i.e., other processing) that is easier if each filename is on a line by itself.

2.8 Sending Output and Error Messages to Different Files

Problem

You are expecting output from a program, but you don’t want it to get littered with error messages. You’d like to save your error messages, but it’s harder to find them mixed among the expected output.

Solution

Redirect output and error messages to different files:

myprogram 1> messages.out 2> message.err

or more commonly:

myprogram > messages.out 2> message.err

Discussion

This example shows two different output files created by the shell. The first, messages.out, will get all the output from the hypothetical myprogram redirected into it. Any error messages from myprogram will be redirected into message.err.

In the constructs 1> and 2> the number is the file descriptor. 1 is standard output (STDOUT) and 2 is standard error (STDERR). Numbering starts at 0, for standard input (STDIN). When no number is specified, STDOUT is assumed. For more information on file descriptors and the difference between STDOUT and STDERR, see Recipe 2.19.

2.9 Sending Output and Error Messages to the Same File

Problem

Using redirection, you can redirect output or error messages to separate files, but how do you capture all the output and error messages to a single file?

Solution

Use the shell syntax to redirect standard error messages to the same place as standard output.

Preferred:

both >& outfile

or:

both &> outfile

or older and slightly more verbose (but also more portable):

both > outfile 2>&1

where both is just our (imaginary) program that is going to generate output to both STDERR and STDOUT.

Discussion

&> and >& are shortcuts that simply send both STDOUT and STDERR to the same place—exactly what we want to do.

In the third example, the 1 appears to be used as the target of the redirection, but the >& says to interpret the 1 as a file descriptor instead of a filename. In fact, the 2>&1 is a single entity—no spaces allowed—indicating that standard error (2) will be redirected (>) to a file descriptor (&) that follows (1). The 2>& all has to appear together without spaces; otherwise the 2 would look just like another argument, and the & actually means something completely different when it appears by itself. (It has to do with running the command in the background.)

It may help to think of all redirection operators as taking a leading number (e.g., 2>), but that the default number for > is 1, the standard output file descriptor.

You could also do the redirection in the other order, though it is slightly less read-able, and redirect standard output to the same place to which you have already redirected standard error (in fact you must do it this way if you are using a pipe; see Recipe 2.15):

both 2> outfile 1>&2

The 1 indicates standard output and the 2 standard error. We could have written just >&2 for that last redirection, since 1 is the default for >, but we find it more readable to write the number explicitly when redirecting file descriptors.

Note

Note the order of the contents of the output file. Sometimes the error messages may appear sooner in the file than they do on the screen. That has to do with the unbuffered nature of standard error, and the effect becomes more pronounced when writing to a file instead of the screen.

2.10 Appending Rather than Clobbering Output

Problem

Each time you redirect your output, it creates that output file anew. What if you want to redirect output a second (or third, or…) time, and don’t want to clobber the previous output?

Solution

The double greater-than sign (>>) is a bash redirector that means append the output:

$ ls > /tmp/ls.out
$ cd ../elsewhere
$ ls >> /tmp/ls.out
$ cd ../anotherdir
$ ls >> /tmp/ls.out
$

Discussion

The first line includes a redirect that truncates the file if it exists and starts with a clean (empty) file, filling it with the output from the ls command.

The second and third invocations of ls use the double greater-than sign (>>) to indicate appending to, rather than replacing the contents of, the output file.

If you want to have error messages (i.e., STDERR) included in the redirection, specify that redirection after redirecting STDERR, like this:

ls >> /tmp/ls.out 2>&1

As of bash version 4 you can combine both of those redirections in one:

ls &>> /tmp/ls.out

which will redirect both STDERR and STDOUT and append them to the specified file. Just remember that the ampersand must come first and no spacing is allowed between the three characters.

2.11 Using Just the Beginning or End of a File

Problem

You need to display or use just the beginning or end of a file.

Solution

Use the head or tail command. By default, head will output the first 10 lines and tail will output the last 10 lines of the given file. If more than one file is given, the appropriate lines from each of them are output. Use the -number switch (e.g., -5) to change the number of lines. tail also has the -f and -F switches, which follow the end of the file as it is written to, and it has an interesting + switch that we cover in Recipe 2.12.

Discussion

head and tail, along with cat, grep, sort, cut, and uniq, are some of the most commonly used Unix text processing tools out there. If you aren’t already familiar with them, you’ll soon wonder how you ever got along without them.

2.12 Skipping a Header in a File

Problem

You have a file with one or more header lines and you need to process just the data, and skip the header.

Solution

Use the tail command with a special argument. For example, to skip the first line of a file:

$ tail -n +2 lines
Line 2
Line 3
Line 4
Line 5
$

Discussion

An argument to tail, of the format -n number (or just -number), will specify a line offset relative to the end of the file. So, tail -n 10 file shows the last 10 lines of file, which also happens to be the default if you don’t specify anything. Specifying a number starting with a plus sign (+) indicates an offset relative to the top of the file. Thus, tail -n +1 file gives you the entire file, tail -n +2 skips the first line, and so on.

2.13 Throwing Output Away

Problem

Sometimes you don’t want to save the output into a file; in fact, sometimes you don’t even want to see it at all.

Solution

Redirect the output to /dev/null as shown in these examples:

find / -name myfile -print 2> /dev/null

or:

noisy > /dev/null 2>&1

Discussion

You could redirect the unwanted output into a file, then remove the file when you’re done. But there is an easier way. Unix and Linux systems have a special device that isn’t real hardware at all, just a bit bucket where we can dump unwanted data. It’s called /dev/null and is perfect for these situations. Any data written there is simply thrown away, so it takes up no disk space. Redirection makes it easy.

In the first example, only the output going to standard error is thrown away. In the second example, both standard output and standard error are discarded.

In rare cases, you may find yourself in a situation where /dev is on a read-only filesystem (for example, certain information security appliances), in which case you are stuck with the first suggestion of writing to a file and then removing it.

2.14 Saving or Grouping Output from Several Commands

Problem

You want to capture the output with a redirect, but you’re typing several commands on one line:

pwd; ls; cd ../elsewhere; pwd; ls > /tmp/all.out

The final redirect applies only to the last command, the last ls on that line. All the other output appears on the screen (i.e., does not get redirected).

Solution

Use braces ({ }) to group these commands together; then redirection applies to the output from all commands in the group. For example:

{ pwd; ls; cd ../elsewhere; pwd; ls; } > /tmp/all.out
Warning

There are two very subtle catches here. The braces are actually reserved words, so they must be surrounded by whitespace. Also, the trailing semicolon is required before the closing brace.

Alternatively, you could use parentheses, (), to tell bash to run the commands in a subshell, then redirect the output of the entire subshell’s execution. For example:

(pwd; ls; cd ../elsewhere; pwd; ls) > /tmp/all.out

Discussion

While these two solutions look very similar, there are two important differences. The first difference is syntactic, the second semantic. Syntactically, the braces need to have whitespace around them, and the last command inside the list must terminate with a semicolon. That’s not required when you use parentheses. The bigger difference, though, is semantic—what these constructs mean. The braces are just a way to group several commands together, more like a shorthand for our redirecting, so that we don’t have to redirect each command separately. Commands enclosed in parentheses, however, run in another instance of the shell, a child of the current shell called a subshell.

The subshell is almost identical to the current shell’s environment—i.e., variables, including $PATH, are all the same, but traps are handled differently (for more on traps, see Recipe 10.6). Now here is the big difference in using the subshell approach: because a subshell is used to execute the cd commands, when the subshell exits, your main shell remains where it started. That is, its current directory hasn’t moved, and its variables haven’t changed.

With the braces used for grouping, you end up in the new directory (../elsewhere in our example). Any other changes that you make (variable assignments, for example) will be made to your current shell instance. While both approaches result in the same output, they leave you in very different places.

One interesting thing you can do with braces is form more concise branching blocks (Recipe 6.2). You can shorten this:

if [ $result = 1 ]; then
    echo "Result is 1; excellent."
    exit 0
else
    echo "Uh-oh, ummm, RUN AWAY! "
    exit 120
fi

into this:

[ $result = 1 ] 
  && { echo "Result is 1; excellent." ; exit 0; } 
  || { echo "Uh-oh, ummm, RUN AWAY! " ; exit 120; }

How you write it depends on your style and what you think is readable, but we recommend the first form because it is clearer to a wider audience.

2.15 Connecting Two Programs by Using Output as Input

Problem

You want to take the output from one program and use it as the input of another program.

Solution

You could redirect the output from the first program into a temporary file, then use that file as input to the second program. For example:

$ cat one.file another.file > /tmp/cat.out
$ sort < /tmp/cat.out
...
$ rm /tmp/cat.out
$

Or you could do all of that in one step, sending the output directly to the next program, by using the pipe symbol (|) to connect them. For example:

cat one.file another.file | sort

You can also link a sequence of several commands together by using multiple pipes:

cat my* | tr 'a-z' 'A-Z' | sort | uniq | awk -f transform.awk | wc

Discussion

Using the pipe symbol means we don’t have to invent a temporary filename, remember it, and remember to delete it.

Programs like sort can take input from standard input (redirected via the < symbol), but they can also take input as a filename. So, you can do this:

sort /tmp/cat.out

rather than redirecting the input into sort:

sort < /tmp/cat.out

That behavior (of using a filename if supplied, and if not, of using standard input) is a typical Unix/Linux characteristic, and a useful model to follow so that commands can be connected one to another via the pipe mechanism. Such programs are called filters, and if you write your programs and shell scripts that way, they will be more useful to you and to those with whom you share your work.

Feel free to be amazed at the powerful simplicity of the pipe mechanism. You can even think of the pipe as a rudimentary parallel processing mechanism. You have two commands (programs) running in parallel, sharing data—the output of one as the input to the next. They don’t have to run sequentially (where the first runs to completion before the second one starts); the second one can get started as soon as data is available from the first.

Be aware, however, that commands run this way (i.e., connected by pipes) are run in separate processes. While such a subtlety can often be ignored, there are a few times when the implications of this are important. We’ll discuss that in Recipe 19.8.

Also consider a command such as svn -v log | less. If less exits before Subversion has finished sending data, you’ll get an error like svn: Write error: Broken pipe. While it isn’t pretty, it also isn’t harmful. It happens all the time when you pipe a voluminous amount of data into a program like less—you often want to quit once you’ve found what you’re looking for, even if there is more data coming down the pipe.

2.16 Saving a Copy of Output Even While Using It as Input

Problem

You want to debug a long sequence of piped I/O, such as:

cat my* | tr 'a-z' 'A-Z' | uniq | awk -f transform.awk | wc

How can you see what is happening between uniq and awk without disrupting the pipe?

Solution

The solution to these problems is to use what plumbers call a T-joint in the pipes. For bash, that means using the tee command to split the output into two identical streams, one that is written to a file and the other that is written to standard output, so as to continue the sending of data along the pipes.

For this example where we’d like to debug a long string of pipes, we insert the tee command between uniq and awk:

... uniq | tee /tmp/x.x | awk -f transform.awk ...

Discussion

The tee command writes the output to the filename(s) specified as its parameter and also writes that same output to standard out. In our example, it sends a copy to /tmp/x.x and also sends the same data to awk, the command to which the output of tee is connected via the pipe symbol.

Don’t worry about what each different piece of the command line is doing in these examples; we just want to illustrate how tee can be used in any sequence of commands.

Let’s back up just a bit and start with a simpler command line. Suppose you’d just like to save the output from a long-running command for later reference, while at the same time seeing it on the screen. After all, a command like:

find / -name '*.c' -print | less

could find a lot of C source files, so the output will likely scroll off the window. Using more or less will let you look at the output in manageable pieces, but once completed they don’t let you go back and look at that output without rerunning the command. Sure, you could run the command and save it to a file:

find / -name '*.c' -print > /tmp/all.my.sources

but then you have to wait for it to complete before you can see the contents of the file. (OK, we know about tail -f, but that’s just getting off-topic here.) The tee command can be used instead of the simple redirection of standard output:

find / -name '*.c' -print | tee /tmp/all.my.sources

In this example, since the output of tee isn’t redirected anywhere, it will print to the screen. But the copy that is diverted into a file will also be there for later use (e.g., cat /tmp/all.my.sources).

Notice, too, that in these examples we did not redirect standard error at all. This means that any errors, like you might expect from find, will be printed to the screen but won’t show up in the tee file. We could add a 2>&1 to the find command:

find / -name '*.c' -print 2>&1 | tee /tmp/all.my.sources

to include the error output in the tee file. It won’t be neatly separated, but it will be captured.

2.17 Connecting Two Programs by Using Output as Arguments

Problem

What if one of the programs to which you would like to connect with a pipe doesn’t work that way? For example, you can remove files with the rm command, specifying the files to be removed as parameters to the command:

rm my.java your.c their.*

But rm doesn’t read from standard input, so you can’t do something like:

find . -name '*.c' | rm

Since rm only takes its filenames as arguments or parameters on the command line, how can we get the output of a previously run command (e.g., echo or ls) onto the command line?

Solution

Use the command substitution feature of bash:

rm $(find . -name '*.class')

You can also use the xarg command; see the discussion in Recipe 15.13.

Discussion

The $() encloses a command that is run in a subshell. The output from that command is substituted in place of the $() phrase. Newlines cause the output to become several parameters on the command line, which is often useful but may sometimes be surprising.

The earlier shell syntax was to use backquotes (``) instead of $() for enclosing the sub-command. The $() syntax is preferred over the older `` syntax because it is easier to nest and arguably easier to read. However, you may see `` more often than $(), especially in older scripts or from those who grew up with the original Bourne or C shells.

In our example, the output from find—typically a list of names—will become the arguments to the rm command.

Warning

Be very careful when doing something like this because rm is very unforgiving. If your find command finds more than you expect, rm will remove it with no recourse. This is not Windows; you cannot recover deleted files from the recycle bin. You can mitigate the danger with rm -i, which will prompt you to verify each deletion. That’s OK on a small number of files, but interminable on a large set.

One way to use such a mechanism in bash with greater safety is to run that inner command first by itself. When you can see that you are getting the results that you want, only then do you use it in the command with $().

For example:

$ find . -name '*.class'
First.class
Other.class
$ rm $(find . -name '*.class')
$

We’ll see in an upcoming recipe how this can be made even more foolproof by using !! instead of retyping the find command (see Recipe 18.2).

2.18 Using Multiple Redirects on One Line

Problem

You want to redirect output to several different places.

Solution

Use redirection with file numbers to open all the files that you want to use. For example:

divert 3> file.three 4> file.four 5> file.five 6> else.where

where divert might be a shell script with various commands whose output you want to send to different places. For example, you might write divert to contain lines like this: echo option $OPTSTR >&5. That is, your divert shell script could direct its output to various different descriptors, which the invoking program can send to different destinations.

Similarly, if divert was a C program executable, you could actually write to descriptors 3, 4, 5, and 6 without any need for open() calls.

Discussion

In Recipe 2.8 we explained that each file descriptor is indicated by a number, starting at zero: standard input is 0, standard output is 1, and standard error is 2. If no number is given, 1 is assumed. That means that you could redirect standard output with the slightly more verbose 1> (rather than a simple >) followed by a filename, but there’s no need; the shorthand > is fine. It also means that you can have the shell open up any number of arbitrary file descriptors and have them set to write various files so that the program that the shell then invokes from the command line can use these opened file descriptors without further ado.

While we don’t recommend this technique because it’s fragile and more complicated than it needs to be, it is intriguing.

2.19 Saving Output When Redirect Doesn’t Seem to Work

Problem

You tried using > but some (or all) of the output still appears on the screen.

For example, the compiler was producing these error messages:

$ gcc bad.c
bad.c: In function `main':
bad.c:3: error: `bad' undeclared (first use in this function)
bad.c:3: error: (Each undeclared identifier is reported only once
bad.c:3: error: for each function it appears in.)
bad.c:3: error: parse error before "c"
$

You wanted to capture those messages, so you tried redirecting the output:

$ gcc bad.c > save.it
bad.c: In function `main':
bad.c:3: error: `bad' undeclared (first use in this function)
bad.c:3: error: (Each undeclared identifier is reported only once
bad.c:3: error: for each function it appears in.)
bad.c:3: error: parse error before "c"
$

However, it doesn’t seem to have redirected anything. In fact, when you examine the file into which you were directing the output, that file is empty (zero bytes long):

$ ls -l save.it
-rw-r--r-- 1 albing users 0 2005-11-13 15:30 save.it
$ cat save.it
$

Solution

Redirect the error output, as follows:

gcc bad.c 2> save.it

The contents of save.it are now the error messages that you saw before.

Discussion

So what’s going on here? Every process in Unix and Linux typically starts out with three open file descriptors: one for input called standard input (STDIN), one for output called standard output (STDOUT), and one for error messages called standard error (STDERR). It is really up to the programmer who writes any particular program to stick to these conventions and write error messages to standard error and to the normally expected output to standard out, so there is no guarantee that every error message that you ever get will go to standard error. But most of the long-established utilities are well behaved this way. That is why these compiler messages are not being diverted with a simple > redirect; it only redirects standard output, not standard error.

As mentioned in the previous recipe, each file descriptor is indicated by a number, starting at zero. Standard input is 0, output is 1, and error is 2. That means that you could redirect standard output with the slightly more verbose: 1> (rather than a simple >) followed by a filename, but there’s no need. The shorthand > is fine. To redirect standard error, use 2>.

One important difference between standard output and standard error is that standard output is buffered but standard error is unbuffered; that is, every character is written individually, and they aren’t collected together and written as a bunch. This means that you see the error messages right away and that there is less chance of them being dropped when a fault occurs, but the cost is one of efficiency. It’s not that standard output is unreliable, but in error situations (e.g., when a program dies unexpectedly), the buffered output may not have made it to the screen before the program stops executing. That’s why standard error is unbuffered: to be sure the message gets written. By contrast, with standard output, only when the buffer is full (or when the file is closed) does the output actually get written. It’s more efficient for the more frequently used output, but efficiency isn’t as important when an error is being reported.

What if you want to see the output as you are saving it? The tee command we discussed in Recipe 2.16 seems just the thing:

gcc bad.c 2>&1 | tee save.it

This will take standard error and redirect it to standard out, piping them both into tee. The tee command will write its input to both the file (save.it) and tee’s standard out, which will go to your screen since it isn’t otherwise redirected.

This is a special case of redirecting because normally the order of the redirections is important. Compare these two commands:

somecmd >my.file 2>&1
somecmd 2>&1 >my.file

In the first case, standard output is redirected to a file (my.file), and then standard error is redirected to the same place as standard out. All output will appear in my.file.

But that is not the case with the second command. In the second command, standard error is redirected to standard output (which at that point is connected to the screen), after which standard output is redirected to my.file. Thus, only standard output messages will be put in the file, and errors will still show on the screen.

However, this ordering had to be subverted for pipes—you can’t put the second redirect after the pipe symbol, because after the pipe comes the next command. So, bash makes an exception when you write:

somecmd 2>&1 | othercmd

and recognizes that standard output is being piped. It therefore assumes that you want to include standard error in the piping when you write 2>&1 even though its normal ordering wouldn’t work that way.

The other result of this, and of pipe syntax in general, is that it gives us no way to pipe just standard error and not standard output into another command—unless we first swap the file descriptors (see the next recipe).

Note

As of the 4.x versions of bash, there is a shortcut syntax for redirecting both standard output and standard error into a pipe. To redirect both output streams from somecmd into some othercmd, as shown previously, we can now use |& to write:

somecmd |& othercmd

2.20 Swapping STDERR and STDOUT

Problem

You need to swap STDERR and STDOUT so you can send STDOUT to a logfile, but then send STDERR to the screen and to a file using the tee command. But pipes only work with STDOUT.

Solution

Swap STDERR and STDOUT before the pipe redirection using a third file descriptor:

./myscript 3>&1 1>stdout.logfile 2>&3- | tee -a stderr.logfile

Discussion

Whenever you redirect file descriptors, you are duplicating the open descriptor to another descriptor. This gives you a way to swap descriptors, much like how any program swaps two values—by means of a third, temporary holder. Copy A into C, copy B into A, copy C into B, and then you have swapped the values of A and B. For file descriptors, it looks like this:

./myscript 3>&1 1>&2 2>&3

Read the syntax 3>&1 as “give file descriptor 3 the same value as output file descriptor 1.” What happens here is that it duplicates file descriptor 1 (i.e., STDOUT) into file descriptor 3, our temporary holding place. Then it duplicates file descriptor 2 (i.e., STDERR) into STDOUT, and finally duplicates file descriptor 3 into STDERR. The net effect is that the STDERR and STDOUT file descriptors have swapped places.

So far, so good. Now we just change this slightly. Once we’ve copied STDOUT (into file descriptor 3), we are free to redirect STDOUT into the logfile we want to have capture the output of our script or other program. Then we can copy the file descriptor from its temporary holding place (file descriptor 3) into STDERR. Adding the pipe will now work because the pipe connects to the (original) STDOUT. That gets us to the solution shown earlier:

./myscript 3>&1 1>stdout.logfile 2>&3- | tee -a stderr.logfile

Note the trailing - on the 2>&3- term. We do that so that we close file descriptor 3 when we are done with it. That way our program doesn’t have an extra open file descriptor. We are tidying up after ourselves.

We’re also using the -a option to tee to append instead of replace.

2.21 Keeping Files Safe from Accidental Overwriting

Problem

You don’t want to delete the contents of a file by mistake. It can be too easy to mistype a filename and find that you’ve redirected output into a file that you meant to save.

Solution

Tell the shell to be more careful, as follows:

set -o noclobber

If you decide you don’t want to be so careful after all, then turn the option off:

set +o noclobber

Discussion

The noclobber option tells bash not to overwrite any existing files when you redirect output. If the file to which you redirect output doesn’t (yet) exist, everything works as normal, with bash creating the file as it opens it for output. If the file already exists, however, you will get an error message.

Here it is in action. We begin by turning the option off, just so that your shell is in a known state, regardless of how your particular system may be configured:

$ set +o noclobber                                1
$ echo something > my.file                        2
$ echo some more > my.file
$ set -o noclobber                                3
$ echo something > my.file
bash: my.file: cannot overwrite existing file
$ echo some more >> my.file                       4
$
1

The first time we redirect output to my.file the shell will create it for us.

2

The second time we redirect, bash overwrites the file (it truncates the file to 0 bytes and starts writing from there).

3

Then we set the noclobber option and we get an error message when we try to write to that file.

4

As we show in the last part of this example, we can append to the file (using >>) just fine.

Warning

Beware! The noclobber option only refers to the shell’s clobbering of a file when redirecting output. It will not stop other file manipulating actions of other programs from clobbering files (see Recipe 14.13):

$ echo useless data > some.file
$ echo important data > other.file
$ set -o noclobber
$ cp some.file other.file
$

Notice that no error occurs; the file is copied over the top of an existing file. That copy is done via the cp command. The shell doesn’t get involved.

If you’re a good and careful typist this may not seem like an important option, but we will look at other recipes where filenames are generated with regular expressions or passed as variables. Those filenames could be used as the filename for output redirection. In such cases, having noclobber set may be an important safety feature for preventing unwanted side effects (whether goofs or malicious actions).

2.22 Clobbering a File on Purpose

Problem

You like to have noclobber set, but every once in a while you do want to clobber a file when you redirect output. Can you override bash’s good intentions, just once?

Solution

Use >| to redirect your output. Even if noclobber is set, bash ignores its setting and overwrites the file.

Consider this example:

$ echo something > my.file
$ set -o noclobber
$ echo some more >| my.file                       1
$ cat my.file
some more
$ echo once again > my.file                       2
bash: my.file: cannot overwrite existing file
$
1

Notice that no error message occurs on the second echo.

2

But on the third echo, when we are no longer using the vertical bar but just the plain > character by itself, the shell warns us and does not clobber the existing file.

Discussion

Using noclobber does not take the place of file permissions. If you don’t have write permission in the directory, you won’t be able to create the file, whether or not you use the >| construct. Similarly, you must have write permission on the file itself to overwrite that existing file, whether or not you use the >|.

So why the vertical bar? According to Chet, “POSIX specifies the >| syntax, which it picked up from ksh88. I’m not sure why Korn chose it. csh does use >!.” To help you remember it, you can think of it as for emphasis. Its use in English (with the imperative mood) fits that sense of “do it anyway!” when telling bash to overwrite the file if need be. The vi and ex editors use the ! with that same meaning in their write (:w! filename) command. Without a !, the editor will complain if you try to overwrite an existing file. With it, you are telling the editor to “do it!”

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.188.187.165