We’ve seen how to redirect input from a file and
output to a file. You can also connect two
programs
together so that the output from one
program becomes the input of the next program. Two or more programs
connected in this way form a
pipe
.
To make a pipe, put a vertical bar
(|
) on the command line between two
commands. When a pipe is set up between two commands, the standard
output of the command to the left of the pipe symbol becomes the
standard input of the command to the right of the pipe symbol. Any
two commands can form a pipe as long as the first program writes to
standard output and the second program reads from standard input.
When a program takes its input from another program, performs some operation on that input, and writes the result to the standard output (which may be piped to yet another program), it is referred to as a filter . A common use of filters is to modify output. Just as a common filter culls unwanted items, Unix filters can restructure output.
Most Unix programs can be used to form pipes. Some programs that are commonly used as filters are described in the next sections. Note that these programs aren’t used only as filters or parts of pipes. They’re also useful on their own.
The
grep
program searches a file or files for
lines that have a certain pattern. The syntax is:
grep "pattern
"file(s)
The name “grep” is derived from the
ed
(a Unix line editor) command
g/re/p
, which means
"globally search for a regular
expression and print all matching lines containing
it.” A regular
expression
is either some plain text (a word,
for example) and/or special characters used for pattern matching.
When you learn more about regular expressions, you can use them to
specify complex patterns of text.
The simplest use of grep
is to look for a pattern
consisting of a single word. It can be used in a pipe so only those
lines of the input files containing a given string are sent to the
standard output. But let’s start with an example
reading from files: searching all files in the working directory for
a word — say, Unix. We’ll
use the wildcard *
to quickly give
grep
all filenames in the directory.
% grep "Unix" *
ch01:Unix is a flexible and powerful operating system
ch01:When the Unix designers started work, little did
ch05:What can we do with Unix?
%
When grep
searches multiple files, it shows the
filename where it finds each matching line of text. Alternatively, if
you don’t give grep
a filename to
read, it reads its standard input; that’s the way
all filter programs work:
% ls -l | grep "Jan"
drwx------ 4 taylor staff 264 Jan 29 22:33 Movies/
drwx------ 2 taylor staff 264 Jan 13 10:02 Music/
drwxr-xr-x 12 root staff 364 Jan 9 20:24 NetInfo/
drwx------ 95 taylor staff 3186 Jan 29 22:44 Pictures/
drwxr-xr-x 3 taylor staff 264 Jan 24 21:24 Public/
%
First, the example runs ls
-l
to list your directory. The standard output of ls
-l
is piped to grep
, which only
outputs lines that contain the string Jan
(that
is, files or directories that were last modified in January and any
other lines that have the pattern
“Jan” within). Because the standard
output of grep
isn’t redirected,
those lines go to the Terminal screen.
grep
options let you modify the search. Table 6-1 lists some of the options.
Table 6-1. Some grep options
Option |
Description |
---|---|
|
Print all lines that do not match pattern. |
|
Print the matched line and its line number. |
|
Print only the names of files with matching lines (lowercase letter “L”). |
|
Print only the count of matching lines. |
|
Match either upper- or lowercase. |
Next, let’s use a regular expression that tells
grep
to find lines with root,
followed by zero or more other characters (abbreviated in a regular
expression as .*
), then followed by
Jan
:[7]
% ls -l | grep "root.*Jan"
drwxr-xr-x 12 root staff 364 Jan 9 20:24 NetInfo/
%
For more about regular expressions, see the references in Section 10.1.
The sort
program
arranges lines of text alphabetically or numerically. The following
example sorts the lines in the food file (from
Section 5.1.1) alphabetically.
sort
doesn’t modify the file
itself; it reads the file and writes the sorted text to the standard
output.
% sort food
Afghani Cuisine
Bangkok Wok
Big Apple Deli
Isle of Java
Mandalay
Sushi and Sashimi
Sweet Tooth
Tio Pepe's Peppers
By default, sort
arranges lines of text
alphabetically. Many options control the sorting, and Table 6-2 lists some of them.
Table 6-2. Some sort options
Option |
Description |
---|---|
|
Sort numerically (example: 10 sorts after 2); ignore blanks and tabs. |
|
Reverse the sorting order. |
|
Sort upper- and lowercase together. |
|
Ignore first |
More than two commands may be linked up into a pipe. Taking a
previous pipe example using grep
, we can further
sort the files modified in January by order of size. The following
pipe uses the commands ls
,
grep
, and sort
:
% ls -l | grep "Jan" | sort +4n
drwx------ 2 taylor staff 264 Jan 13 10:02 Music/
drwx------ 4 taylor staff 264 Jan 29 22:33 Movies/
drwxr-xr-x 3 taylor staff 264 Jan 24 21:24 Public/
drwxr-xr-x 12 root staff 364 Jan 9 20:24 NetInfo/
drwx------ 95 taylor staff 3186 Jan 29 22:44 Pictures/
%
This pipe sorts all files in your directory modified in January by
order of size, and prints them to the Terminal screen. The
sort
option
+4
n
skips four fields
(fields are separated by blanks), then sorts the lines in numeric
order. So, the output of ls
, filtered by
grep
, is sorted by the file size (this is the
fifth column, starting with 1605). Both grep
and
sort
are used here as filters to modify the output
of the ls
-l
command.You could
print the listing by piping the sort
output to
your printer command (either lp
,
lpr
, or atprint
).
The
less
program, which you saw in
Section 2.1.12, can also be used as a filter. A long
output normally zips by you on the screen, but if you run text
through less
, the display stops after each
screenful of text.
Let’s assume that you have a long directory listing.
(If you want to try this example and need a directory with lots of
files, use cd
first to change to a system
directory such as /bin or
/usr/bin.) To make it easier to read the sorted
listing, pipe the output through less
:
% ls -l | grep "Jan" | sort +4n | less
drwx------ 2 taylor staff 264 Jan 13 10:02 Music/
drwx------ 4 taylor staff 264 Jan 29 22:33 Movies/
drwxr-xr-x 3 taylor staff 264 Jan 24 21:24 Public/
drwxr-xr-x 12 root staff 364 Jan 9 20:24 NetInfo/
.
.
.
drwx------ 95 taylor staff 3186 Jan 29 22:44 Pictures/
:
less
reads a screenful of text from the pipe
(consisting of lines sorted by order of file size), then prints a
colon (:
) prompt. At the prompt, you can type a
less
command to move through the sorted text.
less
reads more text from the pipe and shows it to
you and saves a copy of what it has read, so you can go backward to
reread previous text if you want. (The simpler pager program
more
can’t back up while reading
from a pipe.) When you’re done seeing the sorted
text, the q
command quits less
.
In the following exercises you redirect output, create a simple pipe, and use filters to modify output.
Task |
Command |
---|---|
Redirect output to a file. |
|
Change all the letters to uppercase. |
|
Sort the output of a program. |
|
Append sorted output to a file. |
|
Display output to the screen. |
|
Display long output to the screen. |
|
Format and print a file with |
[7] Note that the regular
expression for “zero or more
characters,” .*
, is different
than the corresponding filename wildcard *
. See
Section 3.2. We can’t
cover regular expressions in enough depth here to explain the
difference, though more-detailed books do. As a rule of thumb,
remember that the first argument to grep
is a
regular expression; other arguments, if any, are filenames that can
use wildcards.
3.143.4.181