Redirecting Input and Output with File Descriptors

When you execute the history command, the program’s output is displayed on the screen. The history command writes its output to standard output, and standard output is mapped to your terminal screen by default.

You’ve already used the > and >> symbols to redirect a program’s output to a file. For example, you can save the contents of your shell history to a file:

 $ ​​history​​ ​​>​​ ​​commands.txt

You also know that the >> symbol appends text to an existing file. For example, you can append today’s date to the file you just created:

 $ ​​date​​ ​​>>​​ ​​commands.txt

Here’s how this concept works. The > and >> symbols redirect content to file descriptors, an abstract way for the OS to talk to the kernel and manage files. On Unix systems, each file opened by any process returns a file descriptor to the process as an integer. The kernel keeps track of all these file descriptors in a table. However, there are three file descriptors reserved. 0 is mapped to standard input, 1 is mapped to standard output, and 2 is mapped to standard error.

When sending a program’s output to a file using standard output, you can omit the integer, and most people do. That’s why you’ll see history > commands.txt instead of history 1> commands.txt.

Sometimes you don’t even care about the output of a program. You know it’ll work and you don’t need to see the response. Unix-based operating systems have a special device designed specifically for this purpose: /dev/null. Instead of redirecting the output to a file, redirect it to /dev/null:

 $ ​​ls​​ ​​>​​ ​​/dev/null

You don’t see any output on the screen because you redirected it, but it also doesn’t end up in a file. It’s completely discarded.

This isn’t very practical by itself, but the concept will make more sense when you work with errors shortly. Right now, let’s look at how to process input from files.

Unix-based programs accept input from a few places; one of those places is the standard input stream.

When you use the cat command to display a file, you specify the file to read as an argument to the program:

 $ ​​cat​​ ​​commands.txt

However, cat also accepts data from standard input. That means you can use a file descriptor instead of specifying a file. The file descriptor for standard input is 0<, but you can use < instead, which is what you’ll typically see.

 $ ​​cat​​ ​​<​​ ​​commands.txt

You’ll see the output displayed on the screen, just as if you’d passed it in as an argument. This particular example isn’t entirely practical, since cat lets you specify the filename directly with one less character. But when you use <, the file is opened and its content is sent to cat; and if the file doesn’t exist, cat doesn’t execute. Try this out:

 $ ​​cat​​ ​​windows.txt
 cat: windows.txt: No such file or directory
 $ ​​cat​​ ​​<​​ ​​windows.txt
 -bash: windows.txt: No such file or directory

In the first attempt, cat opens the file. In the second attempt, the shell tries to open the file before passing it to the cat command.

It’s common for many command-line utilities to receive data from files via standard input. For example, the command-line tool for the MySQL database lets you run a series of SQL commands using standard input. If you’re setting up a new database for an app, you’ll often see instructions like this:

 $ ​​mysql​​ ​​-u​​ ​​root​​ ​​-p​​ ​​<​​ ​​tables.sql

The tables.sql file contains a bunch of SQL statements to create tables. The program reads them in and executes them. If you’ve ever encountered that kind of command, you now know how it works.

You can send input to programs in a few other ways. Throughout the book, you’ve used cat as a quick method to build files:

 $ ​​cat​​ ​​<<​​ ​​'EOF'​​ ​​>​​ ​​names.txt
 >​​ ​​Homer
 >​​ ​​Marge
 >​​ ​​Bart
 >​​ ​​Lisa
 >​​ ​​Maggie
 >​​ ​​EOF

The << symbol defines a “here-document” or heredoc, a block of text that programs treat as if it were a separate file. The EOF characters in this example specify the end of the heredoc. EOF is a convention, short for “end of file.” You can use any sequence you want.

The data gets passed to the cat program as if it were an input file. The > symbol then redirects the output to standard output, saving it in the specified file.

Using cat with a heredoc is a useful hack for creating a file, and now you have a deeper understanding of how it works.

A “herestring,” which lets you create a premade string as input to a program, is another option. Let’s use the bc program as an example.

bc is a command-line calculator. When launched, you can do interactive math:

 $ ​​bc
 bc
 bc 1.07.1
 Copyright 1991-1994, 1997, 1998, 2000, 2004, 2006, 2008,
 2012-2017 Free Software Foundation, Inc.
 This is free software with ABSOLUTELY NO WARRANTY.
 For details type `warranty'.
 2 + 2
 4
 quit

If you wanted to use bc in a noninteractive way, you can use a herestring, which you specify with <<<:

 $ ​​bc​​ ​​<<<​​ ​​"2 + 2"
 4

Herestrings and heredocs can save you a few steps when you’re feeding input into programs.

Now let’s look at another way to get input into programs. In addition to sending a program to a file, you can send it to another program as its input.

Creating Pipelines of Data

One of the main foundations of a Unix-based system is that programs work together like a pipeline; the output of one program can be the input of another. In other words, you send the standard output of one program to the standard input of another.

The less command makes it easy to navigate through a large file by reading it one page at a time, but less also accepts input from standard input. If you have a very long directory listing, you can send the output from the ls -alh command to the less command to paginate the results. Give this a try:

 $ ​​ls​​ ​​-alh​​ ​​/usr/bin​​ ​​|​​ ​​less

So how does this work? The | symbol, or the pipe, connects the programs. The results of the ls command, which get sent to standard output, are “piped” to the less command which sees the stream as standard input.

Remember the head and tail commands you used in Reading the Beginning and End of a File? You can pipe output to these as well.

To see only the first three entries in a long directory listing, try this:

 $ ​​ls​​ ​​-alh​​ ​​/usr/bin​​ ​​|​​ ​​head​​ ​​-n​​ ​​3
 total 241M
 drwxr-xr-x 2 root root 44K Mar 8 11:56 .
 drwxr-xr-x 10 root root 4.0K Feb 9 18:12 ..

The output of the ls command gets sent to the head command, which displays only the first three of lines of output, as specified by the -n argument.

Using this concept, you can use head and tail together to view a single line from output if you know where it’s located. Display only the third entry from the directory listing by using the tail command to start at the third line of the file, and then pipe the results to the head command to read only the first line:

 $ ​​ls​​ ​​-alh​​ ​​/usr/bin/​​ ​​|​​ ​​tail​​ ​​-n​​ ​​+3​​ ​​|​​ ​​head​​ ​​-n​​ ​​1
 drwxr-xr-x 10 root root 4.0K Feb 9 18:12 ..

You can keep piping output from one program to the next, creating long pipelines of text that you process. You’ll see a few examples of this throughout the rest of the book, but here’s one example you can try now, which prints the most-used commands on your system by looking at your history, grouping identical commands, and sorting them:

 $ ​​history​​ ​​|​​ ​​awk​​ ​​'{c[$2]++}END{for(i in c){print c[i] " " i}}'​​ ​​|​​ ​​sort​​ ​​-rn​​ ​​|​​ ​​head

This command sends the output of history to the awk command, which parses out the commands and counts up how many times each one is used, producing an output of each command and its frequency, like this:

 37 ls
 12 chmod
 11 cat
 9 stat
 9 mkdir
 9 echo
 8 cd
 7 touch
 7 rm
 7 cp

You’ll learn about awk at the end of the chapter.

It then sends the output to the sort command which sorts the output in reverse order. The output then gets sent to the head command which displays the first ten results.

This demonstrates a key philosophy of Unix-based systems: use small, focused tools that do a single thing well, and chain them together. Let the data flow through a pipeline of tools.

If you write code, you probably apply this same approach to the functions or objects in your system. You know that when you make a function do too many things, it becomes more difficult to maintain; and when you have systems that are too tightly coupled, they tend to resist change. That’s the idea behind how these tools interact.

Let’s look at another practical use for piping data to another program.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.226.34.117