About This Book

This book covers bash, the GNU Bourne Again Shell, which is a member of the Bourne family of shells that includes the original Bourne shell sh, the Korn shell ksh, and the Public Domain Korn Shell pdksh. While these and other shells such as dash, and zsh are not specifically covered, odds are that most of the scripts will work pretty well with them.

You should be able to read this book cover to cover, and also just pick it up and read anything that catches your eye. But perhaps most importantly, we hope that when you have a question about how to do something or you need a hint, you will be able to easily find the right answer—or something close enough—and save time and effort.

A great part of the Unix philosophy is to build simple tools that do one thing well, then combine them as needed. This combination of tools is often accomplished via a shell script because these commands, called pipelines, can be long or difficult to remember and type. Where appropriate, we’ll cover the use of many of these tools in the context of the shell script as the glue that holds the pieces together to achieve the goal.

This book was written using OpenOffice.org Writer running on whatever Linux or Windows machine happened to be handy, and kept in Subversion (see Appendix D). The nature of the Open Document Format facilitated many critical aspects of writing this book, including cross-references and extracting code (see Processing Files with No Line Breaks).

GNU Software

bash, and many of the tools we discuss in this book, are part of the GNU Project (http://www.gnu.org/). GNU (pronounced guh-noo, like canoe) is a recursive acronym for “GNU’s Not Unix” and the project dates back to 1984. Its goal is to develop a free (as in freedom) Unix-like operating system.

Without getting into too much detail, what is commonly referred to as Linux is, in fact, a kernel with various supporting software as a core. The GNU tools are wrapped around it and it has a vast array of other software possibly included, depending on your distribution. However, the Linux kernel itself is not GNU software.

The GNU project argues that Linux should in fact be called “GNU/Linux” and they have a good point, so some distributions, notably Debian, do this. Therefore GNU’s goal has arguably been achieved, though the result is not exclusively GNU.

The GNU project has contributed a vast amount of superior software, notably including bash, but there are GNU versions of practically every tool we discuss in this book. And while the GNU tools are more rich in terms of features and (usually) friendliness, they are also sometimes a little different. We discuss this in Developing Portable Shell Scripts, though the commercial Unix vendors in the 1980s and 1990s are also largely to blame for these differences.

Enough (several books this size worth) has already been said about all of these aspects of GNU, Unix, and Linux, but we felt that this brief note was appropriate. See http://www.gnu.org for much more on the topic.

A Note About Code Examples

When we show an executable piece of shell scripting in this book, we typically show it in an offset area like this:

$ ls
a.out  cong.txt  def.conf  file.txt  more.txt  zebra.list

The first character is often a dollar sign ($) to indicate that this command has been typed at the bash shell prompt. (Remember that you can change the prompt, as in Customizing Your Prompt, so your prompt may look very different.) The prompt is printed by the shell; you type the remainder of the line. Similarly, the last line in such an example is often a prompt (the $ again), to show that the command has ended execution and control has returned to the shell.

The pound or hash sign (#) is a little trickier. In many Unix or Linux files, including bash shell scripts, a leading # denotes a comment, and we have used it that way in some out our code examples. But as the trailing symbol in a bash command prompt (instead of $), # means you are logged in as root. We only have one example that is running anything as root, so that shouldn’t be confusing, but it’s important to understand.

When you see an example without the prompt string, we are showing the contents of a shell script. For several large examples we will number the lines of the script, though the numbers are not part of the script.

We may also occasionally show an example as a session log or a series of commands. In some cases, we may cat one or more files so you can see the script and/or data files we’ll be using in the example or in the results of our operation.

$ cat data_file
static header line1
static header line2
1 foo
2 bar
3 baz

Many of the longer scripts and functions are available to download as well. See the end of this Preface for details. We have chosen to use #!/usr/bin/env bash for these examples, where applicable, as that is more portable than the #!/bin/bash you will see on Linux or a Mac. See Finding bash Portably for #! for more details.

Also, you may notice something like the following in some code examples:

# cookbook filename: snippet_name

That means that the code you are reading is available for download on our site (http://www.bashcookbook.com). The download (.tgz or .zip) is documented, but you’ll find the code in something like ./chXX/snippet_name, where chXX is the chapter and snippet_name is the name of the file.

Useless Use of cat

Certain Unix users take a positively giddy delight in pointing out inefficiencies in other people’s code. Most of the time this is constructive criticism gently given and gratefully received.

Probably the most common case is the so-called “useless use of cat award” bestowed when someone does something like cat file | grep foo instead of simply grep foo file. In this case, cat is unnecessary and incurs some system overhead since it runs in a subshell. Another common case would be cat file | tr '[A-Z]' '[a-z]' instead of tr '[A-Z]' '[a-z]' < file. Sometimes using cat can even cause your script to fail (see Forgetting That Pipelines Make Subshells).

But… (you knew that was coming, didn’t you?) sometimes unnecessarily using cat actually does serve a purpose. It might be a placeholder to demonstrate the fragment of a pipeline, with other commands later replacing it (perhaps even cat -n). Or it might be that placing the file near the left side of the code draws the eye to it more clearly than hiding it behind a < on the far right side of the page.

While we applaud efficiency and agree it is a goal to strive for, it isn’t as critical as it once was. We are not advocating carelessness and code-bloat, we’re just saying that processors aren’t getting any slower any time soon. So if you like cat, use it.

A Note About Perl

We made a conscious decision to avoid using Perl in our solutions as much as possible, though there are still a few cases where it makes sense. Perl is already covered elsewhere in far greater depth and breadth than we could ever manage here. And Perl is generally much larger, with significantly more overhead, than our solutions. There is also a fine line between shell scripting and Perl scripting, and this is a book about shell scripting.

Shell scripting is basically glue for sticking Unix programs together, whereas Perl incorporates much of the functionality of the external Unix programs into the language itself. This makes it more efficient and in some ways more portable, at the expense of being different, and making it harder to efficiently run any external programs you still need.

The choice of which tool to use often has more to do with familiarity than with any other reason. The bottom line is always getting the work done; the choice of tools is secondary. We’ll show you many of ways to do things using bash and related tools. When you need to get your work done, you get to choose what tools you use.

More Resources

  • Perl Cookbook, Nathan Torkington and Tom Christiansen (O’Reilly)

  • Programming Perl, Larry Wall et al. (O’Reilly)

  • Perl Best Practices, Damian Conway (O’Reilly)

  • Mastering Regular Expressions, Jeffrey E. F. Friedl (O’Reilly)

  • Learning the bash Shell, Cameron Newham (O’Reilly)

  • Classic Shell Scripting, Nelson H.F. Beebe and Arnold Robbins (O’Reilly)

