Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Chapter 7. Processing Text with awk

Awk is a programming language that can be used to make your shell scripts more powerful, as well as to write independent scripts completely in awk itself. Awk is typically used to perform text-processing operations on data, either through a shell pipe or through operations on files. It's a convenient and clear language that allows for easy report creation, analysis of data and log files, and the performance of otherwise mundane text-processing tasks. Awk has a relatively easy-to-learn syntax. It is also a utility that has been a standard on Unix systems for years, so is almost certain to be available. If you are a C programmer or have some Perl knowledge, you will find that much of what awk has to offer will be familiar to you. This is not a coincidence, as one of the original authors of awk, Brian Kernighan, was also one of the original creators of the C language. Many programmers would say that Perl owes a lot of its text processing to awk. If programming C scares you, you will find awk to be less daunting, and you will find it easy to accomplish some powerful tasks.

Although there are many complicated awk programs, awk typically isn't used for very long programs but for shorter one-off tasks, such as trimming down the amount of data in a web server's access log to only those entries that you want to count or manipulate, swapping the first two columns in a file, or manipulating comma-separated (CSV) files. This chapter introduces you to the basics of awk, providing an introduction to the following subjects:

The different versions of awk and how to install gawk (GNU awk)
The basics of how awk works
The many ways of invoking awk
Different ways to print and format your data
Using variables and functions
Using control blocks to loop over data

What Is awk (Gawk/Mawk/Nawk/Oawk)?

Awk was first designed by Alfred V. Aho, Peter J. Weinberger, and Brian W. Kernighan at AT&T Bell Laboratories. (If you take the first letter of each of their last names, you see the origin of awk.) They designed awk in 1977, but awk has changed over the years through many different implementations. Because companies competed, rather than cooperated, in their writing of their implementations of the early Unix operating system, different versions of awk were developed for SYSV Unix compared to those for BSD Unix. Eventually, a POSIX standard was developed, and then a GNU Free Software version was created. Because of all these differing implementations of awk, different systems often have different versions installed.

The many different awks have slightly different names; together, they sound like a gaggle of birds squawking. The most influential and widely available version of awk today is GNU awk, known as gawk for short. Some systems have the original implementation of awk installed, and it is simply referred to as awk. Some systems may have more than one version of awk installed: the new version of awk, called nawk, and the old version available as oawk (for either old awk or original awk). Some create a symlink from the awk command to gawk, or mawk.

However it is done on your system, it may be confusing and difficult to discern which awk you have. If you don't know which version or implementation you have, it's difficult to know what functionality your awk supports. Writing awk scripts is frustrating if you implement something that is supported in GNU awk, but you have only the old awk installed.

Gawk, the GNU awk

Gawk is commonly considered to be the most popular version of awk available today. Gawk comes from the GNU Foundation, and in true GNU fashion, it has many enhancements that other versions lack.

The enhancements that gawk has over the traditional awks are too numerous to cover here; however, a few of the most notable follow:

Gawk tends to provide you with more informative error messages. Most awk implementations try to tell you what line a syntax error occurs, but gawk does one better by telling you where in that line it occurs.
Gawk has no built-in limits that people sometimes run into when using the other awks to do large batch processing.
Gawk also has a number of predefined variables, functions, and commands that make your awk programming much simpler.
Gawk has a number of useful flags that can be passed on invocation, including the very pragmatic options that give you the version of gawk you have installed and provide you with a command summary (--version and --help, respectively).
Gawk allows you to specify line breaks using to continue long lines easily.
Gawk's regular expression capability is greatly enhanced over the other awks.
Although gawk implements the POSIX awk standard, the GNU extensions it has do not adhere to these standards, but if you require explicit POSIX compatibility this can be enabled with gawk using the invocation flags --traditional or --posix. For a full discussion of the GNU extensions to the awk language, see the gawk documentation, specifically Appendix A.5 in the latest manual.

If these features are not enough, the gawk project is very active, with a number of people contributing, whereas mawk has not had a release in several years. Gawk has been ported to a dizzying array of architectures, from Atari, Amiga, and BeOS to Vax/VMS. Gawk is the standard awk that is installed on GNU/Linux and BSD machines.

The additional features, the respect that the GNU Foundation has in making quality free (as in freedom) software, the wide deployment on GNU/Linux systems, and the active development in the project are all probable reasons why gawk has become the favorite over time.

What Version Do I Have Installed?

There is no single test to find out what version or implementation of awk you have installed. You can do a few things to deduce it, or you can install it yourself so you know exactly what is installed. Check your documentation, man pages, and info files to see if you can find a mention of which implementation is referenced, looking out for any mention of oawk, nawk, gawk, or mawk. Also, poke around on your system to find where the awk binary is, and see if there are others installed. It is highly unlikely that you have no version installed, but the hard part is figuring out which version you do have.

Gawk takes the standard GNU version flags to determine what version you are running. If you run awk with these flags as shown in the following Try It Out, and it succeeds, you know that you have GNU awk available.

Try It Out: Checking Which Version of awk You Are Running

Run awk with the following flags to see if you can determine what implementation you have installed:

$ awk --version
GNU Awk 3.1.4
Copyright (C) 1989, 1991-2003 Free Software Foundation.
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or
(at your option) any later version.

This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
GNU General Public License for more details.

You should have received a copy of the GNU General Public License
along with this program; if not, write to the Free Software
Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA  02111-1307, USA.

How It Works

The GNU utilities implement a standard library that standardizes some flags, such as --version and --help, making it easy to determine if the version you have installed is the GNU awk.

This output showed that GNU awk, version 3.1.4, is installed on the system. If you do not have GNU awk, you get an error message. For example, if you have mawk installed, you might get this error:

$ awk --version
awk: not an option: --version

In this case, you see that it is not GNU awk, or you would have been presented with the version number and the copyright notice. Try the following to see if it is mawk:

$ awk -W versions
mawk 1.3.3 Nov 1996, Copyright (C) Michael D. Brennan

compiled limits:
max NF             32767
sprintf buffer      1020

Because this flag succeeded, and it shows you what implementation and version (mawk, 1.3.3) is installed, you know that your system has mawk installed.

Installing gawk

By far the most popular awk is the GNU Foundation's implementation, gawk. If you find that your system does not have gawk installed, and you wish to install it, follow these steps. If you have a system that gawk has not been ported to, you may need to install a different awk. The known alternatives and where they can be found are listed in the awk FAQ at www.faqs.org/faqs/computer-lang/awk/faq/.

Note

Be careful when putting gawk on your system! Some systems depend on the version of awk that they have installed in /usr/bin, and if you overwrite that with gawk, you may find your system unable to work properly, because some system scripts may have been written for the older implementation. For example, fink for Mac OS X requires the old awk in /usr/bin/awk. If you replace that awk with gawk, fink no longer works properly. The instructions in this section show you how to install gawk without overwriting the existing awk on the system, but you should pay careful attention to this fact!

By far the easiest way to install gawk is to install a prepackaged version, if your operating system provides it. Installation this way is much simpler and easier to maintain. For example, to install gawk on the Debian GNU/Linux OS, type this command:

apt-get install gawk

Mac OS X has gawk available through fink. Fink is a command-line program that you can use to fetch and easily install some useful software that has been ported to OS X. If you don't have fink installed on your system, you can get it at http://fink.sourceforge.net/download/index.php.

If your system does not have packages, or if you want to install gawk on your own, follow these steps:

Obtain the gawk software. The home page for GNU gawk is www.gnu.org/software/gawk/. You can find the latest version of the software at http://ftp.gnu.org/gnu/gawk/. Get the latest .tar.gz from there, and then uncompress and untar it as you would any normal tar:
```
$ tar -zxf gawk-3.1.4.tar.gz
$ cd gawk-3.1.4
```
Review the README file that is included in the source. Additionally, you need to read the OS-specific README file in the directory README_d for any notes on installing gawk on your specific system.
To configure awk, type the following command:
```
$ sh ./configure
checking for a BSD-compatible install... /usr/bin/install -c
checking whether build environment is sane... yes
checking for gawk... gawk
checking whether make sets $(MAKE)... yes
checking for gcc... gcc
```
This continues to run through the GNU autoconf configuration, analyzing your system for various utilities, variables, and parameters that need to be set or exist on your system before you can compile awk. This can take some time before it finishes. If this succeeds, you can continue with compiling awk itself. If it doesn't, you need to resolve the configuration problem(s) that are presented before proceeding. Autoconf indicates if there is a significant problem with your configuration and requires you to resolve it and rerun ./configure before it can continue. It is not uncommon for autoconf to look for a utility and not find it and then proceed. This does not mean it has failed; it exits with an error if it fails.

To compile awk, issue a make command:

$ make
      make 'CFLAGS=-g -O2' 'LDFLAGS=-export-dynamic' all-recursive
make[1]: Entering directory `/home/micah/working/gawk-3.1.4'
Making all in intl
make[2]: Entering directory `/home/micah/working/gawk-3.1.4/intl'
make[2]: Nothing to be done for `all'.
make[2]: Leaving directory `/home/micah/working/gawk-3.1.4/intl'
Making all in .

This command continues to compile awk. It may take a few minutes to compile, depending on your system.

If everything goes as expected, you can install the newly compiled awk simply by issuing the make install command as root:
```
$ su
Password:
# make install
```
Awk is placed in the default locations in your file system. By default, make install installs all the files in /usr/local/bin, /usr/local/lib, and so on. You can specify an installation prefix other than /usr/local using --prefix when running configure; for instance, sh ./configure --prefix=$HOME will make awk so that it installs in your home directory. However, please heed the warning about replacing your system's installed awk, if it has one!

How awk Works

Awk has some basic functionality similarities with sed (see Chapter 6). At its most basic level, awk simply looks at lines that are sent to it, searching them for a pattern that you have specified. If it finds a line that matches the pattern that you have specified, awk does something to that line. That "something" is the action that you specify by your commands. Awk then continues processing the remaining lines until it reaches the end. Sed acts in the same way: It searches for lines and then performs editing commands on the lines that match. The input comes from standard in and is sent to standard out. In this way, awk is stream-oriented, just like sed.

In fact, there are a number of things about awk that are similar to sed. The syntax for using awk is very similar to sed; both are invoked using similar syntax; both use regular expressions for matching patterns.

Although the similarities exist, there are syntactic differences. When you run awk, you specify the pattern, followed by an action contained in curly braces. A very basic awk program looks like this:

awk '/somedata/ { print $0 }' filename

The rest of this brief section provides just an overview of the basic steps awk follows in processing the command. The following sections fill in the details of each step.

In this example, the expression that awk looks for is somedata. This is enclosed in slashes, and the action to be performed, indicated within the curly braces, is print $0. Awk works by stepping through three stages. The first is what happens before any data is processed; the second is what happens during the data processing loop; and the third is what happens after the data is finished processing. Before any lines are read in to awk and then processed, awk does some preinitialization, which is configurable in your script, by specifying a BEGIN clause. At the end of processing, you can perform any final actions by using an END clause.

Invoking awk

You can invoke awk in one of several ways, depending on what you are doing with it. When you become more familiar with awk, you will want to do quick things on the command line, and as things become more complex, you will turn them into awk programs.

The simplest method is to invoke awk on the command line. This is useful if what you are doing is relatively simple, and you just need to do it quickly. Awk can also be invoked this way within small shell scripts. You run an awk program on the command line by typing awk and then the program, followed by the files you want to run the program on:

awk 'program' filename1 filename2

The filename1 and filename2 are not required. You can specify only one input file, or two or more, and awk can even be run without any input files. Or you can pipe the data to awk instead of specifying any input files:

cat filename1 | sed 'program'

The program is enclosed in single quotes to keep the shell from interpreting special characters and to make the program a single argument to awk. The contents of program are the pattern to match, followed by the curly braces, which enclose the action to take on the pattern. The following Try It Out gives you some practice running basic awk programs.

Try It Out: A Simple Command-Line awk Program

Try this simple awk program, which runs without any input files at all:

$ awk '{ print "Hi Mom!" }'

If you run this as written, you will see that nothing happens. Because no file name was specified, awk is waiting for input from the terminal. If you hit the Enter key, you see the string "Hi Mom!" printed on the terminal. You need to issue an end-of-file to get out of this input; do this by pressing Ctrl-D.

Awk can also be invoked using bash's multiline capability. Bash is smart enough to know when you have not terminated a quote or a brace and prompts you for more input until you close the open quote or brace.

To do the preceding example in this way, type the following in a bash shell. After the first curly brace, press the Enter key to move to the next line:

$ awk '{
> print "Hi Mom!"
> }'

Once you have typed these three lines, hit the Enter key, and you will see again the string "Hi Mom!" printed on the terminal. Again you will need to issue an end-of-file by pressing Ctrl-D to get out of this input.

How It Works

The result of this command is the same output from the preceding command, just invoked in a different manner. The Korn, Bourne, and Z shell all look for closing single quotes and curly braces, and prompt you when you haven't closed them properly.

Note, however, that the C shell does not work this way.

Your awk commands will soon become longer and longer, and it will be cumbersome to type them on the command line. At some point you will find putting all your commands into a file to be a more useful way of invoking awk. You do this by putting all the awk commands into a file and then invoking awk with the -f flag followed by the file that contains your commands. The following Try It Out demonstrates this way of invoking awk.

Try It Out: Running awk Scripts with awk -f

In this Try It Out, you take the simple program from the previous Try It Out, place it into a file, and use the -f flag to invoke this program. First, take the text that follows and place it into a file called hello.awk:

{ print "Hi Mom!" }

Notice that you do not need to place single quotes around this text, as you did when you were invoking awk on the command line. When the program is contained in a file, this is not necessary.

Adding .awk at the end of the file is not necessary but is a useful convention to remember what files contain what types of programs.

Next, invoke sed using the -f flag to point it to the file that contains the program to execute:

$ awk -f hello.awk

This outputs the exact same output as the previous two examples did:

"Hi Mom!"

How It Works

Awk takes the -f flag, reads the awk program that is specified immediately afterward, and executes it. This outputs exactly the same way as the example showing you how to invoke awk on the command line.

You can also write full awk shell scripts by adding a magic file handle at the top of the file, as in the following Try It Out.

Try It Out: Creating Self-Contained awk Programs

Type the following script, including the { print "Hi Mom!" } from the preceding example, into a file called mom.awk:

#!/usr/bin/awk -f
# This is my first awk script!
# I will add more as I learn some more awk

{ print "Hi Mom!" } # This prints my greeting

If the awk on your system is located somewhere other than /usr/bin/awk, you need to change that path accordingly.

Now make the mom.awk script executable by typing the following on the command line:

$ chmod +x mom.awk

Now you can run this new shell script on the command line as follows:

$ ./hello.awk
Hi Mom!

This command tells the shell to execute the file hello.awk located in the current working directory. If you are not currently in the same directory as the script is located, this will not work. You need to type the full path of the script.

How It Works

Because the execute bit was set on the program, the shell knows that it should be able to run this file, rather than it simply containing data. It sees the magic file marker at the beginning, denoted by the shebang (#!), and uses the command that immediately follows to execute the script.

You may have noticed the comments that were snuck into the awk script. It is very common in shell scripting to add comments by placing the # symbol and then your comment text. In awk, the comments are treated as beginning with this character and ending at the end of the line. Each new line that you want to have a comment on requires another # symbol.

The print Command

Earlier, in the section How awk Works, you saw the basic method of using awk to search for a string and then print it. In this section, you learn exactly how that print command works and some more advanced useful incarnations of it.

First, you need some sample data to work with. Say you have a file called countries.txt, and each line in the file contains the following information:

Country   Internet domain   Area in sq. km   Population   Land lines   Cell phones

The beginning of the file has the following contents:

Afghanistan    .af 647500         28513677       33100          12000
Albania        .al    28748          3544808        255000         1100000
Algeria        .dz    2381740        32129324       2199600        1447310
Andorra        .ad    468            69865          35000          23500
Angola         .ao    1246700        10978552       96300          130000

The following command searches the file countries.txt for the string Al and then uses the print command to print the results:

$ awk '/Al/ { print $0 }' countries.txt
Albania         .al     28748   3544808         255000          1100000
Algeria         .dz     2381740 32129324        2199600         1447310

As you can see from this example, the regular expression surrounds the string to be searched for, in this case Al, and this matches two lines. The lines that are matched then have the command specified within the curly braces acted on it; in this case print $0 is executed, printing the lines.

This isn't very interesting, because you can do this with grep or sed. This is where awk starts to become interesting, because you can very easily say that you want to print only the matching countries' landline and cell phone usage:

$ awk '/Al/ { print $5,$6 }' countries.txt
255000 1100000
2199600  1447310

In this example, the same search pattern was supplied, and for each line that is matched, awk performs the specified actions, in this case, printing the fifth and sixth field. Awk automatically stores each field in its numerical sequential order. By default, awk defines the fields as any string of printing characters separated by spaces. The first field is the $0 field, which represents the entire line; this is why when you specified the action print $0, the entire line was printed for each match. Field $1 represents the first field (in our example, Country), the $2 represents the second field (Internet Domain), and so on.

By default, awk's behavior is to print the entire line, so each of the following lines results in the same output:

awk '/Al/' countries.txt
awk '/Al/ { print $0 }' countries.txt
awk '/Al/ { print }' countries.txt

Although explicitly writing print $0 is not necessary, it does make for good programming practice because you are making it very clear that this is your action instead of using a shortcut.

It is perfectly legal to omit a search pattern from your awk statement. When there is no search pattern provided, awk by default matches all the lines of your input file and performs the action on each line.

For example, the following command prints the number of cell phones in each country:

$ awk '{ print $6 }' countries.txt
12000
1100000
1447310
23500
130000

It prints each line because I did not specify a search pattern.

You can also insert text anywhere in the command action, as demonstrated in the following Try It Out.

Try It Out: Inserting Text

Type the following awk command to see the number of cell phones in use in each of the countries included:

$ awk '{ print "Number of cell phones in use in",$1":",$6 }' countries.txt
Number of cell phones in use in Afghanistan: 12000
Number of cell phones in use in Albania: 1100000
Number of cell phones in use in Algeria: 1447310
Number of cell phones in use in Andorra: 23500
Number of cell phones in use in Angola: 130000

How It Works

The print command is used only one time, not multiple times for each field. A comma separates the print elements in most cases, except after the $1 in this example. The comma inserts a space between the element printed before and after itself. (Try putting a comma after the $1 in the preceding command to see how it changes the output.)

For each line in the file, awk prints the string Number of cell phones in use in, and then it prints a space (because of the comma). Then it prints field number 1 (the country), followed by the simple string : (colon), and then another comma puts in a space, and finally field number 6 (cell phones) is inserted.

Print is a simple command; all by itself it just prints the input line. With one argument it prints the argument; with multiple arguments it prints all the arguments, separated by spaces when the arguments are separated by commas or together without spaces when the arguments are not comma delineated.

If you want to print a newline as part of your print command, just include the standard newline sequence as part of the string, as in the following Try It Out.

Try It Out: Printing Newlines

Write a quick letter to Mom that puts the strings on separate lines, so it looks nicer:

$ awk 'BEGIN { print "Hi Mom,

Camp is fun.

Love,
Son" }'
Hi Mom,

Camp is fun.

Love,
Son

How It Works

In this example, you put the entire awk program into a BEGIN block by placing the word BEGIN right before the first opening curly brace. Whatever is contained in a BEGIN block is executed one time immediately, before the first line of input is read (similarly, the END construct is executed once after all input lines have been read). Previously, you were putting your awk programs into the main execution block, and the BEGIN was empty and was not specified. Commands that are contained in the main block are executed on lines of input that come from a file or from the terminal. If you recall, the previous Try It Out sections required that you hit the Enter key for the output to be printed to the screen; this was the input that awk needed to execute its main block. In this example, all your commands are contained in the BEGIN block, which requires no input, so it prints right away without your needing to provide input with the Enter key.

Awk uses the common two-character designation to mean newline. Whenever awk encounters this construct in a print command, it actually prints a newline rather than the string .

Using Field Separators

The default field separator in awk is a blank space. When you insert a blank space by pressing the spacebar or Tab key, awk delineates each word in a line as a different field. However, if the data that you are working with includes spaces within the text itself, you may encounter difficulties.

For example, if you add more countries to your countries.txt file to include some that have spaces in them (such as Dominican Republic), you end up with problems. The following command prints the area of each country in the file:

$ awk '{ print $3 }' countries.txt
647500
28748
2381740
.do
468
1246700

Why is the .do included in the output? Because one of the lines of this file contains this text:

Dominican Republic  .do  8833634     48730  901800  2120400

The country Dominican Republic counts as two fields because it has a space within its name. You need to be very careful that your fields are uniform, or you will end up with ambiguous data like this. There are a number of ways to get around this problem; one of the easiest methods is to specify a unique field separator and format your data accordingly. In this case, you need to format your countries.txt file so that any country that has spaces in its name instead had underscores, so Dominican Republic becomes Dominican_Republic.

Unfortunately, it isn't always practical or possible to change your input data file. In this case, you can invoke awk with the -F flag to specify an alternative field separator character instead of the space. A very common field separator is the comma, so to instruct awk to use a comma as the character that separates fields, invoke awk using -F, to indicate the comma should be used instead. Most databases are able to export their data into CSV (Comma Separated Values) files. If your data is formatted using commas to separate each field, you can specify that field separator to awk on the command line, as the following Try It Out section demonstrates.

Try It Out: Alternative Field Separators

Reformat your countries.txt file so the fields are separated by commas instead of spaces, and save it in a file called countries.csv, so it looks like the following:

Afghanistan,.af,647500,28513677,33100,12000
Albania,.al,28748,3544808,255000,1100000
Algeria,.dz,2381740,32129324,2199600,1447310
Andorra,.ad,468,69865,35000,23500
Angola,.ao,1246700,10978552,96300,130000
Dominican Republic,.do,8833634,48730,901800,2120400

Then run the following on the command line:

$ awk -F, '{ print $3 }' countries.csv
647500
28748
2381740
8833634
468
1246700

As with spaces, make sure that the new field separator that you specify isn't being used in the data as well. For example, if the numbers in your data are specified using commas, as in 647,500, the numbers before and after the commas will be interpreted as two separate fields.

How It Works

Awk processes each line using the comma as the character that separates fields instead of a space. It reads each line of your countries.csv file, looks for the third field, and then prints it. This makes entries such as Dominican Republic, which has a space in its name, to be processed as you expect and not as two separate fields.

Using the printf Command

The printf (formatted print) command is a more flexible version of print. If you are familiar with C, you will find the printf command very familiar; it was borrowed from that language. Printf is used to specify the width of each item printed. It also can be used to change the output base to use for numbers, to determine how many digits to print after the decimal point, and more. Printf is different from print only because of the format string, which controls how to output the other arguments. One main difference between print and printf is that printf does not include a newline at the end. Another difference is that with printf you specify how you want to format your string. The printf command works in this format:

printf(<string>,<format string>)

The parentheses are optional, but otherwise, the basic print command that you have been using so far is almost identical:

printf("Hi Mom!
")

The string is the same with the exception of the added character, which adds a newline to the end of the string. This doesn't seem very useful, because now you have to add a newline when you didn't before. However, printf has more flexibility because you can specify format codes to control the results of the expressions, as shown in the following Try It Out examples.

Try It Out: The printf Command

Try the following printf command:

$ awk '{ printf "Number of cell phones in use in %s: %d
", $1, $6 }' countries.txt
Afghanistan    .af    647500         28513677       33100          12000
Albania        .al    28748          3544808        255000         1100000
Algeria        .dz    2381740        32129324       2199600        1447310
Andorra        .ad    468            69865          35000          23500
Angola         .ao    1246700        10978552       96300          130000

How It Works

As you see, this prints the output in exactly the same format as the command that used print instead of printf.

The printf command prints the string that is enclosed in quotes until it encounters the format specifier, the percent symbol followed by a format control letter. (See the table following the next Try It Out for a list of format control characters and their meanings.) The first instance, %s, tells awk to substitute a string, which is the first argument (in this case, $1), so it puts the $1 string in place of the %s. It then prints the colon and a space, and then encounters the second format specifier, %d, which tells awk to substitute a digit, the second argument. The second argument is $6, so it pulls that information in and replaces the %d with what $6 holds. After that, it prints a newline.

Try It Out: printf Format Codes

Try the following command to print the number of cell phones in each country in decimal, hex, and octal format:

$ awk '{ printf "Decimal: %d, Hex: %x, Octal: %o
", $6, $6, $6 }' countries.txt
Decimal: 12000, Hex: 2ee0, Octal: 27340
Decimal: 1100000, Hex: 10c8e0, Octal: 4144340
Decimal: 1447310, Hex: 16158e, Octal: 5412616
Decimal: 23500, Hex: 5bcc, Octal: 55714
Decimal: 130000, Hex: 1fbd0, Octal: 375720

How It Works

In this example, the same number is being referenced three times by the $6, formatted in decimal format, hexadecimal format, and octal format, depending on the format control character specified.

The following table lists the format control characters and what kind of values they print:

Format Control Character	Kind of Value Printed
%c	Prints a number in ASCII format. awk ' { printf "percnt;c", 65 }' outputs the letter A.
%d %i	Either character prints a decimal integer.
%e %E	Prints number in exponential format. awk ' { printf " %3.2e ", 2134 }' prints 2.13e+03.
%f	Prints number in floating-point notation. awk ' { printf " %3.2f ", 2134 }' prints 213.40.
%g %G	Prints a number in scientific notation or in floating-point notation, depending on which uses the least characters.
%o	Prints an unsigned octal integer.
%s	Prints a string.
%u	Prints an unsigned decimal integer.
%x %X	Prints an unsigned hexadecimal integer; using %X prints capital letters.
%%	Outputs a % character.

Using printf Format Modifiers

These printf format characters are useful for representing your strings and numbers in the way that you expect. You can also add a modifier to your printf format characters to specify how much of the value to print or to format the value with a specified number of spaces.

You can provide an integer before the format character to specify a width that the output would use, as in the following example:

$ awk '{ printf "|%16s|
", $6 }' countries.txt
|     Afghanistan|
|         Albania|
|         Algeria|

Here, the width 16 was passed to the format modifier %s to make the string the same length in each line of the output. You can left-justify this text by placing a minus sign in front of the number, as follows:

$ awk '{ printf "|%-16s|
", $1 }' countries.txt
|Afghanistan     |
|Albania         |
|Algeria         |

Use a fractional number to specify the maximum number of characters to print in a string or the number of digits to print to the right of the decimal point for a floating-point number:

$ awk '{ printf "|%-.4s|
", $1 }' countries.txt
|Afgh|
|Alba|
|Alge|

Try It Out: Using printf Format Modifiers

In this example, you use printf to create headings for each of the columns in the countries.txt file and print the data underneath each column. Put the following into a file called format.awk:

BEGIN { printf "%-15s %20s

", "Country", "Cell phones" }
      { printf "%-15s %20d
", $1, $6 }

Then call this script, using the countries.txt file as input:

$ awk -f format.awk countries.txt
Country                  Cell phones

Afghanistan                    12000
Albania                      1100000
Algeria                      1447310
Andorra                         23500

How It Works

The first block is contained within a BEGIN statement, so it is executed initially and only one time. This allows the header to be printed, and then the main execution block is run over the input data.

Printf format specifiers are used to left-justify the first string and specify that the column width be 15 characters wide; the second string has 20 characters specified as the format string. Because you used the same format string specifiers for the headers, they line up above the data.

Using the sprintf Command

The sprintf function operates exactly like printf, with the same syntax. The only difference is that it assigns its output to a variable (variables are discussed in the next section), rather than printing it to standard out. The following example shows how this works:

$ awk '{ variable = sprintf("[%-.4s]", $1); print variable}' countries.txt
|Afgh|
|Alba|
|Alge|

This assigns the output from the sprintf function to the variable variable and then prints that variable, which results in the same output as if you had used printf.

Using Variables in awk

In Chapter 2, variables were introduced as a mechanism to store values that can be manipulated or read later, and in many ways they operate the same in awk, with some differences in syntax and particular built-in variables. The last section introduced the sprintf command, which assigns its output to a variable. The example in that section was a user-defined variable. Awk also has some predefined, or built-in, variables that can be referenced. The following sections provide more detail on using these two types of variables with awk.

User-Defined Variables

User-defined variables have a few rules associated with them. They must not start with a digit and are case sensitive. Besides these rules, your variables can consist of alphanumeric characters and underscores. A user-defined variable must not conflict with awk's reserved built-in variables or commands. For example, you may not create a user-defined variable called print, because this is an awk command. Unlike some programming languages, variables in awk do not need to be initialized or declared. The first time you use a variable, it is set to an empty string ("") and assigned 0 as its numerical value. However, relying on default values is a bad programming practice and should be avoided. If your awk script is long, define the variables you will be using in the BEGIN block, with the values that you want set as defaults.

Variables are assigned values simply by writing the variable, followed by an equal sign and then the value. Because awk is a "weak-typed" language, you can assign numbers or strings to variables:

myvariable = 3.141592654
myvariable = "some string"

When you perform a numeric operation on a variable, awk gives you a numerical result; if a string operation is performed, a string will be the result.

In the earlier section on printf, the Try It Out example used format string modifiers to specify columnar widths so that the column header lined up with the data. This format string modifier could be set in a variable instead of having to type it each time, as in the following code:

BEGIN { colfmt="%-15s %20s
"; printf colfmt, "Country", "Cell phones
" }
      { printf colfmt, $1, $6 }

In this example, a user-defined variable called colfmt is set, containing the format string specifiers that you want to use in the rest of the script. Once it is defined, you can reference it simply by using the variable; in this case it is referenced twice in the two printf statements.

Built-in Variables

Built-in variables are very useful if you know what they are used for. The following subsections introduce you to some of the most commonly used built-in variables.

Remember, you should not create a user-defined variable that conflicts with any of awk's built-in variables.

The FS Variable

FS is awk's built-in variable that contains the character used to denote separate fields. In the section Using Field Separators you modified this variable on the command line by passing the -F argument to awk with a new field separator value (in that case, you replaced the default field separator value with a comma to parse CSV files). It is actually more convenient to put the field separator into your script using awk's built-in FS variable rather than setting it on the command line. This is more useful when the awk script is in a file, rather on the command line where specifying flags to awk is not difficult.

To change the field separator within a script you use the special built-in awk variable, FS. To change the field separator variable, you need to assign a new value to it at the beginning of the script. It must be done before any input lines are read, or it will not be effective on every line, so you should set the field separator value in an action controlled by the BEGIN rule, as in the following Try It Out.

Try It Out: Using the Field Separator Variable

Type the following into a file called countries.awk:

# Awk script to print the number of cell phones in use in each country

BEGIN { FS = "," } # Our data is separated by commas

{ print "Number of cell phones in use in",$1":",$6 }

Note that there are double quotes around the comma and that there are no single quotes around the curly braces, as there would be on the command line.

This script can be invoked using the -f flag. Use it against the CSV version of the countries.csv file that contains each field separated by a comma:

$ awk -f countries.awk countries.csv

Note the difference between the -F flag and the -f flag. You use the -f flag here to execute the specified countries.awk script. This script sets the field separator (FS) variable to use a comma as the field separator, rather than setting the field separator using the -F flag on the command line, as in a previous example.

How It Works

In the BEGIN block of the awk script, the FS built-in variable is set to the comma character. It remains set to this throughout the script (as long as it doesn't get redefined later). Awk then uses this to determine what separates fields, just like it did when the -F flag was used on the command line.

FS Regular Expressions

The FS variable can contain more than a single character, and when it does, it is interpreted as a regular expression. If you use a regular expression for a field separator, you then have the ability to specify several characters to be used as delimiters, instead of just one, as in the following Try It Out.

Try It Out: Field Separator Regular Expressions

The following assignment of the FS variable identifies a comma followed by any number of spaces as the field separator:

$ echo "a,,,   b,,,,, c,,   d, e,, f,   g" | awk 'BEGIN {FS="[,]+[ ]+"} {print $2}'
b

How It Works

The FS variable is set to match the regular expression that says any number of commas and any number of spaces. Notice that no matter how many spaces or commas there are between fields, the regular expression matches the fields as expected.

The NR Variable

The built-in variable NR is automatically incremented by awk on each new line it processes. It always contains the number of the current record. This is a useful variable because you can use it to count how many lines are in your data, as in the following Try It Out.

Try It Out: Using the NR Variable

Try using the NR variable to print the number of lines in your countries.txt file:

$ awk 'END { print "Number of countries:", NR }' countries.txt
Number of countries: 5

How It Works

Each line is read into awk and processed, but because there is no BEGIN or main code block, nothing happens until all of the lines have been read in and processed. After all of the lines have been processed (the processing is nothing, but awk still reads each line in individually and does nothing to them), the END block is executed. The NR variable has been automatically incremented internally for each line read in, so at the END the variable has the total of all the lines that have been read in from the file, giving you a count of the lines in a file.

Of course, you could use the much easier Unix utility wc to get the same output.

Try It Out: Putting It All Together

In this Try It Out, you add line numbers to the output that you created in the earlier columnar display example by adding the NR variable to the output. Edit the format.awk script so it looks like the following:

BEGIN { colfmt="%-15s %20s
"; printf colfmt, "Country", "Cell phones
" }
      { printf "%d. " colfmt, NR, $1, $6 }

And then run it:

$ awk -f format.awk countries.txt
Country                      Cell phones

1. Afghanistan                    12000
2. Albania                      1100000
3. Algeria                      1447310
4. Andorra                         23500
5. Angola                        130000

How It Works

You set the colfmt variable to have the printf format specifier in the BEGIN block, as you did in the User-Defined Variables section, and then print the headers. The second line, which contains the main code block, has a printf command with the format character %d. This specifies that a digit will be put in this position, followed by a period and then a space. The colfmt format specifier variable is set, and then the elements of the printf command are specified. The first is the NR variable; because this is a digit and is the first element, it gets put into the %d. position. The first line read in will have NR set to the number 1, so the first line prints 1. followed by the formatting and field 1 and field 6. When the next line is read in, NR gets set to 2, and so on.

The following table contains the basic built-in awk variables and what they contain. You will find these very useful as you make awk scripts and you need to make decisions about how your script runs depending on what is happening internally.

Built-in Variable	Contents
ARGC, ARGV	Contains a count and an array of the command-line arguments.
CONVFMT	Controls conversions of numbers to strings; default value is set to %.6g.
ENVIRON	Contains an associative array of the current environment. Array indices are set to environment variable names.
FILENAME	The name of the file that awk is currently reading. Set to `-` if reading from STDIN; is empty in a BEGIN block.
FNR	Current record number in the current file, incremented for each line read. Set to 0 each time a new file is read.
FS	Input field separator; default value is " ", a string containing a single space. Set on command line with flag -F.
NF	Number of fields in the current input line. NF is set every time a new line is read.
NR	Number of records processed since the beginning of execution. It is incremented with each new record read.
OFS	Output field separator; default value is a single space. The contents of this variable are output between fields printed by the print statement.
ORS	Output record specifier; the contents of this variable are output at the end of every print statement. Default value is , a newline.
PROCINFO	An array containing information about the running program. Elements such as "gid", "uid", "pid", and "version" are available.
RS	Input record separator; default value is a string containing a newline, so an input record is a single line of text.

Control Statements

Control statements are statements that control the flow of execution of your awk program. Awk control statements are modeled after similar statements in C, and the looping and iteration concepts are the same as were introduced in Chapter 3. This means you have your standard if, while, for, do, and similar statements.

All control statements contain a control statement keyword, such as if, and then what actions to perform on the different results of the control statement.

if Statements

One of the most important awk decision making statements is the if statement. It follows a standard if (condition) then-action [else else-action] format, as in the following Try It Out.

Try It Out: Using if Statements

Type the following command to perform an if statement on the countries.txt file:

$ awk '{ if ($3 < 1000) print }'
Andorra .ad 468 69865 35000 23500

How It Works

Awk reads in each line of the file, looks at field number 3, and then does a check to see if that field's contents are less than the number 1,000, performing a comparative operation on the field. (Comparative operations are tests that you can perform to see if something is equal to, greater than, less than, true/false, and so on.) With this file, only the Andorra line has a third field containing a number that is less than 1,000.

Notice that the if conditional does not use a then, it just assumes that whatever statement follows the condition (in this case, print) is what should be done if the condition is evaluated to be true.

If statements often have else statements as well. The else statements define what to do with the data that does not match the condition, as in this Try It Out.

Try It Out: Using else

Type the following into a file called ifelse.awk to see how an else statement enhances an if:

{ if ($3 < 1000)
        printf "%s has only %d people!
", $1, $3
  else
        printf "%s has a population larger than 1000
", $1 }

Notice how the script has been formatted with whitespace to make it easier to read and understand how the flow of the condition works. This is not required but is good programming practice!

Then run the script:

$ awf -f ifelse.awk countries.txt
Afghanistan has a population larger than 1000
Albania has a population larger than 1000
Algeria has a population larger than 1000
Andorra has only 468 people!
Angola has a population larger than 1000

How It Works

The condition if ($3 < 1000) is tested. If it is true for a country, the first printf command is executed; otherwise, the second printf command is executed.

An else statement can include additional if statements to make the logic fulfill all conditions that you require. For example:

{ if ( $1 == "cat" )
        print "meow";
    else if ( $1 == "dog" )
        print "woof";
    else if ( $1 == "bird" )
        print "caw";
    else
        print "I do not know what kind of noise " $1 " makes!" }

Each condition is tested in the order it appears, on each line in succession. Awk reads in a line, tests the first field to see if it is cat and if so, prints meow. Otherwise, awk goes on to test whether the first field is instead dog and, if so, prints woof. This process continues until awk reaches a condition that tests to be true or runs out of conditions. If $1 isn't cat, dog, or bird, then awk admits it doesn't know what kind of noise the animal that is in $1 makes.

Comparison Operators

These examples use the less than operation, but there are many other operators available for making conditional statements powerful. Another example is the equal comparison operator, which checks to see if something is equal to another thing. For example, the following command looks in the first field on each line for the string Andorra and, if it finds it, prints it:

awk '{ if ($1 == "Andorra") print }'

Unlike some languages, relational expressions in awk do not return a value; they evaluate to a true condition or a false condition only.

The following table lists the comparison operators available in awk.

Comparison Operator	Description
<	Less than
<=	Less than or equal to
>	Greater than
>=	Greater than or equal to
!=	Not equal
==	Equal

It is also possible to combine as many comparison operators in one statement as you require by using AND (&&) as well as OR (||) operators. This allows you to test for more than one thing before your control statement is evaluated to be true.

For example:

$ awk '{ if ((($1 == "Andorra") && ($3 <= 500)) || ($1 == "Angola")) print }'
Andorra .ad 468 69865 35000 23500
Angola .ao 1246700 10978552 96300 130000

This prints any line whose first field contains the string Andorra and whose third field contains a number that is less than or equal to 500, or any line whose first field contains the string Angola.

As this example illustrates, each condition that you are testing must be surrounded by parentheses. Because the first and second condition are together (the first field has to match Andorra and the third field must be less than or equal to 500), the two are enclosed together in additional parentheses. There are also opening and closing parentheses that surround the entire conditional.

Arithmetic Functions

The comparison operators are useful for making comparisons, but you often will want to make changes to variables. Awk is able to perform all the standard arithmetic functions on numbers (addition, subtraction, multiplication, and division), as well as modulo (remainder) division, and does so in floating point. The following Try It Out demonstrates some arithmetic functions.

Try It Out: Using awk as a Calculator

Type the following commands on the command line:

$ awk 'BEGIN {myvar=10; print myvar+myvar}'
20
$ awk 'BEGIN {myvar=10; myvar=myvar+1; print myvar}'
11

How It Works

In these examples, you start off setting the variable myvar to have the value of 10. The first example does a simple addition operation to print the result of adding the value of myvar to myvar, resulting in adding 10 + 10. The second example adds 1 to myvar, puts that value into myvar, and then prints it. In the first example, myvar was not changed from its value of 10, so after the print statement, it still contains the value 10, but in the second example, the result is added to the variable, so myvar changed to the new value.

In the second example, you used an arithmetic operation to increase the value of the variable by one. There is actually a shorter way of doing this in awk, and it has some additional functionality. You can use the operator ++ to add 1 to a variable, and the operator -- to subtract 1. The position of these operators makes the increase or decrease happen at different points.

Try It Out: Increment and Decrement Operators

To get a good understanding of how this works, try typing the following examples on the command line:

$ awk 'BEGIN {myvar=10; print ++myvar; print myvar}'
11
11
$ awk 'BEGIN {myvar=10; print myvar++; print myvar}'
10
11

How It Works

In these two examples, the variable myvar is initialized with the value of 10. In the first example, a print is done on ++myvar that instructs awk to increment the value of myvar by 1 and then print it; this results in the printing of the first 11. Then you print myvar a second time to illustrate that the variable has actually been set to the new incremented value of 11. This is the same process shown in the previous section using myvar=myvar+1.

The second command is an example of a postincrement operation. The value of myvar is first printed and then it is incremented. The new value of myvar is then printed to illustrate that the variable was actually incremented.

A third set of increment operator shortcuts are the += and the -= operators. These allow you to add to or subtract from the variable.

Try It Out: Using the Add-to Operator

For example, you can use the += operator to add up all the cell phone users in your countries.txt file:

$ awk 'BEGIN {celltotal = 0}
>    {celltotal += $6}
>    END { print celltotal }' /tmp/countries.txt
2712810

How It Works

This example uses the += operator to add to the celltotal variable the contents of field number 6. The first line of the file is read in, and celltotal gets the value of the sixth field added to it (the variable starts initialized as 0 in the BEGIN block). It then reads the next line of the file, taking the sixth field and adding it to the contents of the celltotal variable. This continues until the end, where the value of that variable is printed.

Output Redirection

Be careful when using comparison operators, because some of them double as shell output variables in different contexts. For example the > character can be used in an awk statement to send the output from a command or a function into the file specified. For example, if you do the following:

$ awk 'BEGIN { print 4+5 > "result" }'

you create a file called result in your current working directory and then print the result of the sum of 4 + 5 into the file. If the file result already exists, it will be overwritten, unless you use the shell append operator, as follows:

$ awk 'BEGIN { print 5+5 >> "result" }'

This appends the summation of 5 + 5 to the end of the result file. If that file doesn't exist, it will be created.

Output from commands can also be piped into other system commands in the same way that this can be done on the shell command line.

While Loops

While statements in awk implement basic looping logic, using the same concepts introduced in Chapter 3. Loops continually execute statements until a condition is met. A while loop executes the statements that you specify while the condition specified evaluates to true.

While statements have a condition and an action. The condition is the same as the conditions used in if statements. The action is performed as long as the condition tests to be true. The condition is tested; if it is true, the action happens, and then awk loops back and tests the condition again. At some point, unless you have an infinite loop, the condition evaluates to be false, and then the action is not performed and the next statement in your awk program is executed.

Try It Out: Using while Loops

Try this basic while loop to print the numbers 1 through 10:

$ awk 'BEGIN { while (++myvar <= 10 ) print myvar }'

How It Works

The variable myvar starts off with the value of 0. Awk then uses the variable increment operators to increment myvar by 1 to have the value of 1. The while condition is tested, "Is myvar less than or equal to 10?" The answer is that myvar is 1, and 1 is less than 10, so print the value of myvar. The loop repeats, the myvar variable is incremented by 1, the condition is tested, it passes, and the value of the variable is printed (2).

For Loops

For loops are more flexible and provide a syntax that is easier to use, although they may seem more complex. They achieve the same results as a while loop but are often a better way of expressing it. Check out this example.

Try It Out: Using for Loops

This for loop prints every number between 1 and 10:

$ awk 'BEGIN { for ( myvar = 1; myvar <= 10; myvar++ ) print myvar }'

How It Works

For loops have three pieces. The first piece of the for loop does an initial action; in this case, it sets the variable myvar to 1. The second thing it does is set a condition; in this case, as long as myvar is less than or equal to 10, continue looping. The last part of this for loop is an increment; in this example, you increment the variable by 2. So in English, you could read this a, "For every number between 1 and 1-, print the number."

Functions

Awk has some built-in functions that make life as an awk programmer easier. These functions are always available; you don't need to define them or bring in any extra libraries to make them work. A function is called with arguments and returns the results. Functions are useful for things such as performing numeric conversions, finding the length of strings, changing the case of a string, running system commands, printing the current time, and the like.

Different functions have different requirements for how many arguments must be passed in order for them to work. Many have optional arguments that do not need to be included or have defaults that can be set if you desire. If you provide too many arguments to a function, gawk gives you a fatal error, while some awk implementations just ignore the extra arguments.

Functions are called in a standard way: the function name, an opening parenthesis, and then before the final parenthesis the arguments to the function. For example, sin($3) is calling the sin function and passing the argument $3. This function returns the mathematical sine value of whatever argument is sent to it.

Function arguments that are expressions, such as x+y, are evaluated before the function is passed those arguments. The result of x+y is what is passed to the function, rather than "x+y" itself.

Try It Out: Function Examples

Try these functions to see how they work:

$ awk 'BEGIN {print length("dog")}'
3
$ awk 'BEGIN {x=6; y=10; print sqrt(x+y)}'
4

How It Works

The first function in the example is the length function. It takes a string and tells you how many characters are in it. You pass the argument do, and it returns the length of that string.

The second sets two variables and then calls the square root function, using the additive of those two variables as the function argument. Because the expression is evaluated before the function is passed the argument, x+y is evaluated to be 16 and then sqrt(16) is called.

Awk has a number of predefined, built-in functions, and gawk has even more extensive ones available. The number of functions available are too many to list here, but you should look through the manual pages to see what functions are available, especially before you struggle to try to do something that may be implemented already in a built-in function.

The following table provides a list of some of the more common functions and what they do.

Function	Description of Function
atan(x,y)	Returns arctangent of y/x in radians
cos(x)	Returns the cosine of x
exp()	Returns the exponential e^x
index (in, find)	Searches string in for the first occurrence of find and returns its character position
int (x)	Returns nearest integer to x
length ([string])	Returns number of characters in string
log (x)	Returns the logarithm of x
rand()	Returns a random number between 0 (zero) and 1
sin(x)	Returns the radial sine of x
sqrt(x)	Returns the square root of x
strftime (format)	Returns the time in the format specified, similar to the C function strftime().
tolower (string), toupper (string)	Changes the case of string
system (command)	Executes command and returns the exit code of that command
systime()	Returns the current seconds since the system epoch

Resources

The following are some good resources on the awk language:

You can find the sources to awk at ftp://ftp.gnu.org/pub/gnu/awk.
The Awk FAQ has many useful answers to some of the most commonly asked questions. It is available at www.faqs.org/faqs/computer-lang/awk/faq/.
The GNU Gawk manual is a very clear and easy-to-understand guide through the language: www.gnu.org/software/gawk/manual/gawk.html.
The newsgroup for awk is comp.lang.awk.

Summary

Awk can be complex and overwhelming, but the key to any scripting language is to learn some of the basics and start writing some simple scripts. As you practice, you will become more proficient and faster with writing your scripts. Now that you have a basic understanding of awk, you can dive further into the complexities of the language and use what you know to accomplish whatever it is you need to do in your shell scripts.

In this chapter:

You learned what awk is and how it works, all the different versions that are available, and how to tell what version you have installed on your system. You also learned how to compile and install gawk, the most frequently used awk implementation.
You learned how awk programs flow, from BEGIN to END, and the many different ways that awk can be invoked: from the command line or by creating independent awk scripts.
You learned the basic awk print command and the more advanced printf and sprintf.
You learned about different fields, the field separator variable, and different ways to change this to what you need according to your data.
You learned about string formatting and format modifier characters, and now you can make nice-looking reports easily.
You learned how to create your own variables and about the different built-in variables that are available to query throughout your programs.
Control blocks were introduced, and you learned how to do if, for, and do loops.
Arithmetic operators and comparison operators were introduced, as well as different ways to increment and decrement variables.
You were briefly introduced to some of awk's standard built-in functions.

Exercises

Pipe your /etc/passwd file to awk, and print out the home directory of each user.
Change the following awk line so that it prints exactly the same but doesn't make use of commas:
```
awk '{ print "Number of cell phones in use in",$1":",$6 }' countries.txt
```
Print nicely formatted column headings for each of the fields in the countries.txt file, using a variable to store your format specifier.
Using the data from the countries.txt file, print the total ratio of cell phones to all the landlines in the world.
Provide a total of all the fields in the countries.txt at the bottom of the output.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for 7. Processing Text with awk

Create new playlist

Sign In

Sign Up

Chapter 7. Processing Text with awk

What Is awk (Gawk/Mawk/Nawk/Oawk)?

Gawk, the GNU awk

What Version Do I Have Installed?

Installing gawk

Note

How awk Works

Invoking awk

The print Command

Using Field Separators

Using the printf Command

Using printf Format Modifiers

Using the sprintf Command

Using Variables in awk

User-Defined Variables

Built-in Variables

The FS Variable

FS Regular Expressions

The NR Variable

Control Statements

if Statements

Comparison Operators

Arithmetic Functions

Output Redirection

While Loops

For Loops

Functions

Resources

Summary

Exercises

Table of Contents for
7. Processing Text with awk