An awk program consists of the awk command, the program instructions enclosed in quotes (or in a file), and the name of the input file. If an input file is not specified, input comes from standard input (stdin), the keyboard.
Awk instructions consist of patterns, actions, or a combination of patterns and actions. A pattern is a statement consisting of an expression of some type. If you do not see the keyword if, but you think the word if when evaluating the expression, it is a pattern. Actions consist of one or more statements separated by semicolons or newlines and enclosed in curly braces. Patterns cannot be enclosed in curly braces, and consist of regular expressions enclosed in forward slashes or expressions consisting of one or more of the many operators provided by awk.
Awk commands can be typed at the command line or in awk script files. The input lines can come from files, pipes, or standard input.
In the following examples, the percent sign (%) is the C shell prompt.
Format
% awk 'pattern' filename % awk '{action}' filename % awk 'pattern {action}' filename
Here is a sample file called employees:
% cat employees Tom Jones 4424 5/12/66 543354 Mary Adams 5346 11/4/63 28765 Sally Chang 1654 7/22/54 650000 Billy Black 1683 9/23/44 336500 % awk '/Mary/' employees Mary Adams 5346 11/4/63 28765 |
Explanation
Awk prints all lines that contain the pattern Mary.
% cat employees Tom Jones 4424 5/12/66 543354 Mary Adams 5346 11/4/63 28765 Sally Chang 1654 7/22/54 650000 Billy Black 1683 9/23/44 336500 % awk '{print $1}' employees Tom Mary Sally Billy |
Explanation
Awk prints the first field of file employees, where the field starts at the left margin of the line and is delimited by white space.
% cat employees Tom Jones 4424 5/12/66 543354 Mary Adams 5346 11/4/63 28765 Sally Chang 1654 7/22/54 650000 Billy Black 1683 9/23/44 336500 %awk '/Sally/{print $1, $2}'employees Sally Chang |
Explanation
Awk prints the first and second fields of file employees, only if the line contains the pattern Sally. Remember, the field separator is white space.
The output from a Linux command or commands can be piped to awk for processing. Shell programs commonly use awk for manipulating commands.
Format
% command | awk 'pattern' % command | awk '{action}' % command | awk 'pattern {action}'
1 % df | awk '$4 > 75000' /oracle (/dev/dsk/c0t0d057 ):390780 blocks 105756 files /opt (/dev/dsk/c0t0d058 ):1943994 blocks 49187 files 2 % rusers | awk '/root$/{print $1}' owl crow bluebird |
Explanation
The df command reports the free disk space on file systems. The output of the df command is piped to awk. If the fourth field is greater than 75,000 blocks, the line is printed.
Awk has a number of comand line options. Gawk has two formats for command line options: the Gnu long format starting with a double dash (--) and a word, and the traditional short POSIX format, consisting of a dash and one letter. Gawk specific options are used with the -W option or its corresponding long option. Any arguments provided to long options are either joined by an = sign (with no intervening spaces), or may be provided in the next command line argument. The --help option to gawk lists all the gawk options. See Table 5.1.
% awk --help Usage: awk [POSIX or GNU style options] -f progfile [--] file awk [POSIX or GNU style options] [--] 'program' file POSIX options: GNU long options: -f progfile --file=progfile -F fs --field-separator=fs -v var=val --assign=var=val -m[fr] val -W compat --compat -W copyleft --copyleft -W copyright --copyright -W help --help -W lint --lint -W lint-old --lint-old -W posix --posix -W re-interval --re-interval -W source=program-text --source=program-text -W traditional --traditional -W usage --usage -W version --version Report bugs to [email protected], with a Cc: to [email protected] |
Options | Meaning |
---|---|
-F fs, --field-separator fs | Specifies the input field separator,where fs is either a string or regular expression; e.g., FS=”:” or FS=”[ :]” |
-v var=value, --assign var=value | Assigns a value to a user-defined variable, var before the awk script starts execution. Available to the BEGIN block. |
-f scriptfile, --file scriptfile | Reads awk commands from the scriptfile. |
-mf nnn, -mr nnn | Sets memory limits to the value ofnnn. With -mf as the option, limits the maximum number of fields to nnn; with -mr as the option sets the maximum number of records. Not applicable for gawk. |
-W traditional,
-W compat, --traditional --compat | Runs in compatibility mode so that gawk behaves exactly as UNIX versions of awk. All gawk extensions are ignored. Both modes do the same thing; --traditional is preferred. |
-W copyleft
-W copyright --copyleft | Prints abbreviated version of copyright information. |
-W help
-W usage --help --usage | Prints the available awk options and a short summary of what they do. |
-W lint --lint | Prints warnings about the use of constructs that may not be portable to traditional versions of UNIX awk. |
-W lint-old, --lint-old | Provides warnings about constructs that are not portable to the original version of UNIX implementations. |
-W posix --posix | Turns on the compatibility mode. Does not recognize: x escape sequences, newlines as a field separator character if FS is assigned a single space, the function keyword, func, operators ** and **= to replace ^ and ^=, and fflush. |
-W re-interval, --re-interval | Allows the use of interval regular expressions (see "The POSIX Character Class" on page 147); i.e., the bracketed expressions such as [[:alpha:]] |
-W source program-text --source program-text | Uses program-text as awk's source code allowing awk commands at the command line to be intermixed with -f files; e.g., awk -W source `{print $1} -f cmdfile inputfile |
-W version --version | Prints version and bug reporting information. |
-- | Signals the end of option processing. |
18.218.224.226