This appendix provides instructions for installing gawk
on the various platforms that are supported
by the developers. The primary developer supports GNU/Linux (and Unix), whereas
the other ports are contributed. See Reporting Problems and Bugs for the email
addresses of the people who maintain the respective ports.
This section describes how to get the gawk
distribution, how to extract it, and then
what is in the various files and subdirectories.
There are two ways to get GNU software:
Copy it from someone else who already has it.
Retrieve gawk
from the
Internet host ftp.gnu.org
, in the
directory /gnu/gawk
. Both
anonymous ftp
and http
access are supported. If you have the
wget
program, you can use a
command like the following:
wget http://ftp.gnu.org/gnu/gawk/gawk-4.1.2.tar.gz
The GNU software archive is mirrored around the world. The up-to-date list of mirror sites is available from the main FSF website. Try to use one of the mirrors; they will be less busy, and you can usually find one closer to your site.
gawk
is distributed as several
tar
files compressed with different
compression programs: gzip
, bzip2
, and xz
. For simplicity, the rest of these
instructions assume you are using the one compressed with the GNU Gzip
program (gzip
).
Once you have the distribution (e.g., gawk-4.1.2.tar.gz
), use gzip
to expand the file and then use tar
to extract it. You can use the following
pipeline to produce the gawk
distribution:
gzip -d -c gawk-4.1.2.tar.gz | tar -xvpf -
On a system with GNU tar
, you
can let tar
do the decompression for
you:
tar -xvpzf gawk-4.1.2.tar.gz
Extracting the archive creates a directory named gawk-4.1.2
in the current directory.
The distribution filename is of the form gawk-
.
The V
.R
.P
.tar.gzV
represents the major version of
gawk
, the
R
represents the current release of version
V
, and the P
represents a patch level, meaning that minor bugs
have been fixed in the release. The current patch level is 2, but when
retrieving distributions, you should get the version with the highest
version, release, and patch level. (Note, however, that patch levels
greater than or equal to 70 denote “beta” or nonproduction software; you
might not want to retrieve such a version unless you don’t mind
experimenting.) If you are not on a Unix or GNU/Linux system, you need
to make other arrangements for getting and extracting the gawk
distribution. You should consult a local
expert.
The gawk
distribution has a
number of C source files, documentation files, subdirectories, and files
related to the configuration process (see Compiling and Installing gawk on Unix-Like Systems), as well as several subdirectories
related to different non-Unix operating systems:
.c
’, ‘.y
’, and ‘.h
’ filesThese files contain the actual gawk
source code.
ABOUT-NLS
A file containing information about GNU gettext
and translations.
AUTHORS
A file with some information about the authorship of
gawk
. It exists only to satisfy
the pedants at the Free Software Foundation.
README
README_d/README.*
Descriptive files: README
for gawk
under Unix and the rest for the
various hardware and software combinations.
INSTALL
A file providing an overview of the configuration and installation process.
ChangeLog
A detailed list of source code changes as bugs are fixed or improvements made.
ChangeLog.0
An older list of source code changes.
NEWS
A list of changes to gawk
since the last release or patch.
NEWS.0
An older list of changes to gawk
.
COPYING
The GNU General Public License.
POSIX.STD
A description of behaviors in the POSIX standard for
awk
that are left undefined, or
where gawk
may not comply
fully, as well as a list of things that the POSIX standard should
describe but does not.
doc/awkforai.txt
Pointers to the original draft of a short article describing
why gawk
is a good language for
artificial intelligence (AI) programming.
doc/bc_notes
A brief description of gawk
’s “byte code” internals.
doc/README.card
doc/ad.block
doc/awkcard.in
doc/cardfonts
doc/colors
doc/macros
doc/no.colors
doc/setter.outline
The troff
source for a
five-color awk
reference card.
A modern version of troff
such
as GNU troff
(groff
) is needed to produce the color
version. See the file README.card
for instructions if you
have an older troff
.
doc/gawk.1
The troff
source for a
manual page describing gawk
.
This is distributed for the convenience of Unix users.
doc/gawktexi.in
doc/sidebar.awk
The Texinfo source file for this book. It should be
processed by doc/sidebar.awk
before processing with texi2dvi
or texi2pdf
to produce a
printed document, and with makeinfo
to produce an Info or HTML
file. The Makefile
takes care
of this processing and produces printable output via texi2dvi
or texi2pdf
.
doc/gawk.texi
The file produced after processing gawktexi.in
with sidebar.awk
.
doc/gawk.info
The generated Info file for this book.
doc/gawkinet.texi
The Texinfo source file for TCP/IP
Internetworking with gawk. It should be processed with
TeX (via texi2dvi
or texi2pdf
) to produce a printed document
and with makeinfo
to produce an
Info or HTML file.
doc/gawkinet.info
The generated Info file for TCP/IP Internetworking with gawk.
doc/igawk.1
The troff
source for a
manual page describing the igawk
program presented in An Easy Way to Use Library Functions.
doc/Makefile.in
The input file used during the configuration process to
generate the actual Makefile
for creating the documentation.
Makefile.am
*/Makefile.am
Files used by the GNU Automake software for generating the
Makefile.in
files used by
Autoconf and configure
.
Makefile.in
aclocal.m4
bisonfix.awk
config.guess
configh.in
configure.ac
configure
custom.h
depcomp
install-sh
missing_d/*
mkinstalldirs
m4/*
These files and subdirectories are used when configuring and
compiling gawk
for various Unix
systems. Most of them are explained in Compiling and Installing gawk on Unix-Like Systems. The rest are there to support the
main infrastructure.
po/*
The po
library contains
message translations.
awklib/extract.awk
awklib/Makefile.am
awklib/Makefile.in
awklib/eg/*
The awklib
directory
contains a copy of extract.awk
(see Extracting Programs from Texinfo Source Files), which can be used to extract the
sample programs from the Texinfo source file for this book. It
also contains a Makefile.in
file, which configure
uses to generate a Makefile
. Makefile.am
is used by GNU Automake to
create Makefile.in
. The
library functions from Chapter 10 and
the igawk
program from An Easy Way to Use Library Functions are included as ready-to-use files in
the gawk
distribution. They are
installed as part of the installation process. The rest of the
programs in this book are available in appropriate subdirectories
of awklib/eg
.
extension/*
The source code, manual pages, and infrastructure files for
the sample extensions included with gawk
. See Chapter 16 for more information.
posix/*
Files needed for building gawk
on POSIX-compliant systems.
pc/*
Files needed for building gawk
under MS-Windows (see Installation on PC Operating Systems for details).
vms/*
Files needed for building gawk
under Vax/VMS and OpenVMS (see
Compiling and Installing gawk on Vax/VMS and OpenVMS for details).
test/*
A test suite for gawk
.
You can use ‘make check
’ from
the top-level gawk
directory to
run your version of gawk
against the test suite. If gawk
successfully passes ‘make
check
’, then you can be confident of a successful
port.
Usually, you can compile and install gawk
by typing only two commands. However, if
you use an unusual system, you may need to configure gawk
for your system yourself.
The normal installation steps should work on all modern commercial Unix-derived systems, GNU/Linux, BSD-based systems, and the Cygwin environment for MS-Windows.
After you have extracted the gawk
distribution, cd
to gawk-4.1.2
. As with most GNU software, you
configure gawk
for your system by
running the configure
program. This
program is a Bourne shell script that is generated automatically using
GNU Autoconf. (The Autoconf software is described fully in
Autoconf—Generating Automatic Configuration
Scripts, which can be found online at the Free
Software Foundation’s website.)
To configure gawk
, simply run
configure
:
sh ./configure
This produces a Makefile
and
config.h
tailored to your system.
The config.h
file describes various
facts about your system. You might want to edit the Makefile
to change the CFLAGS
variable, which controls the
command-line options that are passed to the C compiler (such as
optimization levels or compiling for debugging).
Alternatively, you can add your own values for most make
variables on the command line, such as
CC
and CFLAGS
, when running configure
:
CC=cc CFLAGS=-g sh ./configure
See the file INSTALL
in the
gawk
distribution for all the
details.
After you have run configure
and possibly edited the Makefile
,
type:
make
Shortly thereafter, you should have an executable version of
gawk
. That’s all there is to it! To
verify that gawk
is working properly,
run ‘make check
’. All of the tests
should succeed. If these steps do not work, or if any of the tests fail,
check the files in the README_d
directory to see if you’ve found a known problem. If the failure is not
described there, send in a bug report (see Reporting Problems and Bugs).
Of course, once you’ve built gawk
, it is likely that you will wish to
install it. To do so, you need to run the command ‘make install
’, as a user with the appropriate
permissions. How to do this varies by system, but on many systems you
can use the sudo
command to do so.
The command then becomes ‘sudo make
install
’. It is likely that you will be asked for your
password, and you will have to have been set up previously as a user who
is allowed to run the sudo
command.
There are several additional options you may use on the configure
command line when compiling gawk
from scratch, including:
--disable-extensions
Disable configuring and building the sample extensions in
the extension
directory. This
is useful for cross-compiling. The default action is to
dynamically check if the extensions can be configured and
compiled.
--disable-lint
Disable all lint checking within gawk
. The --lint
and
--lint-old
options (see Command-Line Options) are accepted, but silently do nothing.
Similarly, setting the LINT
variable (see Built-in Variables That Control awk) has no effect
on the running awk
program.
When used with the GNU Compiler Collection’s (GCC’s)
automatic dead code elimination, this option cuts almost 23K bytes
off the size of the gawk
executable on GNU/Linux x86_64 systems. Results on other systems
and with other compilers are likely to vary. Using this option may
bring you some slight performance improvement.
Using this option will cause some of the tests in the test suite to fail. This option may be removed at a later date.
--disable-nls
Disable all message-translation facilities. This is usually not desirable, but it may bring you some slight performance improvement.
--with-whiny-user-strftime
Force use of the included version of the C strftime()
function for deficient
systems.
Use the command ‘./configure
--help
’ to see the full list of options supplied by configure
.
This section is of interest only if you know something about using the C language and Unix-like operating systems.
The source code for gawk
generally attempts to adhere to formal standards wherever possible. This
means that gawk
uses library routines
that are specified by the ISO C standard and by the POSIX operating
system interface standard. The gawk
source code requires using an ISO C compiler (the 1990 standard).
Many Unix systems do not support all of either the ISO or the
POSIX standards. The missing_d
subdirectory in the gawk
distribution
contains replacement versions of those functions that are most likely to
be missing.
The config.h
file that
configure
creates contains
definitions that describe features of the particular operating system
where you are attempting to compile gawk
. The three things described by this file
are: what header files are available, so that they can be correctly
included, what (supposedly) standard functions are actually available in
your C libraries, and various miscellaneous facts about your operating
system. For example, there may not be an st_blksize
element in the stat
structure. In this case, ‘HAVE_STRUCT_STAT_ST_BLKSIZE
’ is
undefined.
It is possible for your C compiler to lie to configure
. It may do so by not exiting with an
error when a library function is not available. To get around this, edit
the custom.h
file. Use an ‘#ifdef
’ that is appropriate for your system,
and either #define
any constants that
configure
should have defined but
didn’t, or #undef
any constants that
configure
defined and should not
have. The custom.h
file is
automatically included by the config.h
file.
It is also possible that the configure
program generated by Autoconf will
not work on your system in some other fashion. If you do have a problem,
the configure.ac
file is the input
for Autoconf. You may be able to change this file and generate a new
version of configure
that works on
your system (see Reporting Problems and Bugs for information on how to
report problems in configuring gawk
).
The same mechanism may be used to send in updates to configure.ac
and/or custom.h
.
This section describes how to install gawk
on various non-Unix systems.
This section covers installation and usage of gawk
on Intel architecture machines running
MS-DOS and any version of MS-Windows. In this section, the term “Windows32” refers to any of
Microsoft Windows 95/98/ME/NT/2000/XP/Vista/7/8.
The limitations of MS-DOS (and MS-DOS shells under the other
operating systems) have meant that various “DOS extenders” are often
used with programs such as gawk
. The
varying capabilities of Microsoft Windows 3.1 and Windows32 can add to
the confusion. For an overview of the considerations, refer to README_d/README.pc
in the
distribution.
gawk
can be compiled for
MS-DOS and Windows32 using the GNU development tools from DJ Delorie
(DJGPP: MS-DOS only) or MinGW (Windows32). The file README_d/README.pc
in the gawk
distribution contains additional notes,
and pc/Makefile
contains
important information on compilation options.
To build gawk
for MS-DOS and
Windows32, copy the files in the pc
directory (except
for ChangeLog
) to the directory
with the rest of the gawk
sources,
then invoke make
with the
appropriate target name as an argument to build gawk
. The Makefile
copied from the pc
directory contains a configuration
section with comments and may need to be edited in order to work with
your make
utility.
The Makefile
supports a
number of targets for building various MS-DOS and Windows32 versions.
A list of targets is printed if the make
command is given without a target. As
an example, to build gawk
using the
DJGPP tools, enter ‘make djgpp
’.
(The DJGPP tools needed for the build may be found at ftp://ftp.delorie.com/pub/djgpp/current/v2gnu/.) To
build a native MS-Windows binary of gawk
using the MinGW tools, type ‘make mingw32
’.
Using make
to run the
standard tests and to install gawk
requires additional Unix-like tools, including sh
, sed
,
and cp
. In order to run the tests,
the test/*.ok
files may need to
be converted so that they have the usual MS-DOS-style end-of-line
markers. Alternatively, run make check
CMP="diff -a"
to use GNU diff
in text mode instead of cmp
to compare the resulting files.
Under MS-DOS and MS-Windows, the Cygwin and MinGW environments
support both the ‘|&
’
operator and TCP/IP networking (see Using gawk for Network Programming).
The MS-DOS and MS-Windows versions of gawk
search for program files as described
in The AWKPATH Environment Variable. However, semicolons (rather
than colons) separate elements in the AWKPATH
variable.
If AWKPATH
is not set or is empty, then the default
search path is ‘.;c:/lib/awk;c:/gnu/lib/awk
’.
An sh
-like shell (as opposed
to command.com
under MS-DOS or
cmd.exe
under MS-Windows) may be
useful for awk
programming. The
DJGPP collection of tools includes an MS-DOS port of Bash.
Under MS-Windows and MS-DOS, gawk
(and many other text programs) silently
translates end-of-line ‘
’ to
‘
’ on input and ‘
’ to ‘
’ on output. A special BINMODE
variable (c.e.) allows control over
these translations and is interpreted as
follows:
If BINMODE
is "r"
or one, then binary mode is set on
read (i.e., no translations on reads).
If BINMODE
is "w"
or two, then binary mode is set on
write (i.e., no translations on writes).
If BINMODE
is "rw"
or "wr"
or three, binary mode is set for
both read and write.
BINMODE=
is the same as ‘non-null-string
BINMODE=3
’
(i.e., no translations on reads or writes). However, gawk
issues a warning message if the
string is not one of "rw"
or
"wr"
.
The modes for standard input and standard output are set one
time only (after the command line is read, but before processing any
of the awk
program). Setting
BINMODE
for standard input or
standard output is accomplished by using an appropriate ‘-v
BINMODE=
’ option on
the command line. N
BINMODE
is set at
the time a file or pipe is opened and cannot be changed
midstream.
The name BINMODE
was
chosen to match mawk
(see
Other Freely Available awk Implementations). mawk
and gawk
handle BINMODE
similarly; however, mawk
adds a ‘-W
BINMODE=
’ option and an
environment variable that can set N
BINMODE
, RS
, and ORS
. The files binmode[1-3].awk
(under gnu/lib/awk
in some of the prepared binary
distributions) have been chosen to match mawk
’s ‘-W
BINMODE=
’ option. These can be
changed or discarded; in particular, the setting of N
RS
giving the fewest “surprises” is open to
debate. mawk
uses ‘RS = "
"
’ if binary mode is set on read,
which is appropriate for files with the MS-DOS-style
end-of-line.
To illustrate, the following examples set binary mode on writes
for standard output and other files, and set ORS
as the “usual” MS-DOS-style
end-of-line:
gawk -v BINMODE=2 -v ORS=" " …
or:
gawk -v BINMODE=w -f binmode2.awk …
These give the same result as the ‘-W
BINMODE=2
’ option in mawk
. The following changes the record
separator to "
"
and sets binary
mode on reads, but does not affect the mode on standard input:
gawk -v RS=" " -e "BEGIN { BINMODE = 1 }" …
or:
gawk -f binmode1.awk …
With proper quoting, in the first example the setting of
RS
can be moved into the BEGIN
rule.
gawk
can be built and used
“out of the box” under MS-Windows if you are using the Cygwin
environment. This environment provides an excellent simulation
of GNU/Linux, using Bash, GCC, GNU Make, and other GNU programs.
Compilation and installation for Cygwin is the same as for a Unix
system:
tar -xvpzf gawk-4.1.2.tar.gz cd gawk-4.1.2 ./configure make && make check
When compared to GNU/Linux on the same system, the ‘configure
’ step on Cygwin takes considerably
longer. However, it does finish, and then the ‘make
’ proceeds as usual.
In the MSYS environment under MS-Windows, gawk
automatically uses binary mode for reading and writing files. Thus,
there is no need to use the BINMODE
variable.
This can cause problems with other Unix-like components that
have been ported to MS-Windows that expect gawk
to do automatic translation of "
"
, because it won’t.
This subsection describes how to compile and install gawk
under VMS. The older designation “VMS” is
used throughout to refer to OpenVMS.
To compile gawk
under
VMS, there is a DCL
command
procedure that issues all the necessary CC
and LINK
commands. There is also a Makefile
for use with the MMS
and MMK
utilities. From the source directory,
use either:
$ @[.vms]vmsbuild.com
or:
$ MMS/DESCRIPTION=[.vms]descrip.mms gawk
or:
$ MMK/DESCRIPTION=[.vms]descrip.mms gawk
MMK
is an open source, free,
near-clone of MMS
and can better handle ODS-5 volumes with upper- and
lowercase filenames. MMK
is
available from https://github.com/endlesssoftware/mmk.
With ODS-5 volumes and extended parsing enabled, the case of the target parameter may need to be exact.
gawk
has been tested under
VAX/VMS 7.3 and Alpha/VMS 7.3-1 using Compaq C V6.4, and under
Alpha/VMS 7.3, Alpha/VMS 7.3-2, and IA64/VMS 8.3. The most recent
builds used HP C V7.3 on Alpha VMS 8.3 and both Alpha and IA64 VMS 8.4
used HP C 7.3.[107]
See The VMS GNV project for information on building
gawk
as a PCSI kit that is
compatible with the GNV product.
The extensions that have been ported to VMS can be built using one of the following commands:
$ MMS/DESCRIPTION=[.vms]descrip.mms extensions
or:
$ MMK/DESCRIPTION=[.vms]descrip.mms extensions
gawk
uses AWKLIBPATH
as either an environment variable
or a logical name to find the dynamic extensions.
Dynamic extensions need to be compiled with the same compiler
options for floating-point, pointer size, and symbol name handling as
were used to compile gawk
itself.
Alpha and Itanium should use IEEE floating point. The pointer size is
32 bits, and the symbol name handling should be exact case with CRC
shortening for symbols longer than 32
bits.
For Alpha and Itanium:
/name=(as_is,short) /float=ieee/ieee_mode=denorm_results
For VAX:
/name=(as_is,short)
Compile-time macros need to be defined before the first VMS-supplied header file is included, as follows:
#if (__CRTL_VER >= 70200000) && !defined (__VAX) #define _LARGEFILE 1 #endif #ifndef __VAX #ifdef __CRTL_VER #if __CRTL_VER >= 80200000 #define _USE_STD_STAT 1 #endif #endif #endif
If you are writing your own extensions to run on VMS, you must
supply these definitions yourself. The config.h
file created when building
gawk
on VMS does this for you; if
instead you use that file or a similar one, then you must remember to
include it before any VMS-supplied header files.
To use gawk
, all you need is
a “foreign” command, which is a DCL
symbol
whose value begins with a dollar sign. For example:
$ GAWK :== $disk1:[gnubin]gawk
Substitute the actual location of gawk.exe
for ‘$disk1:[gnubin]
’. The symbol should be
placed in the login.com
of any
user who wants to run gawk
, so that
it is defined every time the user logs on. Alternatively, the symbol
may be placed in the system-wide sylogin.com
procedure, which allows all
users to run gawk
.
If your gawk
was installed by
a PCSI kit into the GNV$GNU:
directory tree, the program will be known as GNV$GNU:[bin]gnv$gawk.exe
and the help file
will be GNV$GNU:[vms_help]gawk.hlp
.
The PCSI kit also installs a GNV$GNU:[vms_bin]gawk_verb.cld
file that
can be used to add gawk
and
awk
as DCL commands.
For just the current process you can use:
$ set command gnv$gnu:[vms_bin]gawk_verb.cld
Or the system manager can use GNV$GNU:[vms_bin]gawk_verb.cld
to add the
gawk
and awk
to the system-wide ‘DCLTABLES
’.
The DCL syntax is documented in the gawk.hlp
file.
Optionally, the gawk.hlp
entry can be loaded into a VMS help library:
$ LIBRARY/HELP sys$help:helplib [.vms]gawk.hlp
(You may want to substitute a site-specific help library rather
than the standard VMS library ‘HELPLIB
’.) After loading the help text, the
command:
$ HELP GAWK
provides information about both the gawk
implementation and the awk
programming language.
The logical name ‘AWK_LIBRARY
’ can designate a default
location for awk
program files. For
the -f
option, if the specified filename has no
device or directory path information in it, gawk
looks in the current directory first,
then in the directory specified by the translation of ‘AWK_LIBRARY
’ if the file is not found. If,
after searching in both directories, the file still is not found,
gawk
appends the suffix ‘.awk
’ to the filename and retries the file
search. If ‘AWK_LIBRARY
’ has no
definition, a default value of ‘SYS$LIBRARY:
’ is used for it.
Command-line parsing and quoting conventions are significantly
different on VMS, so examples in this book or from other sources often need
minor changes. They are minor though, and all
awk
programs should run
correctly.
Here are a couple of trivial tests:
$gawk -- "BEGIN {print ""Hello, World!""}"
$gawk -"W" version
! could also be -"W version" or "-W version"
Note that uppercase and mixed-case text must be quoted.
The VMS port of gawk
includes
a DCL
-style interface in addition
to the original shell-style interface (see the help entry for
details). One side effect of dual command-line parsing is that if
there is only a single parameter (as in the quoted string program),
the command becomes ambiguous. To work around this, the normally
optional --
flag is required to force Unix-style
parsing rather than DCL
parsing. If
any other dash-type options (or multiple parameters such as datafiles
to process) are present, there is no ambiguity and --
can be omitted.
The exit
value is a
Unix-style value and is encoded into a VMS exit status value when the
program exits.
The VMS severity bits will be set based on the exit
value. A failure is indicated by 1, and
VMS sets the ERROR
status. A fatal
error is indicated by 2, and VMS sets the FATAL
status. All other values will have the
SUCCESS
status. The exit value is
encoded to comply with VMS coding standards and will have the C_FACILITY_NO
of 0x350000
with the constant 0xA000
added to the number shifted over by 3
bits to make room for the severity codes.
To extract the actual gawk
exit code from the VMS status, use:
unix_status = (vms_status .and. &x7f8) / 8
A C program that uses exec()
to call gawk
will get the original
Unix-style exit value.
Older versions of gawk
for
VMS treated a Unix exit code 0 as 1, a failure as 2, a fatal error as
4, and passed all the other numbers through. This violated the VMS
exit status coding requirements.
VAX/VMS floating point uses unbiased rounding. See Rounding Numbers.
VMS reports time values in GMT unless one of the SYS$TIMEZONE_RULE
or TZ
logical names is set. Older versions of
VMS, such as VAX/VMS 7.3, do not set these logical names.
The default search path, when looking for awk
program files specified by the
-f
option, is "SYS$DISK:[],AWK_LIBRARY:"
. The logical name
AWKPATH
can be used to override this default. The
format of AWKPATH
is a comma-separated list of
directory specifications. When defining it, the value should be quoted
so that it retains a single translation and not a multitranslation
RMS
searchlist.
The VMS GNV package provides a build environment similar to
POSIX with ports of a collection of open source tools. The gawk
found in the GNV base kit is an older
port. Currently, the GNV project is being reorganized to supply
individual PCSI packages for each component. See https://sourceforge.net/p/gnv/wiki/InstallingGNVPackages/.
The normal build procedure for gawk
produces a program that is suitable for
use with GNV.
The file vms/gawk_build_steps.txt
in the
distribution documents the procedure for building a VMS PCSI kit that
is compatible with GNV.
Some versions of VMS have an old version of gawk
. To access it, define a symbol, as follows:
$ gawk :== $sys$common:[syshlp.examples.tcpip.snmp]gawk.exe
This is apparently version 2.15.6, which is extremely old. We recommend compiling and using the current version.
There is nothing more dangerous than a bored archaeologist.
—Douglas Adams, The Hitchhiker’s Guide to the Galaxy
If you have problems with gawk
or
think that you have found a bug, report it to the developers; we cannot promise to do
anything, but we might well want to fix it.
Before reporting a bug, make sure you have really found a genuine bug. Carefully reread the documentation and see if it says you can do what you’re trying to do. If it’s not clear whether you should be able to do something or not, report that too; it’s a bug in the documentation!
Before reporting a bug or trying to fix it yourself, try to isolate
it to the smallest possible awk
program
and input datafile that reproduce the problem. Then send us the program
and datafile, some idea of what kind of Unix system you’re using, the
compiler you used to compile gawk
, and
the exact results gawk
gave you. Also
say what you expected to occur; this helps us decide whether the problem
is really in the documentation.
Make sure to include the version number of gawk
you are using. You can get this information
with the command ‘gawk
--version
’.
Once you have a precise problem description, send email to [email protected].
The gawk
maintainers subscribe to
this address, and thus they will receive your bug report. Although you can
send mail to the maintainers directly, the bug reporting address is
preferred because the email list is archived at the GNU Project.
All email must be in English. This is the only language
understood in common by all the maintainers.
Do not try to report bugs in gawk
by posting to the Usenet/Internet
newsgroup comp.lang.awk
. The gawk
developers do occasionally read this
newsgroup, but there is no guarantee that we will see your posting. The
steps described here are the only officially recognized way for
reporting bugs. Really.
Many distributions of GNU/Linux and the various BSD-based operating systems have their own bug reporting systems. If you report a bug using your distribution’s bug reporting system, you should also send a copy to [email protected].
This is for two reasons. First, although some distributions
forward bug reports “upstream” to the GNU mailing list, many don’t, so
there is a good chance that the gawk
maintainers won’t even see the bug report! Second, mail to the GNU list
is archived, and having everything at the GNU Project keeps things
self-contained and not dependent on other organizations.
Non-bug suggestions are always welcome as well. If you have questions about things that are unclear in the documentation or are just obscure features, ask on the bug list; we will try to help you out if we can.
If you find bugs in one of the non-Unix ports of gawk
, send an email to the bug list, with a copy
to the person who maintains that port. The maintainers are named in the
following list, as well as in the README
file in the gawk
distribution. Information in the README
file should be considered authoritative
if it conflicts with this book.
The people maintaining the various gawk
ports are:
Unix and POSIX systems | Arnold Robbins, [email protected] |
MS-DOS with DJGPP | Scott Deifik, [email protected] |
MS-Windows with MinGW | Eli Zaretskii, [email protected] |
OS/2 | Andreas Buening, [email protected] |
VMS | John Malmberg, [email protected] |
z/OS (OS/390) | Dave Pitts, [email protected] |
If your bug is also reproducible under Unix, send a copy of your report to the [email protected] email list as well.
It’s kind of fun to put comments like this in your awk code:
// Do C++ comments work? answer: yes! of course
—Michael Brennan
There are a number of other freely available awk
implementations. This section briefly describes where to get them:
awk
Brian Kernighan, one of the original designers of Unix
awk
, has made his implementation
of awk
freely
available. You can retrieve this version via his home page. It is
available in several archive formats:
tar
fileYou can also retrieve it from GitHub:
git clone git://github.com/onetrueawk/awk bwkawk
This command creates a copy of the Git repository in a directory named
bwkawk
. If you leave that
argument off the git
command
line, the repository copy is created in a directory named awk
.
This version requires an ISO C (1990 standard) compiler; the C compiler from GCC (the GNU Compiler Collection) works quite nicely.
See Common Extensions Summary for a list of
extensions in this awk
that are
not in POSIX awk
.
As a side note, Dan Bornstein has created a Git repository
tracking all the versions of BWK awk
that he could find. It’s available at
http://github.com/danfuzz/one-true-awk.
mawk
Michael Brennan wrote an independent implementation of awk
, called mawk
. It is available under the GPL, just
as gawk
is.
The original distribution site for the mawk
source code no longer has it. A copy
is available at http://www.skeeve.com/gawk/mawk1.3.3.tar.gz.
In 2009, Thomas Dickey took on mawk
maintenance. Basic information is
available on the
project’s web page. The download URL is http://invisible-island.net/datafiles/release/mawk.tar.gz.
Once you have it, gunzip
may be used to decompress this file. Installation is similar to
gawk
’s (see Compiling and Installing gawk on Unix-Like Systems).
See Common Extensions Summary for a list of
extensions in mawk
that are not
in POSIX awk
.
awka
Written by Andrew Sumner, awka
translates awk
programs into C, compiles them, and links them with a library
of functions that provide the core awk
functionality. It also has a number of
extensions.
The awk
translator is
released under the GPL, and the library is under the LGPL.
To get awka
, go to http://sourceforge.net/projects/awka.
The project seems to be frozen; no new code changes have been made since approximately 2001.
pawk
Nelson H.F. Beebe at the University of Utah has modified BWK
awk
to provide timing and profiling information. It is different from
gawk
with the
--profile
option (see Profiling Your awk Programs)
in that it uses CPU-based profiling, not line-count profiling. You
may find it at either ftp://ftp.math.utah.edu/pub/pawk/pawk-20030606.tar.gz
or http://www.math.utah.edu/pub/pawk/pawk-20030606.tar.gz.
awk
BusyBox is a GPL-licensed program providing small versions of many applications within a single
executable. It is aimed at embedded systems. It includes a full
implementation of POSIX awk
. When
building it, be careful not to do ‘make
install
’ as it will
overwrite copies of other applications in your /usr/local/bin
. For more information, see
the project’s home
page.
awk
The versions of awk
in
/usr/xpg4/bin
and /usr/xpg6/bin
on Solaris are more or less
POSIX-compliant. They are based on the awk
from Mortice Kern Systems for PCs. We
were able to make this code compile and work under GNU/Linux with
1–2 hours of work. Making it more generally portable (using GNU
Autoconf and/or Automake) would take more work, and this has not
been done, at least to our knowledge.
The source code used to be available from the OpenSolaris website. However, that project was ended and the website shut down. Fortunately, the Illumos project makes this implementation available. You can view the files one at a time from https://github.com/joyent/illumos-joyent/blob/master/usr/src/cmd/awk_xpg4.
jawk
This is an interpreter for awk
written in Java. It claims to be a full interpreter, although because
it uses Java facilities for I/O and for regexp matching, the
language it supports is different from POSIX awk
. More information is available on the
project’s home
page.
This is an embeddable awk
interpreter derived from mawk
.
For more information, see http://repo.hu/projects/libmawk/.
pawk
This is a Python module that claims to bring awk
-like features to Python. See https://github.com/alecthomas/pawk for more
information. (This is not related to Nelson Beebe’s modified version
of BWK awk
, described
earlier.)
awk
This is an embeddable awk
interpreter. For more information, see http://code.google.com/p/qse/ and http://awk.info/?tools/qse.
QTawk
This is an independent implementation of awk
distributed under the GPL. It has a
large number of extensions over standard awk
and may not be 100% syntactically
compatible with it. See http://www.quiktrim.org/QTawk.html for more
information, including the manual and a download link.
The project may also be frozen; no new code changes have been made since approximately 2008.
See also the “Versions and implementations” section of the
Wikipedia
article on awk
for information on
additional versions.
The gawk
distribution is
available from the GNU Project’s main distribution site, ftp.gnu.org
. The canonical build recipe
is:
wget http://ftp.gnu.org/gnu/gawk/gawk-4.1.2.tar.gz tar -xvpzf gawk-4.1.2.tar.gz cd gawk-4.1.2 ./configure && make && make check
gawk
may be built on
non-POSIX systems as well. The currently supported systems are
MS-Windows using DJGPP, MSYS, MinGW, and Cygwin, and both Vax/VMS and
OpenVMS. Instructions for each system are included in this
appendix.
Bug reports should be sent via email to
[email protected]. Bug reports should be in English and
should include the version of gawk
,
how it was compiled, and a short program and datafile that demonstrate
the problem.
There are a number of other freely available awk
implementations. Many are
POSIX-compliant; others are less so.
18.191.176.194