The awk Programming Language

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

1.4. The sed Editor

This section presents the following topics:

Conceptual overview of sed
Command-line syntax
Syntax of sed commands
Group summary of sed commands
Alphabetical summary of sed commands

1.4.1. Conceptual Overview

sed is a non-interactive, or stream-oriented, editor. It interprets a script and performs the actions in the script. sed is stream-oriented because, like many Unix programs, input flows through the program and is directed to standard output. For example, sort is stream-oriented; vi is not. sed's input typically comes from a file or pipe, but it can also be directed from the keyboard. Output goes to the screen by default but can be captured in a file or sent through a pipe instead.

The Free Software Foundation has a version of sed, available from ftp://gnudist.gnu.org/gnu/sed/sed-3.02.tar.gz. The somewhat older version, 2.05, is also available.

Typical uses of sed include:

Editing one or more files automatically
Simplifying repetitive edits to multiple files
Writing conversion programs

sed operates as follows:

Each line of input is copied into a "pattern space," an internal buffer where editing operations are performed.
All editing commands in a sed script are applied, in order, to each line of input.
Editing commands are applied to all lines (globally) unless line addressing restricts the lines affected.
If a command changes the input, subsequent commands and address tests will be applied to the current line in the pattern space, not the original input line.
The original input file is unchanged because the editing commands modify a copy of each original input line. The copy is sent to standard output (but can be redirected to a file).
sed also maintains the "hold space," a separate buffer that can be used to save data for later retrieval.

1.4.2. Command-Line Syntax

The syntax for invoking sed has two forms:

sed [-n] [-e] 'command' file(s)
sed [-n]  -f  scriptfile
					file(s)

The first form allows you to specify an editing command on the command line, surrounded by single quotes. The second form allows you to specify a scriptfile, a file containing sed commands. Both forms may be used together, and they may be used multiple times. If no file(s) is specified, sed reads from standard input.

The following options are recognized:

-n: Suppress the default output; sed displays only those lines specified with the p command or with the p flag of the s command.
-e cmd: Next argument is an editing command. Useful if multiple scripts or commands are specified.
-f file: Next argument is a file containing editing commands.

If the first line of the script is #n, sed behaves as if -n had been specified.

1.4.3. Syntax of sed Commands

sed commands have the general form:

[address[, address]][!]command [arguments]

sed copies each line of input into the pattern space. sed instructions consist of addresses and editing commands. If the address of the command matches the line in the pattern space, then the command is applied to that line. If a command has no address, then it is applied to each input line. If a command changes the contents of the pattern space, subsequent commands and addresses will be applied to the current line in the pattern space, not the original input line.

commands consist of a single letter or symbol; they are described later, alphabetically and by group. arguments include the label supplied to b or t, the filename supplied to r or w, and the substitution flags for s. addresses are described in the next section.

1.4.3.1. Pattern addressing

A sed command can specify zero, one, or two addresses. An address can be a line number, the symbol $ (for last line), or a regular expression enclosed in slashes (/pattern/). Regular expressions are described in Section 1.3. Additionally, can be used to match any newline in the pattern space (resulting from the N command), but not the newline at the end of the pattern space.

If the Command Specifies:	Then the Command Is Applied To:
No address	Each input line.
One address	Any line matching the address. Some commands accept only one address: `a`, `i`, `r`, `q`, and `=`.
Two comma-separated addresses	First matching line and all succeeding lines up to and including a line matching the second address.
An address followed by `!`	All lines that do not match the address.

1.4.3.2. Examples

`s/xx/yy/g`	Substitute on all lines (all occurrences).
`/BSD/d`	Delete lines containing `BSD`.
`/^BEGIN/,/^END/p`	Print between `BEGIN` and `END`, inclusive.
`/SAVE/!d`	Delete any line that doesn't contain `SAVE`.
`/BEGIN/,/END/!s/xx/yy/g`	Substitute on all lines, except between `BEGIN` and `END`.

Braces ({ }) are used in sed to nest one address inside another or to apply multiple commands at the same address.

[/pattern/[,/pattern/]]{
command1
command2
}

The opening curly brace must end its line, and the closing curly brace must be on a line by itself. Be sure there are no spaces after the braces.

1.4.4. Group Summary of sed Commands

In the lists that follow, the sed commands are grouped by function and are described tersely. Full descriptions, including syntax and examples, can be found afterward in the Section 1.4.5 section.

1.4.4.1. Basic editing

`a`	Append text after a line.
`c`	Replace text (usually a text block).
`i`	Insert text before a line.
`d`	Delete lines.
`s`	Make substitutions.
`y`	Translate characters (like Unix tr).

1.4.4.2. Line information

`=`	Display line number of a line.
`l`	Display control characters in ASCII.
`p`	Display the line.

1.4.4.3. Input/output processing

`n`	Skip current line and go to line below.
`r`	Read another file's contents into the output stream.
`w`	Write input lines to another file.
`q`	Quit the sed script (no further output).

1.4.4.4. Yanking and putting

`h`	Copy into hold space; wipe out what's there.
`H`	Copy into hold space; append to what's there.
`g`	Get the hold space back; wipe out the destination line.
`G`	Get the hold space back; append to the pattern space.
`x`	Exchange contents of the hold and pattern spaces.

1.4.4.5. Branching commands

`b`	Branch to label or to end of script.
`t`	Same as `b`, but branch only after substitution.
`:label`	Label branched to by `t` or `b`.

1.4.4.6. Multiline input processing

`N`	Read another line of input (creates embedded newline).
`D`	Delete up to the embedded newline.
`P`	Print up to the embedded newline.

1.4.5. Alphabetical Summary
of sed Commands

sed Command	Description
#	`#` Begin a comment in a sed script. Valid only as the first character of the first line. (Some versions allow comments anywhere, but it is better not to rely on this.) If the first line of the script is `#n`, sed behaves as if -n had been specified.
:	`:label` Label a line in the script for the transfer of control by `b` or `t`. label may contain up to seven characters.
=	[`/pattern/`]`=` Write to standard output the line number of each line addressed by pattern.
a	[`address`]`a` `text` Append text following each line matched by address. If text goes over more than one line, newlines must
a	be "hidden" by preceding them with a backslash. The text will be terminated by the first newline that is not hidden in this way. The text is not available in the pattern space, and subsequent commands cannot be applied to it. The results of this command are sent to standard output when the list of editing commands is finished, regardless of what happens to the current line in the pattern space.
b	[`address1`[`,address2`]]`b`[`label`] Unconditionally transfer control to `:label` elsewhere in script. That is, the command following the label is the next command applied to the current line. If no label is specified, control falls through to the end of the script, so no more commands are applied to the current line.
c	[`address1`[`,address2`]]`c` `text` Replace (change) the lines selected by the address(es) with text. (See a for details on text.) When a range of lines is specified, all lines are replaced as a group by a single copy of text. The contents of the pattern space are, in effect, deleted and no subsequent editing commands can be applied to the pattern space (or to text).
d	[`address1`[`,address2`]]`d` Delete the addressed line (or lines) from the pattern space. Thus, the line is not passed to standard output. A new line of input is read, and editing resumes with the first command in the script.
D	[`address1`[`,address2`]]`D` Delete the first part (up to embedded newline) of multi-line pattern space created by `N` command and resume editing with first command in script. If this
D	command empties the pattern space, then a new line of input is read, as if the `d` command had been executed.
g	[`address1`[`,address2`]]`g` Paste the contents of the hold space (see h and H) back into the pattern space, wiping out the previous contents of the pattern space.
G	[`address1`[`,address2`]]`G` Same as `g`, except that a newline and the hold space are pasted to the end of the pattern space instead of overwriting it.
h	[`address1`[`,address2`]]`h` Copy the pattern space into the hold space, a special temporary buffer. The previous contents of the hold space are obliterated. You can use `h` to save a line before editing it.
H	[`address1`[`,address2`]]`H` Append a newline and then the contents of the pattern space to the contents of the hold space. Even if the hold space is empty, `H` still appends a newline. `H` is like an incremental copy.
i	[`address1`]`i` `text` Insert text before each line matched by address. (See a for details on text.)
l	[`address1`[`,address2`]]`l` List the contents of the pattern space, showing nonprinting characters as ASCII codes. Long lines are wrapped.
n	[`address1`[`,address2`]]`n` Read the next line of input into pattern space. The current line is sent to standard output, and the next line becomes the current line. Control passes to the command following `n` instead of resuming at the top of the script.
N	[`address1`[`,address2`]]`N` Append the next input line to contents of pattern space; the new line is separated from the previous contents of the pattern space by a newline. (This command is designed to allow pattern matches across two lines.) By using to match the embedded newline, you can match patterns across multiple lines.
p	[`address1`[`,address2`]]`p` Print the addressed line(s). Note that this can result in duplicate output unless default output is suppressed by using `#n` or the `-n` command-line option. Typically used before commands that change flow control (`d`, `n`, `b`), which might prevent the current line from being output.
P	[`address1`[`,address2`]]`P` Print first part (up to embedded newline) of multiline pattern space created by `N` command. Same as `p` if `N` has not been applied to a line.
q	[`address`]`q` Quit when address is encountered. The addressed line is first written to the output (if default output is not suppressed), along with any text appended to it by previous `a` or `r` commands.
r	[`address`]`r` `file` Read contents of file and append after the contents of the pattern space. There must be exactly one space between the `r` and the filename.
s	[`address1`[`,address2`]]`s/pat/repl/`[`flags`] Substitute repl for pat on each addressed line. If pattern addresses are used, the pattern `//` represents the last pattern address specified. Any delimiter may be used. Use within pat or repl to escape the delimiter. The following flags can be specified: n Replace nth instance of pat on each addressed line. n is any number in the range 1 to 512; the default is 1. g Replace all instances of pat on each addressed line, not just the first instance. p Print the line if the substitution is successful. If several substitutions are successful, sed will print multiple copies of the line. w `file` Write the line to file if a replacement was done. A maximum of 10 different files can be opened.
t	[`address1`[`,address2`]]`t` [`label`] Test if successful substitutions have been made on addressed lines, and if so, branch to the line marked by `:label`. (See b and :.) If label is not specified, control branches to the bottom of the script. The `t` command is like a case statement in the C programming language or the various shell programming languages. You test each case; when it's true, you exit the construct.
w	[`address1`[`,address2`]]`w` `file` Append contents of pattern space to file. This action occurs when the command is encountered rather than when the pattern space is output. Exactly one space must separate the `w` and the filename. A maximum of 10 different files can be opened in a script. This command will create the file if it does not exist; if the file
w	exists, its contents will be overwritten each time the script is executed. Multiple write commands that direct output to the same file append to the end of the file.
x	[`address1`[`,address2`]]`x` Exchange the contents of the pattern space with the contents of the hold space.
y	[`address1`[`,address2`]]`y/abc/xyz/` Translate characters. Change every instance of a to x, b to y, c to z, etc.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for The awk Programming Language

Create new playlist

Sign In

Sign Up