Editing Streams with sed

If you need to modify text in a pipeline or in a file, sed is your best friend. Its name is short for “stream editor” and it’s very handy. While you can do many things with sed, the most common use is to replace some text with other text, similar to how you’d use the find and replace feature in your text editor.

Like other tools, sed can read its input from a file or from standard input. Try it out. Print out “Hello World” and use sed to replace “Hello” with “Goodbye”:

 $ ​​echo​​ ​​"Hello World"​​ ​​|​​ ​​sed​​ ​​-e​​ ​​'s/Hello/Goodbye/'
 Goodbye World

In this example, you’re sending “Hello World” via a pipe and then using the -e flag to specify an expression. The expression s/Hello/Goodbye/ is a basic substitution. The / characters are the delimiter. “Hello” is the regular expression, and “Goodbye” is the string you’ll insert as the replacement.

This basic substitution only works on the first occurrence on each line. Let’s take a closer look at this. Create a Markdown file called document.md that contains this text:

 This is *very* important text with *very* important words.
 
 * These words are not important.
 * Neither are these.
 
 You can *always* tell very important text because it's in italics

Use cat to create the file:

 $ ​​cat​​ ​​<<​​ ​​'EOF'​​ ​​>​​ ​​document.md
 >​​ ​​This​​ ​​is​​ ​​*very*​​ ​​important​​ ​​text​​ ​​with​​ ​​*very*​​ ​​important​​ ​​words.
 >
 >​​ ​​*​​ ​​These​​ ​​words​​ ​​are​​ ​​not​​ ​​important.
 >​​ ​​*​​ ​​Neither​​ ​​are​​ ​​these.
 >
 >​​ ​​You​​ ​​can​​ ​​*always*​​ ​​tell​​ ​​very​​ ​​important​​ ​​text​​ ​​because​​ ​​it​​'​​s​​ ​​in​​ ​​italics.
 >​​ ​​EOF

Use sed to replace occurrences of the word “very” with “really”. (In reality, you should remove the superfluous adverb entirely, but humor me for a moment.)

 $ ​​sed​​ ​​-e​​ ​​'s/very/really/'​​ ​​document.md
»This is *really* important text with *very* important words.
 
 * These words are not important.
 * Neither are these.
 
»You can *always* tell really important text because it's in italics.

Notice that one of the instances of “very” didn’t get replaced. On closer inspection, the first instance on the first line was replaced, but the second one was not. Add the g operator to the end of the substitution expression to replace both:

 $ ​​sed​​ ​​-e​​ ​​'s/very/really/g'​​ ​​document.md
»This is *really* important text with *really* important words.
 
 * These words are not important.
 * Neither are these.
 
»You can *always* tell really important text because it's in italics.

sed doesn’t modify the original file. Let’s look at how to save changes.

Saving Changes

To keep the changes you’ve made with sed, you have two options. Your first option is to redirect the output to a new file using redirection.

Execute this command to transform all instances of “very” to “really” and save the results to the file new_document.md:

 $ ​​sed​​ ​​-e​​ ​​'s/very/really/g'​​ ​​document.md​​ ​​>​​ ​​new_document.md

The new_document.md file contains the transformed file. Don’t try to redirect the output to the original filename though. You’ll end up with a blank file.

The other option for saving changes is to modify the document in place by using the -i switch.

sed on macOS

images/aside-icons/tip.png

On macOS, the version of sed you get is the BSD version. It has slightly different options, which can be confusing if you’re trying to follow along with examples you see online or even in this book. You can use Homebrew to install the GNU version of sed. See Installing GNU Versions of awk, sed, and grep, for how to do that.

Execute this command to replace all instances of “really” with “very” in the file new_document.md:

 $ ​​sed​​ ​​-i​​ ​​-e​​ ​​'s/really/very/g'​​ ​​new_document.md

In practice, modifying the file in place is more dangerous, so you should run the command without the -i switch first and verify you’re getting what you want. Then, run the command again with the -i switch.

Ready to look at manipulating specific lines of a file?

Editing Specific Lines

In Markdown documents, you can italicize words or phrases by surrounding them with asterisks or underscores, depending on the Markdown processor you’re using. In your document, you’re using asterisks to bold words:

 This is *very* important text with *very* important words.
 
 * These words are not important.
 * Neither are these.
 
 You can *always* tell very important text because it's in italics

Try using a basic substitution to replace them:

 $ ​​sed​​ ​​-e​​ ​​'s/*/_/g'​​ ​​document.md
 This is _very_ important text with _very_ important words.
 
»_ These words are not important.
»_ Neither are these.
 
 You can _always_ tell very important text because it's in italics.

Whoops. That’s not quite right. Markdown uses asterisks for bulleted lists as well. The regular expression you used here was incredibly naive. To get the results you’re looking for, you’ll want to search for pairs of asterisks. That requires being a little more specific about how you search for things.

When performing operations on streams of text, sed searches the whole file. But by providing more context, you can tell sed to zero in on a specific part of the file. sed calls this an address. An address can be a string of text, a line number, or a regular expression.

To explore this further, create a text file full of URLs. You’ll fiddle around with this file for a bit:

 $ ​​cat​​ ​​<<​​ ​​'EOF'​​ ​​>​​ ​​urls.txt
 >​​ ​​http://example.com
 >​​ ​​http://facebook.com
 >​​ ​​http://twitter.com
 >​​ ​​https://pragprog.com
 >​​ ​​EOF

Notice that the last entry uses https while the others use http. Try to use sed to replace all the instances of http with https by using what you’ve learned so far about replacing text.

Your first attempt might look like this:

 $ ​​sed​​ ​​-e​​ ​​'s/http/https/'​​ ​​urls.txt
 https://example.com
 https://facebook.com
 https://twitter.com
»httpss://pragprog.com

As you can see from the last entry in the output, this results in httpss, which isn’t quite right. You’ll need to be more specific. Including the colon in the match is good enough:

 $ ​​sed​​ ​​-e​​ ​​'s/http:/https:/'​​ ​​urls.txt
 https://example.com
 https://facebook.com
 https://twitter.com
 https://pragprog.com

The command s/http:/https:/ is an example of a command without an address, so it operates on every line of the file.

But you can target a specific line only. Use the following command to comment out the URL for Facebook by placing a hash mark in front:

 $ ​​sed​​ ​​-e​​ ​​'/facebook/s/^/#/'​​ ​​urls.txt
 http://example.com
»#http://facebook.com
 http://twitter.com
 https://pragprog.com

This command finds the line that contains the word facebook and replaces the beginning of the string with the text #. In this case, the string facebook is the address.

Remember that sed doesn’t modify the original file, so redirect the output to a new file to save this new list of URLs with a commented-out Facebook URL:

 $ ​​sed​​ ​​-e​​ ​​'/facebook/s/^/#/'​​ ​​urls.txt​​ ​​>​​ ​​commented_urls.txt

Now, using the commented_urls.txt file, uncomment the Facebook URL. Use /facebook/ to find the line, then remove the comment like this:

 $ ​​sed​​ ​​-e​​ ​​'/facebook/s/^#//'​​ ​​commented_urls.txt
 http://example.com
»http://facebook.com
 http://twitter.com
 https://pragprog.com

This command finds the line containing the string facebook and replaces the # at the beginning of the line with nothing. While you’re using the explicit string facebook here, you could use a more complex regular expression for the address.

And that’s exactly what you have to do with your Markdown document to replace the asterisks with underscores. The easiest way to do that is to target lines of the file that contain the pairs of asterisks and only operate on those. Use a regular expression for the address, then use the substitution to replace the characters:

 $ ​​sed​​ ​​-e​​ ​​'/*.**/s/*/_/g'​​ ​​document.md
»This is _very_ important text with _very_ important words.
 
 * These words are not important.
 * Neither are these.
 
»You can _always_ tell very important text because it's in italics.

That’s more like it. Addresses make it much easier to zero in on what you want to change, but you can be even more specific.

Operating on Lines by Number

In addition to finding a line with text or a regular expression, you can find it by its number. Let’s go back to the urls.txt file and explore this further.

Let’s comment out the first line of the file. Use the following command to target the first line and insert the comment character:

 $ ​​sed​​ ​​-e​​ ​​'1 {s/^/#/}'​​ ​​urls.txt
»#http://example.com
 http://facebook.com
 http://twitter.com
 https://pragprog.com

To comment out lines 2 through 4 of the file, use this command which specifies a range:

 $ ​​sed​​ ​​-e​​ ​​'2,4 {s/^/#/}'​​ ​​urls.txt
 http://example.com
»#http://facebook.com
»#http://twitter.com
»#https://pragprog.com

You can also manipulate the beginning and end of a file. Want to add a line to the top of the file? Use the number 1 to reference the first line of the file, followed by the letter i and a backslash to indicate the text to insert. Add the text “Bookmarks” to the top of the file:

 $ ​​sed​​ ​​-e​​ ​​'1iBookmarks'​​ ​​urls.txt
»Bookmarks
 http://example.com
 http://facebook.com
 http://twitter.com
 https://pragprog.com

To append a line to the end of the file, use a dollar sign instead of a number, followed by the letter a:

 $ ​​sed​​ ​​-e​​ ​​'$ahttp://google.com'​​ ​​urls.txt
 http://example.com
 http://facebook.com
 http://twitter.com
 https://pragprog.com
»http://google.com

You can do multiple expressions in the same command too, which means you can prepend and append text to a file in a single command. Just specify both expressions:

 $ ​​sed​​ ​​-e​​ ​​'1iBookmarks'​​ ​​-e​​ ​​'$ahttp://google.com'​​ ​​urls.txt
»Bookmarks
 http://example.com
 http://facebook.com
 http://twitter.com
 https://pragprog.com
»http://google.com

You can prepend i and a with a line number to prepend or append text anywhere in the file. If you use i or a without a location, sed applies the operation for every line of the file.

In addition, you can change a specific line of the file using c. Change example.com to github.com with this command:

 $ ​​sed​​ ​​-e​​ ​​'1chttps://github.com'​​ ​​urls.txt
 https://github.com
 http://facebook.com
 http://twitter.com
 https://pragprog.com

You can delete a line with d:

 $ ​​sed​​ ​​-e​​ ​​'1d'​​ ​​urls.txt
 http://facebook.com
 http://twitter.com
 https://pragprog.com

Now, let’s look at how to add content from other files.

Replacing Text with Content from Another File

Using sed, you can create a template file with some placeholder text, and then replace that text with text from another file. Let’s create a new document and then use sed to inject our list of URLs inside of it.

First, create a new document with the placeholder LINKS where you want the links to appear:

 $ ​​cat​​ ​​<<​​ ​​'EOF'​​ ​​>​​ ​​useful_links.md
 >​​ ​​Here​​ ​​are​​ ​​a​​ ​​few​​ ​​interesting​​ ​​links:
 >​​ ​​LINKS
 >​​ ​​I​​ ​​hope​​ ​​you​​ ​​find​​ ​​them​​ ​​useful.
 >​​ ​​EOF

The r command tells sed to read text from an external file. So you’ll search for LINKS in the file and then use r to place the contents of urls.txt below that line. Then you’ll delete the line containing the LINKS placeholder. Try it out:

 $ ​​sed​​ ​​-e​​ ​​'/LINKS/r urls.txt'​​ ​​-e​​ ​​'/LINKS/d'​​ ​​useful_links.md
 Here are a few interesting links:
»http://example.com
»http://facebook.com
»http://twitter.com
»https://pragprog.com
 I hope you find them useful.

This worked nicely, but you can shorten the command by using braces to group the commands, like this:

 $ ​​sed​​ ​​-e​​ ​​'/LINKS/{r urls.txt'​​ ​​-e​​ ​​'d}'​​ ​​useful_links.md
 Here are a few interesting links:
»http://example.com
»http://facebook.com
»http://twitter.com
»https://pragprog.com
 I hope you find them useful.

With sed, use regular expressions, text, and even external files to modify the contents of any stream of text, whether it comes from a file or another program. Let’s look at another tool for parsing and manipulating files.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.191.205.214