Back in Chapter 6, we talked about sed and how to use it to search and replace throughout files, one file at a time. Although we're sure you're still coming down off of the power rush from doing that, we'll now show you how to combine sed with shell scripts and loops. In doing this, you can take your search-and-replace criteria and apply them to multiple documents. For example, you can search through all of the .html documents in a directory and make the same change to all of them. In this example (Figure 16.5), we strip out all of the <BLINK> tags, which are offensive to some HTML purists.
Before you get started, you might have a look at Chapter 6 for a review of sed basics and Chapter 10 for a review of scripts and loops.
To search and replace throughout multiple documents:
1. | vi thestinkinblinkintag Use the editor of your choice to create a new script. Name the file whatever you want. | |
2. | #! /bin/sh Start the shell script with the name of the program that should run the script. | |
3. | for i in 'ls -l *.htm*' Start a loop. In this case, the loop will process all of the .htm or .html documents in the current directory. | |
4. | do Indicate the beginning of the loop content. | |
5. | cp $i $i.bak Make a backup copy of each file before you change it. Remember, Murphy is watching you. | |
6. | sed "s/</*BLINK>//g" $i > $i Specify your search criteria and replacement text. A lot is happening in this line, but don't panic. From the left, this command contains sed followed by
Code Listing 16.1. You can even use sed to strip out bad HTML tags, as shown here.
| |
7. | echo "$i is done." Optionally, print a status message onscreen, which can be reassuring if there are a lot of files to process. | |
8. | done Indicate the end of the loop. | |
9. |
Save and close out of your script. | |
10. |
Try it out. Remember to make your script executable with chmod u+x and the file name, then run it with ./thestinking blinkintag. In our example, we'll see the "success reports" for each of the HTML documents processed (Code Listing 16.1). |
Tip
You could perform any number of other operations on the files within the loop, if you wanted. For example, you could strip out other codes, replace a former Webmaster's address with your own, or automatically insert comments and last-update dates.
[ejr@hobbes /home]$ ls -la | awk '{print $9 " owned by " $3 } END { print NR " Total Files" }' owned by . owned by root .. owned by root admin owned by admin anyone owned by anyone asr owned by asr awr owned by awr bash owned by bash csh owned by csh deb owned by deb debray owned by debray ejr owned by ejr ejray owned by ejray ftp owned by root httpd owned by httpd lost+found owned by root merrilee owned by merrilee oldstuff owned by 1000 pcguest owned by pcguest raycomm owned by pcguest samba owned by root shared owned by root 22 Total Files [ejr@hobbes /home]$ |
18.191.13.255