When you’re dealing with program output, you’ll often want to filter the results. The grep command lets you search text for characters or phrases. You can use grep to search through program output or a file. Let’s explore grep by working with some files.
Create a file named words.txt that contains several words, each on its own line:
| $ cat << 'EOF' > words.txt |
| > blue |
| > apple |
| > candy |
| > hand |
| > fork |
| > EOF |
Now use grep to search the file for the word and:
| $ grep 'and' words.txt |
| candy |
| hand |
This displays the two lines of the file that show the string you specified. You get both results because they contain the string and somewhere on the line. This is the most simple form of searching. Surrounding the search term in quotes isn’t always necessary, but it’s a good habit to get into because you can run into some strange edge cases with special characters if you don’t.
You can also tell grep to remove lines containing that text. The -v option instructs grep to only show lines that don’t contain the search pattern you specified.
| $ grep 'and' -v words.txt |
| blue |
| apple |
| fork |
grep reads the file in and processes its contents, but you’re not limited to using grep on just files. You can use it to process output from other programs, which means you can use it to filter the streams of text other programs display.
Try it out by using grep to show you all the ls commands in your history:
| $ history | grep 'ls' |
| ... |
| 471 ls |
| 479 ls |
| 484 ls |
| 500 history | grep 'ls' |
When you ran the command on your machine, you probably saw a lot of results, and the last result was the history | grep command. You can filter that last command out by piping the output to grep again:
| $ history | grep 'ls' | grep -v 'grep' |
| ... |
| 471 ls |
| 479 ls |
| 484 ls |
If there are too many commands for you to see, you can always pipe the output to less:
| $ history | grep 'ls' | grep -v 'grep' | less |
grep supports searching multiple files as well. Create another file with some more words:
| $ cat << 'EOF' > words2.txt |
| > blue car |
| > apple pie |
| > candy bar |
| > hand in hand |
| > fork in the road |
| > EOF |
Then, use grep to search both files for the word blue:
| $ grep 'blue' words.txt words2.txt |
| words.txt:blue |
| words2.txt:blue car |
This time, grep shows the word, along with the name of the file that contains the word.
The grep command only shows the exact line containing the match, but you can tell it to give you a little more context. Using the -A and -B switches, you can specify the number of lines above and below the match:
| $ grep 'candy' -A 2 -B 2 words* |
| words2.txt-blue car |
| words2.txt-apple pie |
| words2.txt:candy bar |
| words2.txt-hand in hand |
| words2.txt-fork in the road |
| -- |
| words.txt-blue |
| words.txt-apple |
| words.txt:candy |
| words.txt-hand |
| words.txt-fork |
The output separates the matches clearly.
In this example, you selected the same amount of lines before and after the matched line. In cases like this, you can shorten the command by using the -C switch instead of specifying both -A and -B:
| $ grep 'candy' -C 2 words* |
The resulting output is the same as before. The -C switch shows the “context” around the results.
Adding the -n flag will show you the line number where the match was found:
| $ grep 'candy' -C 2 -n words* |
| words2.txt-1-blue car |
| words2.txt-2-apple pie |
| words2.txt:3:candy bar |
| words2.txt-4-hand in hand |
| words2.txt-5-fork in the road |
| -- |
| words.txt-1-blue |
| words.txt-2-apple |
| words.txt:3:candy |
| words.txt-4-hand |
| words.txt-5-fork |
This is helpful when working with source code. You can use grep to look at your entire codebase and find phrases or keywords quickly, as grep can read directories recursively.
To demonstrate this, use grep to scan the contents of the /var/log folder for instances of your username:
| $ sudo grep 'brian' -r /var/log |
| ... |
| /var/log/auth.log:Mar 3 15:40:29 puzzles sudo: brian : TTY=pts/8 ; |
| PWD=/home/brian ; USER=root ; COMMAND=/bin/grep brian -r /var/log/ |
| /var/log/auth.log:Mar 3 15:40:29 puzzles sudo: pam_unix(sudo:session): |
| session opened for user root by brian(uid=0) |
| Binary file /var/log/btmp matches |
| Binary file /var/log/wtmp matches |
| Binary file /var/log/auth.log.1 matches |
You’ll see a stream of data returned, displaying events from your system logs.
All of the searches you performed so far are simple text searches, but you can use regular expressions, or regexes as well. A regex is a sequence of characters that defines a pattern for finding text.
This book doesn’t go into a ton of detail on regular expressions. However, you’ll use regular expressions a few more times throughout this book, so I’ll explain what’s going on with each one.
If you’d like more information on regular expressions, lots of online resources will help get you started, including Regex101,[10] an interactive online tool for building and debugging regular expressions.
For now, let’s try out regular expression with grep. If you search both files for the letter b, you get all of the lines containing that word:
| $ grep 'b' words* |
| words.txt:blue |
| words2.txt:blue car |
| words2.txt:candy bar |
But if you use the regular expression ^b, which means “look for the lower-case letter b at the beginning of the line,” you only see two results: blue and blue car:
| $ grep '^b' words* |
| words.txt:blue |
| words2.txt:blue car |
Similarly, if you use the expression e$, which means “look for any line ending with the letter e,” you see these three results:
| $ grep 'e$' words* |
| words.txt:blue |
| words.txt:apple |
| words2.txt:apple pie |
Likewise, use the regular expression blue|apple to search for lines that contain “blue” or “apple”. To use this regular expression with grep, use the -E switch:
| $ grep -E 'blue|apple' words* |
| words.txt:blue |
| words.txt:apple |
| words2.txt:blue car |
| words2.txt:apple pie |
The -E switch lets you use extended regular expressions, which means that the characters |, ?, +, {, (, and ) are supported in the expression. These characters let you create more advanced search patterns. For example, the expression a(n|r) will look for any lines containing either “an” or “ar”:
| $ grep -E 'a(n|r)' words* |
| words2.txt:blue car |
| words2.txt:candy bar |
| words2.txt:hand in hand |
| words.txt:candy |
| words.txt:hand |
grep is a general purpose text search tool, and while there are some other options out there like ack[11] or ripgrep,[12] which have additional features aimed at working with source code, you should be comfortable using grep since it’s universally available.
Next, you’ll look at how to remove characters from output.
3.145.194.57