One of the powerful and widely used command in shell is grep
. It searches in an input file and matches lines in which the given pattern is found. By default, all the matched patterns are printed on stdout
that is usually terminal. We can also redirect the matched output to other streams such as file. Instead of giving an input from a file, grep
can also take the input from the redirected output of the command executed on the left-hand side of '|
'.
The syntax of using the grep
command is as follows:
grep [OPTIONS] PATTERN [FILE...]
Here, FILE
can be multiple files for a search. If no file is given as an input for a search, it will search the standard input.
PATTERN
can be any valid regular expression. Put PATTERN
within single quotes (') or double quotes (") as per need. For example, use single quotes (') to avoid any bash expansion and double quotes (") for expansion.
A lot of OPTIONS
are available in grep
. Some of the important and widely used options are discussed in the following table:
A lot of times we have to search for a given string or a pattern in a file. The grep
command provides us the capability to do it in a single line. Let's see the following example:
The input file for our example will be input1.txt
:
$ cat input1.txt # Input file for our example
This file is a text file to show demonstration of grep command. grep is a very important and powerful command in shell. This file has been used in chapter 2
We will try to get the following information from the input1.txt
file using the grep
command:
sent lines
that don't have a periodNumber
of times the string file
is usedThe following shell script demonstrates how to do the above mentioned tasks:
#!/bin/bash #Filename: pattern_search.sh #Description: Searching for a pattern using input1.txt file echo "Number of lines = `grep -c '.*' input1.txt`" echo "Line starting with capital letter:" grep -c ^[A-Z].* input1.txt echo echo "Line ending with full stop (.):" grep '.*.$' input1.txt echo echo -n "Number of sentence = " grep -c '.' input1.txt echo "Strings matching sub-string sent:" grep -o "sent" input1.txt echo echo "Lines not having full stop are:" grep -v '.' input1.txt echo echo -n "Number of times string file used: = " grep -o "file" input1.txt | wc -w
The output after running the pattern_search.sh
shell script will be as follows:
Number of lines = 4 Line starting with capital letter: 2 Line ending with full stop (.): powerful command in shell. Number of sentence = 2 Strings matching sub-string sent: Lines not having full stop are: This file is a text file to show demonstration This file has been used in chapter 2 Number of times string file used: = 3
The grep
command also allows us to search for a pattern in multiple files as an input. To explain this in detail, we will head directly to the following example:
The input files, in our case, will be input1.txt
and input2.txt
.
We will reuse the content of the input1.txt
file from the previous example:
The content of input2.txt
is as follows:
$ cat input2.txt
Another file for demonstrating grep CommaNd usage. It allows us to do CASE Insensitive string test as well. We can also do recursive SEARCH in a directory using -R and -r Options. grep allows to give a regular expression to search for a PATTERN. Some special characters like . * ( ) { } $ ^ ? are used to form regexp. Range of digit can be given to regexp e.g. [3-6], [7-9], [0-9]
We will try to get the following information from the input1.txt
and input2.txt
files using the grep
command:
command
command
grep
matchesimportant
The following shell script demonstrates how to follow the preceding steps:
#!/bin/bash # Filename: multiple_file_search.sh # Description: Demonstrating search in multiple input files echo "This program searches in files input1.txt and input2.txt" echo "Search result for string "command":" grep "command" input1.txt input2.txt echo echo "Case insensitive search of string "command":" # input{1,2}.txt will be expanded by bash to input1.txt input2.txt grep -i "command" input{1,2}.txt echo echo "Search for string "grep" and print matching line too:" grep -n "grep" input{1,2}.txt echo echo "Punctuation marks in files:" grep -n [[:punct:]] input{1,2}.txt echo echo "Next line content whose previous line has string "important":" grep -A 1 'important' input1.txt input2.txt
The following screenshot is the output after running the shell script pattern_search.sh
. The matched pattern string has been highlighted:
The following subsections will cover a few more usages of the grep
command.
So far, we have seen all the grep
examples running on text files. We can also search for a pattern in binary files using grep
. For this, we have to tell the grep
command to treat a binary file as a text file too. The option -a
or –text
tells grep
to consider a binary file as a test file.
We know that the grep
command itself is a binary file that executes and gives a search result.
One of the option in grep
is --text
. The string --text
should be somewhere available in the grep
binary file. Let's search for it as follows:
$ grep --text '--text' /usr/bin/grep -a, --text equivalent to –binary-files=text
We saw that the string --text
is found in the search path /usr/bin/grep
. The character backslash ('') is used to escape its special meaning.
Now, let's search for the -w
string in the wc
binary. We know that the wc
command has an option -w
that counts the number of words in an input text.
$ grep -a '-w' /usr/bin/wc -w, --words print the word counts
We can also tell grep
to search into all files/directories in a directory recursively using the option -R
. This avoids the hassle of specifying each file as an input text file to grep
.
For example, we are interested in knowing at how many places #include <stdio.h>
is used in a standard include
directory:
$ grep -R '#include <stdio.h>' /usr/include/ | wc -l 77
This means that the #include <stdio.h>
string is found at 77
places in the /usr/include
directory.
In another example, we want to know how many Python files (the extension .py
) in /usr/lib64/python2.7/
does "import os"
. We can check that as follows:
$ grep -R "import os" /usr/lib64/python2.7/*.py | wc -l 93
We can also specify the grep
command to exclude a particular directory or file from search. This is useful when we don't want grep
to look into a file or directory that has some confidential information. This is also useful in the case where we are sure that searching into a certain directory will be of no use. So, excluding them will reduce search time.
Suppose, there is a source code directory called s0
, which uses the git
version control. Now, we are interested in searching for a text or pattern in source files. In this case, searching in the .git
subdirectory will be of no use. We can exclude .git
from search as follows:
$ grep -R --exclude-dir=.git "search_string" s0
Here, we are searching for the search_string
string in the s0
directory and telling grep
to not to search in the .git
directory.
Instead of excluding a directory, to exclude a file, use the --exclude-from=FILE
option.
In some use-case, we don't bother with where the search matched and at how many places the search matched in a file. Instead, we are interested in knowing only the filename where at least one search matched.
For example, I want to save filenames that have a particular search pattern found in a file, or redirect to some other command for further processing. We can achieve this using the -l
option:
$ grep -Rl "import os" /usr/lib64/python2.7/*.py > search_result.txt $ wc -l search_result.txt
79
This example gets name of the file in which import os
is written and saves result in file search_result.txt
.
The exact matching of the word is also possible using word boundary that is on both the sides of the search pattern.
Here, we will reuse the input1.txt
file and its content:
$ grep -i --color "a" input1.txt
The --color
option allows colored printing of the matched search result.
The "a"
option tells grep
to only look for the character a that is alone. In search results, it won't match the character a
present as a sub-string in a string.
The following screenshot shows the output:
18.217.104.118