The main advantage of using extended grep is that additional regular expression metacharacters (see Table 3.4) have been added to the basic set. With the -E extension, Gnu grep allows the use of these new metacharacters.
Metacharacter | Function | Example | What It Matches |
---|---|---|---|
^ | Beginning of line anchor | ^love | Matches all lines beginning with love. |
$ | End of line anchor | love$ | Matches all lines ending with love. |
. | Matches one character | l..e | Matches lines containing an l, followed by two characters, followed by an e. |
* | Matches zero or more characters | *love | Matches lines with zero or more spaces, of the preceding characters followed by the pattern love. |
[ ] | Matches one character in the set | [Ll]ove | Matches lines containing love or Love. |
[^] | Matches one character not in the set | [^A–KM–Z]ove | Matches lines not containing A through K or M through Z, followed by ove. |
New with grep -E or egrep | |||
+ | Matches one or more of the preceding characters | [a–z]+ove | Matches one or more lowercase letters, followed by ove. Would find move, approve, love, behoove, etc. |
? | Matches zero or one of the preceding characters | lo?ve | Matches for an l followed by either one or not any o's at all. Would find love or lve. |
a|b | Matches either a or b | love|hate | Matches for either expression, love or hate. |
() | Groups characters | love(able|ly) (ov)+ | Matches for loveable or lovely. Matches for one or more occurrences of ov. |
x{m}x{m,} x{m,n}[a] | Repetition of character x,m times, at least m times, or between m and n times | o{5}o{5,}o{5, 10} | Matches if line has 5 o's, at least 5 o's, or between 5 and 10 o's |
w | alphanumeric word character;[a-zA-Z0-9] | lw*e | Matches an l followed by zero more word characters, and an e. |
W | nonalphanumeric word character;[^a-zA-Z0-9] | ||
word boundary | love | Matches only the word love. |
[a] The { } metacharacters are not supported on all versions of UNIX or all pattern-matching utilities; they usually work with vi and grep. They don't work with UNIX egrep at all.
The following examples illustrate the way the extended set of regular expression metacharacters are used with grep -E and egrep. The grep examples presented earlier illustrate the use of the standard metacharacters, also recognized by egrep. With basic Gnu grep (grep -G), it is possible to use any of the additional metacharacters, provided that each of the special metacharacters is preceded with a backslash.
In the following examples, all three variants of grep are shown to accomplish the same task.
cat datafile northwest NW Charles Main 3.0 .98 3 34 western WE Sharon Gray 53 .97 5 23 southwest SW Lewis Dalsass 2.7 .8 2 18 southern SO Suan Chin 5.1 .95 4 15 southeast SE Patricia Hemenway 4.0 .7 4 17 eastern EA TB Savage 4.4 .84 5 20 northeast NE AM Main Jr. 5.1 .94 3 13 north NO Margot Weber 4.5 .89 5 9 central CT Ann Stephens 5.7 .94 5 13
1 % egrep 'NW|EA' datafile northwest NW Charles Main 3.0 .98 3 34 eastern EA TB Savage 4.4 .84 5 20 2 % grep -E 'NW|EA' datafile northwest NW Charles Main 3.0 .98 3 34 eastern EA TB Savage 4.4 .84 5 20 3 % grep 'NW|EA' datafile 4 % grep 'NW|EA' datafile northwest NW Charles Main 3.0 .98 3 34 eastern EA TB Savage 4.4 .84 5 20 |
Explanation
In this example, the Gnu grep is used with the -E option to include the extended metacharacters. Same as egrep.
Regular grep does not normally support extended regular expressions; the vertical bar is an extended regular expression metacharacter used for alternation. Regular grep doesn't recognize it and searches for the explicit pattern `NW|EA.' Nothing matches; nothing prints.
% cat datafile northwest NW Charles Main 3.0 .98 3 34 western WE Sharon Gray 53 .97 5 23 southwest SW Lewis Dalsass 2.7 .8 2 18 southern SO Suan Chin 5.1 .95 4 15 southeast SE Patricia Hemenway 4.0 .7 4 17 easten EA TB Savage 4.4 .84 5 20 northeast NE AM Main Jr. 5.1 .94 3 13 north NO Margot Weber 4.5 .89 5 9 central CT Ann Stephens 5.7 .94 5 13
% egrep '3+' datafile % grep -E '3+' datafile % grep '3+' datafile northwest NW Charles Main 3.0 .98 3 34 western WE Sharon Gray 5.3 .97 5 23 northeast NE AM Main 5.1 .94 3 13 central CT Ann Stephens 5.7 .94 5 13 |
Explanation
Prints all lines containing one or more 3s.
% egrep '2.?[0–9]' datafile % grep -E '2.?[0–9]' datafile % grep '2.?[0–9] ' datafile western WE Sharon Gray 5.3 .97 5 23 southwest SW Lewis Dalsass 2.7 .8 2 18 eastern EA TB Savage 4.4 .84 5 20 |
Explanation
Prints all lines containing a 2, followed by zero or one period, followed by a number in the range between 0 and 9.
% egrep '(no)+' datafile % grep -E '(no)+' datafile % grep '(no)+' datafile northwest NW Charles Main 3.0 .98 3 34 northeast NE AM Main 5.1 .94 3 13 north NO Margot Weber 4.5 .89 5 9 |
Explanation
Prints lines containing one or more occurrences of the pattern group no.
grep -E 'w+W+[ABC] ' datafile northwest NW Charles Main 3.0 .98 3 34 southern SO Suan Chin 5.1 .95 4 15 northeast NE AM Main Jr. 5.1 .94 3 13 central CT Ann Stephens 5.7 .94 5 13 |
Explanation
Prints all lines containing one or more alphanumeric word characters (w+), followed by one or more non-alphanumeric word characters (W+), followed by one letter in the set A, B, C.
% egrep 'S(h|u)' datafile % grep -E 'S(h|u)' datafile % grep 'S(h|u)' datafile western WE Sharon Gray 5.3 .97 5 23 southern SO Suan Chin 5.1 .95 4 15 |
Explanation
Prints all lines containing S, followed by either h or u; i.e., Sh or Su.
% egrep 'Sh|u' datafile % grep -E 'Sh|u' datafile % grep 'Sh|u' datafile western WE Sharon Gray 5.3 .97 5 23 southern SO Suan Chin 5.1 .95 4 15 southwest SW Lewis Dalsass 2.7 .8 2 18 southeast SE Patricia Hemenway 4.0 .7 4 17 |
Explanation
Prints all lines containing the expression Sh or u.
The variants of Gnu grep, supported by Linux, are almost, but not the same, as their UNIX namesakes. For example, the version of egrep, found in Solaris or BSD UNIX, does not support three metacharacter sets: { }for repetition, ( ) for tagging characters, and < >, the word anchors. Under Linux, these metacharacters are available with grep and grep -E, but egrep does not recognize < >. The following examples illustrate these differences, just in case you are running bash or tcsh under a UNIX system other than Linux, and you want to use grep and its family in your shell scripts.
% cat datafile northwest NW Charles Main 3.0 .98 3 34 western WE Sharon Gray 53 .97 5 23 southwest SW Lewis Dalsass 2.7 .8 2 18 southern SO Suan Chin 5.1 .95 4 15 southeast SE Patricia Hemenway 4.0 .7 4 17 eastern EA TB Savage 4.4 .84 5 20 northeast NE AM Main Jr. 5.1 .94 3 13 north NO Margot Weber 4.5 .89 5 9 central CT Ann Stephens 5.7 .94 5 13
(Linux Gnu grep) 1 % grep '<north>' datafile Must use backslashes 2 % grep '<north>' datafile north NO Margot Weber 4.5 .89 5 9 3 % grep -E '<north>' datafile north NO Margot Weber 4.5 .89 5 9 4 % egrep '<north>' datafile north NO Margot Weber 4.5 .89 5 9 (Solaris egrep) 5 % egrep '<north>' datafile <no output; not recognized> |
Explanation
No matter what variant of grep is being used, the word anchor metacharacters, < >, must be preceded with a backslash.
This time, grep searches for a word that begins and ends with north. < represents the beginning of word anchor and > represents the end of word anchor.
Grep with the -E option, also recognizes the word anchors.
The Gnu form of egrep recognizes the word anchors.
When using Solaris (SVR4), egrep does not recognize word anchors as regular expression metacharacters.
(Linux Gnu grep) 1 % grep 'w(es)t.*1' datafile grep: Invalid back reference 2 % grep 'w(es)t.*1' datafile northwest NW Charles Main 3.0 .98 3 34 3 % grep -E 'w(es)t.*1' datafile northwest NW Charles Main 3.0 .98 3 34 4 % egrep 'w(es)t.*1' datafile northwest NW Charles Main 3.0 .98 3 34 (Solaris egrep) 5 % egrep 'w(es)t.*1' datafile <no output; not recognized> |
Explanation
When using regular grep, the ( ) extended metacharacters must be backslashed or an error occurs.
If the regular expression, w(es)t, is matched, the pattern, es, is saved and stored in memory register 1. The expression reads: if west is found, tag and save es, search for any number of characters (.*) after it, followed by es (1) again, and print the line. The es in Charles is matched by the backreference.
This is the same as the previous example, except, grep with the -E switch, does not precede the ( ) with backslashes.
The Gnu egrep also uses the extended metacharacters, ( ), without backslashes.
With Solaris, egrep doesn't recognize any form of tagging and backreferencing.
(Linux Gnu grep) 1 % grep '.[0-9]{2}[^0-9]' datafile northwest NW Charles Main 3.0 .98 3 34 western WE Sharon Gray 5.3 .97 5 23 southern SO Suan Chin 5.1 .95 4 15 eastern EA TB Savage 4.4 .84 5 20 northeast NE AM Main Jr. 5.1 .94 3 13 north NO Margot Weber 4.5 .89 5 9 central CT Ann Stephens 5.7 .94 5 13 2 % grep -E '.[0-9]{2}[^0-9] ' datafile northwest NW Charles Main 3.0 .98 3 34 western WE Sharon Gray 5.3 .97 5 23 southern SO Suan Chin 5.1 .95 4 15 eastern EA TB Savage 4.4 .84 5 20 northeast NE AM Main Jr. 5.1 .94 3 13 north NO Margot Weber 4.5 .89 5 9 central CT Ann Stephens 5.7 .94 5 13 3 % egrep '.[0-9]{2}[^0-9]' datafile northwest NW Charles Main 3.0 .98 3 34 western WE Sharon Gray 5.3 .97 5 23 southern SO Suan Chin 5.1 .95 4 15 eastern EA TB Savage 4.4 .84 5 20 northeast NE AM Main Jr. 5.1 .94 3 13 north NO Margot Weber 4.5 .89 5 9 central CT Ann Stephens 5.7 .94 5 13 (Solaris egrep) 4 % egrep '.[0-9]{2}[^0-9]' datafile <no output; not recognized with or without backslashes> |
Explanation
The extended metacharacters, {}, are used for repetition. The Gnu and UNIX versions of regular grep do not evaluate this extended metacharacter set unless the curly braces are preceded by backslashes. The whole expression reads: search for a literal period ., followed by a number between 0 and 9, [0-9], if the pattern is repeated exactly two times, {2}, followed by a nondigit [^0-9].
With extended grep, grep -E, the repetition metacharacters, {2}, do not need to be preceded with backslashes as in the previous example.
Because Gnu egrep and grep -E are functionally the same, this command produces the same output as the previous example.
This is the standard UNIX version of egrep. It does not recognize the curly braces as an extended metacharacter set either with or without backslashes.
18.191.234.150