Capturing groups

Groups are a very useful feature of regular expressions, which are supported in all the flavors of regular expressions. Groups are used to combine multiple characters or multiple smaller components of regular expressions into a single unit. We create groups by placing a series of characters or subpatterns inside round brackets or parentheses, ( and ). For example, consider the following regex pattern:

    (blue|red) 

It means a capturing group that uses alternation. It either matches the letters b, l, u, and e or it matches the letters r, e, and d. In other words, it matches the strings blue or red, and more importantly, it creates a capturing group with either of the two matched strings. Each group becomes a single unit that can be used to apply certain constructs to the entire group. For example, anchors, boundary assertion, quantifiers, or alternation can be restricted to a part of the regular expression represented by the group. For example, look at the following regex pattern:

    ^Regular(Expression)?$ 

This regular expression will match the string, Regular, at the start. After that, there is one capturing group with the string, Expression; however, due to the placement of the ? quantifier after the group, it will match the 0 or 1 occurrence of the preceding group, making it an optional capturing group. Hence, this regex will either match the string, Regular, with an empty first capturing group or it will match the string RegularExpression, with the substring, Expression, in the first capturing group.

If we are given a problem to write a regular expression that matches only an even number of digits in the input, then we can use this pattern:

    ^([0-9]{2})+$ 

Since the + quantifier (one or more) is used next to the group that matches a pair of digits, this quantifier is applied to the entire group. Hence, this regular expression will match one or more pairs of digits (2, 4, 6, 8, 10, ...), or in simple words, it matches an even number of digits.

A regular expression can have multiple capturing groups, which can be nested inside each other as well.

For example, in the following regular expression, there are three capturing groups:

    ^((d+)-([a-zA-Z]+))$ 

The preceding expression will match the input string, that is, 1234-aBc with the following groups:

  1. Group 1: 1234-aBc
  2. Group 2: 1234
  3. Group 3: aBc
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.139.83.151