Exploring the quantifiers

Each of these different quantifiers is greedy. A greedy quantifier will grab as much as it possibly can before allowing the regex engine to move on to the next character in the expression.

In the following example, the expression has been instructed to match everything it can, ending with a character. As a result, it takes everything up to the last , because the expression is greedy:

PS> 'C:longpath	osomefiles' -match '.*'; $matches[0]
True
C:longpath osome

The repetition operators can be made lazy by adding the ? character. A lazy expression, by contrast, will get as little as it can before it ends:

PS> 'C:longpath	osomefiles' -match '.*?'; $matches[0]
True
C:

A possible use of a lazy quantifier is parsing HTML. The following line describes a very simple HTML table. The goal is to get the first table's data (td) element:

<table><tr><td>Value1</td><td>Value2</td></tr></table> 

Using a greedy quantifier will potentially take too much:

PS> $html = '<table><tr><td>Value1</td><td>Value2</td></tr></table>'
$html -match '<td>.+</td>'; $matches[0]
True
<td>Value1</td><td>Value2</td>

Using a character class is one possible way to solve this problem. The character class is used to take all characters except >, which denotes the end of the next </td> tag:

PS> $html = '<table><tr><td>Value1</td><td>Value2</td></tr></table>'
PS> $html -match '<td>[^>]+</td>'
True

PS> $matches[0]
<td>Value1</td>

Another way to solve a problem is to use a lazy quantifier:

PS> $html = '<table><tr><td>Value1</td><td>Value2</td></tr></table>'
PS> $html -match '<td>.+?</td>'
True

PS> $matches[0]
<td>Value1</td>
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.145.186.173