Regular expressions provide us with a powerful method to locate an arbitrarily complex pattern within a string. The regexp
command is similar to a Find function in a text editor. You search for a defined string for the character or the pattern of characters you are looking for and it returns a Boolean value that indicates success or failure and populates a list of optional variables with any matched strings. The -indices
and -inline
options must be used to modify the behavior, as indicated by this statement. But it doesn't stop there; by providing switches, you can control the behavior of regexp
. The switches are as follows:
Switch |
Behavior |
---|---|
|
No actual matching is made. Instead |
|
Allows the use of expanded regular expression, wherein whitespaces and comments are ignored. |
|
Returns a list of two decimal strings, containing the indices in the string to match for the first and last characters in the range. |
|
Enables the newline-sensitive matching similar to passing the |
|
Changes the behavior of [^] bracket expressions and the "." character so that they stop at newline characters. |
|
Changes the behavior of ^ and $ (anchors) so that they match both the beginning and end of a line. |
|
Treats uppercase characters in the search string as lowercase. |
|
Causes the command to match as many times as possible and returns the count of the matches found. |
|
Causes Match variables may NOT be used if |
|
Allows us to specify a character index from which searching should start. |
|
Denotes the end of switches being passed to Any argument following this switch will be treated as an expression, even if they start with a "-". |
Now that we have a background in switches, let's look at the command:
regexp switches expression string submatchvar submatchvar…
The regexp
command determines if the expression matches part or all of the string and returns a 1
if the match exists or a 0
if it is not found. If the variables (submatchvar) (for example myNumber
or myData)
are passed after the string, they are used as variables to store the returned submatchvar
. Keep in mind that if the inline
switch has been passed, no return variables should be included in the command.
To complete the following example, we will need to create a Tcl script file in your working directory. Open the text editor of your choice and follow the next set of instructions.
A common use for regexp
is to accept a string containing multiple words and to split it into its constituent parts. In the following example, we will create a string containing an IP address and assign the values to the named variables. Enter the following command:
% regexp "([0-9]{1,3}).([0-9]{1,3}).([0-9]{1,3}).([0-9]{1,3})"
$ip all first second third fourth
% puts "$all
$first
$second
$third
$fourth"
192.168.1.65
192
168
1
65
As you can see, the IP Address has been split into its individual octet values. What regexp
has done is match the groupings of decimal characters [0-9] of a varying length of 1 to 3 characters {1, 3} delimited by a "." character. The original IP address is assigned to the first variable (all) while the octet values are assigned to the remaining variables (first, second, third
, and fourth
).
3.143.5.201