String Pattern Functions

To simplify program development of shells and other related functions that require this pattern matching capability, the function fnmatch(3) was developed. The function synopsis for it is as follows:

#include <fnmatch.h>

int fnmatch(const char *pattern, const char *string, int flags);

The argument pattern is the input pattern string, which is compared with the input argument string. The argument pattern contains the meta-characters, if any. The argument string is the string that you want to test for a match. The argument flags enables and disables certain features of the fnmatch(3) function.

The return value from fnmatch(3) is zero if a match is made. Otherwise, the value FNM_NOMATCH is returned instead.

Note

When the argument pattern is the C string "*" and the argument string is the null string "", function fnmatch(3) considers this to be a match.


The flags argument of the fnmatch(3) function accepts the following macros for various bit definitions, which may be ORed together:

FNM_NOESCAPE Treat as a normal character (no quoting is performed).
FNM_PATHNAME Slashes (/) in string must match slashes in pattern.
FNM_PERIOD Leading periods (.) in string must only be matched by leading periods in pattern. This is affected by FNM_PATHNAME.
FNM_LEADING_DIR Match the leading directory pattern, but ignore all text that follows the trailing slash (/) in pattern.
FNM_CASEFOLD Ignore case distinctions in the pattern and string arguments.

Each of these option flags will be discussed in detail in the sections that follow. To aid in discussing and experimenting with these flags, the program in Listing 20.1 will be used.

Code Listing 20.1. fnmatch.cā€”A Program to Exercise the fnmatch(3) Function
1:   /* fnmatch.c */
2:
3:   #include <stdio.h>
4:   #include <unistd.h>
5:   #include <fnmatch.h>
6:
7:   /*
8:    * Provide command usage instructions :
9:    */
10:  static void
11:  usage(void) {
12:
13:      puts("Usage: fnmatch [options] <pattern> <strings>...");
14:      puts("
Options:");
15:      puts("	-n	Show non-matches");
16:      puts("	-e	FNM_NOESCAPE");
17:      puts("	-p	FNM_PATHNAME");
18:      puts("	-P	FNM_PERIOD");
19:      puts("	-d	FNM_LEADING_DIR");
20:      puts("	-c	FNM_CASEFOLD");
21:  }
22:
23:  /*
24:   * Report the flag bits in use as confirmation :
25:   */
26:  static void
27:  report_flags(int flags) {
28:
29:      fputs("Flags:",stdout);
30:      if ( flags & FNM_NOESCAPE )
31:          fputs(" FNM_NOESCAPE",stdout);
32:      if ( flags & FNM_PATHNAME )
33:          fputs(" FNM_PATHNAME",stdout);
34:      if ( flags & FNM_PERIOD )
35:          fputs(" FNM_PERIOD",stdout);
36:      if ( flags & FNM_LEADING_DIR )
37:          fputs(" FNM_LEADING_DIR",stdout);
38:      if ( flags & FNM_CASEFOLD )
39:          fputs(" FNM_CASEFOLD",stdout);
40:      if ( !flags )
41:          puts(" NONE");
42:      else
43:          putchar('
'),
44:  }
45:
46:  /*
47:   * Main program :
48:   */
49:  int
50:  main(int argc,char **argv) {
51:      int x;                      /* Interator variable */
52:      int z;                      /* General status variable */
53:      int flags = 0;              /* fnmatch(3) flags argument */
54:      int cmdopt_n = 0;           /* When true, report non-matches */
55:      char *pattern;              /* Pattern string for fnmatch(3) */
56:      const char cmdopts[] = "epPdchn"; /* Supported command options */
57:
58:      /*
59:       * Process any command options :
60:       */
61:      while ( (z = getopt(argc,argv,cmdopts)) != -1 )
62:          switch ( z ) {
63:          case 'e':
64:              flags |= FNM_NOESCAPE;      /* -e */
65:              break;
66:          case 'p':
67:              flags |= FNM_PATHNAME;      /* -p */
68:              break;
69:          case 'P':
70:              flags |= FNM_PERIOD;        /* -P */
71:              break;
72:          case 'd':
73:              flags |= FNM_LEADING_DIR;   /* -d */
74:              break;
75:          case 'c':
76:              flags |= FNM_CASEFOLD;      /* -c */
77:              break;
78:          case 'n':
79:              cmdopt_n = 1;               /* -n ; Show non-matches */
80:              break;
81:          case 'h':
82:          default :
83:              usage();
84:              return 1;
85:          }
86:
87:      /*
88:       * We must have a pattern and at least one trial string :
89:       */
90:      if ( optind + 1 >= argc ) {
91:          usage();
92:          return 1;
93:      }
94:
95:      /*
96:       * Pick the pattern string and report the flags that
97:       * are in effect for this run :
98:       */
99:      pattern = argv[optind++];
100:     report_flags(flags);
101:
102:     /*
103:      * Now try pattern against all remaining command
104:      * line arguments :
105:      */
106:     for ( x=optind; x<argc; ++x ) {
107:         z = fnmatch(pattern,argv[x],flags);
108:         /*
109:          * Report matches, or report all, if -n
110:          * option was used :
111:          */
112:         if ( !z || cmdopt_n )
113:             printf("%s: fnmatch('%s','%s',flags)
",
114:                 !z ? "Matched" : "No match",
115:                 pattern,
116:                 argv[x]);
117:     }
118:
119:     return 0;
120: }

The first portion of the main program parses the command line for options (lines 58ā€“93) and prepares for the test run (lines 99 and 100). The report_flags() function simply reports the flag option bits in effect as a confirmation.

The interesting code is in lines 106ā€“117 where the function fnmatch(3) is called to test each command-line argument. By default, only the matches are reported unless the -n option has been supplied.

The following shows how to compile the program and provoke a usage display with the -h option:

$ make fnmatch
cc -c  -Wall fnmatch.c
cc -o fnmatch fnmatch.o
$ ./fnmatch -h
Usage: fnmatch [options] <pattern> <strings>...

Options:
        -n      Show non-matches
        -e      FNM_NOESCAPE
        -p      FNM_PATHNAME
        -P      FNM_PERIOD
        -d      FNM_LEADING_DIR
        -c      FNM_CASEFOLD
$

From the output you can see that all options except -n apply additional fnmatch(3) flag bits. Initially no flags are in effect.

To make it simpler to perform some of the tests in this chapter, alter your PATH variable as follows:

$ PATH=$PWD:$PATH

Repeating one of the earlier tests, we can use our fnmatch command in place of ls(1):

$ cd /etc
$ fnmatch '*[xyz]*'*
Flags: NONE
Matched: fnmatch('*[xyz]*','exports',flags)
Matched: fnmatch('*[xyz]*','gettytab',flags)
Matched: fnmatch('*[xyz]*','newsyslog.conf',flags)
Matched: fnmatch('*[xyz]*','security',flags)
Matched: fnmatch('*[xyz]*','skeykeys',flags)
Matched: fnmatch('*[xyz]*','syslog.conf',flags)
Matched: fnmatch('*[xyz]*','ttys',flags)
$

Please notice two important things here:

  • The pattern is in single quotes.

  • The remaining arguments are expanded by the shell before our fnmatch command is executed.

If you add the option -n to the command line, you will list all of the entries that did not match the output. Only the command is shown here:

$ fnmatch -n '*[xyz]*'*

Any command-line options must appear before the pattern. After the options, the pattern must be the first command-line argument. All remaining arguments are tested against the pattern.

The FNM_NOESCAPE Flag

The FNM_NOESCAPE flag bit disables the fnmatch(3) capability to escape meta-characters. To test this, first change to the /tmp directory and create an empty test file named [file]:

$ cd /tmp
$ >'[file]'

Now test the fnmatch command using the escape characters:

$ fnmatch '[file]'*
Flags: NONE
Matched: fnmatch('[file]','[file]',flags)
$

From all of the files in the /tmp directory, it matched the pattern literally with filename [file]. The pattern [file] matches because the escape characters indicate that the following meta-characters should be treated as normal characters. Adding the flag FNM_NOESCAPE (option -e) changes things:

$ fnmatch -e '[file]'*
Flags: FNM_NOESCAPE
$

In this case, no match is attained. This happens because the leading backslash must now match part of the string. The remainder of the pattern is now a range, since the backslashes are not acting as escape characters when FNM_NOESCAPE is used.

The FNM_CASEFOLD Flag

The FNM_CASEFOLD allows the programmer to specify that fnmatch(3) ignore the case of the letters when performing the pattern match. This is confirmed with the help of the test program (option -c used):

$ cd /etc
$ fnmatch -c 'HOSTS*'*
Flags: FNM_CASEFOLD
Matched: fnmatch('HOSTS*','hosts',flags)
Matched: fnmatch('HOSTS*','hosts.allow',flags)
Matched: fnmatch('HOSTS*','hosts.equiv',flags)
Matched: fnmatch('HOSTS*','hosts.lpd',flags)
$

In this example, the pattern HOSTS* matches the file hosts, although the case differs.

Warning

The FNM_CASEFOLD flag appears to be a GNU C library feature and is not available on other UNIX platforms. This feature is supported by FreeBSD and Linux, however.


The FNM_PATHNAME Flag

The FNM_PATHNAME flag adds some pathname semantics to the fnmatch(3) function. This option requires that slashes (/) occurring in patterns must match slashes in the supplied input string. This makes it possible to perform directory and file pattern matches more intelligently. To perform this test, first create a temporary directory in /tmp as follows:

$ make one
mkdir /tmp/one
mkdir /tmp/one/log
mkdir /tmp/one/two
mkdir /tmp/one/two/log
date >/tmp/one/log/date1.log
date >/tmp/one/log/.date3
date >/tmp/one/two/log/date2.log
$

From this, you can see that a number of subdirectories are created, and two log files were created with the date(1) command. Now perform the following:

$ fnmatch '/tmp/*/log/*.log'`find /tmp/one`
Flags: NONE
Matched: fnmatch('/tmp/*/log/*.log','/tmp/one/log/date1.log',flags)
Matched: fnmatch('/tmp/*/log/*.log','/tmp/one/two/log/date2.log',flags)
$

If you look at this output carefully, you will see that one match is not intended. The first match makes sense because the subdirectory one matches the first *, and the filename date1 matches the second *.

In the second case, however, the first * actually matches the string one/two, and the second * matches the date2 in date2.log. The spirit of this match suggests that there should have only been one directory level between /tmp/ and /log/*.log.

To accomplish this, the FNM_PATHNAME flag (option -p) must be enabled:

$ fnmatch -p '/tmp/*/log/*.log'`find /tmp/one`
Flags: FNM_PATHNAME
Matched: fnmatch('/tmp/*/log/*.log','/tmp/one/log/date1.log',flags)
$

The results now agree with what was expected.

Note

FNM_FILE_NAME is provided on some UNIX platforms as a synonym for FNM_PATHNAME.


The FNM_PERIOD Flag

This flag causes strings that have leading periods to match only when the pattern has leading periods. Another way to say this is that * and ?, for example, will not match a leading period in the string with the flag FNM_PERIOD enabled. This also applies to ranges.

Usually the FNM_PERIOD flag is used in combination with the FNM_PATHNAME flag. The FNM_PATHNAME flag causes a period to be considered a leading period, if it follows a slash (/) character. Assuming that you still have the directory /tmp/one from the last experiment, perform the following pattern test using only the FNM_PATHNAME (-p) option:

$ fnmatch -p '/tmp/*/log/*'`find /tmp/one`
Flags: FNM_PATHNAME
Matched: fnmatch('/tmp/*/log/*','/tmp/one/log/date1.log',flags)
Matched: fnmatch('/tmp/*/log/*','/tmp/one/log/.date3',flags)
$

Notice that in this experiment the pattern specifies * for the last filename component. Using this pattern, two files matched: date1.log and .date3. Adding the FNM_PERIOD flag (option -P), causes the following results to be displayed instead:

$ fnmatch -pP '/tmp/*/log/*'`find /tmp/one`
Flags: FNM_PATHNAME FNM_PERIOD
Matched: fnmatch('/tmp/*/log/*','/tmp/one/log/date1.log',flags)
$

In this output, fnmatch(3) does not permit the leading period in .date3 to match with the * pattern character. If your object was the files prefixed with periods, then you would alter the match string:

$ fnmatch -pP '/tmp/*/log/.*'`find /tmp/one`
Flags: FNM_PATHNAME FNM_PERIOD
Matched: fnmatch('/tmp/*/log/.*','/tmp/one/log/.date3',flags)
$

In this example, the period (.) was added to the pattern string in order to effect a match to .date3.

The FNM_LEADING_DIR Flag

This option causes the pattern match to occur on a directory component level. After the pattern match, anything that follows starting with a slash (/) is ignored for pattern matching purposes.

$ cd /tmp
$ fnmatch -d 'on*'`find one`
Flags: FNM_LEADING_DIR
Matched: fnmatch('on*','one',flags)
Matched: fnmatch('on*','one/log',flags)
Matched: fnmatch('on*','one/log/date1.log',flags)
Matched: fnmatch('on*','one/log/.date3',flags)
Matched: fnmatch('on*','one/two',flags)
Matched: fnmatch('on*','one/two/log',flags)
Matched: fnmatch('on*','one/two/log/date2.log',flags)
$

The documentation does not suggest that FNM_PATHNAME is required. Experiments suggest that FNM_LEADING_DIR works with or without the FNM_PATHNAME flag.

Warning

The FNM_LEADING_DIR flag appears to be a GNU C library feature and is not available on other UNIX platforms. This feature is supported by FreeBSD and Linux, however.


..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.141.202.187