The glob(3) Function

The glob(3) function represents another way that a process can gather a list of file and directory name objects. Unlike the fnmatch(3) function, the glob(3) function actually performs directory searches. The function synopsis for glob(3) and globfree(3) is as follows:

#include <glob.h>

int glob(
    const char *pattern,
    int flags,
    int (*errfunc)(const char *, int),
    glob_t *pglob);

void globfree(glob_t *pglob);

typedef struct {
    int    gl_pathc;       /* count of total paths so far */
    int    gl_matchc;      /* count of paths matching pattern */
    int    gl_offs;        /* reserved at beginning of gl_pathv */
    int    gl_flags;       /* returned flags */
    char   **gl_pathv;     /* list of paths matching pattern */
} glob_t;

The first argument pattern for glob(3) is a shell pattern, like the patterns used by fnmatch(3). However, the argument flags uses a different set of flags that will be described shortly. Argument errfunc is an optional function pointer and must be a null pointer when it is not used. The final argument pglob is a pointer to a glob_t structure.

The function globfree(3) should be called after a successful call to glob(3) has been made, and the information contained in the structure glob_t is no longer required. This function releases memory occupied by the array member gl_pathv and perhaps other implementation-defined storage.

The glob_t structure member gl_pathv is a returned array of matching filenames. The member gl_pathc is a count of how many string pointers are contained in gl_pathv. When gl_pathc is zero, there is no gl_pathv array allocated, and it should not be referenced. When the gl_pathv array is allocated, the last member of the array is followed by a null pointer.

The member gl_flags is used by glob(3) to return flag bits. Flag bit GLOB_MAGCHAR is one flag that may be returned in this member to indicate that the pattern argument contained at least one meta-character.

The member gl_matchc contains the current number of matched pathnames for the current glob(3) call. Since glob(3) can be called to append to the gl_pathv array, gl_matchc is useful for determining how many paths were appending with the current function call.

The member gl_offs must be initialized prior to the first call to glob(3) for the given glob_t structure used, when the flag GLOB_DOOFFS is set. This member indicates how many initial gl_pathv array entries to reserve as null pointers. If you do not need to reserve any array entries, then initialize this value to zero (not using flag GLOB_DOOFFS also will work).

The flag GLOB_ERR causes glob(3) to stop the directory scan at the first sign of trouble. By default, glob(3) ignores directory scan errors and attempts to match as much as possible. Using flag GLOB_ERR changes this behavior so that glob(3) will exit with the first error encountered.

Multiple calls to glob(3) are permitted, to gather additional member entries. The GLOB_ERR flag applied in an earlier call will influence later calls when the same pglob argument is used. This is the result of the GLOB_ERR flag being saved in the gl_flags member of the glob_t structure.

Return Values for glob(3)

When glob(3) returns normally, the value zero is returned. However, when an error occurs, the value GLOB_NOSPACE or GLOB_ABEND is returned instead.

When GLOB_NOSPACE is returned, this indicates that glob(3) was unable to allocate or reallocate memory. This might be a sign that you are failing to call globfree(3).

The return value GLOB_ABEND indicates that the directory scan was stopped. An error may have occurred while scanning the directory and flag bit GLOB_ERR was set. Alternatively, the errfunc function may have returned non-zero to cause the scan to be stopped.

Before the individual glob(3) flags are discussed, an example program is presented in Listing 20.2. This program will permit you to experiment with the various glob(3) flags and patterns.

Code Listing 20.2. glob.c—Exerciser for the glob(3) and globfree(3) Functions
1:   /* glob.c */
2:
3:   #include <stdio.h>
4:   #include <stdlib.h>
5:   #include <unistd.h>
6:   #include <errno.h>
7:   #include <string.h>
8:   #include <glob.h>
9:
10:  /*
11:   * Provide command usage instructions :
12:   */
13:  static void
14:  usage(void) {
15:
16:      puts("Usage: glob [options] pattern...");
17:      puts("Options:");
18:      puts("	-a	GLOB_APPEND");
19:      puts("	-c	GLOB_NOCHECK");
20:      puts("	-o n	GLOB_DOOFFS");21:      puts("	-e	GLOB_ERR");
22:      puts("	-m	GLOB_MARK");
23:      puts("	-n	GLOB_NOSORT");
24:      puts("	-B	GLOB_BRACE");
25:      puts("	-N	GLOB_NOMAGIC");
26:      puts("	-Q	GLOB_QUOTE");
27:      puts("	-T	GLOB_TILDE");
28:  }
29:
30:  /*
31:   * Report the flag bits in use as confirmation :
32:   */
33:  static void
34:  report_flags(int flags) {
35:
36:      fputs("Flags:",stdout);
37:      if ( flags & GLOB_APPEND )
38:          fputs(" GLOB_APPEND",stdout);
39:      if ( flags & GLOB_DOOFFS )
40:          fputs(" GLOB_DOOFFS",stdout);
41:      if ( flags & GLOB_ERR )
42:          fputs(" GLOB_ERR",stdout);
43:      if ( flags & GLOB_MARK )
44:          fputs(" GLOB_MARK",stdout);
45:      if ( flags & GLOB_NOSORT )
46:          fputs(" GLOB_NOSORT",stdout);
47:      if ( flags & GLOB_NOCHECK )
48:          fputs(" GLOB_NOCHECK",stdout);
49:      if ( flags & GLOB_BRACE )
50:          fputs(" GLOB_BRACE",stdout);
51:      if ( flags & GLOB_MAGCHAR )
52:          fputs(" GLOB_MAGCHAR",stdout);
53:      if ( flags & GLOB_NOMAGIC )
54:          fputs(" GLOB_NOMAGIC",stdout);
55:      if ( flags & GLOB_QUOTE )
56:          fputs(" GLOB_QUOTE",stdout);
57:      if ( flags & GLOB_TILDE )
58:          fputs(" GLOB_TILDE",stdout);
59:      if ( !flags )
60:          puts(" NONE");
61:      else
62:          putchar('
'),
63:  }
64:
65:  /*
66:   * Error callback function :
67:   */
68:  static int
69:  errfunc(const char *path,int e) {
70:      printf("%s: %s
",strerror(e),path);
71:      return 0;
72:  }
73:
74:  /*
75:   * Report the glob_t results :
76:   */
77:  static void
78:  report_glob(glob_t *gp) {
79:      int x;
80:      int g_offs = 0;             /* glob offset */
81:
82:      if ( gp->gl_pathc < 1 ) {
83:          puts("There are no glob results.");
84:          return;
85:      }
86:      printf("There were %d matches returned:
",gp->gl_pathc);
87:
88:      if ( gp->gl_flags & GLOB_DOOFFS )
89:          g_offs = gp->gl_offs;   /* Allow for offset */
90:
91:      for ( x=0; x < gp->gl_pathc + g_offs; ++x )
92:          printf("%3d: %s
",
93:              x,
94:              gp->gl_pathv[x] ? gp->gl_pathv[x] : "<NULL>");
95:
96:      report_flags(gp->gl_flags);
97:      putchar('
'),
98:  }
99:
100: /*
101:  * Main program :
102:  */
103: int
104: main(int argc,char **argv) {
105:     int z;                      /* General status */
106:     glob_t g;                   /* The glob area */
107:     int flags = 0;              /* All other flags */
108:     int a = 0;                  /* GLOB_APPEND flag */
109:     int offs = 0;               /* Offset */
110:     const char cmdopts[] = "aco:emnBNQTh";
111:
112:     /*
113:      * Process any command options :
114:      */
115:     while ( (z = getopt(argc,argv,cmdopts)) != -1 )
116:         switch ( z ) {
117:         case 'a':
118:             a = GLOB_APPEND;
119:             break;
120:         case 'o':
121:             flags |= GLOB_DOOFFS;
122:             offs = atoi(optarg);
123:             break;
124:         case 'e':
125:             flags |= GLOB_ERR;
126:             break;
127:         case 'm':
128:             flags |= GLOB_MARK;
129:             break;
130:         case 'n':
131:             flags |= GLOB_NOSORT;
132:             break;
133:         case 'c':
134:             flags |= GLOB_NOCHECK;
135:             break;
136:         case 'B':
137:             flags |= GLOB_BRACE;
138:             break;
139:         case 'N':
140:             flags |= GLOB_NOMAGIC;
141:             break;
142:         case 'Q':
143:             flags |= GLOB_QUOTE;
144:             break;
145:         case 'T':
146:             flags |= GLOB_TILDE;
147:             break;
148:         case 'h':
149:         default :
150:             usage();
151:             return 1;
152:         }
153:
154:     /*
155:      * We must have at least one pattern :
156:      */
157:     if ( optind >= argc ) {
158:         usage();
159:         return 1;
160:     }
161:
162:     /*
163:      * Pick the pattern string and report the flags that
164:      * are in effect for this run :
165:      */
166:     report_flags(flags|a);
167:
168:     /*
169:      * Now try pattern against all remaining command
170:      * line arguments :
171:      */
172:     for ( ; optind < argc; ++optind, flags |= a ) {
173:         /*
174:          * Invoke glob(3) to scan directories :
175:          */
176:         g.gl_offs = offs;       /* Offset, if any */
177:         z = glob(argv[optind],flags,errfunc,&g);
178:         if ( z ) {
179:             if ( z == GLOB_NOSPACE )
180:                 fputs("glob(3) ran out of memory
",stderr);
181:             else if ( z == GLOB_ABEND )
182:                 fputs("glob(3): GLOB_ERR/errfunc
",stderr);
183:             return 1;
184:         }
185:
186:         /*
187:          * Report glob(3) findings, unless GLOB_APPEND :
188:          */
189:         if ( !a ) {              /* If not GLOB_APPEND */
190:             report_glob(&g);    /* Report matches */
191:             globfree(&g);       /* Free gl_pathv[] etc. */
192:         } else {
193:             /*
194:              * GLOB_APPEND requested. Just accumulate
195:              * glob(3) results, but here we report the
196:              * number of matches made with each pattern:
197:              */
198:             printf("Pattern '%s'got %d matches
",
199:                 argv[optind],
200:                 g.gl_matchc);
201:         }
202:     }
203:
204:     /*
205:      * If GLOB_APPEND used, then report everything at
206:      * the end :
207:      */
208:     if ( a ) {                   /* If GLOB_APPEND */
209:         report_glob(&g);        /* Report appended matches */
210:         globfree(&g);           /* Free gl_pathv[] etc. */
211:     }
212:
213:     return 0;
214: }

The program in Listing 20.2 is similar in many respects to Listing 20.1. Lines 115–160 have to do with parsing the command-line options, which enable various glob(3) flags. Note that option -a causes flag bit GLOB_APPEND to be stored into variable a, which is initialized as zero in line 108. This flag is kept separate from the other flags, which are stored as variable flags because GLOB_APPEND cannot be used the first time that glob(3) is called (line 177). However, the for loop causes a to be ORed to flags at the end of each loop, ensuring that GLOB_APPEND is used in successive iterations.

After all options are parsed from the command line, the flags in effect are reported in line 166 (note the input argument is flags|a so that GLOB_APPEND is included.

The int variable optind will point to the first non-option command-line argument after the getopt(3) loop has completed. These remaining command-line arguments are used as input patterns to glob(3) in the for loop of lines 172–202.

If the offset option -o is used, the variable offs contains this offset. Line 176 assigns this offset value to g.gl_offs. This assignment is significant only if the GLOB_DOOFFS flag is set in variable flags, when the -o option is processed from the command line.

The function glob(3) is called in line 177. The return value z is tested and reported in lines 178–184. Line 189 tests to see if variable a is zero. When a is zero, this indicates that no GLOB_APPEND is being used, and the report of each glob(3) pattern is reported immediately after each call (lines 190 and 191). Otherwise, when GLOB_APPEND has been requested, only the number of matches made for the current glob(3) call are reported in lines 198–200. The GLOB_APPEND results are reported at the end of the for loop in lines 209 and 210 instead.

Now examine the report_glob() function in lines 77–98. The if statement in line 82 is important, because if the glob_t member gl_pathc is zero, then gl_pathv is not allocated and should not be referenced. The program executes the return statement in line 84, when there are no results to report.

Note also in line 88 that the if statement tests for flag GLOB_DOOFFS. If it is present, you must allow for the offset when iterating through the gl_pathv array of pointers. Notice how the for loop allows for the offset g_offs in its test. This allowance is necessary because the loop starts at x=0.

To compile and provoke usage information from the program in Listing 20.2, perform the following:

$ make glob
cc -c  -Wall glob.c
cc -o glob glob.o
$ ./glob -h
Usage: glob [options] pattern...
Options:
        -a      GLOB_APPEND
        -o n    GLOB_DOOFFS
        -e      GLOB_ERR
        -m      GLOB_MARK
        -n      GLOB_NOSORT
        -B      GLOB_BRACE
        -N      GLOB_NOMAGIC
        -Q      GLOB_QUOTE
        -T      GLOB_TILDE
$

Lowercase option letters represent standard flags that are available for glob(3). Uppercase options represent extension flags or non-universal ones. Option -o is the only option that takes an argument. It represents a numeric offset to be used with GLOB_DOOFFS. The -e option adds the GLOB_ERR flag, but this is not explored in the examples that follow. It is there for your own experimentation.

Flag GLOB_DOOFFS

This flag indicates that glob_t member gl_offs is being used to reserve a number of null pointers at the start of the gl_pathv array (allocated by glob(3)). When flag GLOB_DOOFFS is used, you must initialize gl_offs prior to calling glob(3).

This sounds like a strange thing to do, but it makes a lot of sense when you are about to invoke execvp(2) to start a new command. The following prepares to execute the command cc -c -g *.c:

glob_t g;

g.gl_offs = 3;
glob("*.c",GLOB_DOOFFS,NULL,&g);

g.gl_pathv[0] = "cc";
g.gl_pathv[1] = "-c";
g.gl_pathv[2] = "-g";

execvp("cc",g.gl_pathv);

The variable g is the glob_t structure being used. Three entries are reserved in the g.gl_pathv array by assigning the value 3 to g.gl_offs (flag GLOB_DOOFFS is present in the flags argument). The call to glob(3) searches the directory for the pattern *.c.

The focus here is that g.gl_pathv[0] to g.gl_pathv[2] has been reserved for your own use. In this example, these reserved elements are used for the C compiler's first three arguments. This makes the result convenient to use with the system call execvp(2).

Try one experiment without using GLOB_DOOFFS so that you can then compare results with the next experiment. Make sure to enclose your patterns in single quotes to keep the shell from expanding them:

$ ./glob '*.c'
Flags: NONE
There were 2 matches returned:
  0: fnmatch.c
  1: glob.c
Flags: GLOB_MAGCHAR
$

This example uses no command-line options and provides one pattern '*.c'. In this result, you see two filenames were returned with the glob_t member gl_flags containing the flag GLOB_MAGCHAR. The GLOB_MAGCHAR flag, when returned, indicates that at least one meta-character was found in the pattern.

Now try the same experiment, but add an offset using the -o option. This experiment uses an offset of 3:

$ ./glob -o3 '*.c'
Flags: GLOB_DOOFFS
There were 2 matches returned:
  0: <NULL>
  1: <NULL>
  2: <NULL>
  3: fnmatch.c
  4: glob.c
Flags: GLOB_DOOFFS GLOB_MAGCHAR

$

Notice how three null pointers were reserved at the start of the gl_pathv array for your own use. The gl_flags member also reports the additional flag GLOB_DOOFFS that was supplied as input to glob(3).

The GLOB_APPEND Flag

The flag GLOB_APPEND indicates that the glob_t structure is to have new matched pathnames appended to it instead of initializing it. The following example shows how this is done:

glob_t g;

g.gl_offs = 3;
glob("*.c",GLOB_DOOFFS,NULL,&g);
glob("*.C",GLOB_DOOFFS|GLOB_APPEND,NULL,&g);

The first call to glob(3) initializes the glob_t variable g and adds pathnames that match the pattern *.c. The second call with the flag GLOB_APPEND causes glob(3) to assume that g has already been initialized. Matches to *.C are then appended to the existing collection in g.gl_pathv.

Now test this feature as follows:

$ ./glob -a '*.c''*.o'
Flags: GLOB_APPEND
Pattern '*.c'got 2 matches
Pattern '*.o'got 1 matches
There were 3 matches returned:
  0: fnmatch.c
  1: glob.c
  2: glob.o
Flags: GLOB_APPEND GLOB_MAGCHAR

$

The output shows how the first pattern '*.c' collected two matches, and the pattern '*.o' appended one more match. The result of all matches is reported at the end, and you can see that three final pathnames are reported.

The GlOB_MARK Flag

The GLOB_MARK flag marks directory entries by appending a slash (/) to them. Files are left as they are. The following example illustrates (note the -m option):

$ ./glob -am '/b*''/etc/hosts'
Flags: GLOB_APPEND GLOB_MARK
Pattern '/b*'got 2 matches
Pattern '/etc/hosts'got 1 matches
There were 3 matches returned:
  0: /bin/
  1: /boot/
  2: /etc/hosts
Flags: GLOB_APPEND GLOB_MARK

$

Directories /bin and /boot were marked with a trailing slash. The filename /etc/hosts was not.

The GLOB_NOSORT Flag

The GLOB_NOSORT flag disables the sort feature of glob(3). The following example shows the default sorted result:

$ ./glob '/etc/h*'
Flags: NONE
There were 5 matches returned:
  0: /etc/host.conf
  1: /etc/hosts
  2: /etc/hosts.allow
  3: /etc/hosts.equiv
  4: /etc/hosts.lpd
Flags: GLOB_MAGCHAR

$

Adding the GLOB_NOSORT flag by using the -n option yields unsorted results:

$ ./glob -n '/etc/h*'
Flags: GLOB_NOSORT
There were 5 matches returned:
  0: /etc/hosts
  1: /etc/hosts.allow
  2: /etc/host.conf
  3: /etc/hosts.equiv
  4: /etc/hosts.lpd
Flags: GLOB_NOSORT GLOB_MAGCHAR

$

However, note that sorting and not sorting affect only the current glob(3) call when GLOB_APPEND is used. Consequently, while the default is to sort, appended results are not sorted ahead of earlier results. You can test this for yourself:

$ ./glob -a '/etc/h*''/b*'
Flags: GLOB_APPEND
Pattern '/etc/h*'got 5 matches
Pattern '/b*'got 2 matches
There were 7 matches returned:
  0: /etc/host.conf
  1: /etc/hosts
  2: /etc/hosts.allow
  3: /etc/hosts.equiv
  4: /etc/hosts.lpd
  5: /bin
  6: /boot
Flags: GLOB_APPEND GLOB_MAGCHAR

$

Although the default suggests that the gl_pathv array should be sorted, it is sorted only within pattern groups. The first pattern matches for '/etc/h*' are sorted, but the later matches for pattern '/b*' are not sorted ahead of the earlier match set.

The GLOB_QUOTE Flag

By default, there is no quoting capability in glob(3). Applying the flag GLOB_QUOTE allows glob to interpret a backslash () as a quote meta-character. The quote character causes the character following to be treated literally, even if it is a meta-character. The example illustrates this:

$ date >'*.c'
$ ./glob '*.c'
Flags: NONE
There were 3 matches returned:
  0: *.c
  1: fnmatch.c
  2: glob.c
Flags: GLOB_MAGCHAR

$

The example has carefully created a file named *.c that contains the current date and time. Without any special options, the ./glob program picks up all files ending in the suffix .c. If you need quoting capability, to select only the file *.c you need GLOB_QUOTE (option -Q):

$ ./glob -Q '*.c'
Flags: GLOB_QUOTE
There were 1 matches returned:
  0: *.c
Flags: GLOB_QUOTE

$

Here, glob(3) interprets the asterisk (*) literally, because it is preceded by the quote character backslash () while the option GLOB_QUOTE is active.

The GLOB_NOCHECK Flag

Normally, when a pattern does not match, no results are returned. If you want to have the pattern returned as a result when no matches are found, add the GLOB_NOCHECK flag (option -c below):

$ ./glob '*.xyz'
Flags: NONE
There are no glob results.
$ ./glob -c '*.xyz'
Flags: GLOB_NOCHECK
There were 1 matches returned:
  0: *.xyz
Flags: GLOB_NOCHECK GLOB_MAGCHAR

$

In the first example, notice how no matches were found. Adding option -c causes the pattern itself (*.xyz) to be returned instead of no results.

The GLOB_ALTDIRFUNC Flag

This flag is documented by FreeBSD as an extension to glob(3) to enable programs such as restore(8) to provide globbing from directories stored on other media. The following additional glob_t members can be initialized with function pointers. When the flag GLOB_ALTDIRFUNC is used, these function pointers will be used in place of the glob(3) default functions for searching directories:

void *(*gl_opendir)(const char * name);
struct dirent *(*gl_readdir)(void *);
void (*gl_closedir)(void *);
int (*gl_lstat)(const char *name, struct stat *st);
int (*gl_stat)(const char *name, struct stat *st);

The program in Listing 20.2 does not support the GLOB_ALTDIRFUNC flag.

The GLOB_BRACE Flag

The GLOB_BRACE flag enables glob(3) to support csh(1) pattern groups that are specified between braces. The following example illustrates GLOB_BRACE (option -B):

$ ./glob -B '{ *.c,*.o} '
Flags: GLOB_BRACE
There were 4 matches returned:
  0: fnmatch.c
  1: glob.c
  2: fnmatch.o
  3: glob.o
Flags: GLOB_BRACE GLOB_MAGCHAR

$

By using the GLOB_BRACE flag and the pattern '{ *.c,*.o} ', the glob(3) function was able to combine two patterns into one result. Notice that only the individual pattern results are sorted.

The GLOB_MAGCHAR Flag

The GLOB_MAGCHAR flag is never used as input to glob(3). However, it is returned in the glob_t member gl_flags when at least one meta-character exists in the pattern.

The GLOB_NOMAGIC Flag

The GLOB_NOMAGIC flag causes no results to be returned if the pattern did not make any matches and the pattern had meta-characters present. However, if no meta-characters exist in the pattern, then the pattern is returned in the same manner as GLOB_NOCHECK when no results are found. The following session shows the difference between GLOB_NOMAGIC (option -N) and GLOB_NOCHECK (option -c):

$ ./glob -N '*.z'
Flags: GLOB_NOMAGIC
There are no glob results.
$ ./glob -c '*.z'
Flags: GLOB_NOCHECK
There were 1 matches returned:
  0: *.z
Flags: GLOB_NOCHECK GLOB_MAGCHAR

$

The first command shows GLOB_NOMAGIC and a pattern with meta-characters present. The run with GLOB_NOMAGIC did not return any results, while the run with GLOB_NOCHECK returned the pattern *.z as a result. Now examine another experiment:

$ ./glob -N 'z.z'
Flags: GLOB_NOMAGIC
There were 1 matches returned:
  0: z.z
Flags: GLOB_NOMAGIC

$

In this experiment, flag GLOB_NOMAGIC causes pattern z.z to be returned, although this was not a match. The flag GLOB_NOCHECK would return the same result in this case.

The GLOB_TILDE Flag

This flag is used to enable glob(3) to interpret the Korn shell tilde (~) feature. The following illustrates (using option -T):

$ ./glob -T '~postgres/*'
Flags: GLOB_TILDE
There were 9 matches returned:
  0: /home/postgres/bin
  1: /home/postgres/data
  2: /home/postgres/errlog
  3: /home/postgres/include
  4: /home/postgres/lib
  5: /home/postgres/odbcinst.ini
  6: /home/postgres/pgsql-support.tar.gz
  7: /home/postgres/postgresql-7.0beta1.tar.gz
  8: /home/postgres/psqlodbc-025.tar.gz
Flags: GLOB_MAGCHAR GLOB_TILDE

$

In this example, the glob(3) function looked up the home directory for the postgres account and searched that home directory /home/postgres.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.222.109.141