In this chapter
The V7 ls
command nicely ties together everything we’ve seen so far. It uses almost all of the APIs we’ve covered, touching on many aspects of Unix programming: memory allocation, file metadata, dates and times, user names, directory reading, and sorting.
In comparison to modern versions of ls
, the V7 ls
accepted only a handful of options and the meaning of some of them is different for V7 than for current ls
. The options are as follows:
The biggest differences between V7 ls
and modern ls
concern the -a
option and the -l
option. Modern systems omit all dot files unless -a
is given, and they include both user and group names in the -l
long listing. On modern systems, -g
is taken to mean print only the group name, and -o
means print only the user name. For what it’s worth, GNU ls
has over 50 options!
The file /usr/src/cmd/ls.c
in the V7 distribution contains the code. It is all of 425 lines long.
1 /* 2 * list file or directory 3 */ 4 5 #include <sys/param.h> 6 #include <sys/stat.h> 7 #include <sys/dir.h> 8 #include <stdio.h> 9 10 #define NFILES 1024 11 FILE *pwdf, *dirf; 12 char stdbuf[BUFSIZ]; 13 14 struct lbuf { Collects needed info 15 union { 16 char lname[15]; 17 char *namep; 18 } ln; 19 char ltype; 20 short lnum; 21 short lflags; 22 short lnl; 23 short luid; 24 short lgid; 25 long lsize; 26 long lmtime; 27 }; 28 29 int aflg, dflg, lflg, sflg, tflg, uflg, iflg, fflg, gflg, cflg; 30 int rflg = 1; 31 long year; Global variables: auto init to 0 32 int flags; 33 int lastuid = -1; 34 char tbuf[16]; 35 long tblocks; 36 int statreq; 37 struct lbuf *flist[NFILES]; 38 struct lbuf **lastp = flist; 39 struct lbuf **firstp = flist; 40 char *dotp = "."; 41 42 char *makename(); char *makename(char *dir, char *file); 43 struct lbuf *gstat(); struct lbuf *gstat(char *file, int argfl); 44 char *ctime(); char *ctime(time_t *t); 45 long nblock(); long nblock(long size); 46 47 #define ISARG 0100000
The program starts with file inclusions (lines 5–8) and variable declarations. The struct lbuf
(lines 14–27) encapsulates the parts of the struct stat
that are of interest to ls
. We see later how this structure is filled.
The variables aflg, dflg
, and so on (lines 29 and 30) all indicate the presence of the corresponding option. This variable naming style is typical of V7 code. The flist, lastp
, and firstp
variables (lines 37–39) represent the files that ls
reports information about. Note that flist
is a fixed-size array, allowing no more than 1024 files to be processed. We see shortly how all these variables are used.
After the variable declarations come function declarations (lines 42–45), and then the definition of ISARG
, which distinguishes a file named on the command line from a file found when a directory is read.
49 main(argc, argv) int main(int argc, char **argv) 50 char *argv[]; 51 { 52 int i; 53 register struct lbuf *ep, **ep1; Variable and function declarations 54 register struct lbuf **slastp; 55 struct lbuf **epp; 56 struct lbuf lb; 57 char *t; 58 int compar(); 59 60 setbuf(stdout, stdbuf); 61 time(&lb.lmtime); Get current time 62 year = lb.lmtime - 6L*30L*24L*60L*60L; /* 6 months ago */
The main()
function starts by declaring variables and functions (lines 52–58), setting the buffer for standard output, retrieving the time of day (lines 60–61), and computing the seconds-since-the-Epoch value for approximately six months ago (line 62). Note that all the constants have the L
suffix, indicating the use of long
arithmetic.
63 if (--argc > 0 && *argv[1] == '-') { 64 argv++; 65 while (*++*argv) switch (**argv) { Parse options 66 67 case 'a': All directory entries 68 aflg++; 69 continue; 70 71 case 's': Size in blocks 72 sflg++; 73 statreq++; 74 continue; 75 76 case 'd': Directory info, not contents 77 dflg++; 78 continue; 79 80 case 'g': Group name instead of user name 81 gflg++; 82 continue; 83 84 case 'l': Long listing 85 lflg++; 86 statreq++; 87 continue; 88 89 case 'r': Reverse sort order 90 rflg = -1; 91 continue; 92 93 case 't': Sort by time, not name 94 tflg++; 95 statreq++; 96 continue; 97 98 case 'u': Access time, not modification time 99 uflg++; 100 continue; 101 102 case 'c': Inode change time, not modification time 103 cflg++; 104 continue; 105 106 case 'i': Include inode number 107 iflg++; 108 continue; 109 110 case 'f': Force reading each arg as directory 111 fflg++; 112 continue; 113 114 default: Ignore unknown option letters 115 continue; 116 } 117 argc--; 118 }
Lines 63–118 parse the command-line options. Note the manual parsing code: getopt()
hadn’t been invented yet. The statreq
variable is set to true when an option requires the use of the stat() system call
.
Avoiding an unnecessary stat()
call on each file is a big performance win. The stat()
call was particularly expensive, because it could involve a disk seek to the inode location, a disk read to read the inode, and then a disk seek back to the location of the directory contents (in order to continue reading directory entries).
Modern systems have the inodes in groups, spread out throughout a filesystem instead of clustered together at the front. This makes a noticeable performance improvement. Nevertheless, stat()
calls are still not free; you should use them as needed, but not any more than that.
119 if (fflg) { -f overrides -l, -s, -t, adds -a 120 aflg++; 121 lflg = 0; 122 sflg = 0; 123 tflg = 0; 124 statreq = 0; 125 } 126 if(lflg) { Open password or group file 127 t = "/etc/passwd"; 128 if(gflg) 129 t = "/etc/group"; 130 pwdf = fopen(t, "r"); 131 } 132 if (argc==0) { Use current dir if no args 133 argc++; 134 argv = &dotp - 1; 135 }
Lines 119–125 handle the -f
option, turning off -1
, -s
, -t,
and statreq
. Lines 126–131 handle -l
, setting the file to be read for user or group information. Remember that the V7 ls
shows only one or the other, not both.
If no arguments are left, lines 132–135 set up argv
such that it points at a string representing the current directory. The assignment ’argv = &dotp - 1
’ is valid, although unusual. The ’- 1
’ compensates for the ’++argv
’ on line 137. This avoids special case code for ’argc == 1
’ in the main part of the program.
136 for (i=0; i < argc; i++) { Get info about each file 137 if ((ep = gstat(*++argv, 1))==NULL) 138 continue; 139 ep->ln.namep = *argv; 140 ep->lflags |= ISARG; 141 } 142 qsort(firstp, lastp - firstp, sizeof *lastp, compar); 143 slastp = lastp; 144 for (epp=firstp; epp<slastp; epp++) { Main code, see text 145 ep = *epp; 146 if (ep->ltype=='d' && dflg==0 ||fflg) { 147 if (argc>1) 148 printf(" %s: ", ep->ln.namep); 149 lastp = slastp; 150 readdir(ep->ln.namep); 151 if (fflg==0) 152 qsort(slastp,lastp - slastp,sizeof *lastp,compar); 153 if (lflg || sflg) 154 printf("total %D ", tblocks); 155 for (ep1=slastp; ep1<lastp; ep1++) 156 pentry(*ep1); 157 } else 158 pentry(ep); 159 } 160 exit(0); 161 } End of main()
Lines 136–141 loop over the arguments, gathering information about each one. The second argument to gstat()
is a boolean: true if the name is a command-line argument, false otherwise. Line 140 adds the ISARG
flag to the lflags
field for each command-line argument.
The gstat()
function adds each new struct lbuf
into the global flist
array (line 137). It also updates the lastp
global pointer to point into this array at the current last element.
Lines 142–143 sort the array, using qsort()
, and save the current value of lastp
in slastp
. Lines 144–159 loop over each element in the array, printing file or directory info, as appropriate.
The code for directories deserves further explication:
if (ep->ltype=='d' && dflg==0 || fflg) ...
Line 146. If the file type is directory and if -d
was not provided or if -f
was, then ls
has to read the directory instead of printing information about the directory itself.
if (argc>1) printf("
%s:
", ep->ln.namep)
Lines 147–148. Print the directory name and a colon if multiple files were named on the command line.
lastp = slastp; readdir(ep->ln.namep)
Lines 149–150. Reset lastp
from slastp
. The flist
array acts as a two-level stack of filenames. The command-line arguments are kept in firstp
through slastp - 1
. When readdir()
reads a directory, it puts the struct lbuf
structures for the directory contents onto the stack, starting at slastp
and going through lastp
. This is illustrated in Figure 7.1.
if (fflg==0) qsort(slastp,lastp - slastp,sizeof *lastp,compar)
Lines 151–152. Sort the subdirectory entries if -f
is not in effect.
if (lflg || sflg) printf("total %D
", tblocks)
Lines 153–154. Print the total number of blocks used by files in the directory, for -l
or -s
. This total is kept in the variable tblocks
, which is reset for each directory. The %D
format string for printf()
is equivalent to %ld
on modern systems; it means “print a long integer.” (V7 also had %ld
, see line 192.)
for (ep1=slastp; ep1<lastp; ep1++) pentry(*ep1)
Lines 155–156. Print the information about each file in the subdirectory. Note that the V7 ls
descends only one level in a directory tree. It lacks the modern -R
“recursive” option.
163 pentry(ap) void pentry(struct lbuf *ap) 164 struct lbuf *ap; 165 { 166 struct { char dminor, dmajor;}; Unused historical artifact from V6 ls 167 register t; 168 register struct lbuf *p; 169 register char *cp; 170 171 p = ap; 172 if (p->lnum == -1) 173 return; 174 if (iflg) 175 printf("%5u ", p->lnum); Inode number 176 if (sflg) 177 printf("%4D ", nblock(p->lsize)); Size in blocks
The pentry()
routine prints information about a file. Lines 172–173 check whether the lnum
field is -1
, and return if so. When ’p->lnum == -1
’ is true, the struct lbuf
is not valid. Otherwise, this field is the file’s inode number.
Lines 174–175 print the inode number if -i
is in effect. Lines 176–177 print the total number of blocks if -s
is in effect. (As we see below, this number may not be accurate.)
178 if (lflg) { Long listing: 179 putchar(p->ltype); – File type 180 pmode(p->lflags); – Permissions 181 printf("%2d ", p->lnl); – Link count 182 t = p->luid; 183 if(gflg) 184 t = p->lgid; 185 if (getname(t, tbuf)==0) 186 printf("%-6.6s", tbuf); – User or group 187 else 188 printf("%-6d", t); 189 if (p->ltype=='b' || p->ltype=='c') – Device: major and minor numbers 190 printf("%3d,%3d", major((int)p->lsize), minor((int)p->lsize)); 191 else 192 printf("%7ld", p->lsize); – Size in bytes 193 cp = ctime(&p->lmtime); 194 if(p->lmtime < year) – Modification time 195 printf(" %-7.7s %-4.4s ", cp+4, cp+20); else 196 printf(" %-12.12s ", cp+4); 197 } 198 if (p->lflags&ISARG) – Filename 199 printf("%s ", p->ln.namep); 200 else 201 printf("%.14s ", p->ln.lname); 202 }
Lines 178–197 handle the -l
option. Lines 179–181 print the file’s type, permissions, and number of links. Lines 182–184 set t
to the user ID or the group ID, based on the -g
option. Lines 185–188 retrieve the corresponding name and print it if available. Otherwise, the program prints the numeric value.
Lines 189–192 check whether the file is a block or character device. If it is, they print the major and minor device numbers, extracted with the major()
and minor()
macros. Otherwise, they print the file’s size.
Lines 193–196 print the time of interest. If it’s older than six months, the code prints the month, day, and year. Otherwise, it prints the month, day, and time (see Section 6.1.3.1, “Simple Time Formatting: asctime()
and ctime()
,” page 170, for the format of ctime()
’s result).
Finally, lines 198–201 print the filename. For a command-line argument, we know it’s a zero-terminated string, and %s
can be used. For a file read from a directory, it may not be zero-terminated, and thus an explicit precision, %.14s
, must be used.
204 getname(uid, buf) int getname(int uid, char buf[]) 205 int uid; 206 char buf[]; 207 { 208 int j, c, n, i; 209 210 if (uid==lastuid) Simple caching, see text 211 return(0); 212 if(pwdf == NULL) Safety check 213 return(-1); 214 rewind(pwdf); Start at front of file 215 lastuid = -1; 216 do { 217 i = 0; Index in buf array 218 j = 0; Counts fields in line 219 n = 0; Converts numeric value 220 while((c=fgetc(pwdf)) != ' ') { Read lines 221 if (c==EOF) 222 return(-1); 223 if (c==':') { Count fields 224 j++; 225 c = '0'; 226 } 227 if (j==0) First field is name 228 buf[i++] = c; 229 if (j==2) Third field is numeric ID 230 n = n*10 + c - '0'; 231 } 232 } while (n != uid); Keep searching until ID found 233 buf[i++] = ' '; 234 lastuid = uid; 235 return(0); 236 }
The getname()
function converts a user or group ID number into the corresponding name. It implements a simple caching scheme; if the passed-in uid
is the same as the global variable lastuid
, then the function returns 0, for OK; the buffer will already contain the name (lines 210–211). lastuid
is initialized to -1
(line 33), so this test fails the first time getname()
is called.
pwdf
is already open on either /etc/passwd
or /etc/group
(see lines 126–130). The code here checks that the open succeeded and returns -1
if it didn’t (lines 212–213).
Surprisingly, ls
does not use getpwuid()
or getgrgid()
. Instead, it takes advantage of the facts that the format of /etc/passwd
and /etc/group
is identical for the first three fields (name, password, numeric ID) and that both use a colon as separator.
Lines 216–232 implement a linear search through the file. j
counts the number of colons seen so far: 0 for the name and 2 for the ID number. Thus, while scanning the line, it fills in both the name and the ID number.
Lines 233–235 terminate the name
buffer, set the global lastuid
to the found ID number, and return 0
for OK.
238 long long nblock(long size) 239 nblock(size) 240 long size; 241 { 242 return((size+511)>>9); 243 }
The nblock()
function reports how many disk blocks the file uses. This calculation is based on the file’s size as returned by stat()
. The V7 block size was 512 bytes—the size of a physical disk sector.
The calculation on line 242 looks a bit scary. The ’>>9
’ is a right-shift by nine bits. This divides by 512, to give the number of blocks. (On early hardware, a right-shift was much faster than division.) So far, so good. Now, a file of even one byte still takes up a whole disk block. However, ’1 / 512
’ comes out as zero (integer division truncates), which is incorrect. This explains the ’size+511
’. By adding 511, the code ensures that the sum produces the correct number of blocks when it is divided by 512.
This calculation is only approximate, however. Very large files also have indirect blocks. Despite the claim in the V7 ls(1) manpage, this calculation does not account for indirect blocks.
Furthermore, consider the case of a file with large holes (created by seeking way past the end of the file with lseek())
. Holes don’t occupy disk blocks; however, this is not reflected in the size value. Thus, the calculation produced by nblock()
, while usually correct, could produce results that are either smaller or larger than the real case.
For these reasons, the st_blocks
member was added into the struct stat
at 4.2 BSD, and then picked up for System V and POSIX.
245 int m1[] = { 1, S_IREAD>>0, 'r', '-' }; 246 int m2[] = { 1, S_IWRITE>>0, 'w', '-' }; 247 int m3[] = { 2, S_ISUID, 's', S_IEXEC>>0, 'x', '-' }; 248 int m4[] = { 1, S_IREAD>>3, 'r', '-' }; 249 int m5[] = { 1, S_IWRITE>>3, 'w', '-' }; 250 int m6[] = { 2, S_ISGID, 's', S_IEXEC>>3, 'x', '-' }; 251 int m7[] = { 1, S_IREAD>>6, 'r', '-' }; 252 int m8[] = { 1, S_IWRITE>>6, 'w', '-' }; 253 int m9[] = { 2, S_ISVTX, 't', S_IEXEC>>6, 'x', '-' }; 254 255 int *m[] = { m1, m2, m3, m4, m5, m6, m7, m8, m9}; 256 257 pmode(aflag) void pmode(int aflag) 258 { 259 register int **mp; 260 261 flags = aflag; 262 for (mp = &m[0]; mp < &m[sizeof(m)/sizeof(m[0])];) 263 select(*mp++); 264 } 265 266 select(pairp) void select(register int *pairp) 267 register int *pairp; 268 { 269 register int n; 270 271 n = *pairp++; 272 while (--n>=0 && (flags&*pairp++)==0) 273 pairp++; 274 putchar(*pairp); 275 }
Lines 245–275 print the file’s permissions. The code is compact and rather elegant; it requires careful study.
Lines 245–253: The arrays m1
through m9
encode the permission bits to check for along with the corresponding characters to print. There is one array per character to print in the file mode. The first element of each array is the number of (permission, character) pairs encoded in that particular array. The final element is the character to print in the event that none of the given permission bits are found.
Note also how the permissions are specified as ’I_READ>>0
’, ’I_READ>>3
’, ’I_READ>>6
’, and so on. The individual constants for each bit (S_IRUSR, S_IRGRP
, etc.) had not been invented yet. (See Table 4.5 in Section 4.6.1, “Specifying Initial File Permissions”, page 106.)
Line 255: The m
array points to each of the m1
through m9
arrays.
Lines 257–264: The pmode()
function first sets the global variable flags
to the passed-in parameter aflag
. It then loops through the m
array, passing each element to the select()
function. The passed-in element represents one of the m1
to m9
arrays.
Lines 266–275: The select()
function understands the layout of each m1
through m9
array. n
is the number of pairs in the array (the first element); line 271 sets it. Lines 272–273 look for permission bits, checking the global variable flags
set previously on line 261.
Note the use of the ++
operator, both in the loop test and in the loop body. The effect is to skip over pairs in the array as long as the permission bit in the first element of the pair is not found in flags
.
When the loop ends, either the permission bit has been found, in which case pairp
points at the second element of the pair, which is the correct character to print, or it has not been found, in which case pairp
points at the default character. In either case, line 274 prints the character that pairp
points to.
A final point worth noting is that in C, character constants (such as "x"
) have type int
, not char
.[1] So there’s no problem putting such constants into an integer array; everything works correctly.
277 char * char *makename(char *dir, char *file) 278 makename(dir, file) 279 char *dir, *file; 280 { 281 static char dfile[100]; 282 register char *dp, *fp; 283 register int i; 284 285 dp = dfile; 286 fp = dir; 287 while (*fp) 288 *dp++ = *fp++; 289 *dp++ = '/'; 290 fp = file; 291 for (i=0; i<DIRSIZ; i++) 292 *dp++ = *fp++; 293 *dp = 0; 294 return(dfile); 295 }
Lines 277–295 define the makename()
function. Its job is to concatenate a directory name and a filename, separated by a slash character, and produce a string. It does this in the static
buffer dfile
. Note that dfile
is only 100 characters long and that no error checking is done.
The code itself is straightforward, copying characters one at a time. makename()
is used by the readdir()
function.
297 readdir(dir) void readdir(char *dir) 298 char *dir; 299 { 300 static struct direct dentry; 301 register int j; 302 register struct lbuf *ep; 303 304 if ((dirf = fopen(dir, "r")) == NULL) { 305 printf("%s unreadable ", dir); 306 return; 307 } 308 tblocks = 0; 309 for(;;) { 310 if (fread((char *) &dentry, sizeof(dentry), 1, dirf) != 1) 311 break; 312 if (dentry.d_ino==0 313 || aflg==0 && dentry.d_name[0]=='.' && (dentry.d_name[1]==' ' 314 || dentry.d_name[1]=='.' && dentry.d_name[2]==' ')) 315 continue; 316 ep = gstat(makename(dir, dentry.d_name), 0); 317 if (ep==NULL) 318 continue; 319 if (ep->lnum != -1) 320 ep->lnum = dentry.d_ino; 321 for (j=0; j<DIRSIZ; j++) 322 ep->ln.lname[j] = dentry.d_name[j]; 323 } 324 fclose(dirf); 325 }
Lines 297–325 define the readdir()
function, whose job is to read the contents of directories named on the command line.
Lines 304–307 open the directory for reading, returning if fopen()
fails. Line 308 initializes the global variable tblocks
to 0
. This was used earlier (lines 153–154) to print the total number of blocks used by files in a directory.
Lines 309–323 are a loop that reads directory entries and adds them to the flist
array. Lines 310–311 read one entry, exiting the loop upon end-of-file.
Lines 312–315 skip uninteresting entries. If the inode number is zero, this slot isn’t used. Otherwise, if -a
was not given and the filename is either ’.’ or ’..’, skip it.
Lines 316–318 call gstat()
with the full name of the file, and a second argument of false, indicating that it’s not from the command line. gstat()
updates the global lastp
pointer and the flist
array. A NULL
return value indicates some sort of failure.
Lines 319–322 save the inode number and name in the struct lbuf
. If ep->lnum
comes back from gstat()
set to -1
, it means that the stat()
operation on the file failed. Finally, line 324 closes the directory.
The following function, gstat()
(lines 327–398), is the core function for the operation of retrieving and storing file information.
327 struct lbuf * struct lbuf*gstat(char *file, int argfl) 328 gstat(file, argfl) 329 char *file; 330 { 331 extern char *malloc(); 332 struct stat statb; 333 register struct lbuf *rep; 334 static int nomocore; 335 336 if (nomocore) Ran out of memory earlier 337 return(NULL); 338 rep = (struct lbuf *)malloc(sizeof(struct lbuf)); 339 if (rep==NULL) { 340 fprintf(stderr, "ls: out of memory "); 341 nomocore = 1; 342 return(NULL); 343 } 344 if (lastp >= &flist[NFILES]) { Check whether too many files given 345 static int msg; 346 lastp--; 347 if (msg==0) { 348 fprintf(stderr, "ls: too many files "); 349 msg++; 350 } 351 } 352 *lastp++ = rep; Fill in information 353 rep->lflags = 0; 354 rep->lnum = 0; 355 rep->ltype = '-'; Default file type
The static
variable nomocore
[sic] indicates that malloc()
failed upon an earlier call. Since it’s static
, it’s automatically initialized to 0
(that is, false). If it’s true upon entry, gstat()
just returns NULL
. Otherwise, if malloc()
fails, ls
prints an error message, sets nomocore
to true, and returns NULL
(lines 334–343).
Lines 344–351 make sure that there is still room left in the flist
array. If not, ls
prints a message (but only once; note the use of the static
variable msg
), and then reuses the last slot in flist
.
Line 352 makes the slot lastp
points to point to the new struct lbuf
(rep
.) This also updates lastp
, which is used for sorting in main()
(lines 142 and 152). Lines 353–355 set default values for the flags, inode number, and type fields in the struct lbuf
.
356 if (argfl || statreq) { 357 if (stat(file, &statb)<0) { stat() failed 358 printf("%s not found ", file); 359 statb.st_ino = -1; 360 statb.st_size = 0; 361 statb.st_mode = 0; 362 if (argfl) { 363 lastp--; 364 return(0); 365 } 366 } 367 rep->lnum = statb.st_ino; stat() OK, copy info 368 rep->lsize = statb.st_size; 369 switch(statb.st_mode&S_IFMT) { 370 371 case S_IFDIR: 372 rep->ltype = 'd'; 373 break; 374 375 case S_IFBLK: 376 rep->ltype = 'b'; 377 rep->lsize = statb.st_rdev; 378 break; 379 380 case S_IFCHR: 381 rep->ltype = 'c'; 382 rep->lsize = statb.st_rdev; 383 break; 384 } 385 rep->lflags = statb.st_mode & ~S_IFMT; 386 rep->luid = statb.st_uid; 387 rep->lgid = statb.st_gid; 388 rep->lnl = statb.st_nlink; 389 if(uflg) 390 rep->lmtime = statb.st_atime; 391 else if (cflg) 392 rep->lmtime = statb.st_ctime; 393 else 394 rep->lmtime = statb.st_mtime; 395 tblocks += nblock(statb.st_size); 396 } 397 return(rep); 398 }
Lines 356–396 handle the call to stat()
. If this is a command-line argument or if statreq
is true because of an option, the code fills in the struct lbuf
as follows:
Lines 357–366: Call stat()
and if it fails, print an error message and set values as appropriate, then return NULL
(expressed as 0
).
Lines 367–368: Set the inode number and size fields from the struct stat
if the stat()
succeeded.
Lines 369–384: Handle the special cases of directory, block device, and character device. In all cases the code updates the ltype
field. For devices, the lsize
value is replaced with the st_rdev
value.
Lines 385–388: Fill in the lflags, luid, lgid
, and lnl
fields from the corresponding fields in the struct stat
. Line 385 removes the file-type bits, leaving the 12 permissions bits (read/write/execute for user/group/other, and setuid, setgid, and save-text).
Lines 389–394: Based on command-line options, use one of the three time fields from the struct stat
for the lmtime
field in the struct lbuf
.
Line 395: Update the global variable tblocks
with the number of blocks in the file.
400 compar(pp1, pp2) int compar(struct lbuf**pp1, 401 struct 1buf **pp1, **pp2; struct lbuf**pp2) 402 { 403 register struct lbuf *p1, *p2; 404 405 p1 = *pp1; 406 p2 = *pp2; 407 if (dflg==0) { 408 if (p1->Iflags&ISARG && p1->ltype=='d') { 409 if (!(p2->Iflags&ISARG && p2->ltype=='d')) 410 return(1); 411 } else { 412 if (p2->Iflags&ISARG && p2->ltype=='d') 413 return(-1); 414 } 415 } 416 if (tflg) { 417 if(p2->lmtime == p1->lmtime) 418 return(0); 419 if(p2->lmtime > p1->lmtime) 420 return(rflg); 421 return(-rflg); 422 } 423 return(rflg * strcmp(p1->lflags&ISARG? p1->ln.namep: p1->ln.lname, 424 p2->lflags&ISARG? p2->ln.namep: p2->ln.lname)); 425 }
The compar()
function is dense: There’s a lot happening in little space. The first thing to remember is the meaning of the return value: A negative value means that the first file should sort to an earlier spot in the array than the second, zero means the files are equal, and a positive value means that the second file should sort to an earlier spot than the first.
The next thing to understand is that ls
prints the contents of directories after it prints information about files. Thus the result of sorting should be that all directories named on the command line follow all files named on the command line.
Finally, the rflg
variable helps implement the -r
option, which reverses the sorting order. It is initialized to 1
(line 30). If -r
is used, rflg
is set to -1
(lines 89—91).
The following pseudocode describes the logic of compar()
; the line numbers in the left margin correspond to those of ls.c:
407 if ls has to read directories # dflg == 0 408 if p1 is a command-line arg and p1 is a directory 409 if p2 is not a command-line arg and is not a directory 410 return 1 # first comes after second else fall through to time test 411 else # p1 is not a command-line directory 412 if p2 is a command-line arg and is a directory 413 return -1 # first comes before second else fall through to time test 416 if sorting is based on time # tflg is true # compare times: 417 if p2's time is equal to p1's time 418 return 0 419 if p2's time > p1's time 420 return the value of rflg (positive or negative) # p2's time < p1's time 421 return opposite of rflg (negative or positive) 423 Multiply rflg by the result of strcmp() 424 on the two names and return the result
The arguments to strcmp()
on lines 423—424 look messy. What’s going on is that different members of the ln
union in the struct lbuf
must be used, depending on whether the filename is a command-line argument or was read from a directory.
The V7 ls
is a relatively small program, yet it touches on many of the fundamental aspects of Unix programming: file I/O, file metadata, directory contents, users and groups, time and date values, sorting, and dynamic memory management.
The most notable external difference between V7 ls
and modern ls
is the treatment of the -a
and -l
options. The V7 version has many fewer options than do modern versions; a noticeable lack is the -R
recursive option.
The management of flist
is a clean way to use the limited memory of the PDP-11 architecture yet still provide as much information as possible. The struct lbuf
nicely abstracts the information of interest from the struct stat;
this simplifies the code considerably. The code for printing the nine permission bits is compact and elegant.
Some parts of ls
use surprisingly small limits, such as the upper bound of 1024
on the number of files or the buffer size of 100
in makename()
.
Consider the getname()
function. What happens if the requested ID number is 216
and the following two lines exist in /etc/passwd
, in this order:
joe:xyzzy:2160:10:Joe User:/usr/joe:/bin/sh jane:zzyxx:216:12:Jane User:/usr/jane:/bin/sh
Consider the makename()
function. Could it use sprintf()
to make the concatenated name? Why or why not?
Are lines 319—320 in readdir()
really necessary?
Take the stat
program you wrote for the exercises in “Exercises” for Chapter 6, page 205. Add the nblock()
function from the V7 ls
, and print the results along with the st_blocks
field from the struct stat
. Add a visible marker when they’re different.
How would you grade the V7 ls
on its use of malloc()
? (Hint: how often is free()
called? Where should it be called?)
How would you grade the V7 ls
for code clarity? (Hint: how many comments are there?)
Outline the steps you would take to adapt the V7 ls
for modern systems.
[1] This is different in C++: There, character constants do have type char
. This difference does not affect this particular code.
3.145.188.160