Working with Files Under UNIX

A file or device under UNIX is opened with the open(2) system call. Before open(2) is considered in detail, let's first examine the way UNIX references open files in general.

When you want to read from a file, such as /etc/hosts, you must indicate which file you want to read. However, if you had to name the path as a C string "/etc/hosts" each time you wanted to read part of the file, this would not only be tedious and inefficient, it would also be inflexible. How would you read from different parts of the same file? Obviously, a method by which the file can be opened more than once is much more flexible.

When you open a file under UNIX, you are given a reference to that file. You already know (since this is review) that it is a number. This is also known as a file unit number or a file descriptor. Conceptually, this number is a handle that refers back to the file that you named in the open(2) call.

File descriptors returned from an open(2) call allow you to name the path of the file system object once. After you have a file descriptor, you can read the /etc/hosts file one line at a time by providing the file descriptor to the read(2) function. The UNIX kernel then knows which file you mean, because it remembers it from the earlier open(2) call.

This provides flexibility also, since open(2) can be called a second (or nth) time for the same file. In this way, one part of your application can be reading one part of the file while another part is reading another. Neither read disturbs the other. The read(2) call can manage this because file state information is associated with each different file descriptor.

Finally, it should be apparent that an open file descriptor eventually needs to be closed. The close(2) function fills that need. When a process terminates because of a signal or for any other reason, including a normal exit, any file descriptors that are still open are closed by the UNIX kernel. If this were not done, the UNIX kernel would suffer from a serious memory leak, among other problems.

Less obvious is that, when an execve(2) is called to start a new program within a process, some file descriptors can be closed automatically, while others are left open. See fcntl(2) and the F_SETFD flag if this is of interest. The execve(2) call is covered in Chapter 19, "Forked Processes."

Opening and Closing Files

Files under UNIX are opened and closed with the following functions:

#include <sys/types.h>                 /* for mode_t */
#include <sys/stat.h>                  /* for mode_t */
#include <fcntl.h>                     /* For open */

int open(const char *path, int flags, ... /* mode_t mode */);


#include <unistd.h>

int close(int d);

The open(2) call accepts a C string that represents the pathname of the file system object to be opened, some flags, and optionally some permission bits in the mode argument. The return value is either -1 (with errno) if the call fails or a file descriptor value that starts at the value zero.

Note

The handling of errno is covered in Chapter 3, "Error Handling and Reporting," if you need to know more about this variable.


The returned file descriptor is always the lowest unused file descriptor number. If you have standard input already open (file unit 0), standard output (file unit 1), and standard error (file unit 2), then the next successful open(2) call will return file unit 3.

When you are finished with a file descriptor (an open file system object), you must close it with a call to close(2).

Flags for open(2)

The second argument to open(2) can consist of several flag bits. These are given in Table 2.1.

Table 2.1. FreeBSD open(2) Flag Bits
Flag Description
O_RDONLY Open for read only
O_WRONLY Open for write only
O_RDWR Open for read and write
O_NONBLOCK Do not block on open
O_APPEND Append with each write
O_CREAT Create file if necessary
O_TRUNC Truncate file to 0 bytes
O_EXCL Error if creating and the file already exists
O_SHLOCK Atomically obtain a shared lock
O_EXLOCK Atomically obtain an exclusive lock

The flag O_NONBLOCK causes the open(2) call not to block while waiting for the device to be ready. For example, opening a modem device can cause it to wait until a carrier is detected. On some UNIX platforms such as SGI's IRIX 6.5, there is also the O_NDELAY flag, which has special semantics when combined with the O_NONBLOCK flag.

The O_APPEND flag will cause each write to the file to be appended to the end of the file. This applies to all write(2) calls, not just the first one (intervening appends can be done by other processes).

The O_CREATE flag can be used to cause the file to be created, if necessary. However, when combined with the O_EXCL flag, if the file already exists, the open(2) call returns an error. A special case of this is when flags O_CREATE and O_EXCL are used and the pathname given is a symbolic link. The call will fail even if the pathname resolved by the symbolic link does not exist. Another way to state this is that if the symbolic link exists, the open call treats this as if the file already exists and returns an error.

When opening a file in order to overwrite it, you can specify the O_TRUNC flag. This causes the file to be emptied prior to open(2) returning successfully. Any prior content of the file is lost.

Flags O_SHLOCK and O_EXLOCK are permitted on FreeBSD 3.4 Release and cause certain flock(2) semantics to be applied. Chapter 5, "File Locking," will cover the topic of locking files under UNIX.

Closing Files Automatically

All files are closed when the current process terminates. However, by default they remain open across calls to the execve(2) function. If you need the open file descriptor to close prior to executing a new program (with execve(2)), then you should apply a call to fcntl(2) using the F_SETFD operation.

#include <fcntl.h>

int fcntl(int fd, int cmd, ...);

To change a file descriptor given by variable fd to close automatically before another executable is started by execve(2), perform the following:

int fd;                                /* Open file descriptor */
int b0;                                /* Original setting */

if ( (b0 = fcntl(fd,F_GETFD)) == -1 )  /* Get original setting */
    /* Error handling... */

if ( fcntl(fd,F_SETFD,1)) == -1 )      /* Set the flag TRUE */
    /* Error handling... */

Here both the fetching of the current setting and the setting of the close-on-exec flag are shown. Some platforms use a C macro to identify this bit. For example, SGI's IRIX 6.5 uses the FD_CLOEXEC macro instead of assuming it is the least significant bit.

Opening Special Files

There is actually nothing unusual about opening a special file. You open it as you would any other file. For example, if you have permission to open a disk partition, your program can use the open(2) call to open it for reading and writing. For example

int fd;

fd = open("/dev/wd0s2f",O_RDWR);
if ( fd == -1 )
    /* Error handling... */

From this point on, this sample program would have access to the entire disk or disk partition, assuming that the open call succeeded. File systems have their special files protected so that normal users cannot open them this way. If they could, they could seriously corrupt the file system.

Tip

The open(2) and close(2) functions can return the error EINTR. It is easy to overlook this fact for the close(2) function. See Chapter 15, "Signals," for a discussion of this error code.


Working with Sockets

Sockets require special treatment. They are not opened with the normal open(2) call. Instead, sockets are created with the socket(2) or socketpair(2) call. Other socket function calls are used to establish socket addresses and other operating modes. Socket programming is outside the scope of this book.

It should be noted, however, that once a socket is created and a connection is established (at least for connection-oriented protocols), reading and writing to a socket can occur like any open file, with calls to read(2) and write(2). Sockets are like bi-directional pipes, and seeking is not permitted.

Duplicating File Descriptors

UNIX provides this unique capability to have one open file descriptor available as two (or more) separate file descriptors. Additionally, it is possible to take an open file descriptor and cause it to be available on a specific file unit number, provided the number is not already in use.

The function synopses for dup(2) and dup2(2) are as follows:

#include <unistd.h>

int dup(int oldfd);

int dup2(int oldfd, int newfd);

In the case of dup(2), the returned file descriptor when successful is the lowest unused file unit number available in the current process. For dup2(2), however, the new file descriptor value is specified in the argument newfd. When dup2(2) returns successfully, the return value should match newfd.

Tip

On some UNIX platforms, the dup(2) and dup2(2) calls can return the error EINTR (known to be documented for SGI's IRIX 6.5). See Chapter 15 for a discussion of this error code.


One situation in which dup(2) is helpful is in opening FILE streams to work with an existing socket. The following example takes the socket s and creates one input stream rx and another tx stream for writing:

int s;                                 /* Open socket */
FILE *rx;                              /* Read stream */
FILE *tx;                              /* Write stream */

...
rx = fdopen(s,"r");                    /* Open stream for reading on s */
tx = fdopen(dup(s),"w");               /* Open stream for writing on s */

Did you spot the dup(2) call? Why is it necessary? The dup(2) call is necessary because when the fclose(3) call is later made to close the rx stream, it will also close the file descriptor (socket) s. The dup(2) call ensures that the tx stream will have its own file descriptor to use, regardless of if stream rx is still open.

If the dup(2) were omitted from the example, the final data held in the buffers for tx would fail to be written to the socket when fclose(3) was called for tx (assuming rx has been closed first). The reason is that the underlying file descriptor will already have been closed. The dup(2) call solves an otherwise thorny problem.

Changing Standard Input

If you need to change your standard input, how is this accomplished? This may be necessary for the sort(1) command for example, since it processes the data presented on its standard input.

Assume that the input file to be sorted has been opened on unit 3 and held in variable fd. You can place this open file on standard input as follows:

int fd;                                /* Open input file for sort(1) */

close(0);                              /* Close my standard input */
if ( dup2(fd,0) == -1 )                /* Make fd available on 0 */
    /* Error handling... */
close(fd);                             /* This fd is no longer required */

The basic principle here is that once you close unit 0 (standard input), you can make the file that is open on unit 3 available as unit 0 by calling dup2(2). Once you have accomplished that, you can close unit 3, since it is not needed any longer.

You can apply this principle for standard output, standard error, or any other file unit you would like to control.

Warning

Note that the example avoided testing for errors for close(2), which should be done. Test for the error EINTR, and retry the close(2) call if the EINTR error occurs.


..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.116.40.177