File Operations

A few of the file operations have different prototypes in 2.1 than they had in 2.0. This is mainly due to the upcoming need to handle files whose size can’t fit in 32 bits. The differences are handled by the header sysdep-2.1.h, which defines a few pseudo-types according to the kernel version being used. The only serious innovation introduced in the file operations is the poll method, which replaces select with a completely different implementation.

Prototype Differences

Four file operations feature a new prototype; they are:

long long (*llseek) (struct inode *, struct file *, long long, int);
long (*read) (struct inode *, struct file *, char *, unsigned long);
long (*write) (struct inode *, struct file *, const char *,
               unsigned long);
int (*release) (struct inode *, struct file *);

The 2.0 counterparts were:

int (*lseek) (struct inode *, struct file *, off_t, int);
int (*read) (struct inode *, struct file *, char *, int);
int (*write) (struct inode *, struct file *, const char *, int);
void (*release) (struct inode *, struct file *);

The difference, as you see, lies in their return values (which allow for a greater range), and the count and offset arguments. The header sysdep-2.1.h handles the differences by defining the following macros:

read_write_t

This macro expands to the type of the count argument and the return value of read and write.

lseek_t

This macro expands to the return value’s type in llseek. The change in the method’s name (from lseek to llseek) is not a problem, as you won’t usually assign the field by name in your file_operations, but will rather declare a static structure.

lseek_off_t

The offset argument to lseek.

release_t

The return value of the release method; either void or int.

release_return(int return_value);

This macro can be used to return from the release method. Its argument is used to return an error code: 0 for success and a negative value for failure. With kernels older than 2.1.31, the macro just expands to return, as the method returns void.

Using the previous macros, the prototypes of a portable driver are:

lseek_t my_lseek(struct inode *, struct file *, lseek_off_t, int);
read_write_t my_read(struct inode *, struct file *, char *, count_t);
read_write_t my_write(struct inode *, struct file *, const char *,
                      count_t);
release_t my_release(struct inode *, struct file *);

The poll Method

Version 2.1.23 introduced the poll system call, which is the System V counterpart of select (which was introduced in BSD Unix). Unfortunately, it is not possible to implement poll functionality on top of a select device method, so the whole implementation was replaced with a different one, which serves as a back-end to both select and poll.

With current versions of the kernel, the device method in file_operations is called poll like the system call, because its internals resemble the system call. The prototype of the method is:

unsigned int (*poll) (struct file *, poll_table *);

The device-specific implementation in the driver should perform two tasks:

  • Queue the current process in any wait queue that may awaken it in the future. Usually, this means queueing the process in both the input and the output queues. The function poll_wait is used for this purpose and works exactly like select_wait (see Section 5.3 in Chapter 5, for details).

  • Build a bitmask describing the status of the device and return it to the caller. The values of the bits are platform-specific and are defined in <linux/poll.h>, which must be included by the driver.

Before describing the individual bits of the bitmask, I’d better show what a typical implementation looks like. The following function is part of v2.1/scull/pipe.c and is the implementation of the poll method for /dev/scullpipe, whose internals were described in Chapter 5:

unsigned int scull_p_poll (struct file *filp, poll_table *wait)
{
    Scull_Pipe *dev = filp->private_data;
    unsigned int mask = 0;

    /* how many bytes left there to be read? */
    int left = (dev->rp + dev->buffersize - dev->wp) % dev->buffersize;

    poll_wait(&dev->inq,  wait);
    poll_wait(&dev->outq, wait);
    if (dev->rp != dev->wp) mask |= POLLIN | POLLRDNORM;  /* readable */
    if (left)               mask |= POLLOUT | POLLWRNORM; /* writable */

    return mask;
}

As you see, the code is pretty easy. It’s easier than the corresponding select method. As far as select is concerned, the status bits count as either ``readable,'' ``writable,'' or ``exception occurred'' (the third condition of select).

The full list of poll bits is shown below. ``Input'' bits are listed first, ``output'' bits follow, and the single ``exception'' bit comes at the end.

POLLIN

This bit must be set if the device can be read without blocking.

POLLRDNORM

This bit must be set if ``normal'' data is available for reading. A readable device returns (POLLIN | POLLRDNORM).

POLLRDBAND

This bit is currently unused in the kernel sources. Unix System V uses the bit to report that data of non-zero priority is available for reading. The concept of data priority is related to the ``Streams'' package.

POLLHUP

When a process reading this device sees end-of-file, the driver must set POLLHUP (hang-up). A process calling select will be told that the device is readable, as dictated by the select functionality.

POLLERR

An error condition has occurred on the device. When poll is invoked by the select system call, the device is reported as both readable and writable, as either read or write will return an error code without blocking.

POLLOUT

This bit is set in the return value if the device can be written to without blocking.

POLLWRNORM

This bit has the same meaning as POLLOUT, and sometimes it actually is the same number. A writable device returns (POLLOUT | POLLWRNORM).

POLLWRBAND

Like POLLRDBAND, this bit means that data with non-zero priority can be written to the device. Only the ``datagram'' implementation of poll uses this bit, as a datagram can transmit ``out of band data.'' select reports that the device is writable.

POLLPRI

High priority data (``out of band'') can be read without blocking. This bit causes select to report that an exception condition occurred on the file because select reports out-of-band data as an exception condition.

The main problem with poll is that it has nothing to do with the select method used by 2.0 kernels. The best way to deal with the difference, therefore, is to use conditional compilation to compile the proper function, while including both of them in the source file.

The header sysdep-2.1.h defines the symbol __USE_OLD_SELECT__ if the current version supports select instead of poll. This relieves you of the need to refer to LINUX_VERSION_CODE in the source file. The sample drivers in the v2.1 directory use code similar to the following:

#include "sysdep-2.1.h"

#ifdef __USE_OLD_SELECT__
int sample_poll(struct inode *inode, struct file *filp,
               int mode, select_table *table)
{
/* ... 2.0 (select) implementation ... */
}
#else
unsigned int sample_poll (struct file *filp, poll_table *wait)
{
/* ... 2.1 (poll) implementation ... */
}
#endif

The two functions are called with the same name because sample_poll is referenced in the sample_fops structure, where the poll file operation replaced the select method in place.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.14.6.194