A few of the file operations have different prototypes in 2.1 than they had in 2.0. This is mainly due to the upcoming need to handle files whose size can’t fit in 32 bits. The differences are handled by the header sysdep-2.1.h, which defines a few pseudo-types according to the kernel version being used. The only serious innovation introduced in the file operations is the poll method, which replaces select with a completely different implementation.
Four file operations feature a new prototype; they are:
long long (*llseek) (struct inode *, struct file *, long long, int); long (*read) (struct inode *, struct file *, char *, unsigned long); long (*write) (struct inode *, struct file *, const char *, unsigned long); int (*release) (struct inode *, struct file *);
int (*lseek) (struct inode *, struct file *, off_t, int); int (*read) (struct inode *, struct file *, char *, int); int (*write) (struct inode *, struct file *, const char *, int); void (*release) (struct inode *, struct file *);
The difference, as you see, lies in their return values (which
allow for a greater range), and the count
and offset
arguments. The header sysdep-2.1.h
handles the differences by
defining the following macros:
read_write_t
This macro expands to the type of the count
argument and the return value of read and
write.
lseek_t
This macro expands to the return value’s type in
llseek. The change in the method’s name (from
lseek to llseek) is not a problem, as you won’t
usually assign the field by name in your
file_operations
, but will rather declare a static
structure.
lseek_off_t
release_t
release_return(int return_value);
This macro can be used to return from the release
method. Its argument is used to return an error code: 0 for
success and a negative value for failure. With kernels older
than 2.1.31, the macro just expands to return
, as the
method returns void
.
Using the previous macros, the prototypes of a portable driver are:
lseek_t my_lseek(struct inode *, struct file *, lseek_off_t, int); read_write_t my_read(struct inode *, struct file *, char *, count_t); read_write_t my_write(struct inode *, struct file *, const char *, count_t); release_t my_release(struct inode *, struct file *);
Version 2.1.23 introduced the poll system call, which is the System V counterpart of select (which was introduced in BSD Unix). Unfortunately, it is not possible to implement poll functionality on top of a select device method, so the whole implementation was replaced with a different one, which serves as a back-end to both select and poll.
With current versions of the kernel, the device method in
file_operations
is called poll like the system call,
because its internals resemble the system call. The prototype of the
method is:
unsigned int (*poll) (struct file *, poll_table *);
The device-specific implementation in the driver should perform two tasks:
Queue the current process in any wait queue that may awaken it in the future. Usually, this means queueing the process in both the input and the output queues. The function poll_wait is used for this purpose and works exactly like select_wait (see Section 5.3 in Chapter 5, for details).
Build a bitmask describing the status of the device
and return it to the caller. The values of the bits are
platform-specific and are defined in <linux/poll.h>
,
which must be included by the driver.
Before describing the individual bits of the bitmask, I’d better
show what a typical implementation looks like. The following function
is part of v2.1/scull/pipe.c
and is the implementation of the
poll method for /dev/scullpipe
, whose internals were
described in Chapter 5:
unsigned int scull_p_poll (struct file *filp, poll_table *wait) { Scull_Pipe *dev = filp->private_data; unsigned int mask = 0; /* how many bytes left there to be read? */ int left = (dev->rp + dev->buffersize - dev->wp) % dev->buffersize; poll_wait(&dev->inq, wait); poll_wait(&dev->outq, wait); if (dev->rp != dev->wp) mask |= POLLIN | POLLRDNORM; /* readable */ if (left) mask |= POLLOUT | POLLWRNORM; /* writable */ return mask; }
As you see, the code is pretty easy. It’s easier than the corresponding select method. As far as select is concerned, the status bits count as either ``readable,'' ``writable,'' or ``exception occurred'' (the third condition of select).
The full list of poll bits is shown below. ``Input'' bits are listed first, ``output'' bits follow, and the single ``exception'' bit comes at the end.
POLLIN
This bit must be set if the device can be read without blocking.
POLLRDNORM
This bit must be set if ``normal'' data is available for
reading. A readable device returns (POLLIN | POLLRDNORM)
.
POLLRDBAND
This bit is currently unused in the kernel sources. Unix System V uses the bit to report that data of non-zero priority is available for reading. The concept of data priority is related to the ``Streams'' package.
POLLHUP
When a process reading this device sees end-of-file,
the driver must set POLLHUP
(hang-up). A process
calling select will be told that the device is
readable, as dictated by the select
functionality.
POLLERR
An error condition has occurred on the device. When poll is invoked by the select system call, the device is reported as both readable and writable, as either read or write will return an error code without blocking.
POLLOUT
This bit is set in the return value if the device can be written to without blocking.
POLLWRNORM
This bit has the same meaning as POLLOUT
, and
sometimes it actually is the same number. A writable device
returns (POLLOUT | POLLWRNORM)
.
POLLWRBAND
Like POLLRDBAND
, this bit means that data
with non-zero priority can be written to the device. Only the
``datagram'' implementation of poll uses this bit, as a
datagram can transmit ``out of band data.'' select
reports that the device is writable.
POLLPRI
High priority data (``out of band'') can be read without blocking. This bit causes select to report that an exception condition occurred on the file because select reports out-of-band data as an exception condition.
The main problem with poll is that it has nothing to do with the select method used by 2.0 kernels. The best way to deal with the difference, therefore, is to use conditional compilation to compile the proper function, while including both of them in the source file.
The header sysdep-2.1.h
defines the symbol
__USE_OLD_SELECT__
if the current version supports select
instead of poll. This relieves you of the need to refer to
LINUX_VERSION_CODE
in the source file. The sample drivers in
the v2.1
directory use code similar to the following:
#include "sysdep-2.1.h" #ifdef __USE_OLD_SELECT__ int sample_poll(struct inode *inode, struct file *filp, int mode, select_table *table) { /* ... 2.0 (select) implementation ... */ } #else unsigned int sample_poll (struct file *filp, poll_table *wait) { /* ... 2.1 (poll) implementation ... */ } #endif
The two functions are called with the same name because
sample_poll
is referenced in the sample_fops
structure,
where the poll file operation replaced the select method
in place.
3.14.6.194