Example 4-1 is a complete character driver that lets you manipulate a doubly linked list through its d_ioctl
function. You can add or remove an item from the list, determine whether an item is on the list, or print every item on the list. Example 4-1 also contains some synchronization problems.
Take a quick look at this code and try to identify the synchronization problems.
Example 4-1. race.c
#include <sys/param.h> #include <sys/module.h> #include <sys/kernel.h> #include <sys/systm.h> #include <sys/conf.h> #include <sys/uio.h> #include <sys/malloc.h> #include <sys/ioccom.h> #include <sys/queue.h> #include "race_ioctl.h" MALLOC_DEFINE(M_RACE, "race", "race object"); struct race_softc { LIST_ENTRY(race_softc) list; int unit; }; static LIST_HEAD(, race_softc) race_list = LIST_HEAD_INITIALIZER(&race_list); static struct race_softc * race_new(void); static struct race_softc * race_find(int unit); static void race_destroy(struct race_softc *sc); static d_ioctl_t race_ioctl; static struct cdevsw race_cdevsw = { .d_version = D_VERSION, .d_ioctl = race_ioctl, .d_name = RACE_NAME }; static struct cdev *race_dev; static int race_ioctl(struct cdev *dev, u_long cmd, caddr_t data, int fflag, struct thread *td) { struct race_softc *sc; int error = 0; switch (cmd) { case RACE_IOC_ATTACH: sc = race_new(); *(int *)data = sc->unit; break; case RACE_IOC_DETACH: sc = race_find(*(int *)data); if (sc == NULL) return (ENOENT); race_destroy(sc); break; case RACE_IOC_QUERY: sc = race_find(*(int *)data); if (sc == NULL) return (ENOENT); break; case RACE_IOC_LIST: uprintf(" UNIT "); LIST_FOREACH(sc, &race_list, list) uprintf(" %d ", sc->unit); break; default: error = ENOTTY; break; } return (error); } static struct race_softc * race_new(void) { struct race_softc *sc; int unit, max = −1; LIST_FOREACH(sc, &race_list, list) { if (sc->unit > max) max = sc->unit; } unit = max + 1; sc = (struct race_softc *)malloc(sizeof(struct race_softc), M_RACE, M_WAITOK | M_ZERO); sc->unit = unit; LIST_INSERT_HEAD(&race_list, sc, list); return (sc); } static struct race_softc * race_find(int unit) { struct race_softc *sc; LIST_FOREACH(sc, &race_list, list) { if (sc->unit == unit) break; } return (sc); } static void race_destroy(struct race_softc *sc) { LIST_REMOVE(sc, list); free(sc, M_RACE); } static int race_modevent(module_t mod __unused, int event, void *arg __unused) { int error = 0; switch (event) { case MOD_LOAD: race_dev = make_dev(&race_cdevsw, 0, UID_ROOT, GID_WHEEL, 0600, RACE_NAME); uprintf("Race driver loaded. "); break; case MOD_UNLOAD: destroy_dev(race_dev); uprintf("Race driver unloaded. "); break; case MOD_QUIESCE: if (!LIST_EMPTY(&race_list)) error = EBUSY; break; default: error = EOPNOTSUPP; break; } return (error); } DEV_MODULE(race, race_modevent, NULL);
Before I identify Example 4-1’s synchronization problems, let’s walk through it. Example 4-1 begins by defining and initializing a doubly linked list of race_softc
structures named race_list
. Each race_softc
structure contains a (unique) unit number and a structure that maintains a pointer to the previous and next race_softc
structure in race_list
.
Next, Example 4-1’s character device switch table is defined. The constant RACE_NAME
is defined in the race_ioctl.h
header as follows:
#define RACE_NAME "race"
Note how Example 4-1’s character device switch table doesn’t define d_open
and d_close
. Recall, from Chapter 1, that if a d_foo
function is undefined the corresponding operation is unsupported. However,d_open
and d_close
are unique; when they’re undefined the kernel will automatically define them as follows:
int nullop(void) { return (0); }
This ensures that every registered character device can be opened and closed.
Drivers commonly forgo defining a d_open
and d_close
function when they don’t need to prepare their devices for I/O—like Example 4-1.
Next, Example 4-1’s d_ioctl
function, named race_ioctl
, is defined. This function is like the main
function for Example 4-1. It uses three helper functions to do its work:
race_new
race_find
race_destroy
Before I describe race_ioctl
, I’ll describe these functions first.
The race_new
function creates a new race_softc
structure, which is then inserted at the head of race_list
. Here is the function definition for race_new
(again):
static struct race_softc * race_new(void) { struct race_softc *sc; int unit, max = −1; LIST_FOREACH(sc, &race_list, list) { if (sc->unit > max) max = sc->unit; } unit = max + 1; sc = (struct race_softc *)malloc(sizeof(struct race_softc), M_RACE, M_WAITOK | M_ZERO); sc->unit = unit; LIST_INSERT_HEAD(&race_list, sc, list); return (sc); }
This function first iterates through race_list
looking for the largest unit number, which it stores in max.
Next, unit
is set to max
plus one. Then race_new
allocates memory for a new race_softc
structure, assigns it the unit number unit
, and inserts it at the head of race_list
. Lastly, race_new
returns a pointer to the new race_softc
structure.
The race_find
function takes a unit number and finds the associated race_softc
structure on race_list
.
static struct race_softc * race_find(int unit) { struct race_softc *sc; LIST_FOREACH(sc, &race_list, list) { if (sc->unit == unit) break; } return (sc); }
If race_find
is successful, a pointer to the race_softc
structure is returned; otherwise, NULL
is returned.
The race_destroy
function destroys a race_softc
structure on race_list
. Here is its function definition (again):
static void race_destroy(struct race_softc *sc) { LIST_REMOVE(sc, list); free(sc, M_RACE); }
This function takes a pointer to a race_softc
structure and removes that structure from race_list
. Then it frees the allocated memory for that structure.
Before I walk through race_ioctl
, an explanation of its ioctl commands, which are defined in race_ioctl.h
, is needed.
#define RACE_IOC_ATTACH _IOR('R', 0, int) #define RACE_IOC_DETACH _IOW('R', 1, int) #define RACE_IOC_QUERY _IOW('R', 2, int) #define RACE_IOC_LIST _IO('R', 3)
As you can see, three of race_ioctl
’s ioctl commands transfer an integer value. As you’ll see, this integer value is a unit number.
Here is the function definition for race_ioctl
(again):
static int race_ioctl(struct cdev *dev, u_long cmd, caddr_t data, int fflag, struct thread *td) { struct race_softc *sc; int error = 0; switch (cmd) { case RACE_IOC_ATTACH: sc = race_new(); *(int *)data = sc->unit; break; case RACE_IOC_DETACH: sc = race_find(*(int *)data); if (sc == NULL) return (ENOENT); race_destroy(sc); break; case RACE_IOC_QUERY: sc = race_find(*(int *)data); if (sc == NULL) return (ENOENT); break; case RACE_IOC_LIST: uprintf(" UNIT "); LIST_FOREACH(sc, &race_list, list) uprintf(" %d ", sc->unit); break; default: error = ENOTTY; break; } return (error); }
This function can perform one of four ioctl-based operations. The first, RACE_IOC_ATTACH
, creates a new race_softc
structure, which is then inserted at the head of race_list
. Afterward, the unit number of the new race_softc
structure is returned.
The second operation, RACE_IOC_DETACH
, removes a user-specified race_softc
structure from race_list
.
The third operation, RACE_IOC_QUERY
, determines whether a user-specified race_softc
structure is on race_list
.
Lastly, the fourth operation, RACE_IOC_LIST
, prints the unit number of every race_softc
structure on race_list
.
The race_modevent
function is the module event handler for Example 4-1. Here is its function definition (again):
static int race_modevent(module_t mod __unused, int event, void *arg __unused) { int error = 0; switch (event) { case MOD_LOAD: race_dev = make_dev(&race_cdevsw, 0, UID_ROOT, GID_WHEEL, 0600, RACE_NAME); uprintf("Race driver loaded. "); break; case MOD_UNLOAD: destroy_dev(race_dev); uprintf("Race driver unloaded. "); break; case MOD_QUIESCE: if (!LIST_EMPTY(&race_list)) error = EBUSY; break; default: error = EOPNOTSUPP; break; } return (error); }
As you can see, this function includes a new case: MOD_QUIESCE
.
Because MOD_LOAD
and MOD_UNLOAD
are extremely rudimentary and because you’ve seen similar code elsewhere, I’ll omit discussing them.
When one issues the kldunload(8)
command, MOD_QUIESCE
is run before MOD_UNLOAD
. If MOD_QUIESCE
returns an error, MOD_UNLOAD
does not get executed. In other words, MOD_QUIESCE
verifies that it is safe to unload your module.
The kldunload -f
command ignores every error returned by MOD_QUIESCE
. So you can always unload a module, but it may not be the best idea.
Here, MOD_QUIESCE
guarantees that race_list
is empty (before Example 4-1 is unloaded). This is done to prevent memory leaks from any potentially unclaimed race_softc
structures.
Now that we’ve walked through Example 4-1, let’s run it and see if we can identify its synchronization problems.
Example 4-2 presents a command-line utility designed to invoke the race_ioctl
function in Example 4-1:
Example 4-2. race_config.c
#include <sys/types.h> #include <sys/ioctl.h> #include <err.h> #include <fcntl.h> #include <limits.h> #include <stdio.h> #include <stdlib.h> #include <unistd.h> #include "../race/race_ioctl.h" static enum {UNSET, ATTACH, DETACH, QUERY, LIST} action = UNSET; /* * The usage statement: race_config -a | -d unit | -q unit | -l */ static void usage() { /* * Arguments for this program are "either-or." For example, * 'race_config -a' or 'race_config -d unit' are valid; however, * 'race_config -a -d unit' is invalid. */ fprintf(stderr, "usage: race_config -a | -d unit | -q unit | -l "); exit(1); } /* * This program manages the doubly linked list found in /dev/race. It * allows you to add or remove an item, query the existence of an item, * or print every item on the list. */ int main(int argc, char *argv[]) { int ch, fd, i, unit; char *p; /* * Parse the command line argument list to determine * the correct course of action. * * -a: add an item. * -d unit: detach an item. * -q unit: query the existence of an item. * -l: list every item. */ while ((ch = getopt(argc, argv, "ad:q:l")) != −1) switch (ch) { case 'a': if (action != UNSET) usage(); action = ATTACH; break; case 'd': if (action != UNSET) usage(); action = DETACH; unit = (int)strtol(optarg, &p, 10); if (*p) errx(1, "illegal unit -- %s", optarg); break; case 'q': if (action != UNSET) usage(); action = QUERY; unit = (int)strtol(optarg, &p, 10); if (*p) errx(1, "illegal unit -- %s", optarg); break; case 'l': if (action != UNSET) usage(); action = LIST; break; default: usage(); } /* * Perform the chosen action. */ if (action == ATTACH) { fd = open("/dev/" RACE_NAME, O_RDWR); if (fd < 0) err(1, "open(/dev/%s)", RACE_NAME); i = ioctl(fd, RACE_IOC_ATTACH, &unit); if (i < 0) err(1, "ioctl(/dev/%s)", RACE_NAME); printf("unit: %d ", unit); close (fd); } else if (action == DETACH) { fd = open("/dev/" RACE_NAME, O_RDWR); if (fd < 0) err(1, "open(/dev/%s)", RACE_NAME); i = ioctl(fd, RACE_IOC_DETACH, &unit); if (i < 0) err(1, "ioctl(/dev/%s)", RACE_NAME); close (fd); } else if (action == QUERY) { fd = open("/dev/" RACE_NAME, O_RDWR); if (fd < 0) err(1, "open(/dev/%s)", RACE_NAME); i = ioctl(fd, RACE_IOC_QUERY, &unit); if (i < 0) err(1, "ioctl(/dev/%s)", RACE_NAME); close (fd); } else if (action == LIST) { fd = open("/dev/" RACE_NAME, O_RDWR); if (fd < 0) err(1, "open(/dev/%s)", RACE_NAME); i = ioctl(fd, RACE_IOC_LIST, NULL); if (i < 0) err(1, "ioctl(/dev/%s)", RACE_NAME); close (fd); } else usage(); return (0); }
Example 4-2 is a bog-standard command-line utility. As such, I won’t cover its program structure.
The following shows an example execution of Example 4-2:
$sudo kldload ./race.ko
Race driver loaded. $sudo ./race_config -a & sudo ./race_config -a &
[1] 2378 [2] 2379 $ unit: 0 unit: 0
Above, two threads simultaneously add a race_softc
structure to race_list
, which results in two race_softc
structures with the “unique” unit number 0
—this is a problem, yes?
Here’s another example:
$sudo kldload ./race.ko
Race driver loaded. $sudo ./race_config -a & sudo kldunload race.ko &
[1] 2648 [2] 2649 $ unit: 0 Race driver unloaded. [1]- Done sudo ./race_config -a [2]+ Done sudo kldunload race.ko $dmesg | tail -n 1
Warning: memory type race leaked memory on destroy (1 allocations, 16 bytes leaked).
Above, one thread adds a race_softc
structure to race_list
while another thread unloads race.ko, which causes a memory leak. Recall that MOD_QUIESCE
is supposed to prevent this, but it didn’t. Why?
The problem, in both examples, is a race condition. Race conditions are errors caused by a sequence of events. In the first example, both threads check race_list
simultaneously, discover that it is empty, and assign 0
as the unit number. In the second example, MOD_QUIESCE
returns error-free, a race_softc
structure is then added to race_list
, and finally MOD_UNLOAD
completes.
One characteristic of race conditions is that they’re hard to reproduce. Ergo, the results were doctored in the preceding examples. That is, I caused the threads to context switch at specific points to achieve the desired outcome. Under normal conditions, it would have taken literally millions of attempts before those race conditions would occur, and I didn’t want to spend that much time.
18.223.171.162