In this chapter
10.1 Introduction page 348
10.2 Signal Actions page 348
10.3 Standard C Signals: signal()
and raise()
page 349
10.4 Signal Handlers in Action page 353
10.5 The System V Release 3 Signal APIs: sigset()
et al. page 365
10.6 POSIX Signals page 367
10.7 Signals for Interprocess Communication page 379
10.8 Important Special-Purpose Signals page 382
10.9 Signals Across fork()
and exec()
page 398
10.10 Summary page 399
Exercises page 401
This chapter covers the ins and outs of signals, an important but complicated part of the GNU/Linux API.
A signal is an indication that some event has happened, for example, an attempt to reference a memory address that isn’t part of your program’s address space, or when a user presses CTRL-C to stop your program (called generating an interrupt).
Your program can tell only that a particular signal has happened at least once. Generally, you can’t tell if the same signal has happened multiple times. You can distinguish one signal from another, and control the way in which your program reacts to different signals.
Signal handling mechanisms have evolved over time. As is the case with almost all such mechanisms, both the original and the newer APIs are standardized and available. However, of the fundamental APIs, signal handling displays possibly the broadest change; there’s a lot to get a handle on to be able to use the most capable APIs. As a result, this is perhaps the most difficult chapter in the book. We’ll do our best to make a coherent presentation, but it’ll help if you work your way through this chapter more carefully than usual.
Unlike most of the chapters in this book, our presentation here is historical, covering the APIs as they evolved, including some APIs that you should never use in new code. We do this because it simplifies the presentation, making it straightforward to understand why the POSIX sigaction()
API supports all the facilities that it does.
Every signal (we provide a full list shortly) has a default action associated with it. POSIX terms this the signal’s disposition. This action is what the kernel does for the process when a particular signal arrives. The default actions vary:
Termination
Ignored
Core dump
The process is terminated, and the kernel creates a core
file (in the process’s current directory) containing the image of the running program at the time the signal arrived. The core dump can be used later with a debugger for examination of the state of the program (see Chapter 15, “Debugging,” page 567).
By default, GNU/Linux systems create files named core.
pid
, where pid
is the process ID of the killed process. (This can be changed; see sysctl(8).) This naming lets you store multiple core
files in the same directory, at the expense of the disk space involved.[1] Traditional Unix systems name the file core
, and it’s up to you to save any core
files for later reexamination if there’s a chance that more will be created in the same directory.
Stopped
The ISO C standard defines the original V7 signal management API and a new API for sending signals. You should use them for programs that have to work on non-POSIX systems, or for cases in which the functionality provided by the ISO C APIs is adequate.
You change a signal’s action with the signal()
function. You can change the action to one of “ignore this signal,” “restore the system’s default action for this signal,” or “call my function with the signal number as a parameter when the signal occurs.”
A function you provide to deal with the signal is called a signal handler (or just a handler), and putting a handler in place is arranging to catch the signal.
With that introduction, let’s proceed to the APIs. The <signal.h>
header file provides macro definitions for supported signals and declares the signal management function provided by Standard C:
#include <signal.h> ISO C
void (*signal(int signum, void (*func)(int)))(int);
This declaration for signal()
is almost impossible to read. Thus, the GNU/Linux signal(2) manpage defines it this way:
typedef void (*sighandler_t)(int); sighandler_t signal(int signum, sighandler_t handler);
Now it’s more intelligible. The type sighandler_t
is a pointer to a function returning void
, which accepts a single integer argument. This integer is the number of the arriving signal.
The signal()
function accepts a signal number as its first parameter and a pointer to a function (the new handler) as its second argument. If not a function pointer, the second argument may be either SIG_DFL
, which means “restore the default action,” or SIG_IGN
, which means “ignore the signal.”
signal()
changes the action for signum
and returns the previous action. (This allows you to later restore the previous action if you so desire.) The return value may also be SIG_ERR
, which indicates that something went wrong. (Some signals can’t be caught or ignored; supplying a signal handler for them, or an invalid signum
, generates this error return.) Table 10.1 lists the signals available under GNU/Linux, their numeric values, each one’s default action, the formal standard or modern operating system that defines them, and each one’s meaning.
Table 10.1. GNU/Linux signals
Value | Default | Source | Meaning | |
---|---|---|---|---|
| 1 | Term | POSIX | Hangup. |
| 2 | Term | ISO C | Interrupt. |
| 3 | Core | POSIX | Quit. |
| 4 | Core | ISO C | Illegal instruction. |
| 5 | Core | POSIX | Trace trap. |
| 6 | Core | ISO C | Abort. |
| 6 | Core | BSD | IOT trap. |
| 7 | Core | BSD | Bus error. |
| 8 | Core | ISO C | Floating-point exception. |
| 9 | Term | POSIX | Kill, unblockable. |
| 10 | Term | POSIX | User-defined signal 1. |
| 11 | Core | ISO C | Segmentation violation. |
| 12 | Term | POSIX | User-defined signal 2. |
| 13 | Term | POSIX | Broken pipe. |
| 14 | Term | POSIX | Alarm clock. |
| 15 | Term | ISO C | Termination. |
| 16 | Term | Linux | Stack fault on a processor (unused). |
| 17 | Ignr | POSIX | Child process status changed. |
| 17 | Ignr | System V | Same as |
| 18 | POSIX | Continue if stopped. | |
| 19 | Stop | POSIX | Stop, unblockable. |
| 20 | Stop | POSIX | Keyboard stop. |
| 21 | Stop | POSIX | Background read from tty. |
| 22 | Stop | POSIX | Background write to tty. |
| 23 | Ignr | BSD | Urgent condition on socket. |
| 24 | Core | BSD | CPU limit exceeded. |
| 25 | Core | BSD | File size limit exceeded. |
| 26 | Term | BSD | Virtual alarm clock. |
| 27 | Term | BSD | Profiling alarm clock. |
| 28 | Ignr | BSD | Window size change. |
| 29 | Term | BSD | I/O now possible. |
| 29 | Term | System V | Pollable event occurred: same as |
| 30 | Term | System V | Power failure restart. |
| 31 | Core | POSIX | Bad system call. |
Key: | Core: Terminate the process and produce a core file. | |||
Ignr: Ignore the signal. | ||||
Stop: Stop the process. | ||||
Term: Terminate the process. |
Older versions of the Bourne shell (/bin/sh
) associated traps, which are shell-level signal handlers, directly with signal numbers. Thus, the well-rounded Unix programmer needed to know not only the signal names for use from C code but also the corresponding signal numbers! POSIX requires the trap
command to understand symbolic signal names (without the ’SIG
’ prefix), so this is no longer necessary. However (mostly against our better judgment), we have provided the numbers in the interest of completeness and because you may one day have to deal with a pre-POSIX shell script or ancient C code that uses signal numbers directly.
For some of the newer signals, from 16
on up, the association between signal number and signal name isn’t necessarily the same across platforms! Check your system header files and manpages. Table 10.1 is correct for GNU/Linux.
Some systems also define other signals, such as SIGEMT, SIGLOST
, and SIGINFO
. The GNU/Linux signal(7) manpage provides a complete listing; if your program needs to handle signals not supported by GNU/Linux, the way to do it is with an #ifdef
:
#ifdef SIGLOST
... handle SIGLOST here ...
#endif
With the exception of SIGSTKFLT
, the signals listed in Table 10.1 are widely available and don’t need to be bracketed with #ifdef
.
SIGKILL
and SIGSTOP
cannot be caught or ignored (or blocked, as described later in the chapter). They always perform the default action listed in Table 10.1.
You can use ’kill -l
’ to see a list of supported signals. From one of our GNU/Linux systems:
$ kill -1
1) SIGHUP 2) SIGINT 3) SIGQUIT 4) SIGILL
5) SIGTRAP 6) SIGABRT 7) SIGBUS 8) SIGFPE
9) SIGKILL 10) SIGUSR1 11) SIGSEGV 12) SIGUSR2
13) SIGPIPE 14) SIGALRM 15) SIGTERM 17) SIGCHLD
18) SIGCONT 19) SIGSTOP 20) SIGTSTP 21) SIGTTIN
22) SIGTTOU 23) SIGURG 24) SIGXCPU 25) SIGXFSZ
26) SIGVTALRM 27) SIGPROF 28) SIGWINCH 29) SIGIO
30) SIGPWR 31) SIGSYS 32) SIGRTMIN 33) SIGRTMIN+1
34) SIGRTMIN+2 35) SIGRTMIN+3 36) SIGRTMIN+4 37) SIGRTMIN+5
38) SIGRTMIN+6 39) SIGRTMIN+7 40) SIGRTMIN+8 41) SIGRTMIN+9
42) SIGRTMIN+10 43) SIGRTMIN+11 44) SIGRTMIN+12 45) SIGRTMIN+13
46) SIGRTMIN+14 47) SIGRTMIN+15 48) SIGRTMAX-15 49) SIGRTMAX-14
50) SIGRTMAX-13 51) SIGRTMAX-12 52) SIGRTMAX-11 53) SIGRTMAX-10
54) SIGRTMAX-9 55) SIGRTMAX-8 56) SIGRTMAX-7 57) SIGRTMAX-6
58) SIGRTMAX-5 59) SIGRTMAX-4 60) SIGRTMAX-3 61) SIGRTMAX-2
62) SIGRTMAX-1 63) SIGRTMAX
The SIGRT
XXX
signals are real-time signals, an advanced topic that we don’t cover.
Besides being generated externally, a program can send itself a signal directly, using the Standard C function raise()
:
#include <signal.h> ISO C
int raise(int sig);
This function sends the signal sig
to the calling process. (This action has its uses; we show an example shortly.)
Because raise()
is defined by Standard C, it is the most portable way for a process to send itself a signal. There are other ways, which we discuss further on in the chapter.
Much of the complication and variation shows up once a signal handler is in place, as it is invoked, and after it returns.
After putting a signal handler in place, your program proceeds on its merry way. Things don’t get interesting until a signal comes in (for example, the user pressed CTRL-C to interrupt your program or a call to raise()
was made).
Upon receipt of the signal, the kernel stops the process wherever it may be. It then simulates a procedure call to the signal handler, passing it the signal number as its sole argument. The kernel arranges things such that a normal return from the signal handler function (either through return
or by falling off the end of the function) returns to the point in the program at which the signal happened.
Once a signal has been handled, what happens the next time the same signal comes in? Does the handler remain in place? Or is the signal’s action reset to its default? The answer, for historical reasons, is “it depends.” In particular, the C standard leaves it as implementation defined.
In practice, V7 and traditional System V systems, such as Solaris, reset the signal’s action to the default.
Let’s see a simple signal handler in action under Solaris. The following program, ch10-catchint.c
, catches SIGINT
. You normally generate this signal by typing CTRL-C at the keyboard.
1 /* ch10-catchint.c --- catch a SIGINT, at least once. */ 2 3 #include <signal.h> 4 #include <string.h> 5 #include <unistd.h> 6 7 /* handler --- simple signal handler. */ 8 9 void handler(int signum) 10 { 11 char buf[200], *cp; 12 int offset; 13 14 /* Jump through hoops to avoid fprintf(). */ 15 strcpy(buf, "handler: caught signal "); 16 cp = buf + strlen(buf); /* cp points at terminating ' ' */ 17 if (signum > 100) /* unlikely */ 18 offset = 3; 19 else if (signum > 10) 20 offset = 2; 21 else 22 offset = 1; 23 cp += offset; 24 25 *cp-- = ' '; /* terminate string */ 26 while (signum > 0) { /* work backwards, filling in digits */ 27 *cp-- = (signum % 10) + '0'; 28 signum /= 10; 29 } 30 strcat(buf, " "); 31 (void) write(2, buf, strlen(buf)); 32 } 33 34 /* main --- set up signal handling and go into infinite loop */ 35 36 int main(void) 37 { 38 (void) signal(SIGINT, handler); 39 40 for (;;) 41 pause(); /* wait for a signal, see later in the chapter */ 42 43 return 0; 44 }
Lines 9–32 define the signal handling function (cleverly named handler()
). All this function does is print the caught signal’s number and return. It does a lot of manual labor to generate the message, since fprintf()
is not “safe” for calling from within a signal handler. (This is described shortly, in Section 10.4.6, “Additional Caveats,” page 363.)
The main()
function sets up the signal handler (line 38) and then goes into an infinite loop (lines 40–41). Here’s what happens when it’s run:
$ ssh solaris.example.com Log in to a handy Solaris system Last login: Fri Sep 19 04:33:25 2003 from 4.3.2.1. Sun Microsystems Inc. SunOS 5.9 Generic May 2002 $ gcc ch10-catchint.c Compile the program $ a.out Run it ^Chandler: caught signal 2 Type^C, handler is called ^C Try again, but this time ... $ The program dies
Because V7 and other traditional systems reset the signal’s action to the default, when you wish to receive the signal again in the future, the handler function should immediately reinstall itself:
void handler(int signum)
{
char buf[200], *cp;
int offset;
(void) signal (signum, handler); /* reinstall handler */
... rest of function as before ...
}
4.2 BSD changed the way signal()
worked.[2] On BSD systems, the signal handler remains in place after the handler returns. GNU/Linux systems follow the BSD behavior. Here’s what happens under GNU/Linux:
$ ch10-catchint Run the program handler: caught signal 2 Type ^C, handler is called handler: caught signal 2 And again... handler: caught signal 2 And again! handler: caught signal 2 Help! handler: caught signal 2 How do we stop this?! Quit (core dumped) ^, generate SIGQUIT. Whew
On a BSD or GNU/Linux system, a signal handler doesn’t need the extra ’signal(signum, handler)
’ call to reinstall the handler. However, the extra call also doesn’t hurt anything, since it maintains the status quo.
In fact, POSIX provides a bsd_signal()
function, which is identical to signal()
, except that it guarantees that the signal handler stays installed:
#include <signal.h> XSI, Obsolescent
void (*bsd_signal(int sig, void (*func) (int)))(int);
This eliminates the portability issues. If you know your program will run only on POSIX systems, you may wish to use bsd_signal()
instead of signal()
.
One caveat is that this function is also marked “obsolescent,” meaning that it can be withdrawn from a future standard. In practice, even if it’s withdrawn, vendors will likely continue to support it for a long time. (As we’ll see, the POSIX sigaction()
API provides enough facilities to let you write a workalike version, should you need to.)
More practically, when a signal handler is invoked, it usually means that the program should finish up and exit. It would be annoying if most programs, upon receipt of a SIGINT
, printed a message and continued; the point of the signal is that they should stop!
For example, consider the sort
program. sort
may have created any number of temporary files for use in intermediate stages of the sorting process. Upon receipt of a SIGINT, sort
should remove the temporary files and then exit. Here is a simplified version of the signal handler from the GNU Coreutils sort.c
:
/* Handle interrupts and hangups. */ Simplified for presentation static void sighandler (int sig) { signal (sig, SIG_IGN); Ignore this signal from now on cleanup (); Clean up after ourselves signal (sig, SIG_DFL); Restore default action raise (sig); Now resend the signal }
Setting the action to SIG_IGN
ensures that any further SIGINT
signals that come in won’t affect the clean-up action in progress. Once cleanup()
is done, resetting the action to SIG_DFL
allows the system to dump core if the signal that came in would do so. Calling raise()
regenerates the signal. The regenerated signal then invokes the default action, which most likely terminates the program. (We show the full sort.c
signal handler later in this chapter.)
The EINTR
value for errno
(see Section 4.3, “Determining What Went Wrong,” page 86) indicates that a system call was interrupted. While a large number of system calls can fail with this error value, the two most important ones are read()
and write()
. Consider the following code:
void handler(int signal) { /* handle signals */ } int main(int argc, char **argv) { signal(SIGINT, handler); ... while ((count = read(fd, buf, sizeof buf)) > 0) { /* process the buffer */ } if (count == 0) /* end of file, clean up etc. */ else if (count == -1) /* failure */ ... }
Suppose that the system has successfully read (and filled in) part of the buffer when a SIGINT
occurs. The read()
system call has not yet returned from the kernel to the program, but the kernel decides that it can deliver the signal. handler()
is called, runs, and returns into the middle of the read()
. What does read()
return?
In days of yore (V7, earlier System V systems), read()
would return -1
and set errno
to EINTR
. There was no way to tell that data had been transferred. In this case, V7 and System V act as if nothing happened: No data are transferred to or from the user’s buffer, and the file offset isn’t changed.
4.2 BSD changed this. There were two cases:
Slow devices
Regular files
The BSD behavior is clearly valuable; you can always tell how much data you’ve read.
The POSIX behavior is similar, but not identical, to the original BSD behavior. POSIX indicates that read()
[3] fails with EINTR
only if a signal occurred before any data were transferred. Although POSIX doesn’t say anything about “slow devices,” in practice, this condition only occurs on such devices.
Otherwise, if a signal interrupts a partially successful read()
, the return is the number of bytes read so far. For this reason (as well as being able handle short files), you should always check the return value from read()
and never assume that it read the full number of bytes requested. (The POSIX sigaction()
API, described later, allows you to get the behavior of BSD restartable system calls if you want it.)
The GNU Coreutils use two routines, safe_read()
and safe_write()
, to handle the EINTR
case on traditional systems. The code is a bit complicated by the fact that the same file, by means of #include
and macros, implements both functions. From lib/safe-read.c
in the Coreutils distribution:
1 /* An interface to read and write that retries after interrupts. 2 Copyright (C) 1993, 1994, 1998, 2002 Free Software Foundation, Inc. ...lots of boilerplate stuff omitted ... 56 57 #ifdef SAFE_WRITE 58 # include "safe-write.h" 59 # define safe_rw safe_write Create safe_write() 60 # define rw write Use write()system call 61 #else 62 # include "safe-read.h" 63 # define safe_rw safe_read Create safe_read() 64 # define rw read Use read()system call 65 # undef const 66 # define const /* empty */ 67 #endif 68 69 /* Read (write) up to COUNT bytes at BUF from(to) descriptor FD, retrying if 70 interrupted. Return the actual number of bytes read(written), zero for EOF, 71 or SAFE_READ_ERROR(SAFE_WRITE_ERROR) upon error. */ 72 size_t 73 safe_rw (int fd, void const *buf, size_t count) 74 { 75 ssize_t result; 76 77 /*POSIX limits COUNT to SSIZE_MAX, but we limit it further, requiring 78 that COUNT <=INT_MAX, to avoid triggering a bug in Tru64 5.1. 79 When decreasing COUNT, keep the file pointer block-aligned. 80 Note that in any case, read(write) may succeed, yet read(write) 81 fewer than COUNT bytes, so the caller must be prepared to handle 82 partial results. */ 83 if (count > INT_MAX) 84 count = INT_MAX & ~8191; 85 86 do 87 { 88 result = rw (fd, buf, count); 89 } 90 while (result < 0 && IS_EINTR (errno)); 91 92 return (size_t) result; 93 }
Lines 57–67 handle the definitions, creating safe_read()
and safe_write()
, as appropriate (see safe_write.c
, below).
Lines 77–84 are indicative of the kinds of complications found in the real world. Here, one particular Unix variant can’t handle count values greater than INT_MAX
, so lines 83–84 perform two operations at once: reducing the count to below INT_MAX
and keeping the amount a multiple of 8192. The latter operation maintains the efficiency of the I/O operations: Doing I/O in multiples of the fundamental disk block size is always more efficient than doing it in odd amounts. As the comment notes, the code maintains the semantics of read()
and write()
, where the returned count may be less than the requested count.
Note that the count
parameter can indeed be greater than INT_MAX
, since count is a size_t
, which is unsigned. INT_MAX
is a plain int
, which on all modern systems is signed.
Lines 86–90 are the actual loop, performing the operation repeatedly, as long as it fails with EINTR
. The IS_EINTR()
macro isn’t shown, but it handles the case for systems on which EINTR
isn’t defined. (There must be at least one out there or the code wouldn’t bother setting up the macro; it was probably done for a Unix or POSIX emulation on top of a non-Unix system.)
Here is safe_write.c
:
1 /* An interface to write that retries after interrupts. 2 Copyright (C) 2002 Free Software Foundation, Inc. ... lots of boilerplate stuff omitted ... 17 18 #define SAFE_WRITE 19 #include "safe-read.c"
The #define
on line 18 defines SAFE_WRITE;
this ties in to lines 57–60 in safe-read.c
The GLIBC <unistd.h>
file defines a macro, TEMP_FAILURE_RETRY()
, that you can use to encapsulate any system call that can fail and set errno
to EINTR
. Its “declaration” is as follows:
#include <unistd.h> GLIBC long int TEMP_FAILURE_RETRY(expression);
Here is the macro’s definition:
/* Evaluate EXPRESSION, and repeat as long as it returns -1 with 'errno' set to EINTR. */ # define TEMP_FAILURE_RETRY(expression) (__extension__ ({ long int __result; do __result = (long int) (expression); while (__result == -1L && errno == EINTR); __result; }))
The macro uses a GCC extension to the C language (as marked by the __extension__
keyword) which allows brace-enclosed statements inside parentheses to return a value, thus acting like a simple expression.
Using this macro, we might rewrite safe_read()
as follows:
size_t safe_read(int fd, void const *buf, size_t count) { ssize_t result; /* Limit count as per comment earlier. */ if (count > INT_MAX) count = INT_MAX & ~8191; result = TEMP_FAILURE_RETRY(read(fd, buf, count)); return (size_t) result; }
So far, handling one signal at a time looks straightforward: install a signal handler in main()
and (optionally) have the signal handler reinstall itself (or set the action to SIG_IGN
) as the first thing it does.
What happens though if two identical signals come in, right after the other? In particular, what if your system resets the signal’s action to the default, and the second one comes in after the signal handler is called but before it can reinstall itself?
Or, suppose you’re using bsd_signal()
, so the handler stays installed, but the second signal is different from the first one? Usually, the first signal handler needs to complete its job before the second one runs, and every signal handler shouldn’t have to temporarily ignore all other possible signals!
Both of these are race conditions. One workaround for these problems is to make signal handlers as simple as possible. You can do this by creating flag variables that indicate that a signal occurred. The signal handler sets the variable to true and returns. Then the main logic checks the flag variable at strategic points:
int sig_int_flag = 0; /* signal handler sets to true */ void int_handler(int signum) { sig_int_flag = 1; } int main(int argc, char **argv) { bsd_signal (SIGINT, int_handler); ...program proceeds on... if (sig_int_flag) { /* SIGINT occurred, handle it */ } ...rest of logic... }
(Note that this strategy reduces the window of vulnerability but does not eliminate it.)
Standard C introduces a special type—sig_atomic_t
—for use with such flag variables. The idea behind the name is that assignments to variables of this type are atomic: That is, they happen in one indivisible action. For example, on most machines, assignment to an int
value happens atomically, whereas a structure assignment is likely to be done either by copying all the bytes with a (compiler-generated) loop, or by issuing a “block move” instruction that can be interrupted. Since assignment to a sig_atomic_t
value is atomic, once started, it completes before another signal can come in and interrupt it.
Having a special type is only part of the story.sig_atomic_t
variables should also be declared volatile:
volatile sig_atomic_t sig_int_flag = 0; /*signal handler sets to true */
...rest of code as before...
The volatile
keyword tells the compiler that the variable can be changed externally, behind the compiler’s back, so to speak. This keeps the compiler from doing optimizations that might otherwise affect the code’s correctness.
Structuring an application exclusively around sig_atomic_t
variables is not reliable. The correct way to deal with signals is shown later, in Section 10.7, “Signals for Interprocess Communication,” page 379.
The POSIX standard provides several caveats for signal handlers:
It is undefined what happens when handlers for SIGFPE, SIGILL, SIGSEGV,
or any other signals that represent “computation exceptions” return.
If a handler was invoked as a result of calls to abort(),raise()
, or kill()
, the handler cannot call raise().abort()
is described in Section 12.4, “Committing Suicide: abort(),” page 445, and kill()
is described later in this chapter. (The sigaction()
API, with the three-argument signal handler described later, makes it possible to tell if this is the case.)
Signal handlers can only call the functions in Table 10.2. In particular, they should avoid <stdio.h>
functions. The problem is that an interrupt may come in while a <stdio.h>
function is running, when the internal state of the library is in the middle of being updated. Further calls to <stdio.h>
functions could corrupt the internal state.
Table 10.2. Functions that can be called from a signal handler
The list in Table 10.2 comes from Section 2.4 of the System Interfaces volume of the 2001 POSIX standard. Many of these functions are advanced APIs not otherwise covered in this volume.
Signals are a complicated topic, and it’s about to get more confusing. So, let’s pause for a moment, take a step back, and summarize what we’ve discussed so far:
Signals are an indication that some external event has occurred.
raise()
is the ISO C function for sending signals to the current process. We have yet to describe how to send signals to other processes.
signal()
controls the disposition of a signal: that is, the process’s reaction to the signal when it comes in. The signal may be left set to the system default, ignored, or caught.
A handler function runs when a signal is caught. Here is where complexity starts to rear its ugly head:
ISO C leaves as unspecified whether signal disposition is restored to its default before the handler runs or whether the disposition remains in place. The former is the behavior of V7 and modern System V systems such as Solaris. The latter is the BSD behavior also found on GNU/Linux. (The POSIX bsd_signal()
function may be used to force BSD behavior).
What happens when a system call is interrupted by a signal also varies along the traditional vs. BSD line. Traditional systems return -1
with errno
set to EINTR
. BSD systems restart the system call after the handler returns. The GLIBC TEMP_FAILURE_RETRY()
macro can help you write code to handle system calls that return -1 with errno
set to EINTR
.
POSIX requires that a system call that has partially completed return a success value indicating how much succeeded. A system call that hasn’t started yet is restarted.
The signal()
mechanism provides fertile ground for growing race conditions. The ISO C sig_atomic_t
data type helps with this situation but doesn’t solve it, and the mechanism as defined can’t be made safe from race conditions.
A number of additional caveats apply, and in particular, only a subset of the standard library functions can be safely called from within a signal handler.
Despite the problems, for simple programs, the signal()
interface is adequate, and it is still widely used.
4.0 BSD (circa 1980) introduced additional APIs to provide “reliable” signals.[4] In particular, it became possible to block signals. In other words, a program could tell the kernel, “hang on to these particular signals for the next little while, and then deliver them to me when I’m ready to take them.” A big advantage is that this feature simplifies signal handlers, which automatically run with their own signal blocked (to avoid the two-signals-in-a-row problem) and possibly with others blocked as well.
System V Release 3 (circa 1984) picked up these APIs and popularized them; in most Unix-related documentation and books, you’ll probably see these APIs referred to as being from System V Release 3. The functions are as follows:
#include <signal.h> XSI int sighold(int sig); Add sig to process signal mask int sigrelse(int sig) Remove sig from process signal mask int sigignore(int sig); Short for sigset(sig, SIG_IGN) int sigpause(int sig); Suspend process, allow sig to come in void (*sigset (int sig, void (*disp)(int)))(int); sighandler_t sigset(int sig, sighandler_t disp);
The POSIX standard for these functions describes their behavior in terms of each process’s process signal mask. The process signal mask tracks which signals (if any) a process currently has blocked. This is described in more detail in Section 10.6.2, “Signal Sets: sigset_t
and Related Functions,” page 368. In the System V Release 3 API there is no way to retrieve or modify the process signal mask as a whole. functions work as follows:
int sighold(int sig)
Adds sig
to the list of blocked processes (the process signal mask).
Removes (releases) sig
from the process signal mask.
int sigignore (int sig)
Ignores sig
. This is a convenience function.
int sigpause(int sig)
Removes sig
from the process signal mask, and then suspends the process until a signal comes in (see Section 10.7, “Signals for Interprocess Communication,” page 379).
sighandler_t sigset(int sig, sighandler_t disp)
Is a replacement for signal()
. (We’ve used the GNU/Linux manpage notation here to make the declaration easier to read.)
For sigset()
, the handler
argument can be SIG_DFL, SIG_IGN,
or a function pointer, just as for signal()
. However, it may also be SIG_HOLD
. In this case, sig
is added to the process’s process signal mask, but its associated action is otherwise unchanged. (In other words, if it had a handler, the handler is still in place; if it was the default action, that has not changed.)
When sigset()
is used to install a signal handler and the signal comes in, the kernel first adds the signal to the process signal mask, blocking any additional receipt of that signal. The handler runs, and when it returns, the kernel restores the process signal mask to what it was before the handler ran. (In the POSIX model, if a signal handler changes the signal mask, that change is overridden by the restoration of the previous mask when the handler returns.)
sighold()
and sigrelse()
may be used together to bracket so-called critical sections of code: chunks of code that should not be interrupted by particular signals so that no data structures are corrupted by code from a signal handler.
POSIX standardizes these APIs, since a major goal of POSIX is to formalize existing practice, wherever possible. However, the sigaction()
APIs described shortly let you do everything that these APIs do, and more. You should not use these APIs in new programs. Instead, use sigaction()
. (We note that there isn’t even a sigset(2) GNU/Linux manpage!)
The POSIX API is based on the sigvec()
API from 4.2 and 4.3 BSD. With minor changes, this API was able to subsume the functionality of both the V7 and System V Release 3 APIs. POSIX made these changes and renamed the API sigaction()
. Because the sigvec()
interface was not widely used, we don’t describe it. Instead, this section describes only sigaction()
, which is what you should use anyway. (Indeed, the 4.4 BSD manuals from 1994 mark sigvec()
as obsolete, pointing the reader to sigaction()
.)
What’s wrong with the System V Release 3 APIs? After all, they provide signal blocking, so signals aren’t lost and any given signal can be handled reliably.
The answer is that the API works with only one signal at a time. Programs generally handle more than one signal. And when you’re in the middle of handling one signal, you don’t want to have to worry about handling another one. (Suppose you’ve just answered your office phone when your cell phone starts ringing: You’d prefer to have the phone system tell your caller you’re on another line and you’ll be there shortly, instead of having to do it yourself.)
With the sigset()
API, each signal handler would have to temporarily block all the other signals, do its job, and then unblock them. The problem is that in the interval between any two calls to sighold()
, a not-yet-blocked signal could come up. The scenario is rife, once again, with race conditions.
The solution is to make it possible to work with groups of signals atomically, that is, with one system call. You effect this by working with signal sets and the process signal mask.
The process signal mask is a list of signals that a process currently has blocked. The strength of the POSIX API is that the process signal mask can be manipulated atomically, as a whole.
The process signal mask is represented programmatically with a signal set. This is the sigset_t
type. Conceptually, it’s just a bitmask, with 0
and 1
values in the mask representing a particular signal’s absence or presence in the mask:
/* Signal mask manipulated directly. DO NOT DO THIS! */ int mask = (1 << SIGHUP) | (1 << SIGINT); /* bitmask for SIGHUP and SIGINT */
However, because a system can have more signals than can be held in a single int or long and because heavy use of the bitwise operators is hard to read, several APIs exist to manipulate signal sets:
#include <signal.h> POSIX
int sigemptyset(sigset_t *set);
int sigfillset(sigset_t *set);
int sigaddset(sigset_t *set, int signum);
int sigdelset(sigset_t *set, int signum);
int sigismember(const sigset_t *set, int signum);
int sigemptyset (sigset_t *set)
Empties out a signal set. Upon return, *set
has no signals in it. Returns 0
on success or -1
on error.
int sigfillset (sigset_t *set)
Completely fills in a signal set. Upon return, *set
contains all the signals defined by the system. Returns 0
on success or -1
on error.
int sigaddset (sigset_t *set, int signum)
Adds signum to the process signal mask in *set
. Returns 0
on success or -1
on error.
int sigdelset (sigset_t *set, int signum)
Removes signum
from the process signal mask in *set
. Returns 0
on success or -1
on error.
int sigismember (const sigset_t *set, int signum)
Returns true/false if signum
is or isn’t present in *set
.
You must always call one of sigemptyset()
or sigfillset()
before doing anything else with a sigset_t
variable. Both interfaces exist because sometimes you want to start out with an empty set and then just work with one or two signals, and other times you want to work with all signals, possibly taking away one or two.
The process signal mask starts out empty—initially, no signals are blocked. (This is a simplification; see Section 10.9, “Signals Across fork() and exec(),” page 398.) Three functions let you work directly with the process signal mask:
#include <signal.h> POSIX
int sigprocmask(int how, const sigset_t *set, sigset_t *oldset);
int sigpending(sigset_t *set);
int sigsuspend(const sigset_t *set);
The functions are as follows:
int sigprocmask(int how, const sigset_t *set, sigset_t *oldset)
If oldset
is not NULL
, the current process signal mask is retrieved and placed in *oldset
. The process signal mask is then updated, according to the contents of set
and the value of how
, which must be one of the following:
| Merge the signals in |
| Remove the signals in |
| Replace the process signal mask with the contents of |
If set
is NULL
and oldset
isn’t, the value of how
isn’t important. This combination retrieves the current process signal mask without changing it. (This is explicit in the POSIX standard but isn’t clear from the GNU/Linux manpage.)
int sigpending(sigset_t *set)
int sigsuspend(const sigset_t *set)
This function temporarily replaces the process’s process signal mask with *set
, and then suspends the process until a signal is received. By definition, only a signal not in *set
can cause the function to return (see Section 10.7, “Signals for Interprocess Communication,” page 379).
Finally, we’re ready to look at the sigaction()
function. This function is complicated, and we intentionally omit many details that are only for advanced uses. The POSIX standard and the sigaction(2) manpage provide full details, although you must carefully read both to fully absorb everything.
#include <signal.h> POSIX
int sigaction(int signum, const struct sigaction *act, struct sigaction *oldact);
The arguments are as follows:
int signum
The signal of interest, as with the other signal handling functions.
const struct sigaction *act
The new handler specification for signal signum.
struct sigaction *oldact
The current handler specification. If not NULL
, the system fills in *oldact
before installing *act. *act
can be NULL
, in which case *oldact
is filled in, but nothing else changes.
Thus, sigaction()
both sets the new handler and retrieves the old one, in one shot. The struct sigaction
looks like this:
/* NOTE: Order in struct may vary. There may be other fields too! */ struct sigaction { sigset_t sa_mask; Additional signals to block int sa_flags; Control behavior void (*sa_handler) (int); May be union with sa_sigaction void (*sa_sigaction) (int, siginfo_t *, void *); May be union with sa_handler }
sigset_t sa_mask
void (*sa_handler) (int)
A pointer to a “traditional” handler function. It has the same signature (return type and parameter list) as the handler functions for signal(), bsd_signal()
, and sigset()
.
void (*sa_sigaction) (int, siginfo_t *, void *)
A pointer to a “new style” handler function. The function takes three arguments, as described shortly.
Which of act->sa_handler
and act->sa_sigaction
is used depends on the SA_SIGINFO
flag in act->sa_flags
. When present, act->sa_sigaction
is used; otherwise, act->sa_handler
is used. Both POSIX and the GNU/Linux manpage point out that these two fields may overlap in storage (that is, be part of a union
). Thus, you should never use both fields in the same struct sigaction
.
The sa_flags
field is the bitwise OR of one or more of the flag values listed in Table 10.3.
Table 10.3. Flag values for sa_flags
Meaning | |
---|---|
| This flag is only meaningful for |
| This flag is only meaningful for |
| Normally, the given signal is blocked while the signal handler runs. When one of these flags is set, the given signal is not blocked while the signal handler runs. |
| An alternative name for |
The signal handler takes three arguments. As mentioned, with this flag set, the | |
| This is an advanced feature. Signal handlers can be called, using userprovided memory as an “alternative signal stack.” Such memory is given to the kernel for this use with |
This flag provides the V7 behavior: The signal’s action is reset to its default when the handler is called. | |
| An alternative name for |
| This flag provides BSD semantics: System calls that can fail with |
[*] As far as we could determine, the names |
When the SA_SIGINFO
flag is set in act->sa_flags
, then the act->sa_sigaction
field is a pointer to a function declared as follows:
void action_handler(int sig, siginfo_t *info, void *context) { /* handler body here */ }
The siginfo_t
structure provides a wealth of information about the signal:
/* POSIX 2001 definition. Actual contents likely to vary across systems. */ typedef struct { int si_signo; /* signal number */ int si_errno; /* <errno.h> value if an error */ int si_code; /* signal code; see text */ pid_t si_pid; /* process ID of process that sent signal */ uid_t si_uid; /* real UID of sending process */ void *si_addr; /* address of instruction that faulted */ int si_status; /* exit value, may include death-by-signal */ long si_band; /* band event for SIGPOLL/SIGIO */ union sigval si_value; /* signal value (advanced) */ } siginfo_t;
The si_signo, si_code
, and si_value
fields are available for all signals. The other fields can be members of a union
and thus should be used only for the signals for which they’re defined. There may also be other fields in the siginfo_t
structure.
Almost all the fields are for advanced uses. The full details are in the POSIX standard and in the sigaction(2) manpage. However, we can describe a straightforward use of the si_code
field.
For SIGBUS, SIGCHLD, SIGFPE, SIGILL, SIGPOLL, SIGSEGV
, and SIGTRAP
, the si_code
field can take on any of a set of predefined values specific to each signal, indicating the cause of the signal. Frankly, the details are a bit overwhelming; everyday code doesn’t really need to deal with them (although we’ll look at the values for SIGCHLD
later on). For all other signals, the si_code
member has one of the values in Table 10.4.
Table 10.4. Signal origin values for si_code
In particular, the SI_USER
value is useful; it allows a signal handler to tell if the signal was sent by raise()
or kill()
(described later). You can use this information to avoid calling raise()
or kill()
a second time.
The third argument to a three-argument signal handler, void *context
, is an advanced feature, not otherwise discussed in this volume.
Finally, to see sigaction()
in use, examine the full text of the signal handler for sort.c
:
2074 static void 2075 sighandler (int sig) 2076 { 2077 #ifndef SA_NOCLDSTOP On old style system ... 2078 signal (sig, SIG_IGN); – Use signal() to ignore sig 2079 #endif – Otherwise, sig automatically blocked 2080 2081 cleanup (); Run cleanup code 2082 2083 #ifdef SA_NOCLDSTOP On POSIX style system ... 2084 { 2085 struct sigaction sigact; 2086 2087 sigact.sa_handler = SIG_DFL; – Set action to default 2088 sigemptyset (&sigact.sa_mask); – No additional signals to block 2089 sigact.sa_flags = 0; – No special action to take 2090 sigaction (sig, &sigact, NULL); – Put it in place 2091 } 2092 #else On old style system ... 2093 signal (sig, SIG_DFL); – Set action to default 2094 #endif 2095 2096 raise (sig); Resend the signal 2097 }
Here is the code in main()
that puts the handler in place:
2214 #ifdef SA_NOCLDSTOP On a POSIX system ... 2215 { 2216 unsigned i; 2217 sigemptyset (&caught_signals); 2218 for (i = 0; i < nsigs; i++) – Block all signals 2219 sigaddset (&caught_signals, sigs[i]); 2220 newact.sa_handler = sighandler; – Signal handling function 2221 newact.sa_mask = caught_signals; — Set process signal mask for handler 2222 newact.sa_flags = 0; – No special flags 2223 } 2224 #endif 2225 2226 { 2227 unsigned i; 2228 for (i = 0; i < nsigs; i++) For all signals ... 2229 { 2230 int sig = sigs[i]; 2231 #ifdef SA_NOCLDSTOP 2232 sigaction (sig, NULL, &oldact); – Retrieve old handler 2233 if (oldact.sa_handler != SIG_IGN) – If not ignoring this signal 2234 sigaction (sig, &newact, NULL); – Install our handler 2235 #else 2236 if (signal (sig, SIG_IGN) != SIG_IGN) 2237 signal (sig, sighandler); – Same logic with old API 2238 #endif 2239 } 2240 }
We note that lines 2216–2219 and 2221 could be replaced with the single call:
sigfillset(& newact.sa_mask);
We don’t know why the code is written the way it is.
Also of interest are lines 2233–2234 and 2236–2237, which show the correct way to check whether a signal is being ignored and to install a handler only if it’s not.
The sigaction()
API and the signal()
API should not be used together for the same signal. Although POSIX goes to great lengths to make it possible to use signal()
initially, retrieve a struct sigaction
representing the disposition from signal()
, and restore it, it’s still a bad idea. Code will be easier to read, write, and understand if you use one API or the other, exclusively.
The sigpending()
system call, described earlier, lets you retrieve the set of signals that are pending, that is, those that have come in, but are not yet delivered because they were blocked:
#include <signal.h> POSIX
int sigpending(sigset_t *set);
Besides unblocking the pending signals so that they get delivered, you may choose to ignore them. Setting the action for a pending signal to SIG_IGN
causes the pending signal to be discarded (even if it was blocked). Similarly, for those signals for which the default action is to ignore the signal, setting the action to SIG_DFL
causes such a pending signal to also be discarded.
As a convenience, the siginterrupt()
function can be used to make functions interruptible for a particular signal or to make them restartable, depending on the value of the second argument. The declaration is:
#include <signal.h> XSI
int siginterrupt(int sig, int flag);
According to the POSIX standard, the behavior of siginterrupt()
is equivalent to the following code:
int siginterrupt(int sig, int flag) { int ret; struct sigaction act; (void) sigaction(sig, NULL, &act); Retrieve old setting if (flag) If flag is true ... act.sa_flags &= ~SA_RESTART; Disable restarting else Otherwise ... act.sa_flags |= SA_RESTART; Enable restarting ret = sigaction(sig, &act, NULL); Put new setting in place return ret; Return result }
The return value is 0
on success or -1
on error.
The traditional Unix function for sending a signal is named kill()
. The name is something of a misnomer; all it does is send a signal. (Often, the result is that the signal’s recipient dies, but that need not be true. However, it’s way too late now to change the name.) The killpg()
function sends a signal to a specific process group. The declarations are:
#include <sys/types.h> POSIX #include <signal.h> int kill(pid_t pid, int sig); int killpg(int pgrp, int sig); XSI
The sig
argument is either a signal name or 0
. In the latter case, no signal is sent, but the kernel still performs error checking. In particular, this is the correct way to verify that a given process or process group exists, as well as to verify that you have permission to send signals to the process or process group. kill()
returns 0
on success and -1
on error; errno
then indicates the problem.
The rules for the pid
value are a bit complicated:
The meanings of pid
for kill()
are similar to those of waitpid()
(see Section 9.1.6.1, “Using POSIX Functions: wait() and waitpid(),” page 306).
The Standard C function raise()
is essentially equivalent to
int raise(int sig) { return kill(getpid(), sig); }
The C standards committee chose the name raise()
because C also has to work in non-Unix environments, and kill()
was considered specific to Unix. It was also a good opportunity to use a more descriptive name for the function.
killpg()
sends a signal to a process group. As long as the pgrp
value is greater than 1
, it’s equivalent to ’kill(-pgrp, sig)
’. The GNU/Linux killpg(2) manpage states that if pgrp
is 0
, the signal is sent to the sending processes’s process group. (This is the same as kill()
.)
As you might imagine, you cannot send signals to arbitrary processes (unless you are the superuser, root
). For ordinary users, the real or effective UID of the sending process must match the real or saved set-user-ID of the receiving process. (The different UIDs are described in Section 11.1.1, “Real and Effective IDs,” page 405.)
However, SIGCONT
is a special case: As long as the receiving process is a member of the same session as the sender, the signal will go through. (Sessions were described briefly in Section 9.2.1, “Job Control Overview,” page 312.) This special rule allows a job control shell to continue a stopped descendant process, even if that stopped process is running with a different user ID.
The System V Release 3 API was intended to remedy the various problems presented by the original V7 signal APIs. The notion of signal blocking, in particular, is an important additional concept.
However, those APIs didn’t go far enough, since they worked on only one signal at a time, leaving wide open plenty of windows through which undesired signals could arrive. The POSIX APIs, by working atomically on multiple signals (the process signal mask, represented programmatically by the sigset_t
type), solves this problem, closing the windows.
The first set of functions we examined manipulate sigset_t
values: sigfillset()
, sigemptyset()
, sigaddset()
, sigdelset()
, and sigismember()
.
The next set works with the process signal mask: sigprocmask()
sets and retrieves the process signal mask. sigpending()
retrieves the set of pending signals, and sigsuspend()
puts a process to sleep, temporarily replacing the process signal mask with the one in its parameter.
The POSIX sigaction()
API is (severely) complicated by the need to supply
Backward-compatible behavior: SA_RESETHAND
and SA_RESTART
in the sa_flags
field.
A choice as to whether or not the received signal is also blocked: SA_NODEFER
for sa_flags
.
The ability to have two different kinds of signal handlers: one-argument or three-argument.
A choice of behaviors for managing SIGCHLD:SA_NOCLDSTOP
and SA_NOCLDWAIT
for sa_flags
.
The siginterrupt()
function is a convenience API for enabling or disabling restartable system calls for a given signal.
Finally, kill()
and killpg()
can be used to send signals, not just to the current process but to other processes as well (permissions permitting, of course).
“THIS IS A TERRIBLE IDEA! SIGNALS ARE NOT MEANT FOR THIS! Just say NO.” —Geoff Collyer—
One of the primary mechanisms for interprocess communication (IPC) is the pipe, which is described in Section 9.3, “Basic Interprocess Communication: Pipes and FIFOs,” page 315. It is possible to use signals for very simple IPC as well.[5] Doing so is rather clumsy; the recipient can only tell that a particular signal came in. While the sigaction()
API does allow the recipient to learn the PID and owner of the process that sent the signal, such information usually isn’t terribly helpful.
As the opening quote indicates, using signals for IPC is almost always a bad idea. We recommend avoiding it if possible. But our goal is to teach you how to use the Linux/Unix facilities, including their negative points, leaving it to you to make an informed decision about what to use.
Signals as IPC may sometimes be the only choice for many programs. In particular, pipes are not an option if two communicating programs were not started by a common parent, and FIFO files may not be an option if one of the communicating programs only works with standard input and output. (One instance in which signals are commonly used is with certain system daemon programs, such as xinetd
, which accepts several signals advising that it should reread its control file, do a consistency check, and so on. See xinetd(8) on a GNU/Linux system, and inetd(8) on a Unix system.)
The typical high-level structure of a signal-based application looks like this:
for (;;) { Wait for signal Process signal }
The original V7 interface to wait for a signal is pause()
:
#include <unistd.h> POSIX
int pause (void);
pause()
suspends a process; it only returns after both a signal has been delivered and the signal handler has returned. pause()
, by definition, is only useful with caught signals—ignored signals are ignored when they come in, and signals with a default action that terminates the process (with or without a core
file) still do so.
The problem with the high-level application structure just described is the process signal
part. When that code is running, you don’t want to have to handle another signal; you want to finish processing the current signal before going on to the next one. One solution is to structure the signal handler to set a flag and check for that flag within the main loop:
volatile sig_atomic_t signal_waiting = 0; /* true if undealt-with signals */
void handler(int sig)
{
signal_waiting = 1;
Set up any other data indicating which signal
}
In the mainline code, check the flag:
for (;;) { if (! signal_waiting) { If another signal came in pause(); This code is skipped signal_waiting = 1; } Determine which signal came in signal_waiting = 0; Process the signal }
Unfortunately, this code is rife with race conditions:
for (;;) { if (! signal_waiting) { <------------------------------ Signal could arrive here, after condition checked! pause(); pause() would be called anyway signal_waiting = 1; } Determine which signal came in <---- A signal here could overwrite global data signal_waiting = 0; Process the signal <---- Same here, especially if multiple signals }
The solution is to keep the signal of interest blocked at all times, except when waiting for it to arrive. For example, suppose SIGINT
is the signal of interest:
void handler(int sig) { /* sig is automatically blocked with sigaction() */ Set any global data about this signal } int main(int argc, char **argv) { sigset_t set; struct sigaction act; ... usual setup, process options, etc. ... sigemptyset(& set); Initialize set to empty sigaddset(& set, SIGINT); Add SIGINT to set sigprocmask(SIG_BLOCK, & set, NULL); Block it act.sa_mask = set; Set up handler act.sa_handler = handler; act.sa_flags = 0; sigaction(sig, & act, NULL); Install it ... Possibly install separate handlers ... For other signals sigemptyset(& set); Reset to empty, allows SIGINT to arrive for (;;) { sigsuspend(& set); Wait for SIGINT to arrive Process signal SIGINT is again blocked here } ... any other code ... return 0; }
The key to this working is that sigsuspend()
temporarily replaces the process signal mask with the one passed in as its argument. This allows SIGINT
to arrive. Once it does, it’s handled; the signal handler returns and then sigsuspend()
returns as well. By the time sigsuspend()
returns, the original process signal mask is back in place.
You can easily extend this paradigm to multiple signals by blocking all signals of interest during main()
and during the signal handlers, and unblocking them only in the call to sigsuspend()
.
Given all this, you should not use pause()
in new code. pause()
is standardized by POSIX primarily to support old code. The same is true of the System V Release 3 sigpause()
. Rather, if you need to structure your application to use signals for IPC, use the sigsuspend()
and sigaction()
APIs exclusively.
Several signals serve special purposes. We describe the most important ones here.
It is often necessary to write programs of the form
while (some condition isn't true) { wait for a while }
This need comes up frequently in shell scripting, for example, to wait until a particular user has logged in:
until who | grep '^arnold' > /dev/null do sleep 10 done
Two mechanisms, one lower level and one higher level, let a running process know when a given number of seconds have passed.
The most basic building block is the alarm()
system call:
#include <unistd.h> POSIX
unsigned int alarm(unsigned int seconds);
After alarm()
returns, the program keeps running. However, when seconds
seconds have elapsed, the kernel sends a SIGALRM
to the process. The default action is to terminate the process, but most likely, you will instead have installed a signal handler for SIGALRM
.
The return value is either 0
, or if a previous alarm had been set, the number of seconds remaining before it would have gone off. However, there is only one such alarm for a process; the previous alarm is canceled and the new one is put in place.
The advantage here is that with your own handler in place, you can do anything you wish when the signal comes in. The disadvantage is that you have to be prepared to work in multiple contexts: that of the mainline program and that of the signal handler.
An easier way to wait a fixed amount of time is with sleep()
:
#include <unistd.h> POSIX
unsigned int sleep(unsigned int seconds);
The return value is 0
if the process slept for the full amount of time. Otherwise, the return value is the remaining time left to sleep. This latter return value can occur if a signal came in while the process was napping.
The sleep()
function is often implemented with a combination of signal(),alarm()
, and pause()
. This approach makes it dangerous to mix sleep()
with your own calls to alarm()
(or the setitimer()
advanced function, described in Section 14.3.3, “Interval Timers: setitimer() and getitimer(),” page 546). To learn about the nanosleep()
function now, see Section 14.3.4, “More Exact Pauses: nanosleep(),” page 550).
Several signals are used to implement job control—the ability to start and stop jobs, and move them to and from the background and foreground. At the user level, you have undoubtedly done this: using CTRL-Z to stop a job, bg
to put it in the background, and occasionally using fg
to move a background or stopped job into the foreground.
Section 9.2.1, “Job Control Overview,” page 312, describes generally how job control works. This section completes the overview by describing the job control signals, since you may occasionally wish to catch them directly:
SIGTSTP
The default action for SIGTSTP
is to stop (suspend) the process. However, you can catch this signal, just like any other. It is a good idea to do so if your program changes the state of the terminal. For example, consider the vi
or Emacs screen editors, which put the terminal into character-at-a-time mode. Upon receipt of SIGTSTP
, they should restore the terminal to its normal line-at-a-time mode, and then suspend themselves.
SIGSTOP
This signal also stops a process, but it cannot be caught, blocked, or ignored. It can be used manually (with the kill
command) as a last resort, or programmatically. For example, the SIGTSTP
handler just discussed, after restoring the terminal’s state, could then use ’raise(SIGSTOP
)’ to stop the process.
SIGTTIN, SIGTTOU
These signals were defined earlier as “background read from tty” and “background write to tty.” A tty is a terminal device. On job control systems, processes running in the background are blocked from reading from or writing to the terminal. When a process attempts either operation, the kernel sends it the appropriate signal. For both of them, the default action is to stop the process. You may catch these signals if you wish, but there is rarely a reason to do so.
SIGCONT
This signal continues a stopped process. It is ignored if the process is not stopped. You can catch it if you wish, but again, for most programs, there’s little reason to do so. Continuing our example, the SIGCONT
handler for a screen editor should put the terminal back into character-at-a-time mode before returning.
When a process is stopped, any other signals sent to it become pending. The exception to this is SIGKILL
, which is always delivered to the process and which cannot be caught, blocked, or ignored. Assuming that signals besides SIGKILL
have been sent, upon receipt of a SIGCONT
, the pending signals are delivered and the process then continues execution after they’ve been handled.
As described in Section 9.1.1, “Creating a Process: fork(),” page 284, one side effect of calling fork()
is the creation of parent-child relationships among processes. A parent process can wait for one or more of its children to die and recover the child’s exit status by one of the wait()
family of system calls.
Dead child processes that haven’t been waited for are termed zombies. Normally, every time a child process dies, the kernel sends a SIGCHLD
signal to the parent process.[6] The default action is to ignore this signal. In this case, zombie processes accrue until the parent does a wait()
or until the parent itself dies. In the latter case, the zombie children are reparented to the init
system process (PID 1), which reaps them as part of its normal work. Similarly, active children are also reparented to init
and will be reaped when they exit.
SIGCHLD
is used for more than death-of-children notification. Any time a child is stopped (by one of the job control signals discussed earlier), SIGCHLD
is also sent to the parent. The POSIX standard indicates that SIGCHLD
“may be sent” when a child is continued as well; apparently there are differences among historical Unix systems.
A combination of flags for the sa_flags
field in the struct sigaction
, and the use of SIG_IGN
as the action for SIGCHLD
allows you to change the way the kernel deals with children stopping, continuing, or dying.
As with signals in general, the interfaces and mechanisms described here are complicated because they have evolved over time.
The simplest thing you can do is to change the action for SIGCHLD
to SIG_IGN
. In this case, children that terminate do not become zombies. Instead, their exit status is thrown away, and they are removed from the system entirely. Another option that produces the same effect is use of the SA_NOCLDWAIT
flag. In code:
/* Old style: */ /* New style: */ struct sigaction sa; sa.sa_handler = SIG_IGN; signal (SIGCHLD, SIG_IGN); sa.sa_flags = SA_NOCLDWAIT; sigemptyset (& sa.sa_mask); sigaction (SIGCHLD, & sa, NULL);
Alternatively, you may only care about child termination and not be interested in simple state changes (stopped, and continued). In this case, use the SA_NOCLDSTOP
flag, and set up a signal handler that calls wait()
(or one of its siblings) to reap the process.
In general, you cannot expect to get one SIGCHLD
per child that dies. You should treat SIGCHLD
as meaning “at least one child has died” and be prepared to reap as many children as possible whenever you process SIGCHLD
.
The following program, ch10-reap1.c
, blocks SIGCHLD
until it’s ready to recover the children.
1 /* ch10-reap1.c --- demonstrate SIGCHLD management, using a loop */ 2 3 #include <stdio.h> 4 #include <errno.h> 5 #include <signal.h> 6 #include <string.h> 7 #include <sys/types.h> 8 #include <sys/wait.h> 9 10 #define MAX_KIDS 42 11 #define NOT_USED -1 12 13 pid_t kids [MAX_KIDS]; 14 size_t nkids = 0;
The kids
array tracks the process IDs of children processes. If an element is NOT_USED
, then it doesn’t represent an unreaped child. (Lines 89–90, below, initialize it.) nkids
indicates how many values in kids
should be checked.
16 /* format_num --- helper function since can't use [sf]printf() */ 17 18 const char *format_num(int num) 19 { 20 #define NUMSIZ 30 21 static char buf[NUMSIZ]; 22 int i; 23 24 if (num <= 0) { 25 strcpy(buf, "0"); 26 return buf; 27 } 28 29 i = NUMSIZ - 1; 30 buf[i--] = ' '; 31 32 /* Generate digits backwards into string. */ 33 do { 34 buf[i--] = (num % 10) + '0'; 35 num /= 10; 36 } while (num > 0); 37 38 return & buf[i+1]; 39 }
Because signal handlers should not call any member of the printf()
family, we provide a simple “helper” function, format_num()
, to turn a decimal signal or PID number into a string. This is primitive, but it works.
41 /* childhandler --- catch SIGCHLD, reap all available children */ 42 43 void childhandler(int sig) 44 { 45 int status, ret; 46 int i; 47 char buf[100]; 48 static const char entered[] = "Entered childhandler "; 49 static const char exited[] = "Exited childhandler "; 50 51 write(1, entered, strlen(entered)); 52 for (i = 0; i < nkids; i++) { 53 if (kids[i] == NOT_USED) 54 continue; 55 56 retry: 57 if ((ret = waitpid(kids[i], & status, WNOHANG)) == kids[i]) { 58 strcpy(buf," reaped process "); 59 strcat(buf, format_num(ret)); 60 strcat(buf, " "); 61 write(1, buf, strlen(buf)); 62 kids[i] = NOT_USED; 63 } else if (ret == 0) { 64 strcpy(buf, " pid "); 65 strcat(buf, format_num(kids[i])); 66 strcat(buf, "not available yet "); 67 write(1, buf, strlen(buf)); 68 } else if (ret == -1 && errno == EINTR) { 69 write(1, " retrying ", 10); 70 goto retry; 71 } else { 72 strcpy(buf, " waitpid() failed: "); 73 strcat(buf, strerror(errno)); 74 strcat(buf, " "); 75 write(1, buf, strlen(buf)); 76 } 77 } 78 write(1, exited, strlen(exited)); 79 }
Lines 51 and 58 print “entered” and “exited” messages, so that we can clearly see when the signal handler is invoked. Other messages start with a leading TAB character.
The main part of the signal handler is a large loop, lines 52–77. Lines 53–54 check for NOT_USED
and continue the loop if the current slot isn’t in use.
Line 57 calls waitpid()
on the PID in the current element of kids
. We supply the WNOHANG
option, which causes waitpid()
to return immediately if the requested child isn’t available. This call is necessary since it’s possible that not all of the children have exited.
Based on the return value, the code takes the appropriate action. Lines 57–62 handle the case in which the child is found, by printing a message and marking the appropriate slot in kids
as NOT_USED
.
Lines 63–67 handle the case in which the requested child is not available. The return value is 0
in this case, so we print a message and keep going.
Lines 68–70 handle the case in which the system call was interrupted. In this case, a goto
back to the waitpid()
call is the cleanest way to handle things. (Since main()
causes all signals to be blocked when the signal handler runs [line 96], this interruption shouldn’t happen. But this example shows you how to deal with all the cases.)
Lines 71–76 handle any other error, printing an appropriate error message.
81 /* main --- set up child-related information and signals, create children */ 82 83 int main(int argc, char **argv) 84 { 85 struct sigaction sa; 86 sigset_t childset, emptyset; 87 int i; 88 89 for (i = 0; i < nkids; i++) 90 kids[i] = NOT_USED; 91 92 sigemptyset (& emptyset); 93 94 sa.sa_flags = SA_NOCLDSTOP; 95 sa.sa_handler = childhandler; 96 sigfillset (& sa.sa_mask); /* block everything when handler runs */ 97 sigaction(SIGCHLD, & sa, NULL); 98 99 sigemptyset(& childset); 100 sigaddset (& childset, SIGCHLD); 101 102 sigprocmask (SIG_SETMASK, & childset, NULL); /* block it in main code */ 103 104 for (nkids = 0; nkids < 5; nkids++) { 105 if ((kids[nkids] = fork()) == 0) { 106 sleep(3); 107 _exit(0); 108 } 109 } 110 111 sleep(5); /* give the kids a chance to terminate */ 112 113 printf("waiting for signal "); 114 sigsuspend (& emptyset); 115 116 return 0; 117 }
Lines 89–90 initialize kids
. Line 92 initializes emptyset
. Lines 94–97 set up and install the signal handler for SIGCHLD
. Note the use of SA_NOCLDSTOP
on line 94, while line 96 blocks all signals when the handler is running.
Lines 99–100 create a signal set representing just SIGCHLD
, and line 102 installs it as the process signal mask for the program.
Lines 104–109 create five child processes, each of which sleeps for three seconds. Along the way, it updates the kids
array and nkids
variable.
Line 111 then gives the children a chance to terminate by sleeping longer than they did. (This doesn’t guarantee that the children will terminate, but the chances are pretty good.)
Finally, lines 113–114 print a message and then pause, replacing the process signal mask that blocks SIGCHLD
with an empty one. This allows the SIGCHLD
signal to come through, in turn causing the signal handler to run. Here’s what happens:
$ ch10-reap1 Run the program waiting for signal Entered childhandler reaped process 23937 reaped process 23938 reaped process 23939 reaped process 23940 reaped process 23941 Exited childhandler
The signal handler reaps all of the children in one go.
The following program, ch10-reap2.c
is similar to ch10-reap1.c.
The difference is that it allows SIGCHLD
to arrive at any time. This behavior increases the chance of receiving more than one SIGCHLD
but does not guarantee it. As a result, the signal handler still has to be prepared to reap multiple children in a loop.
1 /* ch10-reap2.c --- demonstrate SIGCHLD management, one signal per child */ 2 ...unchanged code omitted... 12 13 pid_t kids [MAX_KIDS]; 14 size_t nkids = 0; 15 size_t kidsleft = 0; /* <<< Added */ 16 ...unchanged code for format_num() omitted... 41 42 /* childhandler --- catch SIGCHLD, reap all available children */ 43 44 void childhandler (int sig) 45 { 46 int status, ret; 47 int i; 48 char buf[100]; 49 static const char entered[] = "Entered childhandler "; 50 static const char exited[] = "Exited childhandler "; 51 52 write(1, entered, strlen (entered)); 53 for (i = 0; i < nkids; i++) { 54 if (kids[i] == NOT_USED) 55 continue; 56 57 retry: 58 if ((ret = waitpid(kids[i], & status, WNOHANG)) == kids[i]) { 59 strcpy(buf, " reaped process"); 60 strcat(buf, format_num(ret)); 61 strcat(buf, " "); 62 write(1, buf, strlen(buf)); 63 kids[i] = NOT_USED; 64 kidsleft--; /* <<< Added */ 65 } else if (ret == 0) { ...unchanged code omitted... 80 write(1, exited, strlen(exited)); 81 }
This is identical to the previous version, except we have a new variable, kidsleft
, indicating how many unreaped children there are. Lines 15 and 64 flag the new code.
83 /* main --- set up child-related information and signals, create children */ 84 85 int main(int argc, char **argv) 86 { ...unchanged code omitted... 100 101 sigemptyset (& childset); 102 sigaddset(& childset, SIGCHLD); 103 104 /* sigprocmask(SIG_SETMASK, & childset, NULL); /* block it in main code */ 105 106 for (nkids = 0; nkids < 5; nkids++) { 107 if ((kids[nkids] = fork ()) == 0) { 108 sleep(3); 109 _exit(0); 110 } 111 kidsleft++; /* <<< Added */ 112 } 113 114 /* sleep(5); /* give the kids a chance to terminate */ 115 116 while (kidsleft > 0) { /* <<< Added */ 117 printf("waiting for signals "); 118 sigsuspend(& emptyset); 119 } /* <<< Added */ 120 121 return 0; 122 }
Here too, the code is almost identical. Lines 104 and 114 are commented out from the earlier version, and lines 111, 116, and 119 were added. Surprisingly, when run, the behavior varies by kernel version!
$ uname -a Display system version Linux example1 2.4.20-8 #1 Thu Mar 13 17:54:28 EST 2003 i686 i686 i386 GNU/Linux $ ch10-reap2 Run the program waiting for signals Entered childhandler Reap one child reaped process 2702 pid 2703 not available yet pid 2704 not available yet pid 2705 not available yet pid 2706 not available yet Exited childhandler waiting for signals Entered childhandler And the next reaped process 2703 pid 2704 not available yet pid 2705 not available yet pid 2706 not available yet Exited childhandler waiting for signals Entered childhandler And so on reaped process 2704 pid 2705 not available yet pid 2706 not available yet Exited childhandler waiting for signals Entered childhandler reaped process 2705 pid 2706 not available yet Exited childhandler waiting for signals Entered childhandler reaped process 2706 Exited childhandler
In this example, exactly one SIGCHLD
is delivered per child process! While this is lovely, and completely reproducible on this system, it’s also unusual. On both an earlier and a later kernel and on Solaris, the program receives one signal for more than one child:
$ uname -a Display system version Linux example2 2.4.22-1.2115.nptl #1 Wed Oct 29 15:42:51 EST 2003 i686 i686 i386 GNU/Linux $ ch10-reap2 Run the program waiting for signals Entered childhandler Signal handler only called once reaped process 9564 reaped process 9565 reaped process 9566 reaped process 9567 reaped process 9568 Exited childhandler
The code for ch10-reap2.c
has one important flaw—a race condition. Take another look at lines 106–112 in ch10-reap2.c
. What happens if a SIGCHLD
comes in while this code is running? It’s possible for the kids
array and nkids
and kidsleft
variables to become corrupted: The main code adds in a new process, but the signal handler takes one away.
This piece of code is an excellent example of a critical section; it must run uninterrupted. The correct way to manage this code is to bracket it with calls that first block, and then unblock, SIGCHLD
.
The siginfo_t
structure and three argument signal catcher make it possible to learn what happened to a child. For SIGCHLD
, the si_code
field of the siginfo_t
indicates the reason the signal was sent (child stopped, continued, exited, etc.). Table 10.5 presents the full list of values. All of these are defined as an XSI extension in the POSIX standard.
Table 10.5. XSI si_code
values for SIGCHLD
Value | Meaning |
---|---|
| A stopped child has been continued. |
| Child terminated abnormally and dumped core. |
| Child exited normally. |
| Child was killed by a signal. |
| The child process was stopped. |
| A child being traced has stopped. (This condition occurs if a program is being traced—either from a debugger or for real-time monitoring. In any case, you’re not likely to see it in run-of-the-mill situations.) |
The following program, ch10-status.c
, demonstrates the use of the siginfo_t
structure.
1 /* ch10-status.c --- demonstrate SIGCHLD management, use 3 argument handler */ 2 3 #include <stdio.h> 4 #include <errno.h> 5 #include <signal.h> 6 #include <string.h> 7 #include <sys/types.h> 8 #include <sys/wait.h> 9 10 void manage (siginfo_t *si); 11 ... unchanged code for format_num() omitted ...
Lines 3–8 include standard header files, line 10 declares manage()
, which deals with the child’s status changes, and the format_num()
function is unchanged from before.
37 /* childhandler --- catch SIGCHLD, reap just one child */ 38 39 void childhandler (int sig, siginfo_t *si, void *context) 40 { 41 int status, ret; 42 int i; 43 char buf[100]; 44 static const char entered[] = "Entered childhandler "; 45 static const char exited[] = "Exited childhandler "; 46 47 write(1, entered, strlen(entered)); 48 retry: 49 if ((ret = waitpid(si->si_pid, & status, WNOHANG)) == si->si_pid) { 50 strcpy(buf, " reaped process"); 51 strcat(buf, format_num(si->si_pid)); 52 strcat(buf, " "); 53 write(1, buf, strlen(buf)); 54 manage(si); /* deal with what happened to it */ 55 } else if (ret > 0) { 56 strcpy(buf, " reaped unexpected pid "); 57 strcat(buf, format_num(ret)); 58 strcat(buf, " "); 59 write(1, buf, strlen(buf)); 60 goto retry; /* why not? */ 61 } else if (ret == 0) { 62 strcpy(buf, " pid "); 63 strcat(buf, format_num(si->si_pid)); 64 strcat(buf, " changed status "); 65 write(1, buf, strlen(buf)); 66 manage(si); /* deal with what happened to it */ 67 } else if (ret == -1 && errno == EINTR) { 68 write(1, " retrying ", 10); 69 goto retry; 70 } else { 71 strcpy(buf, " waitpid() failed: "); 72 strcat(buf, strerror(errno)); 73 strcat(buf, " "); 74 write(1, buf, strlen(buf)); 75 } 76 77 write(1, exited, strlen(exited)); 78 }
The signal handler is similar to those shown earlier. Note the argument list (line 39), and that there is no loop.
Lines 49–54 handle process termination, including calling manage()
to print the status.
Lines 55–60 handle the case of an unexpected child dying. This case shouldn’t happen, since this signal handler is passed information specific to a particular child process.
Lines 61–66 are what interest us: The return value is 0 for status changes. manage()
deals with the details (line 66).
Lines 67–69 handle interrupts, and lines 70–75 deal with errors.
80 /* child --- what to do in the child */ 81 82 void child(void) 83 { 84 raise(SIGCONT); /* should be ignored */ 85 raise(SIGSTOP); /* go to sleep, parent wakes us back up */ 86 printf(" ---> child restarted <--- "); 87 exit(42); /* normal exit, let parent get value */ 88 }
The child()
function handles the child’s behavior, taking actions of the sort to cause the parent to be notified.[7] Line 84 sends SIGCONT
, which might cause the parent to get a CLD_CONTINUED
event. Line 85 sends a SIGSTOP
, which stops the process (the signal is uncatchable) and causes a CLD_STOPPED
event for the parent. Once the parent restarts the child, the child prints a message to show it’s active again and then exits with a distinguished exit status.
90 /* main --- set up child-related information and signals, create child */ 91 92 int main(int argc, char **argv) 93 { 94 pid_t kid; 95 struct sigaction sa; 96 sigset_t childset, emptyset; 97 98 sigemptyset (& emptyset); 99 100 sa.sa_flags = SA_SIGINFO; 101 sa.sa_sigaction = childhandler; 102 sigfillset(& sa.sa_mask); /* block everything when handler runs */ 103 sigaction(SIGCHLD, & sa, NULL); 104 105 sigemptyset(& childset); 106 sigaddset(& childset, SIGCHLD); 107 108 sigprocmask(SIG_SETMASK, & childset, NULL); /* block it in main code */ 109 110 if ((kid = fork()) == 0) 111 child(); 112 113 /* parent executes here */ 114 for (;;) { 115 printf("waiting for signals "); 116 sigsuspend(& emptyset); 117 } 118 119 return 0; 120 }
The main()
program sets everything up. Lines 100–103 put the handler in place. Line 100 sets the SA_SIGINFO
flag so that the three-argument handler is used. Lines 105–108 block SIGCHLD
.
Line 110 creates the child process. Lines 113–117 continue in the parent, using sigsuspend()
to wait for signals to come in.
123 /* manage --- deal with different things that could happen to child */ 124 125 void manage(siginfo_t *si) 126 { 127 char buf[100]; 128 129 switch (si->si_code) { 130 case CLD_STOPPED: 131 write(1, " child stopped, restarting ", 27); 132 kill(si->si_pid, SIGCONT); 133 break; 134 135 case CLD_CONTINUED: /* not sent on Linux */ 136 write(1, " child continued ", 17); 137 break; 138 139 case CLD_EXITED: 140 strcpy(buf, " child exited with status "); 141 strcat(buf, format_num(si->si_status)); 142 strcat(buf, " "); 143 write(1, buf, strlen(buf)); 144 exit(0); /* we're done */ 145 break; 146 147 case CLD_DUMPED: 148 write(1, " child dumped ", 14); 149 break; 150 151 case CLD_KILLED: 152 write(1, " child killed ", 14); 153 break; 154 155 case CLD_TRAPPED: 156 write(1, " child trapped ", 15); 157 break; 158 } 159 }
Through the manage()
function, the parent deals with the status change in the child. manage()
is called when the status changes and when the child has exited.
Lines 130–133 handle the case in which the child stopped; the parent restarts the child by sending SIGCONT
.
Lines 135–137 print a notification that the child continued. This event doesn’t happen on GNU/Linux systems, and the POSIX standard uses wishy-washy language about it, merely saying that this event can occur, not that it will.
Lines 139–145 handle the case in which the child exits, printing the exit status. For this program, the parent is done too, so the code exits, although in a larger program, that’s not the right action to take.
The other cases are more specialized. In the event of CLD_KILLED
, the status
value filled in by waitpid()
would be useful in determining more details.
Here is what happens when it runs:
$ ch10-status Run the program waiting for signals Entered childhandler Signal handler entered pid 24279 changed status child stopped, restarting Handler takes action Exited childhandler waiting for signals ---> child restarted <--- From the child Entered childhandler reaped process 24279 Parent's handler reaps child child exited with status 42
Unfortunately, because there is no way to guarantee the delivery of one SIGCHLD
per process, your program has to be prepared to recover multiple children at one shot.
When a program calls fork()
, the signal situation in the child is almost identical to that of the parent. Installed handlers remain in place, blocked signals remain blocked, and so on. However, any signals pending for the parent are cleared for the child, including time left as set by alarm()
. This is straightforward, and it makes sense.
When a process calls one of the exec()
functions, the disposition in the new program is as follows:
Signals set to their default action stay set to their default.
Signals that are ignored stay ignored. SIGCHLD
is a special case. If SIGCHLD
is ignored before the exec()
, it may stay ignored after it. Alternatively, it may be reset to the default action. What actually happens is purposely unspecified by POSIX. (The GNU/Linux manpages don’t state what Linux does, and because POSIX leaves it as unspecified, any code you write that uses SIGCHLD
should be prepared to handle either case.)
Signals that are blocked before the exec()
remain blocked after it. In other words, the new program inherits the process’s existing process signal mask.
Any pending signals (those that have arrived but that were blocked) are cleared. The new program won’t get them.
The time remaining for an alarm()
remains in place. (In other words, if a process sets an alarm and then calls exec()
directly, the new image will eventually get the SIGALRM
. If it does a fork()
first, the parent keeps the alarm setting, while the child, which does the exec()
, does not.)
“Our story so far, Episode III.” | ||
--Arnold Robbins |
Signal handling interfaces have evolved from simple but prone-to-race conditions to complicated but reliable. Unfortunately, the multiplicity of interfaces makes them harder to learn than many other Linux/Unix APIs.
Each signal has an action associated with it. The action is one of the following: ignore the signal; perform the system default action; or call a user-provided handler. The system default action, in turn, is one of the following: ignore the signal; kill the process; kill the process and dump core; stop the process; or continue the process if stopped.
signal()
and raise()
are standardized by ISO C. signal()
manages actions for particular signals; raise()
sends a signal to the current process. Whether signal handlers stay installed upon invocation, or are reset to their default values is up to the implementation. signal()
and raise()
are the simplest interfaces, and they suffice for many applications.
POSIX defines the bsd_signal()
function, which is like signal()
but guarantees that the handler stays installed.
What happens after a signal handler returns varies according to the type of system. Traditional systems (V7, Solaris, and likely others) reset signal dispositions to their default. On those systems, interrupted system calls return -1
, setting errno
to EINTR
. BSD systems leave the handler installed and only return -1
with errno
set to EINTR
when no data were transferred; otherwise, they restart the system call.
GNU/Linux follows POSIX, which is similar but not identical to BSD. If no data were transferred, the system call returns -1/EINTR
. Otherwise, it returns a count of the amount of data transferred. The BSD “always restart” behavior is available in the sigaction()
interface but is not the default.
Signal handlers used with signal()
are prone to race conditions. Variables of type volatile sig_atomic_t
should be used exclusively inside signal handlers. (For expositional purposes, we did not follow this rule in some of our examples.) Similarly, only the functions in Table 10.2 are safe to call from within a signal handler.
The System V Release 3 signal API (lifted from 4.0 BSD) was an initial attempt at reliable signals. Don’t use it in new code.
The POSIX API has multiple components:
the process signal mask, which lists the currently blocked signals,
the sigset_t
type to represent signal masks, and the sigfillset()
, sigemptyset()
, sigaddset()
, sigdelset()
, and sigismember()
functions for working with it,
the sigprocmask()
function to set and retrieve the process signal mask,
the sigpending()
function to retrieve the set of pending signals,
the sigaction()
API and struct sigaction
in all their glory.
These facilities together use signal blocking and the process signal mask to provide reliable signals. Furthermore, through various flags, it’s possible to get restartable system calls and a more capable signal handler that receives more information about the reason for a particular signal (the siginfo_t
structure).
kill()
and killpg()
are the POSIX mechanisms for sending signals. These differ from raise()
in two ways: (1) one process may send a signal to another process or an entire process group (permissions permitting, of course), and (2) sending signal 0
does not send anything but does do the checking. Thus, these functions provide a way to verify the existence of a particular process or process group, and the ability to send it (them) a signal.
Signals can be used as an IPC mechanism, although such use is a poor way to structure your application and is prone to race conditions. If someone holds a gun to your head to make you work that way, use careful signal blocking and the sigaction()
interface to do it correctly.
SIGALRM
and the alarm()
system call provide a low-level mechanism for notification after a certain number of seconds have passed. pause()
suspends a process until any signal comes in. sleep()
uses these to put a process to sleep for a given amount of time: sleep()
and alarm()
should not be used together. pause()
itself opens up race conditions; signal blocking and sigsuspend()
should be used instead.
Job control signals implement job control for shells. Most of the time you should leave them set to their default, but it’s helpful to understand that occasionally it makes sense to catch them.
Catching SIGCHLD
lets a parent know what its children processes are doing. Using ’signal (SIGCHLD, SIG_IGN)
’ (or sigaction()
with SA_NOCLDWAIT
) ignores children altogether. Using sigaction()
with SA_NOCLDSTOP
provides notification only about termination. In the latter case, whether or not SIGCHLD
is blocked, signal handlers for SIGCHLD
should be prepared to reap multiple children at once. Finally, using sigaction()
without SA_NOCLDSTOP
with a three-argument signal handler gives you the reason for receipt for the signal. (Whew!)
After a fork()
, signal disposition in the child remains the same, except that pending signals and alarms are cleared. After an exec()
, it’s a little more complicated—essentially everything that can be left alone is; anything else is reset to its defaults.
Implement bsd_signal()
by using sigaction()
.
If you’re not running GNU/Linux, run ch10-catchint
on your system. Is your system traditional or BSD?
Implement the System V Release 3 functions sighold(), sigrelse()
, sigignore(), sigpause()
, and sigset()
by using sigaction()
and the other related functions in the POSIX API.
Practice your bit-bashing skills. Assuming that there is no signal 0
and that there are no more than 31 signals, provide a typedef
for sigset_t
and write sigemptyset(), sigfillset(), sigaddset(), sigdelset()
, and sigismember()
.
Practice your bit-bashing skills some more. Repeat the previous exercise, this time assuming that the highest signal is 42
.
Now that you’ve done the previous two exercises, find sigemptyset()
et al. in your <signal.h>
header file. (You may have to search for them; they could be in files #included
by <signal.h>
.) Are they macros or functions?
In Section 10.7, “Signals for Interprocess Communication,” page 379, we mentioned that production code should work with the initial process signal mask, adding signals to be blocked and removing them except in the call to sigsuspend()
. Rewrite the example, using the appropriate calls to do this.
Write your own version of the kill
command. The interface should be
kill [ -s signal-name ] pid ...
Without a specific signal, the program should send SIGTERM
.
Why do you think modern shells such as Bash and ksh93
have kill
as a built-in command?
(Hard). Implement sleep()
, using alarm(), signal()
, and pause()
. What if a signal handler for SIGALRM
is already in place?
Experiment with ch10-reap.c
, changing the amount of time each child sleeps and arranging to call sigsuspend()
enough times to reap all the children.
See if you can get ch10-reap2.c
to corrupt the information in kids, nkids
, and kidsleft
. Now add blocking/unblocking around the critical section and see if it makes a difference.
[1] At least one vendor of GNU/Linux distributions disables the creation of core
files “out of the box.” To reenable them, put the line ’ulimit -s -c unlimited
’ into your ~/ .profile
file.
[2] Changing the behavior was a bad idea, thoroughly criticized at the time, but it was too late. Changing the semantics of a defined interface always leads to trouble, as it did here. While especially true for operating system designers, anyone designing a general-purpose library should keep this lesson in mind as well.
[3] Although we are describing read()
, the rules apply to all system calls that can fail with EINTR
, such as those of the wait()
family.
[4] The APIs required linking with a separate library, -ljobs
, in order to be used.
[5] Our thanks to Ulrich Drepper for helping us understand the issues involved.
[6] Historically, BSD systems used the name SIGCHLD
, and this is what POSIX uses. System V had a similar signal named SIGCLD
. GNU/Linux #defines
the latter to be the former—see Table 10.1.
[7] Perhaps child_at_school()
would be a better function name.
18.118.20.231