Using the timer interrupt and the value of jiffies
, it’s
easy to generate time intervals that are multiples of the timer tick,
but for smaller delays, the programmer must resort to software loops,
which are introduced last in this section.
Although I’ll show you all the fancy techniques, I think it’s best to become familiar with timing issues by first looking at simple code, though the first implementations I’m going to show are not the best ones.
If you want to delay execution by a multiple of the clock tick or you don’t require strict precision (for example, if you want to delay an integer number of seconds), the easiest implementation (and the most brain-dead) is the following, also known as busy waiting:
unsigned long j = jiffies + jit_delay * HZ; while (jiffies < j) /* nothing */;
This kind of implementation should definitively be avoided.[19] I’m showing it here because on occasion you might want to run this code to understand better the internals of other code (I’ll suggest how to test using busy waiting towards the end of this chapter).
But let’s look at how this code works. The loop is guaranteed
to work because jiffies
is declared as volatile
by
the kernel headers and therefore is reread any time some C code
accesses it. Though ``correct,'' this busy loop
completely locks the computer for the duration of the delay; the scheduler
never interrupts a process that is running in kernel space. Since the
kernel is non-reentrant in the current implementation, a busy loop in
the kernel locks all the processors of an SMP machine.
Still worse, if interrupts happen to be disabled when you enter the loop,
jiffies
won’t be updated, and the while
condition
remains true forever. You’ll be forced to hit the big red button.
This implementation of delaying code is available, like the
following ones, in the jit module.
The /proc/jit*
files created by the
module delay a whole second every time they are read. If you want
to test the busy wait code, you can read /proc/jitbusy
,
which busy-loops for one second
whenever its read method is called; a
command like dd if=/proc/jitbusy bs=1 delays one second
each time it reads a character.
As you may suspect, reading /proc/jitbusy
is terrible for
system performance, as the computer can run other processes only
once a second.
A better solution that allows other processes to run during the time interval is the following, although this method can’t be used in hard real-time tasks or other time-critical situations:
while (jiffies < j) schedule();
The variable j
in this example and the following ones is
the value of jiffies
at the expiration of the delay and is
always calculated as shown for busy waiting.
This loop (which can be tested by reading
/proc/jitsched
), still isn’t optimal. The system can
schedule other tasks; the current process does nothing but release the
CPU, but it remains in the run queue. If it is the only runnable
process, it will actually run (it calls the scheduler, which selects
the same process, which calls the scheduler, which...). In other
words, the load of the machine (the average number of running
processes) will be at least 1, and the idle task (process number 0,
also called ``swapper'' for historical reasons) will never run. Though
this issue may seem irrelevant, running the idle task when the
computer is idle relieves the processor’s workload, decreasing its
temperature and increasing its lifetime, as well as the duration of
the batteries if the computer happens to be your laptop. Moreover,
since the process is actually executing during the delay, it will be
accounted for all the time it consumes. You can see this by running
time cat /proc/jitsched.
Despite its drawbacks, the previous loop can provide a quick and dirty way to monitor the workings of a driver. If a bug in your module locks the system solid, adding a small delay after each debugging printk statement ensures that every message you print before the processor hits your nasty bug reaches the system log before the system locks. Without such delays, the messages are correctly printed to the memory buffer, but the system locks before klogd can do its job.
Arguably, there is a better way to implement delays. The correct
way to put a process to sleep in kernel mode is to set
current->timeout
and sleep on a wait queue. The
timeout
value for the process is compared with
jiffies
every time the scheduler runs. If timeout
is
smaller than or equal to the current time, the process is awakened
independent of what happens to its wait queue.
If no system event wakes the process and takes it off the queue, the
timeout is reached and the scheduler wakes the process.
Here’s what such a delay looks like:
struct wait_queue *wait = NULL; current->timeout = j; interruptible_sleep_on(&wait);
It’s important to call interruptible_sleep_on, not simply
sleep_on, because the timeout
value is never checked
for non-interruptible processes--sleeping can’t
be interrupted, even by timing out. Therefore, if you called sleep_on,
you would have no way to interrupt a sleeping process. You can test
the code shown above by reading /proc/jitqueue
.
The timeout
field is an interesting system resource. It can
be used to implement a timeout for blocking system calls, in
addition to calculating delays. If your hardware
guarantees a response within some predefined time unless an error
occurs, the device driver should set the timeout
value for the
process before it goes to sleep. For example, when you request a data
transfer from or to mass storage, the disk is expected to honor the request
within, say, one second. If you’ve set the timeout and it is reached, the
process is awakened, and the driver can properly handle the missing
transfer. If you use this technique,
the timeout
value should be reset to 0 if the process is
awakened normally. If the timeout expires, the scheduler resets the
field and the driver doesn’t need to.
You may have noticed that using a
wait queue may be overkill when the aim is just to insert a
delay. Actually, you can use current->timeout
without wait
queues, as follows:
current->timeout = j; current->state = TASK_INTERRUPTIBLE; schedule(); current->timeout = 0; /* reset the timeout */
These statements change the status of the process before calling the
scheduler. Making the process TASK_INTERRUPTIBLE
(as opposed to
TASK_RUNNING
), ensures that it won’t be run again until its
timeout expires
(or some other event, like a signal, wakes it). This way of delaying
is implemented in /proc/jitself
--its name emphasizes the fact that
the reading process is ``sleeping by itself,'' without calling sleep_on.
Sometimes a real driver needs to calculate very short delays in
order to synchronize with the hardware. In this case, using the
jiffies
value is definitely not the solution.
The kernel function udelay serves this purpose.[20] Its prototype is:
#include <linux/delay.h> void udelay(unsigned long usecs);
The function is compiled inline on most supported architectures and
uses a software loop to delay execution for the required number of
microseconds. This is where the BogoMips value is used:
udelay uses the integer value loops_per_second
, which in
turn is the result of the BogoMips calculation performed at boot time.
The udelay call should be called only for short time lapses,
because the precision of loops_per_second
is only 8 bits and
noticeable errors accumulate when calculating long delays. Even
though the maximum allowable delay is nearly one second (since
calculations overflow for longer delays), the suggested maximum value
for udelay is 1000 microseconds (one millisecond).
It’s also important to remember that udelay is a
busy-waiting function and that other tasks can’t be run during the time
lapse. The source is in <asm/delay.h>
.
There’s currently no support in the kernel for delays shorter than a timer tick but longer than one millisecond. This is not an issue, because delays need to be just long enough to be noticed by humans or by the hardware. One hundredth of a second is a suitable precision for human-related time intervals, while one millisecond is a long enough delay for hardware activities. If you really need a delay in between, you can easily build a loop around udelay(1000).
3.15.229.113