It’s high time now to begin programming. This chapter is going to introduce all the essential concepts about modules and kernel programming. In these few pages, we’ll build and run a complete module. Building such expertise is an essential foundation for any kind of modularized driver. To avoid throwing in too many concepts, this chapter only talks about modules, without referring to any device class.
All the kernel items (functions, variables, header files, and macros) that are introduced here are described in a reference section at the end of the chapter.
For the impatient reader, the following code is a complete ``Hello, World'' module (which does nothing in particular). This code will compile and run under Linux 2.0 and later versions, but not under 1.2, as explained later in this chapter.[2]
#define MODULE #include <linux/module.h> int init_module(void) { printk("<1>Hello, world "); return 0; } void cleanup_module(void) { printk("<1>Goodbye cruel world "); }
The printk function is defined in the Linux kernel and
behaves similarly to printf; the module can call printk,
because after insmod has loaded it, the module is linked to the kernel
and can access its symbols. The string <1>
is the
priority of the message. I’ve specified a high priority in this
module because a message with the default priority might not show on the
console if you use version 2.0.x of the kernel and an old klogd
daemon (you can ignore this issue for now; we’ll explain it in the section
Section 4.1.1, in Chapter 4).
You can test the module by calling insmod and rmmod, as shown in the screen dump below. Note that only the superuser can load and unload a module.
root#gcc -c hello.c
root#insmod hello.o
Hello, world root#rmmod hello
Goodbye cruel world root#
As you see, writing a module is easy. We’ll go deeper into the topic throughout this chapter.
Before we go further, it’s worth underlining the various differences between a kernel module and an application.
While an application performs a single task from beginning to end, a module registers itself in order to serve future requests, and its ``main'' function terminates immediately. In other words, the task of init_module() (the module’s entry point) is to prepare for later invocation of the module’s functions; it’s as though the module is saying, ``Here I am, and this is what I can do.'' The second entry point of a module, cleanup_module, gets invoked just before the module is unloaded. It should tell the kernel, ``I’m not there any more, don’t ask me to do anything else.'' The ability to unload a module is one of the features of modularization that you’ll most appreciate, because it helps cut down development time; you can test successive versions of your new driver without going through the lengthy shutdown/reboot cycle each time.
As a programmer, you know that an application can call functions it
doesn’t define: the linking stage resolves external references using
the appropriate library of functions.
printf is one of those callable functions and is defined in libc.
A module, on
the other hand, is linked only to the kernel, and the only functions
it can call are the ones exported by the kernel. The printk
function used in hello.c
above, for example, is the version of
printf defined within the kernel and exported to modules; it
behaves exactly like the original function, except that it has no
floating-point support.
Figure 2.1 shows how function calls and function pointers are used in a module to add new functionality to a running kernel.
Since no library is linked to modules, source files should
never include the usual header files. Anything related to the
kernel is declared in headers found in /usr/include/linux
and
/usr/include/asm
. The header files that reside in these
directories are also used indirectly when compiling applications;
kernel code is thus protected by #ifdef __KERNEL__
.
The two directories of kernel headers are usually symbolic links
to the place where the kernel sources reside. If you don’t want the
complete Linux source tree on your system, you still need at least these
two directories of header files. In recent kernels, you
also find the net
and scsi
header directories in the
kernel sources,
but it’s very unusual for modules to need them.
The role of the kernel headers will be introduced later, as each of them is needed.
Kernel modules also differ from applications in requiring that you watch out for ``namespace pollution.'' When writing small programs, programmers frequently don’t care about the program’s namespace, but this causes problems when the small programs are going to become part of a huge application. Namespace pollution is what happens when there are many functions and global variables, and their names aren’t meaningful enough to be easily distinguished. The programmer who is forced to deal with such an application expends much mental energy just to remember the ``reserved'' names and to find unique names for new symbols.
We can’t afford to fall into such an error when writing kernel code,
because even the smallest module is going to be linked to the whole
kernel. The best approach to prevent namespace pollution is to declare
all your symbols as static
and to use a well-defined prefix
for the symbols you leave global. Alternatively, you can avoid
declaring static
symbols by declaring a symbol table, as
described in Section 2.3.1, later in this chapter. Using
the chosen prefix even for private symbols within the module can
sometimes simplify debugging. Prefixes used in the kernel are, by
convention, all lowercase, and we’ll stick to the same convention.
The last difference between kernel programming and application programming is in how faults are handled: while a segmentation fault is harmless during application development and a debugger can always be used to trace the error to the problem in the source code, a kernel fault is fatal at least for the current process, if not for the whole system. We’ll see how to trace kernel errors in Chapter 4, in the section Section 4.4.
We can summarize our discussion by saying that a module runs in the so-called ``kernel space,'' while applications run in ``user space.'' This concept is at the basis of operating systems theory.
The role of the operating system, in practice, is to provide programs with a consistent view of the computer’s hardware. In addition, the operating system must account for independent operation of programs and protection against unauthorized access to resources. This non-trivial task is only possible if the CPU enforces protection of system software from the applications.
Every modern processor is able to enforce this behavior. The chosen approach is to implement different operating modalities (or levels) in the CPU itself. The levels have different roles, and some operations are disallowed at the lowest levels; program code can switch from one level to another only through a limited number of ``gates.'' Unix systems are designed to take advantage of this hardware feature, but they only use two such levels (while, for example, Intel processors have four levels). Under Unix, the kernel executes in the highest level (also called ``supervisor mode''), where everything is allowed, while applications execute in the lowest level (the so-called ``user mode''), where the processor inhibits direct access to hardware and unauthorized access to memory.
As mentioned before, when dealing with software, we usually refer to the execution modes as ``kernel space'' and ``user space,'' with reference to the different memory mappings, and thus the different ``address spaces'' used by program code.
Unix transfers execution from user space to kernel space through system calls and hardware interrupts. Kernel code executing a system call is working in the context of a process—it operates on behalf of the calling process and is able to access data in the process’s address space. Code that handles interrupts, on the other hand, is asynchronous with respect to processes and is not related to any particular process.
The role of a module is to extend kernel functionality; modularized code runs in kernel space. Usually a driver performs both the tasks outlined above: some functions in the module are executed as part of system calls, and some are in charge of interrupt handling.
One of the first questions new kernel programmers ask is how multitasking is managed. Actually, there’s nothing special about multitasking except in the scheduler proper, and the scheduler is beyond the scope of the average programmer’s activity. You can face this task, but module writers don’t need to know anything about it, except to learn the following principles.
Unlike application programs, which run sequentially, the kernel works asynchronously, executing system calls on behalf of the applications. The kernel is in charge of input/output and resource management for every process in the system.
Kernel (and module) functions are completely executed in a single thread, usually in the context of a single process, unless they ``go to sleep''—a driver should be able to support concurrency by allowing the interwoven execution of different tasks. For example, a device can be read by two processes at the same time. The driver responds sequentially to several read calls, each belonging to either process. Since the code needs to keep each flow of data distinct, the kernel (and driver) code must maintain internal data structures to be able to tell the different operations apart. That’s not unlike the way a student keeps track of the interweaving of lessons: a different notebook is devoted to each course. An alternative way of dealing with the problem of multiple access would be to avoid it by prohibiting concurrent access to a device, but this lazy technique isn’t even worth discussing here.
Context switches can’t happen unexpectedly while kernel code is executing, so driver functions don’t need to be reentrant, unless they call schedule by themselves. Functions that must wait for data call sleep_on, which in turn calls schedule. However, you must be careful, as there are other functions that can unexpectedly sleep, notably any access to user space. Making use of ``natural non-preemption'' is generally a bad practice. I won’t deal with reentrant functions until Section 5.2.2, in Chapter 5.
As far as multiple access to the driver is concerned, there are various approaches to keeping things separate, all relying on task-specific data. This data can be either global kernel variables or process-specific arguments to driver functions.
The most important global variable that can be used for tracking
processes is current
: a pointer to struct task_struct
,
which is declared in <linux/sched.h>
. The current
pointer refers to the user process currently executing. During the
execution of a system call, such as open or read, the
current process is the one that invoked the call.[3] Kernel code
can use process-specific information by using current
, if it
needs to do so. An example of this technique is presented in
Section 5.6, in Chapter 5.
The compiler handles current
just like the external reference
printk. A module can refer to current
wherever it
wants and all the
references are resolved by insmod at load time. For
example, the following statement prints the process ID and the command
name of the current process, by accessing certain fields in
struct task_struct
:
printk("The process is "%s" (pid %i) ", current->comm, current->pid);
The command name stored in current->comm
is the basename of
the executable file that was last executed by the current
process.
[2] This example, and all the others presented in this book, are available on the O’Reilly FTP site, as explained in Chapter 1.
[3] In version 2.0, current
is a macro that expands to
current_set[
this_cpu
]
,
to be SMP-compliant. 2.1.37 optimized access to current
by storing the value in the stack, thus removing any global
symbol.
18.191.5.239