The rest of this chapter is devoted to writing a complete, though typeless, module. That is, the module will not belong to any of the classes listed in Section 1.3, in Chapter 1. The sample driver shown in this chapter is called skull, short for “Simple Kernel Utility for Loading Localities.” You can reuse the skull source to load your own local code to the kernel, after removing the sample functionality it offers.[4]
Before we deal with the roles of init_module and
cleanup_module, however, we’ll write a Makefile
that
builds object code that the kernel can load.
First, we need to define the __KERNEL__
symbol in the
preprocessor before we include any headers. This symbol is used to
select which parts of the headers are actually used. Applications
end up including kernel headers because libc includes
them,[5] but
the applications don’t need all the kernel prototypes. Therefore,
__KERNEL__
is used to mask the extra ones out via #ifdef
.
Exporting kernel symbols and macros to user-space programs would
greatly contribute to program namespace pollution. If you are compiling
for an SMP (Symmetric Multi-Processor) machine, you also need
to define __SMP__
before including the kernel headers. This
requirement may seem unfriendly, but is going to disappear as
soon as the developers find the right way to be SMP-transparent.
Another important symbol is MODULE
, which must be defined
before including <linux/module.h>
. This symbol is always
defined, except when compiling drivers that are directly linked to the
kernel image. Since none of the drivers covered in this book
are directly linked to the kernel, they all define the symbol.
A module writer must also specify the -O flag to the
compiler, because many functions are declared as inline
in the header files. gcc doesn’t expand
inlines unless optimization is enabled, but it can accept both
the -g and -O options, allowing you to debug code that
uses inline functions.[6]
Finally, in order to prevent unpleasant errors, I suggest that you use the -Wall (all warnings) compiler flag, and also that you fix all errors in the code to eliminate all compiler warnings, even if this requires changing your usual programming style.
All the definitions and flags I’ve
introduced so far are best located within the
CFLAGS
variable used by make.
In addition to a suitable CFLAGS
, the Makefile
being built needs a rule for joining different object files. The
rule is needed only if the module is split into different source files,
but that is not uncommon with modules. The modules are joined
through the ld -r
command, which is not really a linking
operation, even though it uses the linker. This is because
the output is another object file, which
incorporates all the code from the input files. The -r option
means ``relocatable''; the output file is relocatable because it doesn’t
yet embed absolute addresses.
The following Makefile
implements all the features described
above, and it builds a module made up of two source files. If your module
is made up of a single source file, just skip the entry containing ld -r
.
# Change it here or specify it on the "make" commandline INCLUDEDIR = /usr/include CFLAGS = -D__KERNEL__ -DMODULE -O -Wall -I$(INCLUDEDIR) # Extract version number from headers. VER = $(shell awk -F" '/REL/ {print $$2}' $(INCLUDEDIR)/linux/version.h) OBJS = skull.o all: $(OBJS) skull.o: skull_init.o skull_clean.o $(LD) -r $^ -o $@ install: install -d /lib/modules/$(VER)/misc /lib/modules/misc install -c skull.o /lib/modules/$(VER)/misc install -c skull.o /lib/modules/misc clean: rm -f *.o *~ core
The tricky install
rule in the file above is meant to
install the module in a version-dependent directory, as explained
below. The VER
variable in Makefile
is set to the
correct version number, extracted from <linux/version.h>
.
Next, after the module is built, it must be loaded into the kernel. As I’ve already suggested, insmod does the job for you. The program is like ld, as it links any unresolved symbol in the module to the symbol table of the running kernel. Unlike the linker, however, it doesn’t modify the disk file, but rather the in-memory image. insmod accepts a number of command-line options (for details, see the man page), and it can change the value of integer and string variables in your module before linking the module to the current kernel. Thus, if a module is correctly designed, it can be configured at load time; load-time configuration gives the user more flexibility than compile-time configuration, which unfortunately is still used sometimes. Load-time configuration is explained in Section 2.6, later in this chapter.
Interested readers may want to look at how the kernel supports
insmod: it relies on a few system calls defined in
kernel/module.c
. sys_create_module
allocates
kernel memory to hold a module (this memory is allocated
with vmalloc; see
Section 7.3 in Chapter 7), the system call
get_kernel_syms returns the kernel symbol table in order to link
the module, and sys_init_module
copies the relocated object code to
kernel space and calls the module’s initialization function.
If you actually look in the kernel source, you’ll find that the
name of the system calls is prefixed with sys_
. This is true
for all system calls and no other functions; it’s useful to keep this
in mind when grepping for the system calls in the sources.
Bear in mind that your module’s code has to be recompiled for
each version of the kernel that it will
be linked to. Each module defines a symbol called
kernel_version
, which insmod matches against the
version number of the current kernel. Recent kernels
define the symbol for you in <linux/module.h>
(that’s why
hello.c
above didn’t declare it). That also means that if your
module is made up of multiple source files, you only have to include
<linux/module.h>
from one of your sources. When compiling against
Linux 1.2, on the other hand, kernel_version
must be defined in your
sources.
In case of version mismatch, you can still try to load a module against a different kernel version, by specifying the -f (``force'') switch to insmod, but this operation isn’t safe and can fail. It’s also difficult to tell in advance what will happen. Loading can fail because of mismatching symbols, in which case you’ll get an error message, or it can fail because of an internal change in the kernel. If that happens, you’ll get serious errors at run time and possibly a system panic—a good reason to be wary of version mismatches. Actually, version mismatches can be handled more gracefully by using ``versioning'' in the kernel (a topic that is more advanced and is introduced later in Section 11.2 in Chapter 11).
If you want to compile your module for a particular kernel version,
you have to include the specific header files for that kernel
(for example, by declaring a different
INCLUDEPATH
) in the Makefile
above.
In order to deal with version dependency at load time,
insmod follows
a particular search path; if it doesn’t
find the module in the current directory, it looks for it in a
version-dependent directory, and then in /lib/modules/misc
if
that fails. The install
rule in the Makefile
above follows
this convention.
The tricky task is writing code that can be compiled and run on
any kernel version from 1.2.13 to 2.0.x and on.
The interface to
modularization has changed to make setup easier. You can see in
hello.c
above that there’s no need to declare anything, as long
as you deal only with recent kernels. A portable interface, on the
other hand, looks like the following:
#define __NO_VERSION__ /* don't define kernel_version in module.h */ #include <linux/module.h> #include <linux/version.h> char kernel_version [] = UTS_RELEASE;
In 2.0 and newer kernels, the file version.h
is included by module.h
, which also defines
kernel_version
unless __NO_VERSION__
is defined.
The __NO_VERSION__
symbol can also be used if you need to
include <linux/module.h>
in several source files that will be
linked together to form a single module—if you need preprocessor
macros declared in module.h
, for example. Declaring
__NO_VERSION__
before including module.h
prevents
automatic declaration of the string kernel_version
in source
files where you don’t want it (ld -r would complain about the
multiple definition of the symbol). Sample modules in this book
use __NO_VERSION__
to this aim.
Other dependencies based on the kernel version can be solved with
preprocessor conditionals—version.h
defines the
integer macro LINUX_VERSION_CODE
. The macro expands to the
binary representation of the kernel version, one byte for each
part of the version release number.
For example, the code for 1.3.5 is 66309 (i.e.,
0x10305).[7] With this information, you can easily determine what
version of the kernel you are dealing with.
Writing the number in decimal isn’t too practical when you have to check against a particular version. In order to support multiple kernel versions from the same source file, I’ll use the following macro to build a version code from the three component parts of the version number:
#define VERSION_CODE(vers,rel,seq) (((vers)<<16) | ((rel)<<8) | (seq))
[4] I use
the word ``local''
here to denote personal changes to the system, in the good old
Unix tradition of /usr/local
.
[5] This is true for version 5 and previous versions of the library. With version 6 (glibc) this may change, but discussion is not over as I write this.
[6] Note, however, that using any
optimization more than -O2 is risky, as the compiler might
inline functions that are not declared as inline
in the
source. This may be a problem with kernel code, as some functions
expect to find a standard stack layout when they are called.
[7] This allows up to 256 development versions between stable versions.
3.14.6.194