So far, we’ve talked about the Linux kernel from the perspective of writing device drivers. Once you begin playing with the kernel, however, you may find that you want to ``understand it all.'' In fact, you may find yourself passing whole days navigating through the source code and grepping your way through the source tree to uncover the relationships among the different parts of the kernel.
This kind of ``heavy grepping'' is one of the tasks my home computer has been set up to specialize in, and it is an efficient way to retrieve information from the source code. However, acquiring a little knowledge-base before sitting down in front of your preferred shell prompt can be helpful. This chapter presents a quick overview of the Linux kernel source files, based on version 2.0.x. The file layout hasn’t changed much from version to version, although I can’t guarantee that it won’t change in the future. So the following information should be useful, even if not authoritative, for browsing other versions of the kernel.
In this chapter, every pathname is given relative to the
source root (usually /usr/src/linux
), while filenames
with no directory component are assumed to reside in the ``current''
directory--the one being discussed. Header files (when named with
angle brackets--<
and >
) are given
relative to the include
directory of the source tree. I
won’t introduce the Documentation
directory, as its role
should be clear.
The usual way to look at a program is to start where execution begins. As far as Linux is concerned, it’s hard to tell where execution begins--it depends on how you define ``beginning.''
The architecture-independent starting point is start_kernel,
in init/main.c
. This function is invoked from
architecture-specific code, to which it never returns. It is in charge of
spinning the wheel and can thus be considered the ``mother of all
functions,'' the first breath in the computer’s life. Before
start_kernel, there was the chaos.
By the time start_kernel is invoked, the processor has
been initialized, protected mode (if any) has been activated,
the processor is executing at the highest priority
(what is sometimes called ``supervisor mode''), and interrupts are
disabled. The start_kernel function is in charge of initializing
all the kernel data structures. It does this by calling external
functions to perform subtasks, since each setup function is defined
in the appropriate kernel subsystem. start_kernel also calls
parse_options (defined in the same init/main.c
file) to
decode the command line passed by the user or program that booted the system.
The command line (along with memory_start
and
memory_end
) is retrieved from the computer memory by
setup_arch, which, as the name suggests, is
architecture-specific code.
The code in init/main.c
consists mostly of
#ifdef
s. This happens because initialization takes place in
steps, and many of the steps can be run or skipped, depending on the
compile-time configuration of the kernel. Command-line parsing
also depends heavily on conditionals, as many arguments are meaningful
only if a particular driver is present in the kernel being compiled.
Initialization functions called by start_kernel come in two
flavors. Some of the functions take no arguments and return void
,
while the others take two unsigned long
arguments and return
another unsigned long
value. The arguments are the
current values of memory_start
and memory_end
, the bounds of
the not-yet-allocated physical memory; the return value is the new
memory_start
(as you already know, the kernel refers to memory
addresses as unsigned longs). This technique
allows subsystems to allocate a persistent (and contiguous) memory area
at the beginning of physical memory, as outlined in
"Section 7.4" in Chapter 7.
The big disadvantage of this allocation technique is that it
can only happen at boot time and is thus not available to modules
that need a huge memory region suitable for DMA.
After initialization is complete, start_kernel prints the banner
string, which includes the Linux version number and compile time, and then forks
an init
process by calling kernel_thread.
The start_kernel function then continues as task 0 (the so-called ``idle'' task) and calls cpu_idle, which in turn is an endless loop that calls idle. Things work slightly differently at this point for Symmetric Multi-Processor machines, but I won’t describe the differences. The exact behavior of the idle function is architecture-dependent, and a few greps in the sources will take you to the location where you can study its functionality.
18.191.157.186