Handling Kernel-Space Faults

Version 2.1 of the Linux kernel introduced a great enhancement in the handling of segmentation faults from kernel space. In this section, I’m going to give a quick overview of the principle. The way source code is affected by the new mechanism has already been described in "Section 17.3.”

As suggested earlier, recent versions of the kernel fully exploit the ELF binary format, in particular with regard to its capability to define user-defined sections in the compiled files. The compiler and linker guarantee that every code fragment belonging to the same section will be consecutive in the executable file and therefore in memory when the file is loaded.

Exception handling is implemented by defining two new sections in the kernel executable image (vmlinux). Each time any source code accesses user space via copy_to_user, put_user, or their reading counterparts, some code is added to both of these sections. Although this might look like a non-negligible amount of overhead, one of the outcomes of the new implementation is that there’s no longer any need to use an expensive verify_area mechanism. Moreover, if the user address being used is a correct one, the computational flow will see no jumps at all.

When the user address being accessed is invalid, the hardware issues a page fault. The fault handler (do_page_fault, in the architecture-specific source tree) identifies the fault as an ``incorrect address'' fault (as opposed to ``page not present'') and takes proper action using the following ELF sections:

__ex_table

This section is a table of pointer pairs. The first pointer of each pair refers to an instruction that can fail due to a wrong user-space address, and the second value points to an address where the processor will find a few instructions that deal with this error.

.fixup

This section contains the instructions that deal with each possible error described by the __ex_table section. The second pointer of each pair in the table refers to code that lives in .fixup.

The header file <asm/uaccess.h> takes care of building the needed ELF sections. Each function that accesses user space (such as put_user) expands to assembly instructions that add pointers to __ex_table and handle the error in .fixup.

When the code runs, the actual execution path consists of the following steps: the processor register used for the ``return value'' of the function is initialized to 0 (i.e., no error), data is transferred, and the return value is passed back to the caller. Normal operation is very fast indeed. If an exception occurs, do_page_fault prints a message, looks in __ex_table, and jumps to .fixup, where fixing consists in setting the return value to -EFAULT and jumping back just after the instruction that accessed user space.

The new behavior can be checked by using the faulty module as it appears in the v2.1/misc-modules directory. faulty was described in "Section 4.4" in Chapter 4. faulty’s device node transfers data to user space by reading beyond the bounds of a short buffer, thus causing a page fault when reading to an address above the module’s page. It’s interesting to note that this fault depends on using an incorrect address in kernel space, while in most cases the exception is caused by a faulty user-space address.

When using the cat command to read faulty on a PC, the following messages are printed on the console:

read: inode c1188348, file c16decf0, buf 0804cbd0, count 4096
cat: Exception at [<c28070b7>] (c2807115)

The former line is printed by faulty’s read method, and the latter is printed by the fault handler. The first number is the address of the faulty instruction, while the second is the address of the fixup code (in the .fixup section).

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.118.138.223