This appendix describes, in more depth than in the text, some vulnerability classes, exploitation techniques, and common issues that can lead to bugs.
Buffer overflows are memory corruption vulnerabilities that can be categorized by type (also known as generation). Today the most relevant ones are stack buffer overflows and heap buffer overflows. A buffer overflow happens if more data is copied into a buffer or array than the buffer or array can handle. It’s that simple. As the name implies, stack buffer overflows are happening in the stack area of a process memory. The stack is a special memory area of a process that holds both data and metadata associated with procedure invocation. If more data is stuffed in a buffer declared on the stack than that buffer can handle, adjacent stack memory may be overwritten. If the user can control the data and the amount of data, it is possible to manipulate the stack data or metadata to gain control of the execution flow of the process.
The following descriptions of stack buffer overflows are related to the 32-bit Intel platform (IA-32).
Every function of a process that is executed is represented on the stack. The organization of this information is called a stack frame. A stack frame includes the data and metadata of the function, as well as a return address used to find the caller of the function. When a function returns to its caller, the return address is popped from the stack and into the instruction pointer (program counter) register. If you can overflow a stack buffer and then overwrite the return address with a value of your choosing, you get control over the instruction pointer when the function returns.
There are a lot of other possible ways to take advantage of a stack buffer overflow for example, by manipulating function pointers, function arguments, or other important data and metadata on the stack.
Let’s look at an example program:
Example A-1. Example program stackoverflow.c
01 #include <string.h> 02 03 void 04 overflow (char *arg) 05 { 06 char buf[12]; 07 08 strcpy (buf, arg); 09 } 10 11 int 12 main (int argc, char *argv[]) 13 { 14 if (argc > 1) 15 overflow (argv[1]); 16 17 return 0; 18 }
The example program in Example A-1 contains a simple stack buffer overflow. The first command-line argument (line 15) is used as a parameter for the function called overflow()
. In overflow()
, the user-derived data is copied into a stack buffer with a fixed size of 12 bytes (see lines 6 and 8). If we supply more data than the buffer can hold (more than 12 bytes), the stack buffer will overflow, and the adjacent stack data will be overwritten with our input data.
Figure A-1 illustrates the stack layout right before and after the buffer overflow. The stack grows downward (toward lower memory addresses), and the return address (RET) is followed by another piece of metadata called the saved frame pointer (SFP). Below that is the buffer that is declared in the overflow()
function. In contrast to the stack, which grows downward, the data that is filled into a stack buffer grows toward higher memory addresses. If we supply a sufficient amount of data for the first command-line argument, then our data will overwrite the buffer, the SFP, the RET, and adjacent stack memory. If the function then returns, we control the value of RET, which gives us control over the instruction pointer (EIP
register).
To test the program from Example A-1 under Linux (Ubuntu 9.04), I compiled it without stack canary support (see Section C.1):
linux$ gcc -fno-stack-protector -o stackoverflow stackoverflow.c
Then, I started the program in the debugger (see Section B.4 for more information about gdb) while supplying 20 bytes of user input as a command-line argument (12 bytes to fill the stack buffer plus 4 bytes for the SFP plus 4 bytes for the RET):
linux$gdb -q ./stackoverflow
(gdb)run $(perl -e 'print "A"x12 . "B"x4 . "C"x4')
Starting program: /home/tk/BHD/stackoverflow $(perl -e 'print "A"x12 . "B"x4 . "C"x4') Program received signal SIGSEGV, Segmentation fault.0x43434343 in ?? ()
(gdb)info registers
eax 0xbfab9fac −1079271508 ecx 0xbfab9fab −1079271509 edx 0x15 21 ebx 0xb8088ff4 −1207398412 esp 0xbfab9fc0 0xbfab9fc0 ebp 0x42424242 0x42424242 esi 0x8048430 134513712 edi 0x8048310 134513424eip 0x43434343 0x43434343
eflags 0x10246 [ PF ZF IF RF ] cs 0x73 115 ss 0x7b 123 ds 0x7b 123 es 0x7b 123 fs 0x0 0 gs 0x33 51
I gained control over the instruction pointer (see the EIP
register), as the return address was successfully overwritten with the four C
s supplied from the user input (hexadecimal value of the four C
s: 0x43434343
).
I compiled the vulnerable program from Example A-1 without security cookie (/GS
) support under Windows Vista SP2 (see Section C.1):
C:Users kBHD>cl /nologo /GS- stackoverflow.c
stackoverflow.c
Then, I started the program in the debugger (see Section B.2 for more information about WinDbg) while supplying the same input data as in the Linux example above.
As Figure A-2 shows, I got the same result as under Linux: control over the instruction pointer (see the EIP
register).
This was only a short introduction to the world of buffer overflows. Numerous books and white papers are available on this topic. If you want to learn more, I recommend Jon Erickson’s Hacking: The Art of Exploitation, 2nd edition (No Starch Press, 2008), or you can type buffer overflows into Google and browse the enormous amount of material available online.
3.21.100.34