Chapter 7. Environment Variable and Argument Fuzzing

 

“This foreign policy stuff is a little frustrating.”

 
 --George W. Bush, as quoted by the New York Daily News, April 23, 2002

Local fuzzing is arguably the simplest type of fuzzing. Although many attackers and researchers will have more impressive results exploiting remote and client-side vulnerabilities, local privilege escalation is still an important topic. Even when a remote attack is leveraged to gain access to a targeted machine, local attacks are often used as a secondary attack vector to obtain required privileges.

Introduction to Local Fuzzing

A user can introduce variables into a program in two main ways. Other than the obvious standard input device, which is usually the keyboard, command-line arguments and process-environment variables represent input vectors. We first present command-line arguments as a vector for fuzzing.

Command-Line Arguments

Except for the most sheltered Windows user, everyone has at one time or another experienced a program that has required command-line arguments. Command-line arguments are passed into a program and addressed via the pointer argv, which is declared in the main function of C programs. The variable argc is also passed into the main. It holds the count of arguments that were passed to the program, plus one, because the name of the program as it was invoked is counted as an argument. Let’s go through a few simple examples.

int main(int argc,char *argv[])
{
int ix;
for (ix=0;ix<argc;ix++)
 printf("argv[%d] == %s
",ix,argv[ix]);
}

When we try and run this a few times with varying arguments, we get the results shown in Figure 7.1.

A demonstration of how command-line arguments are stored

Figure 7.1. A demonstration of how command-line arguments are stored

Environment Variables

Another way a user can introduce variables into a process is to use environment variables. Every process contains what is called an environment, which is comprised of environment variables. Environment variables are global values that define the behavior of applications. They can be set or unset by a user, but are typically set to standard values either during a software package installation or by an administrator. Most command interpreters will cause all new processes to inherit the current environment. The command.com shell is an example of a command interpreter in Windows. UNIX systems typically have multiple command interpreters such as sh, csh, ksh, and bash.

Some examples of commonly used environment variables include HOME, PATH, PS1 and USER. These values hold the home directory of the user, the current executable search path, the command prompt, and the current username, respectively. These particular variables are fairly standard; however, many other common variables, including those created by software vendors, are used only in the operation of their applications. When an application requires knowledge of a certain variable it simply uses the getenv function, which specifies the variable name as the argument. Although Windows processes have an environment in the same way the UNIX applications have environments, we focus primarily on the UNIX side of things because Windows does not have a concept of setuid applications, which can be started by an unprivileged user and gain privileges during execution. Figure 7.2 demonstrates what a typical UNIX environment might look like. You can view the current shell environment in the bash shell by typing the command set.

An example of some environment variables used by the bash shell

Figure 7.2. An example of some environment variables used by the bash shell

Each variable in this list can be manipulated by the user using the export command. Armed with an understanding of how command-line arguments and environment variables are used, we can now move on to the basic principles of fuzzing them.

Local Fuzzing Principles

The idea behind environment variable fuzzing and command-line fuzzing is simple: If an environment variable or command-line option contains an unexpected value, how will the application respond when that value is received? Of course, we are only interested in misbehaving privileged applications. This is because local fuzzing requires local access to the machine. Therefore, simply causing an application to crash is of limited value—you would be performing a denial-of-service attack on yourself. There could be some risk if an overflow were discovered in an environment variable that caused the system or a shared application to crash if the system were shared among multiple users. However, what we’re most interested in is finding a buffer overflow in a privileged application that will allow a restricted user to elevate his or her privileges. Finding privileged targets is discussed later in this chapter in the section titled “Finding Targets.”

Many applications are designed to accept command-line arguments from the user when they are invoked. The application then uses this data to determine what actions it should take. A perfect example of this is the ‘su’ application found on nearly all UNIX systems. When users invoke the application with no arguments, it is assumed they would like to authenticate to the root user; however, if the users specify a different username as the first argument, it is understood they would like to switch to that user instead of the root user.

Consider the following C language code, which is a simplification of how the su command might behave differently with different arguments:

int main(int argc,char *argv[])
{
   [...]
   if (argc >1)
    become(argv[1]);
   else
    become("root");
   [...]
}

Command-line arguments and environment variables are essentially just two different vectors for introducing variables into a program. The basic idea behind fuzzing these is simple. What happens when we pass bogus data to an application on the command line? Does this behavior lead to a security risk?

Finding Targets

There are usually only a handful of desirable binary targets on a system when performing local fuzzing. These programs have higher privileges when executed. On UNIX-based systems, these programs are easily recognizable, as they will have the setuid or setgid bits set.

The setuid and setgid bits indicate that when a program runs, it can acquire elevated privileges. In the case of the setuid bit, the process will have the privileges of the owner of the file, and not the person running it. In the case of the setgid bit, the process will have the privileges of the group owner of the file. For example, successful exploitation of a program that is setuid root and setgid staff might yield a shell with those permissions.

It is trivial to construct a list of setuid binaries using the find command, which is a standard tool on UNIX and UNIX-like operating systems. The following command is sufficient to dump a list of all of the setuid binaries on the system. It should be run as root to avoid file system reading errors:

find / -type f -perm -4000 -o -perm -2000

The find command is a very powerful tool that can be used to find very specific types of files, devices, and directories on a file system. In this example, we use just a few of the options the find command supports. The first argument specifies we will be searching the entire system and everything below /, the root directory. The type option tells find that we are only interested in files. This means no symbolic links, directories, or devices will be returned. The -perm options describe the permissions we are interested in. The usage of the -o option allows find to use or logic. If a binary has the setgid bit or the setuid bit set, it will evaluate to true and print the path for that file out. In summary, this command will find all regular files that have either the setuid bit (4) or the setgid (2) bit set. Here is a sample of the output from this command on a default Fedora Core 4 installation:

[root@localhost /]# find / -type f -perm -4000 -o -perm -2000
/bin/traceroute6
/bin/traceroute
/bin/mount
/bin/su
/bin/ping6
/bin/ping
/bin/umount
/usr/bin/lppasswd
/usr/bin/gtali
/usr/bin/wall
/usr/bin/chsh
/usr/bin/passwd
/usr/bin/glines
/usr/bin/gnibbles            ← everyone knows gnibbles is absolutely
/usr/bin/at                        required for a functional system...
/usr/bin/gnotravex
/usr/bin/gnobots2
/usr/bin/sudo
/usr/bin/same-gnome
/usr/bin/gataxx
/usr/bin/rcp
/usr/bin/mahjongg
/usr/bin/iagno
/usr/bin/rlogin
/usr/bin/gnotski
/usr/bin/chage
/usr/bin/lockfile
/usr/bin/write
/usr/bin/gpasswd
/usr/bin/ssh-agent
/usr/bin/crontab
/usr/bin/gnomine
/usr/bin/sudoedit
/usr/bin/chfn
/usr/bin/slocate
/usr/bin/newgrp
/usr/bin/rsh
/usr/X11R6/bin/Xorg
/usr/lib/vte/gnome-pty-helper
/usr/libexec/openssh/ssh-keysign
/usr/sbin/userhelper
/usr/sbin/userisdnctl
/usr/sbin/sendmail.sendmail
/usr/sbin/usernetctl
/usr/sbin/lockdev
/usr/sbin/utempter
/sbin/pam_timestamp_check
/sbin/netreport
/sbin/unix_chkpwd
/sbin/pwdb_chkpwd

UNIX File Permissions Explained

In UNIX, the file permission model allows for three different types of basic access: read, write, and execute. There are also three sets of permissions for each file. They pertain to the user, the group, and those that don’t fit into either (other). In any given situation, only one of these permissions is actually applied. For example, if you own a file, the set of permissions that will be used is that of the user. If you do not own the file, but are in the group that the file is owned by, the group permissions will be applied. For all other cases, the other permissions will be applied. An example follows:

-r-x--x--- 2 dude staff  2048 Jan  2 2002  File

In this example, the user dude owns the file. The permissions this user is allowed include read and execute access. Of course, this user owns the file, so he or she may modify these permissions in any way.

If another member of the staff group attempts to access this file, he or she will only be able to execute this file, but not read it. Attempts to read the file will fail with invalid permissions. Finally, all other users will be denied access as they do not have read, write, or execute access.

In UNIX, a special way of describing absolute file permissions exists. Under this system, permissions are represented in octal form. That is, each combination of permissions has a value from 0 to 7. The read flag has the octal value 4, the write flag has the value 2, and the execute flag has the value 1. These numbers are then added up to get the overall permission. For example, a file that allows read and write access to the user, the group, and other users would be referenced as 666. The example file owned by dude would be represented as 510 -user (5) = .read (4) + execute (1), group (1) = execute (1) and other (0) is null.

A fourth column represents special flags such as the setuid and setgid bits. The setuid bit is represented by a 4 and the setgid by a 2. Therefore, a file that is setuid and setgid might have 6,755 permissions. If the special flag column is left off, it is assumed to be zero, and thus would have no extended permissions.

Local Fuzzing Methods

Environment variables and command-line arguments can be provided easily by a user, and because they are almost always simple ASCII strings, it is practical to perform some basic manual testing against a target. The most trivial test might be setting your HOME variable to a long string, and then running the target to see what happens. This can be accomplished very quickly utilizing Perl, which is available by default on most UNIX systems:

HOME=`perl -e 'print "X"x10000'` /usr/bin/target

This is a very rudimentary way to test the application to see if it can handle a long HOME variable. However, this example assumes that you already knew of an application that utilizes the HOME variable. What if you don’t know what variables are used by the application? How can you determine which ones are used?

Enumerating Environment Variables

At least two automatic methods determine what environment variables a program uses. If the system supports library preloading, you can hook the getenv library call. Providing a new getenv function that performs the standard getenv functionality while also logging the call to a file will effectively record all variables requested by the application. An extension of this method is described in more detail later in this chapter in the section titled “Automating Environment Variable Fuzzing.”

The GNU Debugger (GDB) Method

Another method we can use requires a debugger. Using GDB, one can set a breakpoint inside the getenv function and dump the first argument. An example using GDB scripting to automate this on Solaris 10 follows:

(08:55AM)[user@unknown:~]$gdb -q /usr/bin/id
(no debugging symbols found)...(gdb)
(gdb) break getenv
Function "getenv" not defined.
Make breakpoint pending on future shared library load? (y or [n]) y
Breakpoint 1 (getenv) pending.
(gdb) commands
Type commands for when breakpoint 1 is hit, one per line.
End with a line saying just "end".
>silent
>x/s $i0
>cont
>end
(gdb) r
Starting program: /usr/bin/id
[...]
Breakpoint 2 at 0xff2c4610
Pending breakpoint "getenv" resolved
(no debugging symbols found)...
0xff0a9064:      "LIBCTF_DECOMPRESSOR"
0xff0a9078:      "LIBCTF_DEBUG"
0xff24b940:      "LIBPROC_DEBUG"
0xff351940:      "LC_ALL"
0xff351948:      "LANG"
0xff3518d8:      "LC_CTYPE"
0xff3518e4:      "LC_NUMERIC"
0xff3518f0:      "LC_TIME"
0xff3518f8:      "LC_COLLATE"
0xff351904:      "LC_MONETARY"
0xff351910:      "LC_MESSAGES"
uid=100(user) gid=1(other)

Program exited normally.
(gdb)

If you are not familiar with the commands used in this GDB session, they can be summarized in a few words:

  • The break command sets a breakpoint on a specified function or address. Here, we use it to cause the execution of the program to stop inside the call to getenv.

  • The commands command specifies certain actions that will occur when a breakpoint is hit. In this case, we tell it to be silent and print out the value of the i0 register as a string using x/s. On SPARC, i0 is the register that holds the first argument to the function being entered.

  • The next command simply continues execution so we do not have to tell it to continue after each break. We then use a shortcut for the run command to start execution of the program.

Using this method, we can immediately see the list of 11 environment variables that are requested by the /usr/bin/id program. Note that this method should work on all systems; however, you will need to change the name of the register you are dereferencing as different architectures have different mnemonics for their registers. For example, you might be printing out $eax on x86, $i0 on SPARC, and $a0 on MIPS.

Now that we have covered what variables your target will use, we will explore ways of testing them in a more automated fashion.

Automating Environment Variable Fuzzing

Recall that we briefly mentioned using library preloading in the previous section, “Enumerating Environment Variables;” this is also useful for automated fuzzing. To retrieve the value of an environment variable, it must invoke the getenv function. If we hook the getenv function and return long strings for all calls to it, we do not even need to know a list of variables that are used; we can simply fuzz every one by intercepting all calls to getenv. This is very useful when performing a quick check for unsafe environment variable use.

The following function is a trivial implementation of the getenv function. It uses the global variable environ, which points to the start of the environment. The code simply steps through the environ array and checks to see if the value being requested is within the environ array. If it is, it returns a pointer to the value it holds; if not, it returns NULL to indicate the variable is not set:

extern char **environ;
char *getenv(char *variable)
{
  int ix=0;
  while (environ[ix])
    {
      if ( ! ( strncmp(string,environ[ix],strlen(string))) &&
      (environ[ix][strlen(string)] == '=') )
      {
        printf("%s
",environ[ix]+strlen(string)+1);
        return environ[ix]+strlen(string)+1;
      }
      ix++;
    }

Library Preloading

The next topic we cover is library preloading. Library preloading provides an easy way to hook functions by using the operating system linker to essentially replace functions with user-supplied functions. Although the specifics vary from system to system, the general concept is the same. The user typically sets a certain environment variable to the path of a user-compiled library. The library is then loaded when the program executes. If the library contains symbols that are duplicates of the symbols in the program, they are used instead of the original symbols. When we say symbols, we are primarily referring to functions. For example, if a user builds a library with a function called strcpy , and preloads it when running a binary, the binary will call the user’s version of strcpy instead of the system copy of strcpy . This has many legitimate purposes, such as wrapping calls to do profiling and auditing. It also has some uses for finding vulnerabilities. Consider the uses of wrapping or completely replacing the getenv routine; this routine is used to request environment variable values.

The following function is a simple replacement for getenv that can be used to find simple long string issues. You can force this function to override the real getenv by using library preloading:

#define BUFFSIZE 20000
char *getenv(char *variable)
{
   char buff[BUFFSIZE];
   memset(buff,'A',BUFFSIZE);
   buff[BUFFSIZE-1] = 0x0;
   return buff;
}

It is easy to see that this function returns a long string for every variable request that is made. It does not use the environ array at all as we are not concerned with returning correct values.

This method is used by Dave Aitel’s GPL sharefuzz utility, which has been used to find numerous vulnerabilities in setuid applications. To initiate this trivial fuzz test, simply compile the C code into a shared library and use your operating system’s library preloading functionality (assuming it has such a functionality). For Linux, this can be done in two steps as follows:

gcc -shared -fPIC -o my_getenv.so my_getenv.c
LD_PRELOAD=./my_getenv.so /usr/bin/target

When /usr/bin/target is executed, all calls to getenv will use our modified getenv function.

Detecting Problems

Now that you are familiar with the basic methods of local fuzzing, you will need to know how to recognize when an interesting misbehavior has occurred in your target; many times this is very obvious. For example, the program might crash and print “Segmentation Fault” or some other fatal signal message.

However, as our ultimate goal is automation, we cannot rely on the user recognizing a crash manually. For our purposes, we require a way to do this reliably and programmatically. There are at least two good ways this can be done. The simplest way is by checking the return code of the application. On modern UNIX and Linux systems, if an application terminates due to an unhandled signal, the shell return code will be equal to 128 plus the signal number. For example, a segmentation fault will cause the shell to receive a return code of 139 decimal as SIGSEGV has a value of 11. If the program terminates on an illegal instruction, the shell will receive a return code of 132 as the value of SIGILL is 4. The logic here is simple: If the shell return code of the application is 132 or 139, flag it as a possibly interesting crash.

You also might want to consider the abort signal. SIGABRT is interesting as well due to the introduction of stricter heap checking in newer versions of glibc. Abort is a signal that can be raised in a process to terminate it and dump core. Although the process might abort on heap corruption in a particular case, there are clever ways of getting around this.

Using the shell return code makes sense if you are doing your fuzzing with a hacked together shell script. However, if you are using, for example, a proper fuzzer written in C or some other language, you would want to use the wait or waitpid functions. The general method for local fuzzing in this manner is a simple fork with an execve in the child in tandem with a wait or waitpid in the parent. When used correctly, you can easily determine if the child process crashed by checking the status that is returned via wait or waitpid. A simplified snippet from iFUZZ, a local fuzzing tool, is included in the next chapter to illustrate this method.

If you are concerned about catching signals that might be handled by the application (and thus go undetected by the previous method), there is at least one alternative aside from hooking the signal routine. You will have to use the system’s debugging API to attach to the process and intercept the signals it receives before a signal handler is invoked. For most UNIX operating systems, you will use ptrace for this. The general method here is a fork with a ptrace and execve in the parent in tandem with waitpid and ptrace inside the child in a loop to continuously monitor the processes’ execution and intercept and pass along any signals that might occur. When waitpid returns each time in the child, it means the program has received a signal or has terminated. You will have to check the status returned by waitpid to determine which has occurred. You will also have to explicitly tell the application to continue execution and pass the signal through in most cases. This is also done using ptrace. The implementations in SPIKEfile and notSPIKEfile can be used as references for this general method. These two tools are used for file fuzzing and are explained in detail in Chapter 12, “File Format Fuzzing: Automation on UNIX.” A code snippet is provided in the next chapter that demonstrates this method.

In many cases, the ptrace method is overkill for local fuzzing. Very few setuid UNIX applications make much use of signal handlers for signals like SIGSEGV and SIGILL. Also, once you start using ptrace, you are introducing code that will not necessarily be compatible across different operating systems and architectures. Consider this if you are designing an application that can be used on many platforms without modification.

In the next chapter, we present an implementation of a simple command-line fuzzer that was designed to compile and run on just about any UNIX system with a C compiler. The tool also includes a simple shared library fuzzer for the getenv hooking method.

Summary

Although there is far less glory in the discovery of local vulnerabilities, there is still value in a good privilege escalation bug. We have laid down the foundation to demonstrate various ways of automating discovering these types of vulnerabilities, and in the next chapter implement some of these methods to actually find some bugs.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.137.163.197