2.3 Improving Disassembly Using IDA

In this section, we will explore various features of IDA, and you will learn how to combine the knowledge you gained in the previous chapter with the capabilities offered by IDA to enhance the disassembly process. Consider the following trivial program, which copies the content of one local variable to another:

int main()
{
int x = 1;
int y;
y = x;
return 0;
}

After compiling the preceding code and loading it in IDA, the program disassembles to the following:

.text:00401000 ; Attributes: bp-based frame ➊
.text:00401000
.text:00401000 ; ➋ int __cdecl main(int argc, const char **argv, const char **envp)
.text:00401000 ➐ _main proc near
.text:00401000
.text:00401000 var_8= dword ptr -8 ➌
.text:00401000 var_4= dword ptr -4 ➌
.text:00401000 argc= dword ptr 8 ➌
.text:00401000 argv= dword ptr 0Ch ➌
.text:00401000 envp= dword ptr 10h ➌
.text:00401000
.text:00401000 push ebp ➏
.text:00401001 mov ebp, esp ➏
.text:00401003 sub esp, 8 ➏
.text:00401006 mov ➍ [ebp+var_4], 1
.text:0040100D mov eax, [ebp+var_4] ➍
.text:00401010 mov ➎ [ebp+var_8], eax
.text:00401013 xor eax, eax
.text:00401015 mov esp, ebp ➏
.text:00401017 pop ebp ➏
.text:00401018 retn

When an executable is loaded, IDA performs an analysis on every function that it disassembles to determine the layout of the stack frame. Apart from that, it uses various signatures and runs pattern matching algorithms to determine whether a disassembled function matches any of the signatures known to IDA. At ➊, notice how after performing initial analysis, IDA added a comment (the comment starts with a semicolon), that tells you that an ebp based stack frame is used; this means that the ebp register is used to reference the local variables and the function parameters (the details regarding ebp based stack frames were covered while discussing functions in the previous chapter). At ➋, IDA used its robust detection to identify the function as the main function and inserted the function prototype comment. During analysis this feature can be useful to determine, how many parameters are accepted by a function, and their data types.

At ➌, IDA gives you a summary of the stack view; IDA was able to identify the local variables and function arguments. In the main function, IDA identified two local variables, which are automatically named as var_4 and var_8. IDA also tells you that var_4 corresponds to the value -4, and var_8 corresponds to the value -8. The -4 and -8 specify the offset distance from the ebp (frame pointer); this is IDA's way of saying that it has replaced var_4 for -4 and var_8 for -8 in the code. Notice the instructions at ➍,and ➎ you can see that IDA replaced the memory reference [ebp-4] with [ebp+var_4] and [ebp-8] with [ebp+var_8].

If IDA had not replaced the values, then the instructions at ➍, and ➎ would look like the ones shown here, and you'd have to manually label all of these addresses (as covered in the previous chapter). 

.text:00401006    mov dword ptr [ebp-4], 1
.text:0040100D mov eax, [ebp-4]
.text:00401010 mov [ebp-8], eax

The IDA automatically generated dummy names for the variables/arguments and used these names in the code; this saved the manual work of labeling the addresses and made it easy to recognize the local variables and arguments because of the var_xxx and arg_xxx prefixes added by IDA. You can now treat the [ebp+var_4] at ➍ as just [var_4], so the instruction mov [ebp+var_4],1 can be treated as mov [var_4],1, and you can read it as var_4 being assigned the value 1 (in other words, var_4 = 1). Similarly, the instruction mov [ebp+var_8],eax can be treated as mov [var_8],eax (in other words, var_8 = eax); this feature of IDA makes reading assembly code much easier.

The preceding program can be simplified by ignoring function prologue, function epilogue, and the instructions used to allocate space for the local variables at ➏. From the concepts covered in the previous chapter, we know that these instructions are just used for setting up the function environment. After the cleanup, we are left with the following code:

.text:00401006    mov [ebp+var_4], 1
.text:0040100D mov eax, [ebp+var_4]
.text:00401010 mov [ebp+var_8], eax
.text:00401013 xor eax, eax
.text:00401018 retn
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.223.106.232