8.4 Function Parameters And Return Values

In the x86 architecture, the parameters that a function accepts are pushed onto the stack, and the return value is placed in the eax register.

In order to understand the function, let's take an example of a simple C program. When the following program is executed, the main() function calls the test function and passes two integer arguments: 2 and 3. Inside the test function, the value of arguments is copied to the local variables x and y, and the test returns a value of 0 (return value):

int test(int a, int b)
{
int x, y;
x = a;
y = b;
return 0;
}

int main()
{
test(2, 3);
return 0;
}

First, let's see how the statements inside the main() function are translated into assembly instructions:

push 3  ➊
push 2 ➋
call test ➌
add esp, 8 ; after test is exectued, the control is returned here
xor eax, eax

The first three instructions, ➊, ➋, and ➌, represent the function call test(2,3). The arguments (2 and 3) are pushed onto the stack before the function call in the reverse order (from right to left), and the second argument, 3, is pushed before the first argument, 2. After pushing the arguments, the function, test(), is called at ➌; as a result, the address of the next instruction, add esp,8, is pushed onto the stack (this is the return address), and then the control is transferred to the start address of the test function. Let's assume that before executing the instructions ➊, ➋, ➌, the esp (stack pointer) was pointing to the top of the stack at the address 0xFE50. The following diagram depicts what happens before and after executing  ➊, ➋, and ➌:

Now, let's focus on the test function, as shown here:

int test(int a, int b)
{
int x, y;
x = a;
y = b;
return 0;
}

The following is the assembly translation of the test function:

push ebp  ➍
mov ebp, esp ➎
sub esp, 8 ➑
mov eax, [ebp+8]
mov [ebp-4], eax
mov ecx, [ebp+0Ch]
mov [ebp-8], ecx
xor eax, eax ➒
mov esp, ebp ➏
pop ebp ➐
ret ➓

The first instruction ➍ saves the ebp (also called the frame pointer) on the stack; this is done so that it can be restored when the function returns. As a result of pushing the value of ebp onto the stack, the esp register will be decremented by 4. In the next instruction, at ➎, the value of esp is copied into ebp; as a result, both esp and ebp point at the top of the stack, shown as follows. The ebp from now on will be kept at a fixed position, and the application will use ebp to reference function arguments and the local variables:

You will normally find push ebp and mov ebp, esp at the start of most functions; these two instructions are called function prologue. These instructions are responsible for setting up the environment for the function. At ➏ and ➐, the two instructions (mov esp,ebp and pop ebp) perform the reverse operation of function prologue. These instructions are called function epilogue, and they restore the environment after the function is executed.

At ➑, sub esp,8 further decrements the esp register. This is done to allocate space for the local variables (x and y). Now, the stack looks as follows:

Notice that the ebp is still at a fixed position, and function arguments can be accessed at a positive offset from ebp (ebp + some value). The local variables can be accessed at a negative offset from ebp (ebp - some value). For example, in the preceding diagram, the first argument (2) can be accessed at the address ebp+8 (which is the value of a), and the second argument can be accessed at the address ebp+0xc (which is the value of b). The local variables can be accessed at the addresses ebp-4 (local variable x) and ebp-8 (local variable y).

Most compilers (such as Microsoft Visual C/C++ compiler) make use of fixed ebp based stack frames to reference the function arguments and the local variables. The GNU compilers (such as gcc) do not use ebp based stack frames by default, but they make use of a different technique, where the ESP (stack pointer) register is used to reference the function parameters and local variables.

The actual code inside of the function is between ➑ and , which is shown here: 

mov eax, [ebp+8]
mov [ebp-4], eax
mov ecx, [ebp+0Ch]
mov [ebp-8], ecx

We can rename the argument ebp+8 as a and ebp+0Ch as b. The address ebp-4 can be renamed as the variable x, and ebp-8 as the variable y, as shown here:

mov eax, [a]
mov [x], eax
mov ecx, [b]
mov [y], ecx

Using the techniques covered previously, the preceding statements can be translated to the following pseudocode:

x = a
y = b

At ➒, xor eax,eax sets the value of eax to 0. This is the return value (return 0). The return value is always stored in the eax register. The function epilogue instructions at ➏ and ➐ restore the function environment. The instruction mov esp,ebp at ➏ copies the value of ebp into esp; as a result, esp will point to the address where ebp is pointing. The pop ebp at ➐ restores the old ebp from the stack; after this operation, esp will be incremented by 4. After the execution of the instructions at ➏ and , the stack will look like the one shown here:

At ➓, when the ret instruction is executed, the return address on top of the stack is popped out and placed in the eip register. Also, the control is transferred to the return address (which is add esp,8 in the main function). As a result of popping the return address, esp is incremented by 4 At this point, the control is returned to the main function from the test function. The instruction add esp,8 inside of main cleans up the stack, and the esp is returned  to its original position (the address 0xFE50from where we started), as follows. At this point, all of the values on the stack are logically removed, even though they are physically present. This is how the function works:

In the previous example, the main function called the test function and passed the parameters to the test function by pushing them onto the stack (in the right-to-left order). The main function is known as the caller (or the calling function) and test is the callee (or the called function). The main function (caller), after the function call, cleaned up the stack using add esp,8 instruction. This instruction has the effect of removing the parameters that were pushed onto the stack and adjusts the stack pointer (esp) back to where it was before the function call; such a function is said to be using cdecl calling convention. The calling convention dictates how the parameters should be passed and who (caller or the callee) is responsible for removing them from the stack once the called function has completed. Most of the compiled C programs typically follow the cdecl calling convention. In the cdecl convention, the caller pushes the parameters in the right-to-left order on the stack and the caller itself cleans up the stack after the function call. There are other calling conventions such as stdcall and fastcall. In stdcall, parameters are pushed onto the stack (right-to-left order) by the caller and the callee, (called function) is responsible for cleaning up the stack. Microsoft Windows utilizes the stdcall convention for the functions (API) exported by the DLL files. In the fastcall calling convention, first few parameters are passed to a function by placing them in the registers, and any remaining parameters are placed on the stack in right-to-left order and the callee cleans up the stack similar to the stdcall convention. You will typically see 64-bit programs following the fastcall calling convention.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.144.212.145