Dereferencing pointers in C and assembly

Pointers, as a reference data type, are considered low-level because their values are used as memory addresses. A pointer points at a datum, and the actual memory address of the datum is therefore the value of the pointer. The action of using the pointer to access the datum at the defined memory address is called dereferencing. Let's take a look at a sample C program that plays around with pointers and dereferencing, and then a quick peek at the assembly of the compiled program:

#include <stdio.h>
int main(int argc, char **argv)
{
int x = 10;
int *point = &x;
int deref = *point;
printf(" Variable x is currently %d. *point is %d. ", x, deref);
*point = 20;
int dereftwo = *point;
printf("After assigning 20 to the address referenced by point, *point is now %d. ", dereftwo);
printf("x is now %d. ", x);
}

The compiled program generates this output:

Our following assembly examples are 64-bit (hence, for example, RBP), but the concepts are the same. However, we're sticking with Intel syntax despite working in Linux, which uses AT&T syntax – this is to stay consistent with the previous chapter's introduction to assembly. Remember, source and destination operands are reversed in AT&T notation!

Take a look at what happens at key points in the assembled program. Declaring integer x causes a spot in memory to be allocated for it. int x = 10; looks like this in assembly:

mov    DWORD PTR [rbp-20], 10

Thus, the value 10 is moved into the 4 byte location at the base pointer, minus 20. Easy enough. (Note that the actual size of the memory allocated for our variable is defined here: DWORD. A double-word is 32 bits, or 4 bytes, long.) But now, check out what happens when we get to int *point = &x; where we declare the int pointer, *point, and assign it the actual memory location of x:

lea    rax, [rbp-20]
mov QWORD PTR [rbp-8], rax

The lea instruction means load effective address. Here, the RAX register is the destination, so what's really being said here is put the address of the base pointer minus 20 into the RAX register. Next, the value in RAX is moved to the quadword of memory (8 bytes) at the base pointer minus 8. So far, we set aside 4 bytes of memory at the base pointer minus 20 and placed the integer 10 there. Then, we took the 64-bit address of this integer's location in memory and placed that value into memory at the base pointer minus 8. In short, integer x is now at RBP - 20, and the address at RBP - 20 is now stored as a pointer in RBP - 8.

When we dereference the pointer with int deref = *point;, we see this in assembly:

mov    rax, QWORD PTR [rbp-8]
mov eax, DWORD PTR [rax]
mov DWORD PTR [rbp-12], eax

To understand these instructions, let's quickly review the registers. Remember that EAX is a 32-bit register in the IA-32 architecture; it's an extension of the 16-bit AXRAX is a 64-bit register in the x64 architecture, but recall that, being backward-compatible, it follows the same principle, RAX is an extension of EAX:

 

The square brackets, [ ], distinguish the contents of a memory location or register. So first, we're putting the quadword value pointed to by RBP - 8 into the RAX register; then, we're loading into the EAX register the DWORD value that RAX is pointing to; finally, the DWORD in EAX is placed in a DWORD-sized chunk of memory at the base pointer minus 12.

Remember that RBP - 8 contained the address of our integer, x. So, as you can see in the assembly code, we managed to get that integer stored in another place in memory by pointing to a pointer that was pointing at our integer.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.217.220.114