Pointers, as a reference data type, are considered low-level because their values are used as memory addresses. A pointer points at a datum, and the actual memory address of the datum is therefore the value of the pointer. The action of using the pointer to access the datum at the defined memory address is called dereferencing. Let's take a look at a sample C program that plays around with pointers and dereferencing, and then a quick peek at the assembly of the compiled program:
#include <stdio.h>
int main(int argc, char **argv)
{
int x = 10;
int *point = &x;
int deref = *point;
printf(" Variable x is currently %d. *point is %d. ", x, deref);
*point = 20;
int dereftwo = *point;
printf("After assigning 20 to the address referenced by point, *point is now %d. ", dereftwo);
printf("x is now %d. ", x);
}
The compiled program generates this output:
Take a look at what happens at key points in the assembled program. Declaring integer x causes a spot in memory to be allocated for it. int x = 10; looks like this in assembly:
mov DWORD PTR [rbp-20], 10
Thus, the value 10 is moved into the 4 byte location at the base pointer, minus 20. Easy enough. (Note that the actual size of the memory allocated for our variable is defined here: DWORD. A double-word is 32 bits, or 4 bytes, long.) But now, check out what happens when we get to int *point = &x; where we declare the int pointer, *point, and assign it the actual memory location of x:
lea rax, [rbp-20]
mov QWORD PTR [rbp-8], rax
The lea instruction means load effective address. Here, the RAX register is the destination, so what's really being said here is put the address of the base pointer minus 20 into the RAX register. Next, the value in RAX is moved to the quadword of memory (8 bytes) at the base pointer minus 8. So far, we set aside 4 bytes of memory at the base pointer minus 20 and placed the integer 10 there. Then, we took the 64-bit address of this integer's location in memory and placed that value into memory at the base pointer minus 8. In short, integer x is now at RBP - 20, and the address at RBP - 20 is now stored as a pointer in RBP - 8.
When we dereference the pointer with int deref = *point;, we see this in assembly:
mov rax, QWORD PTR [rbp-8]
mov eax, DWORD PTR [rax]
mov DWORD PTR [rbp-12], eax
To understand these instructions, let's quickly review the registers. Remember that EAX is a 32-bit register in the IA-32 architecture; it's an extension of the 16-bit AX. RAX is a 64-bit register in the x64 architecture, but recall that, being backward-compatible, it follows the same principle, RAX is an extension of EAX:
The square brackets, [ ], distinguish the contents of a memory location or register. So first, we're putting the quadword value pointed to by RBP - 8 into the RAX register; then, we're loading into the EAX register the DWORD value that RAX is pointing to; finally, the DWORD in EAX is placed in a DWORD-sized chunk of memory at the base pointer minus 12.
Remember that RBP - 8 contained the address of our integer, x. So, as you can see in the assembly code, we managed to get that integer stored in another place in memory by pointing to a pointer that was pointing at our integer.