© Jo Van Hoey 2019
J. Van HoeyBeginning x64 Assembly Programminghttps://doi.org/10.1007/978-1-4842-5076-1_8

8. Memory

Jo Van Hoey1 
(1)
Hamme, Belgium
 

Memory is used by the processor as a storage room for data and instructions. We have already discussed registers, which are high-speed access storage places. Accessing memory is a lot slower than accessing registers. But the number of registers is limited. The memory size has a theoretical limit of 264 addresses, which is 18,446,744,073,709,551,616, or 16 exabytes. You cannot use that much memory because of practical design issues! It is time to investigate memory in more detail.

Exploring Memory

Listing 8-1 shows an example we will use during our discussion of memory.
; memory.asm
section .data
      bNum        db    123
      wNum        dw    12345
      warray      times        5 dw 0      ; array of 5 words
                                           ; containing 0
      dNum        dd    12345
      qNum1       dq    12345
      text1       db    "abc",0
      qNum2       dq    3.141592654
      text2       db    "cde",0
section .bss
      bvar  resb  1
      dvar  resd  1
      wvar  resw  10
      qvar  resq  3
section .text
      global main
main:
      push   rbp
      mov   rbp, rsp
      lea   rax, [bNum]      ;load address of bNum in rax
      mov   rax, bNum        ;load address of bNum in rax
      mov   rax, [bNum]      ;load value at bNum in rax
      mov   [bvar], rax      ;load from rax at address bvar
      lea   rax, [bvar]      ;load address of bvar in rax
      lea   rax, [wNum]      ;load address of wNum in rax
      mov   rax, [wNum]      ;load content of wNum in rax
      lea   rax, [text1]     ;load address of text1 in rax
      mov   rax, text1       ;load address of text1 in rax
      mov   rax, text1+1     ;load second character in rax
      lea   rax, [text1+1]   ;load second character in rax
      mov   rax, [text1]     ;load starting at text1 in rax
      mov   rax, [text1+1]   ;load starting at text1+1 in rax
      mov   rsp,rbp
      pop   rbp
      ret
Listing 8-1

memory.asm

Make this program. There is no output for this program; use a debugger to step through each instruction. SASM is helpful here.

We defined some variables of different sizes, including an array of five double words filled with zeros. We also defined some items in section .bss. Look in your debugger for rsp, the stack pointer; it is a very high value. The stack pointer refers to an address in high memory. The stack is an area in memory used for temporarily storing data. The stack will grow as more data is stored in it, and it will grow in the downward direction, from higher addresses to lower addresses. The stack pointer rsp will decrease every time you put data on the stack. We will discuss the stack in a separate chapter, but remember already that the stack is a place somewhere in high memory. See Figure 8-1.
../images/483996_1_En_8_Chapter/483996_1_En_8_Fig1_HTML.jpg
Figure 8-1

rsp contains an address in high memory

We used the lea instruction , which means “load effective address,” to load the memory address of bNum into rax. We can obtain the same result with mov, without the square brackets around bNum. If we use the square brackets, [ ], with the mov instruction , we are loading the value, not the address at bNum into rax. But we are not loading only bNum into rax. Because rax is a 64-bit (or 8-byte) register, more bytes are loaded into rax. Our bNum is the rightmost byte in rax (little endian); here we are only interested in the register al. When you require rax to contain only the value 123, you would first have to clear rax, as shown here:
      xor rax, rax
Then instead of this:
      mov rax, [bNum]
use this:
      mov al, [bNum]
Be careful about the sizes of data you are moving to and from memory. Look, for instance, at the following:
      mov [bvar],rax
With this instruction, you are moving the 8 bytes in rax to the address bvar . If you only intended to write 123 to bvar, you can check with your debugger that you overwrite another 7 bytes in memory (choose type d for bvar in the SASM memory window)! This can introduce nasty bugs in your program. To avoid that, replace the instruction with the following:
      mov [bvar],al

When loading content from memory address text1 into rax, note how the value in rax is in little-endian notation. Step through the program to investigate the different instructions, and change values and sizes to see what happens.

There are two ways to load a memory address: mov and lea. Using lea can make your code more readable, as everybody can immediately see that you are handling addresses here. You can also use lea to speed up calculations, but we will not use lea for that purpose here.

Start gdb memory and then disass main and look at the left column with memory addresses (Figure 8-2). Do not forget to first delete the line added by SASM for correct debugging, as we explained in the previous chapter. In our case, the first instruction is located at address 0x4004a0.
../images/483996_1_En_8_Chapter/483996_1_En_8_Fig2_HTML.jpg
Figure 8-2

GDB disassemble main

Now we will use readelf at the command line. Remember that we asked NASM to assemble using the ELF format (see the makefile). readelf is a CLI tool used to obtain more information about the executable file. If you feel the irresistible urge to know more about linkers, here is an interesting source of information:
  • Linkers and Loaders, John R. Levine, 1999, The Morgan Kaufmann Series in Software Engineering and Programming

As you probably guessed, at the CLI you can also type the following:
man elf
For our purposes, at the CLI type the following:
      readelf --file-header ./memory
You will get some general information about our executable memory. Look at Entry point address: 0x4003b0. That is the memory location of the start of our program. So, between the program entry and the start of the code, as shown in GDB (0x4004a0), there is some overhead. The header provides us with additional information about the OS and the executable code. See Figure 8-3.
../images/483996_1_En_8_Chapter/483996_1_En_8_Fig3_HTML.jpg
Figure 8-3

readelf header

readelf is convenient for exploring a binary executable. Figure 8-4 shows some more examples.
../images/483996_1_En_8_Chapter/483996_1_En_8_Fig4_HTML.jpg
Figure 8-4

readelf symbols

With grep we specify that we are looking for all lines with the word main in it. Here you see that the main function starts at 0x4004a0, as we saw in GDB. In the following example, we look in the symbols table for every occurrence of the label start. We see the start addresses of section .data, section .bss, and the start of the program itself. See Figure 8-5.
../images/483996_1_En_8_Chapter/483996_1_En_8_Fig5_HTML.jpg
Figure 8-5

readelf symbols

Let’s see what we have in memory with the instruction, as shown here:
      readelf --symbols ./memory |tail +10|sort -k 2 -r

The tail instruction ignores some lines that are not interesting to us right now. We sort on the second column (the memory addresses) in reverse order. As you see, some basic knowledge of Linux commands comes in handy!

The start of the program is at some low address, and the start of main is at 0x004004a0. Look for the start of section .data, (0x00601018), with the addresses of all its variables and the start of section .bss, (0x00601051), with the addresses reserved for its variables.

Let’s summarize our findings: we found at the beginning of this chapter that the stack is in high memory (see rsp). With readelf, we found that the executable code is at the lower side of memory. On top of the executable code, we have section .data and on top of that section .bss. The stack in high memory can grow; it grows in the downward direction toward section .bss. The available free memory between the stack and the other sections is called the heap.

The memory in section .bss is assigned at runtime; you can easily check that. Take note of the size of the executable, and then change, for example, the following:
      qvar       resq      3
to the following:
      qvar      resq      30000
Rebuild the program and look again at the size of the executable. The size will be the same, so no additional memory is reserved at assembly/link time. See Figure 8-6.
../images/483996_1_En_8_Chapter/483996_1_En_8_Fig6_HTML.jpg
Figure 8-6

Output of readelf --symbols ./memory |tail +10|sort -k 2 -r

To summarize, Figure 8-7 shows how the memory looks when an executable is loaded.
../images/483996_1_En_8_Chapter/483996_1_En_8_Fig7_HTML.jpg
Figure 8-7

Memory map

Why is it important to know about memory structure? It is important to know that the stack grows in the downward direction. When we exploit the stack later in this book, you will need this knowledge. Also, if you are into forensics or malware investigation, being able to analyze memory is an essential skill. We only touched on some basics here; if you want to know more, refer to the previously mentioned sources.

Summary

In this chapter, you learned about the following:
  • The structure of the process memory

  • How to avoid overwriting memory unintentionally

  • How to use readelf to analyze binary code

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.117.99.152