Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

J. Van HoeyBeginning x64 Assembly Programminghttps://doi.org/10.1007/978-1-4842-5076-1_8

8. Memory

Jo Van Hoey¹

(1)

Hamme, Belgium

Memory is used by the processor as a storage room for data and instructions. We have already discussed registers, which are high-speed access storage places. Accessing memory is a lot slower than accessing registers. But the number of registers is limited. The memory size has a theoretical limit of 2⁶⁴ addresses, which is 18,446,744,073,709,551,616, or 16 exabytes. You cannot use that much memory because of practical design issues! It is time to investigate memory in more detail.

Exploring Memory

Listing 8-1 shows an example we will use during our discussion of memory.

; memory.asm

section .data

bNum db 123

wNum dw 12345

warray times 5 dw 0 ; array of 5 words

; containing 0

dNum dd 12345

qNum1 dq 12345

text1 db "abc",0

qNum2 dq 3.141592654

text2 db "cde",0

section .bss

bvar resb 1

dvar resd 1

wvar resw 10

qvar resq 3

section .text

global main

main:

push rbp

mov rbp, rsp

lea rax, [bNum] ;load address of bNum in rax

mov rax, bNum ;load address of bNum in rax

mov rax, [bNum] ;load value at bNum in rax

mov [bvar], rax ;load from rax at address bvar

lea rax, [bvar] ;load address of bvar in rax

lea rax, [wNum] ;load address of wNum in rax

mov rax, [wNum] ;load content of wNum in rax

lea rax, [text1] ;load address of text1 in rax

mov rax, text1 ;load address of text1 in rax

mov rax, text1+1 ;load second character in rax

lea rax, [text1+1] ;load second character in rax

mov rax, [text1] ;load starting at text1 in rax

mov rax, [text1+1] ;load starting at text1+1 in rax

mov rsp,rbp

pop rbp

ret

Listing 8-1

memory.asm

Make this program. There is no output for this program; use a debugger to step through each instruction. SASM is helpful here.

We defined some variables of different sizes, including an array of five double words filled with zeros. We also defined some items in section .bss. Look in your debugger for rsp, the stack pointer; it is a very high value. The stack pointer refers to an address in high memory. The stack is an area in memory used for temporarily storing data. The stack will grow as more data is stored in it, and it will grow in the downward direction, from higher addresses to lower addresses. The stack pointer rsp will decrease every time you put data on the stack. We will discuss the stack in a separate chapter, but remember already that the stack is a place somewhere in high memory. See Figure 8-1.

../images/483996_1_En_8_Chapter/483996_1_En_8_Fig1_HTML.jpg — Figure 8-1
rsp contains an address in high memory

We used the lea instruction , which means “load effective address,” to load the memory address of bNum into rax. We can obtain the same result with mov, without the square brackets around bNum. If we use the square brackets, [ ], with the mov instruction , we are loading the value, not the address at bNum into rax. But we are not loading only bNum into rax. Because rax is a 64-bit (or 8-byte) register, more bytes are loaded into rax. Our bNum is the rightmost byte in rax (little endian); here we are only interested in the register al. When you require rax to contain only the value 123, you would first have to clear rax, as shown here:

xor rax, rax

Then instead of this:

mov rax, [bNum]

use this:

mov al, [bNum]

Be careful about the sizes of data you are moving to and from memory. Look, for instance, at the following:

mov [bvar],rax

With this instruction, you are moving the 8 bytes in rax to the address bvar . If you only intended to write 123 to bvar, you can check with your debugger that you overwrite another 7 bytes in memory (choose type d for bvar in the SASM memory window)! This can introduce nasty bugs in your program. To avoid that, replace the instruction with the following:

mov [bvar],al

When loading content from memory address text1 into rax, note how the value in rax is in little-endian notation. Step through the program to investigate the different instructions, and change values and sizes to see what happens.

There are two ways to load a memory address: mov and lea. Using lea can make your code more readable, as everybody can immediately see that you are handling addresses here. You can also use lea to speed up calculations, but we will not use lea for that purpose here.

Start gdb memory and then disass main and look at the left column with memory addresses (Figure 8-2). Do not forget to first delete the line added by SASM for correct debugging, as we explained in the previous chapter. In our case, the first instruction is located at address 0x4004a0.

../images/483996_1_En_8_Chapter/483996_1_En_8_Fig2_HTML.jpg — Figure 8-2
GDB disassemble main

Now we will use readelf at the command line. Remember that we asked NASM to assemble using the ELF format (see the makefile). readelf is a CLI tool used to obtain more information about the executable file. If you feel the irresistible urge to know more about linkers, here is an interesting source of information:

Linkers and Loaders, John R. Levine, 1999, The Morgan Kaufmann Series in Software Engineering and Programming

Here is a shorter treatment of the ELF format:

https://linux-audit.com/elf-binaries-on-linux-understanding-and-analysis/

https://www.cirosantilli.com/elf-hello-world/

As you probably guessed, at the CLI you can also type the following:

man elf

For our purposes, at the CLI type the following:

readelf --file-header ./memory

You will get some general information about our executable memory. Look at Entry point address: 0x4003b0. That is the memory location of the start of our program. So, between the program entry and the start of the code, as shown in GDB (0x4004a0), there is some overhead. The header provides us with additional information about the OS and the executable code. See Figure 8-3.

../images/483996_1_En_8_Chapter/483996_1_En_8_Fig3_HTML.jpg — Figure 8-3
readelf header

readelf is convenient for exploring a binary executable. Figure 8-4 shows some more examples.

../images/483996_1_En_8_Chapter/483996_1_En_8_Fig4_HTML.jpg — Figure 8-4
readelf symbols

With grep we specify that we are looking for all lines with the word main in it. Here you see that the main function starts at 0x4004a0, as we saw in GDB. In the following example, we look in the symbols table for every occurrence of the label start. We see the start addresses of section .data, section .bss, and the start of the program itself. See Figure 8-5.

../images/483996_1_En_8_Chapter/483996_1_En_8_Fig5_HTML.jpg — Figure 8-5
readelf symbols

Let’s see what we have in memory with the instruction, as shown here:

readelf --symbols ./memory |tail +10|sort -k 2 -r

The tail instruction ignores some lines that are not interesting to us right now. We sort on the second column (the memory addresses) in reverse order. As you see, some basic knowledge of Linux commands comes in handy!

The start of the program is at some low address, and the start of main is at 0x004004a0. Look for the start of section .data, (0x00601018), with the addresses of all its variables and the start of section .bss, (0x00601051), with the addresses reserved for its variables.

Let’s summarize our findings: we found at the beginning of this chapter that the stack is in high memory (see rsp). With readelf, we found that the executable code is at the lower side of memory. On top of the executable code, we have section .data and on top of that section .bss. The stack in high memory can grow; it grows in the downward direction toward section .bss. The available free memory between the stack and the other sections is called the heap.

The memory in section .bss is assigned at runtime; you can easily check that. Take note of the size of the executable, and then change, for example, the following:

qvar resq 3

to the following:

qvar resq 30000

Rebuild the program and look again at the size of the executable. The size will be the same, so no additional memory is reserved at assembly/link time. See Figure 8-6.

../images/483996_1_En_8_Chapter/483996_1_En_8_Fig6_HTML.jpg — Figure 8-6
Output of readelf --symbols ./memory |tail +10|sort -k 2 -r

To summarize, Figure 8-7 shows how the memory looks when an executable is loaded.

../images/483996_1_En_8_Chapter/483996_1_En_8_Fig7_HTML.jpg — Figure 8-7
Memory map

Why is it important to know about memory structure? It is important to know that the stack grows in the downward direction. When we exploit the stack later in this book, you will need this knowledge. Also, if you are into forensics or malware investigation, being able to analyze memory is an essential skill. We only touched on some basics here; if you want to know more, refer to the previously mentioned sources.

Summary

In this chapter, you learned about the following:

The structure of the process memory
How to avoid overwriting memory unintentionally
How to use readelf to analyze binary code

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for 8. Memory

Create new playlist

Sign In

Sign Up

8. Memory

Exploring Memory

Summary

Table of Contents for
8. Memory