© Jo Van Hoey 2019
J. Van HoeyBeginning x64 Assembly Programminghttps://doi.org/10.1007/978-1-4842-5076-1_4

4. Your Next Program: Alive and Kicking!

Jo Van Hoey1 
(1)
Hamme, Belgium
 

Now that you have a firm grasp of GDB and know what an assembly program looks like, let’s add some complexity. In this chapter, we will show how to obtain the length of a string variable. We will show how to print integer and floating-point values using printf. And we will expand your knowledge of GDB commands.

Listing 4-1 contains the example code that we will use to show how we can find the length of a string and how numeric values are stored in memory.
;alive.asm
section .data
      msg1   db    "Hello, World!",10,0       ; string with NL and 0
      msg1Len      equ    $-msg1-1     ; measure the length, minus the 0
      msg2   db    "Alive and Kicking!",10,0  ; string with NL and 0
      msg2Len      equ    $-msg2-1     ; measure the length, minus the 0
      radius dq    357                 ; no string, not displayable?
      pi     dq    3.14                ; no string, not displayable?
section .bss
section .text
      global main
main:
    push           rbp            ; function prologue
    mov            rbp,rsp        ; function prologue
    mov            rax, 1         ; 1 = write
    mov            rdi, 1         ; 1 = to stdout
    mov            rsi, msg1      ; string to display
    mov            rdx, msg1Len   ; length of the string
    syscall                       ; display the string
    mov            rax, 1         ; 1 = write
    mov            rdi, 1         ; 1 = to stdout
    mov            rsi, msg2      ; string to display
    mov            rdx, msg2Len   ; length of the string
    syscall                       ; display the string
    mov            rsp,rbp        ; function epilogue
    pop            rbp            ; function epilogue
    mov            rax, 60        ; 60 = exit
    mov            rdi, 0         ; 0 = success exit code
    syscall                       ; quit
Listing 4-1

alive.asm

Type this program into your favorite editor and save it as alive.asm. Create the makefile containing the lines in Listing 4-2.
#makefile for alive.asm
alive: alive.o
      gcc -o alive alive.o -no-pie
alive.o: alive.asm
      nasm -f elf64 -g -F dwarf alive.asm -l alive.lst
Listing 4-2

makefile for alive.asm

Save this file and quit the editor.

At the command prompt, type make to assemble and build the program and then run the program by typing ./alive at the command prompt. If you see the output shown in Figure 4-1 displayed at the prompt, then everything worked as planned; otherwise, you made some typo or other error. Happy debugging!
../images/483996_1_En_4_Chapter/483996_1_En_4_Fig1_HTML.jpg
Figure 4-1

alive.asm output

Analysis of the Alive Program

In our first program, hello.asm, we put the length of msg, 13 characters, in rdx in order to display msg. In alive.asm, we use a nice feature to calculate the length of our variables, as shown here:
      msg1Len equ $-msg1-1

The $-msg1-1 part means this: take this memory location ($) and subtract the memory location of msg1. The result is the length of msg1. That length, -1 (minus the string-terminating zero), is stored in the constant msg1Len.

Note the use of a function prologue and function epilogue in the code. These are needed for GDB to function correctly, as pointed out in the previous chapter. The prologue and epilogue code will be explained in a later chapter.

Let’s do some memory digging with GDB! Type the following:
      gdb alive
Then at the (gdb) prompt, type the following:
      disassemble main
Figure 4-2 shows the output.
../images/483996_1_En_4_Chapter/483996_1_En_4_Fig2_HTML.jpg
Figure 4-2

alive disassemble

So, on our computer, it seems that variable msg1 sits at memory location 0x601030; you can check that with this:
      x/s 0x601030
Figure 4-3 shows the output.
../images/483996_1_En_4_Chapter/483996_1_En_4_Fig3_HTML.jpg
Figure 4-3

Memory location of msg1

The stands for “new line.” Another way to verify variables in GDB is as follows:
x/s &msg1
Figure 4-4 shows the output.
../images/483996_1_En_4_Chapter/483996_1_En_4_Fig4_HTML.jpg
Figure 4-4

Memory location of msg1

How about the numeric values?
      x/dw      &radius
      x/xw      &radius
Figure 4-5 shows the output.
../images/483996_1_En_4_Chapter/483996_1_En_4_Fig5_HTML.jpg
Figure 4-5

Numeric values

So, you get the decimal and hexadecimal values stored at memory location radius.

For a floating-point variable, use the following:
      x/fg &pi
      x/fx &pi
Figure 4-6 shows the output.
../images/483996_1_En_4_Chapter/483996_1_En_4_Fig6_HTML.jpg
Figure 4-6

Floating-point values

(Notice the floating-point error?)

There is a subtlety that you should be aware of here. To demonstrate, open the alive.lst file that was generated. See Figure 4-7.
../images/483996_1_En_4_Chapter/483996_1_En_4_Fig7_HTML.jpg
Figure 4-7

alive.lst

Look at lines 10 and 11, where on the left you can find the hexadecimal representation of radius and pi. Instead of 0165, you find 6501, and instead of 40091EB851EB851F, you find 1F85EB51B81E0940. So, the bytes (1 byte is two hex numbers) are in reverse order!

This characteristic is called endianness . The big-endian format stores numbers the way we are used to seeing them, with the most significant digits starting at the left. The little-endian format stores the least significant numbers starting at the left. Intel processors use little-endian, and that can be very confusing when looking at hexadecimal code.

Why do they have such strange names like big-endian and little-endian?

In 1726, Jonathan Swift wrote a famous novel, Gulliver’s Travels. In that novel appear two fictional islands, Lilliput and Blefuscu. Inhabitants of Lilliput are at war with the people of Blefuscu about how to break eggs: on the smaller end or on the bigger end. Lilliputs are little endians, preferring to break their eggs on the smaller end. Blefuscus are big endians. Now you see that modern computing has traditions rooted in the distant past!

Take the time to single-step through the program (break main, run, next, next, next…). You can see that GDB steps over the function prologue. Edit the source code, delete the function prologue and epilogue, and re-make the program. Single-step again with GDB. In our case, GDB does refuse to single-step and completely executes the program. When assembling with YASM, another assembler based on NASM, we can safely omit the prologue and epilogue code and step through the code with GDB . Sometimes it is necessary to experiment, tinker, and Google around!

Printing

Our alive program prints these two strings:
      Hello, World!
      Alive and Kicking!

However, there are two other variables that were not defined as strings: radius and pi. Printing these variables is a bit more complex than printing strings. To print these variables in a similar way as we did with msg1 and msg2, we would have to convert the values radius and pi into strings. It is perfectly doable to add code for this conversion into our program, but it would make our small program too complicated at this point in time, so we are going to cheat a little bit. We will borrow printf , a common function, from the program language C and include it in our program. If this is upsetting you, have patience. When you become a more advanced assembler programmer, you can write your own function for converting/printing numbers. Or you could conclude that writing you own printf function is too much waste of time....

To introduce printf in assembler, we will start with a simple program. Modify the first program, hello.asm, as shown in Listing 4-3.
; hello4.asm
extern      printf     ; declare the function as external
section .data
      msg    db   "Hello, World!",0
      fmtstr db   "This is our string: %s",10,0 ; printformat
section .bss
section .text
      global main
main:
      push  rbp
      mov   rbp,rsp
      mov   rdi, fmtstr      ; first argument for printf
      mov   rsi, msg         ; second argument for printf
      mov   rax, 0           ; no xmm registers involved
      call  printf           ; call the function
      mov   rsp,rbp
      pop   rbp
      mov   rax, 60          ; 60 = exit
      mov   rdi, 0           ; 0 = success exit code
      syscall                ; quit
Listing 4-3

hello4.asm

So, we start by telling the assembler (and the linker) that we are going to use an external function called printf. We created a string for formatting how printf will display msg. The syntax for the format string is similar to the syntax in C; if you have any experience with C, you will certainly recognize the format string. %s is a placeholder for the string to be printed.

Do not forget the function prologue and epilogue. Move the address of msg into rsi, and move the address of the fmtstr into rdi. Clear rax, which in this case means there are no floating-point numbers in the xmm registers to be printed. Floating-point numbers and xmm registers will be explained later in Chapter 11.

Listing 4-4 shows the makefile.
#makefile for hello4.asm
hello4: hello4.o
      gcc -o hello4 hello4.o -no-pie
hello4.o: hello4.asm
      nasm -f elf64 -g -F dwarf hello4.asm -l hello4.lst
Listing 4-4

makefile for hello4.asm

Make sure the -no-pie flag is added in the makefile; otherwise, the use of printf will cause an error. Remember from Chapter 1 that the current gcc compiler generates position-independent executable (pie) code to make it more hacker-safe. One of the consequences is that we cannot simply use external functions anymore. To avoid this complication, we use the flag -no-pie.

Build and run the program. Google the C printf function to get an idea of the possible formats. As you will see, with printf we have the flexibility of formatting the output as print integers, floating-point values, strings, hexadecimal data, and so on. The printf function requires that a string is terminated with 0 (NULL). If you omit the 0, printf will display everything until it finds a 0. Terminating a string with a 0 is not a requirement in assembly, but it is necessary with printf, GDB, and also some SIMD instructions (SIMD will be covered in Chapter 26).

Figure 4-8 shows the output.
../images/483996_1_En_4_Chapter/483996_1_En_4_Fig8_HTML.jpg
Figure 4-8

alive.lst

Back to our alive program! With printf we can now print the variables radius and pi . Listing 4-5 shows the source code. By now you know what to do: create the source code, copy or create/modify a makefile, and there you go.
; alive2.asm
section .data
      msg1        db    "Hello, World!",0
      msg2        db    "Alive and Kicking!",0
      radius      dd    357
      pi          dq    3.14
      fmtstr      db    "%s",10,0 ;format for printing a string
      fmtflt      db    "%lf",10,0 ;format for a float
      fmtint      db    "%d",10,0 ;format for an integer
section .bss
section .text
extern     printf
      global main
main:
    push   rbp
    mov    rbp,rsp
; print msg1
    mov    rax, 0            ; no floating point
    mov    rdi, fmtstr
    mov    rsi, msg1
    call   printf
; print msg2
    mov    rax, 0            ; no floating point
    mov    rdi, fmtstr
    mov    rsi, msg2
    call   printf
; print radius
    mov    rax, 0            ; no floating point
    mov    rdi, fmtint
    mov    rsi, [radius]
    call   printf
; print pi
    mov    rax, 1            ; 1 xmm register used
    movq   xmm0, [pi]
    mov    rdi, fmtflt
    call   printf
    mov    rsp,rbp
    pop    rbp
ret
Listing 4-5

makefile for alive2.asm

We added three strings for formatting the printout. Put the format string in rdi, point rsi to the item to be printed, put 0 into rax to indicate that no floating-point numbers are involved, and then call printf. For printing a floating-point number, move the floating-point value to be displayed in xmm0, with the instruction movq . We use one xmm register, so we put 1 into rax. In later chapters, we will talk more about XMM registers for floating-point calculations and about SIMD instructions.

Note the square brackets, [ ], around radius and pi.
      mov rsi, [radius]

This means: take the content at address radius and put it in rsi. The function printf wants a memory address for strings, but for numbers it expects a value, not a memory address. Keep that in mind.

The exit of our program is something new. Instead of the familiar code shown here:
      mov  rax, 60    ; 60 = exit
      mov  rdi, 0     ; 0 = success exit code
      syscall         ; quit
we use the equivalent:
      ret

A warning about printf: printf takes a format string, and that format string can take different forms and can convert the nature of values printed (integer, double, float, etc.). Sometimes this conversion is unintentional and can be confusing. If you really want to know the value of a register or variable (memory location) in your program, use a debugger and examine the register or memory location.

Figure 4-9 shows the output of the alive2 program.
../images/483996_1_En_4_Chapter/483996_1_En_4_Fig9_HTML.jpg
Figure 4-9

alive2 output

Summary

In this chapter, you learned about the following:
  • Additional GDB functionality

  • Function prologue and epilogue

  • Big endian versus small endian

  • Using the C library function printf for printing strings, integers, and floating-point numbers

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.118.141.27