© Stephen Smith 2020
S. SmithProgramming with 64-Bit ARM Assembly Languagehttps://doi.org/10.1007/978-1-4842-5881-1_2

2. Loading and Adding

Stephen Smith1 
(1)
Gibsons, BC, Canada
 

In this chapter, we will go slowly through the MOV and ADD instructions to lay the groundwork on how they work, especially in the way they handle parameters (operands), so that, in the following chapters, we can proceed at a faster pace as we encounter the rest of the ARM instruction set.

Before getting into the MOV and ADD instructions, we will discuss the representation of negative numbers and the concepts of shifting and rotating bits.

Negative Numbers

In the previous chapter, we discussed how computers represent positive integers as binary numbers, called unsigned integers, but what about negative numbers? Our first thought might be to make one bit represent whether the number is positive or negative. This is simple, but it turns out it requires extra logic to implement, since now the CPU must look at the sign bits, then decide whether to add or subtract and in which order.

It turns out there is a simple representation of negative numbers that works without any special cases or special logic; it is called two’s complement.

About Two’s Complement

The great mathematician John von Neumann, of the Manhattan Project, came up with the idea of the two’s complement representation for negative numbers, in 1945, when working on the Electronic Discrete Variable Automatic Computer (EDVAC) computer—one of the earliest electronic computers.

Two’s complement came about by observing how addition overflows. Consider a 1-byte hexadecimal number like 01. If we add
0x01 + 0xFF = 0x100

(all binary ones) we get 0x100.

However, if we are limited to 1-byte numbers, then the 1 is lost and we are left with 00:
0x01 + 0xFF = 0x00
The mathematical definition of a number’s negative is a number that when added to it makes zero; therefore, mathematically, FF is -1. You can get the two’s complement form for any number by taking
2N - number
where N is the number of bits in our integer. In our example, the two’s complement of 1 is
28 - 1 = 256 - 1 = 255 = 0xFF
This is why it’s called two’s complement. An easier way to calculate the two’s complement is to change all the 1s to 0s and all the 0s to 1s and then add 1. If we do that to 1, we get
0xFE + 1 = 0xFF

Two’s complement is an interesting mathematical oddity for integers, which are limited to having a maximum value of one less than a power of two (which is all computer representations of integers).

Why would we want to represent negative integers this way on computers? As it turns out, this makes addition simple for the computer to execute. Adding signed integers is the same as adding unsigned integers. There are no special cases, all you do is discard the overflow, and everything works out. This means less circuitry is required to perform the addition, and as a result, it can be performed faster. Consider
5 + -3

3 in 1 byte is 0x03 or 0000 0011.

Inverting the bits is
1111 1100
Add 1 to get
1111 1101 = 0xFD
Now add
5 + 0xFD = 0x102 = 2

since we are limited to 1 byte or 8 bits.

Performing these computations by hand is educational, but practically a tool to do this would be handy.

About Gnome Programmer’s Calculator

Fortunately, we have computers to do the conversions and arithmetic for us, but when we see signed numbers in memory, we need to recognize what they are. The Gnome programmer’s calculator can calculate two’s complement for you. Figure 2-1 shows the Gnome calculator representing -3.

Note

The Gnome programmer’s calculator uses 64-bit representations.

../images/494415_1_En_2_Chapter/494415_1_En_2_Fig1_HTML.jpg
Figure 2-1

The Gnome programmer’s calculator calculating the two’s complement of 3

Two’s complement is the standard representation of negative integers; however, just reversing all the bits does have its uses.

About One’s Complement

If we don’t add 1, and just change all the 1s to 0s and vice versa, then this is called one’s complement. There are uses for the one’s complement form, and we will encounter it in how some instructions process their operands.

Now let’s return to the order the bytes that make up an integer are stored in memory.

Big vs. Little Endian

At the end of Chapter 1, “Getting Started,” we saw that the words of our compiled program had their bytes stored in the reverse order to what we might expect they should be stored as. In fact, if we look at a 32-bit representation of 1 stored in memory, it is
01 00 00 00
rather than
00 00 00 01
Most processors pick one format, or the other to store numbers. Motorola and IBM mainframes use what is called big endian, where numbers are stored in the order of most significant digit to least significant digit, in this case
00 00 00 01
Intel processors use little-endian format and store the numbers in reverse order with the least significant digit first, namely:
01 00 00 00
Figure 2-2 shows how the bytes in integers are copied into memory in both little- and big-endian formats. Notice how the bytes end up in the reverse order to each other.
../images/494415_1_En_2_Chapter/494415_1_En_2_Fig2_HTML.jpg
Figure 2-2

How integers are stored in memory in little- vs. big-endian format

The designers of the ARM processor didn’t want to take sides in the little- vs. big-endian debate, so they made the ARM processor support both.

About Bi-endian

The ARM CPU is called bi-endian, because it can do either. Most ARM-based computers use little-endian format. This includes all the systems we’ll cover in this book.

Now let’s look at why most ARM-based computers use little vs. big endian.

Pros of Little Endian

The advantage of little-endian format is that it makes it easy to change the size of integers, without requiring any address arithmetic. If you want to convert a 4-byte integer to a 1-byte integer, you take the first byte. Assuming the integer is in the range of 0–255, and the other three bytes are zero. For example, if memory contains the 4 bytes or word for 1, in little endian, the memory contains
01 00 00 00

If we want the 1-byte representation of this number, we take the first byte; for the 16-bit representation, we take the first two bytes. The key point is that the memory address we use is the same in all cases, saving us an instruction cycle adjusting it.

When we are in the debugger, we will see more representations, and these will be pointed out again as we run into them.

Note

Even though Linux uses little endian, many protocols like TCP/IP used on the Internet use big endian and so require a transformation when moving data from the computer to the outside world.

We’ve looked at how integers are represented and how addition works. It turns out that another useful simple manipulation is shifting the bits right or left and rotating them around inside a register.

Shifting and Rotating

We have 31 64-bit registers and much of programming consists of manipulating the bits in these registers. Two extremely useful bit manipulations are shifting and rotating. Mathematically shifting all the bits left one spot is the same as multiplying by 2, and generally shifting n bits is equivalent to multiplying by 2n. Conversely, shifting bits to the right by n bits is equivalent to dividing by 2n. For example, consider shifting the number 3 left by 4 bits:
0000 0011   (the binary representation of the number 3)
Shift the bits left by 4 bits and we get
0011 0000
which is
0x30 = 3 * 16 = 3 * 24

Now if we shift 0x30 right by 4 bits, we undo what we just did and see how it is equivalent to dividing by 16.

When we shift and rotate, it turns out to be useful to include the carry flag. This means we can do a conditional logic based on the last bit shifted out of the register.

About Carry Flag

When instructions execute, they can optionally set some flags that contain useful information on what happened. Then other instructions can test these flags and process accordingly. One of these is the carry flag. This is normally used when performing addition of larger numbers. If you add two 64-bit numbers and the result is larger than 64 bits, the carry flag is set. We’ll see how to use this when we look at addition in detail later in this chapter.

Let’s look at how shifting is implemented in an ARM processor.

About the Barrel Shifter

The ARM processor has circuitry for shifting, called a barrel shifter. There are instructions to access this directly, which we will cover. But more often shifting can be incorporated into other instructions like the MOVK instruction. The reason for this is that the barrel shifter is outside the arithmetic logic unit (ALU); instead it’s part of the circuitry that loads the second operand to an instruction. We’ll see this in action when we cover Operand2 for the MOV instruction. Figure 2-3 shows the location of the barrel shifter in relation to the ALU.
../images/494415_1_En_2_Chapter/494415_1_En_2_Fig3_HTML.jpg
Figure 2-3

The location of the barrel shifter to perform shifts as part of loading Operand2

Let’s get into the details of shifting and rotating.

Basics of Shifting and Rotating

We have four cases to cover, as follows:
  • Logical shift left

  • Logical shift right

  • Arithmetic shift right

  • Rotate right

Logical Shift Left

This is quite straightforward; as we shift the bits left by the indicated number of places, zeros come in from the right. The last bit shifted out ends up in the carry flag.

Logical Shift Right

Equally easy as logical shift left, here we shift the bits right, then zeros come in from the left, and the last bit shifted out ends up in the carry flag.

Arithmetic Shift Right

The problem with logical shift right is if it’s a negative number, having a zero come in from the left suddenly turns the number positive. If we want to preserve the sign bit, use arithmetic shift right. Here a 1 comes in from the left, if the number is negative, and a 0 if it is positive. This is then the correct form if you are shifting signed integers.

Rotate Right

Rotating is like shifting, except the bits don’t go off the end; instead they wrap around and reappear from the other side. So, rotate right shifts right, but the bits that leave on the right reappear on the left.

That concludes the theory part of the chapter; now we return to writing Assembly Language code by going into the details of loading values into the registers.

Loading Registers

In this section, we look at various ways to load registers with values contained in instructions or other registers. We’ll look at loading registers from memory in Chapter 5, “Thanks for the Memories.”

First, the ARM engineers worked hard to minimize the number of instructions required, and we’ll look at another technique they used to accomplish this.

Instruction Aliases

In Chapter 1, “Getting Started,” in our Hello World sample program, we used the MOV instruction to load the values we needed into registers. However, MOV isn’t an ARM Assembly instruction; it’s an alias. You’re telling the Assembler what you want to do; then the Assembler finds a real ARM instruction to do the job. If it can’t find an instruction to do what you specified, then you get an error.

Consider
ADD X0, XZR, X1
This instruction adds the contents of register X1 to the zero register and puts the result in X0. This essentially moves X1 to X0. Thus, we don’t need an instruction:
MOV X0, X1

(MOV X0, X1 actually translates to ORR X0, XZR, X1, and we’ll talk about the ORR instruction in Chapter 4, “Controlling Program Flow,” but the idea is the same.)

Remember that with ARM instructions being only 32 bits, we can’t waste any of them. Hence the ARM designers were careful to avoid redundancy. It would’ve been a waste of valuable bits to have such a MOV instruction.

Knowing all these tricks would make programs unreadable and put a lot of pressure on programmers to know all the clever tricks, the ARM designers used to reduce the number of real instructions in the processor. The solution is to have the GNU Assembler know all these tricks and do the translations for you.

In this book, we use instruction aliases to make our programs readable, but point out when they’re used to help understand what’s going on. If you use objdump, it might show the same alias you used, another alternate alias, or the real instruction. There is a “-M no-aliases” option for objdump where you can see the true underlying instruction.

Let’s get into the details and forms of the MOV instruction to load the registers.

MOV/MOVK/MOVN

In this section, we look at several forms of the MOV instruction:
  1. 1.

    MOVK XD, #imm16{, LSL #shift}

     
  2. 2.

    MOV XD, #imm16{, LSL #shift}

     
  3. 3.

    MOV XD, XS

     
  4. 4.

    MOV XD, operand2

     
  5. 5.

    MOVN XD, operand2

     

We’ve seen examples of MOV, when putting a small number into a register. Here the immediate value can be any 16-bit quantity, and it will be placed in the lower 16 bits of the specified register unless an optional shift component is included. The shift values can only be the four values: 0, 16, 32, and 48. The shift value allows to put our 16-bit value in each of the four quarters of the 64-bit register.

We’ve listed the registers as X 64-bit registers here. But all these instructions can take W 32-bit registers. Remember that these are the same registers; you are just dealing with half of the register rather than the full register.

The first form is the move keep (MOVK) instruction.

About MOVK

The MOVK instruction answers our question of how to load the full 64 bits of a register. MOVK, the move keep instruction, loads the 16-bit immediate operand into one of four positions in the register without disturbing the other 48 bits. Suppose we want to load register X2 with the 64-bit hex value 0x1234FEDC4F5D6E3A. We could use
MOV    X2, #0x6E3A
MOVK   X2, #0x4F5D, LSL #16
MOVK   X2, #0xFEDC, LSL #32
MOVK   X2, #0x1234, LSL #48

Only four instructions are required, so not too painful, but a bit annoying.

This is our first example of adding a shift operator to the second operand. This saves us valuable instructions, since we don’t need to load the value and then shift it in a separate instruction and then combine it with the desired register in a third instruction.

The first MOV instruction is an alias and assembled as a MOVZ instruction, identical to the MOVK instruction, except it zeros the other 48 bits rather than keeping them. We could’ve used four MOVK instructions, but I like to start with a MOV instruction to guarantee we’ve initialized all the bits.

Register to Register MOV

In the third form of the MOV instruction, we have a version that moves one register into another. For example:
MOV   X1, X2

copies register X2 into register X1.

For the remaining two forms of the MOV instruction, we need to study what is allowed as the second operand.

About Operand2

All the ARM’s data processing instructions have the option of taking a flexible Operand2 as one of their parameters. At this point, it won’t be clear why you want some of this functionality, but as we encounter more instructions, and start to build small programs, we’ll see how they help us. At the bit level, there is a lot of complexity here, but the people who designed the Assembler did a good job of providing syntax to hide a lot of this from us. Still, when doing Assembly programming, it’s good to always know what is going on under the covers.

There are three formats for Operand2:
  1. 1.

    A register and a shift

     
  2. 2.

    A register and an extension operation

     
  3. 3.

    A small number and a shift

     

Due to the low number of bits for each instruction, the size of each component can differ. In the preceding MOVK case, the immediate is 16 bits and the shift is 2 bits. Rather than make the shift be 0, 1, 2, or 3 positions, instead these four values map to 0, 16, 32, or 48 bits. The possible values represent what the ARM designers felt were the most common use cases.

Register and Shift

First of all, you can specify a register and a shift. For this, you specify a register that takes 5 bits and then a shift that is 6 bits (for a total of a full 64-bit shift). For example:
MOV   X1, X2, LSL #1    // Logical shift left
is how we specify take X2, logically shift it left by 1 bit, and put the result in X1. We can then handle the other shift and rotate scenarios we mentioned previously with
MOV   X1, X2, LSR #1    // Logical shift right
MOV   X1, X2, ASR #1    // Arithmetic shift right
MOV   X1, X2, ROR #1    // Rotate right
Since shifting and rotating are quite common, the Assembler provides mnemonics (aliases) for these, so you can specify
LSL   X1, X2, #1    // Logical shift left
LSR   X1, X2, #1    // Logical shift right
ASR   X1, X2, #1    // Arithmetic shift right
ROR   X1, X2, #1    // Rotate right

These assemble to the same byte code. The intent is that it makes the code a little more readable, since it is clear you’re doing a shift or rotate operation and not just loading a register.

Register and Extension

The extension operations let us extract a byte, halfword, or word from the second register. You can then either zero extend or sign extend the extracted value. Further you can shift this value left by 0–4 bits before it is used. The extension operations are listed in Table 2-1.
Table 2-1

Extension operators

Extension Operator

Description

uxtb

Unsigned extend byte

uxth

Unsigned extend halfword

uxtw

Unsigned extend word

sxtb

Sign-extend byte

sxth

Sign-extend halfword

sxtw

Sign-extend word

If you are using the 32-bit W registers, then you would only use the byte and halfword variants of this.

The extension operators aren’t available for the MOV instruction, but we’ll see them shortly with the ADD instruction.

Small Number and Shift

The other form of operand2 consists of a small number and an optional shift amount. We saw this used with the preceding MOVK instruction. The size of this small number varies by instruction, and if a shift is allowed, there will be limited values. You can check the ARM Instruction Reference manual for the valid values for each instruction.

Fortunately, we don’t need to figure this all out. We just specify a number and the Assembler figures out how to represent it. Since there are only limited bits, not all 64-bit numbers can be represented, so if you specify something that can’t be dealt with, then the Assembler gives you an error message. You then need to use MOVK instructions as outlined previously.

MOV has the advantage that it can take an #imm16 operand, which can usually get us out of trouble. However, other instructions that must specify a third register, like the ADD instruction, don’t have this luxury.

Frequently, programmers deal with small integers like loop indexes, say to loop from 1 to 10. These simple cases are handled easily, and we don’t need to be concerned.
// Too big for #imm16
     MOV    X1, #0xAB000000
will be translated by the Assembler to
MOV   x1, #0xAB00, LSL #16
for us, saving us figuring out the instruction complexities.
// Too big for #imm16 and can't be represented.
     MOV    X1, #0xABCDEF11
This instruction gives the error
Error: immediate cannot be moved by a single instruction

when you run your program through the Assembler. This means the Assembler tried all its tricks and failed to represent the number. To load this, you need to use multiple MOV/MOVK instructions.

MOVN

This is the Move Not instruction. It works just like MOV, except it reverses all the 1s and 0s as it loads the register. This means it loads the register with the one’s complement form of what you specified. Another way to say it is that it applies a logical NOT operation to each bit in the word you are loading into the register.

MOVN is a distinct opcode, and not an alias for another instruction with cryptic parameters. The ARM 64-bit instruction set has a limited number of opcodes, so this is an important instruction with three main uses:
  1. 1.

    To calculate the one’s complement of something for you. This has its uses, but does it warrant its own opcode?

     
  2. 2.

    Multiply by -1. We saw that with the shift operations, we can multiply or divide by powers of 2. This instruction gets us halfway to multiplying by -1. Remember that the negative of a number is the two’s complement of the number, or the one’s complement plus one. This means we can multiply by -1 by doing this instruction, then add one. Why would we do this rather than use the multiply (MUL) instruction? The same applies for shifting, why do that rather than using MUL? The answer is that the MUL instruction is quite slow and can take quite a few clock cycles to do its work. Shifting only takes one cycle and using MOVN and ADD, we can multiply by -1 in only two clock cycles. Multiplying by -1 is very common and now we can do it quickly.

     
  3. 3.

    You get twice the number of values due to the extra bit—17 vs. 16. It turns out that all the numbers obtained by using a byte value and even shift are different for MOVN and MOV. This means that if the Assembler sees that the number you specified can’t be represented in a MOV instruction, then it tries to change it to an MOVN instruction and vice versa. So, you really have 17 bits of immediate data, rather than 16.

     
Note

It still might not be able to represent your number, and you may still need to use multiple MOVK instructions.

MOV Examples

In this section, we will write a short program to exercise a selection of the MOV instructions. Create a file called
movexamps.s
containing Listing 2-1.
//
// Examples of the MOV instruction.
//
.global _start    // Provide program starting address
// Load X2 with 0x1234FEDC4F5D6E3A first using MOV and MOVK
_start: MOV     X2, #0x6E3A
     MOVK  X2, #0x4F5D, LSL #16
     MOVK  X2, #0xFEDC, LSL #32
     MOVK  X2, #0x1234, LSL #48
// Just move W2 into W1
     MOVW1, W2
// Now lets see all the shift versions of MOV
     MOV   X1, X2, LSL #1   // Logical shift left
     MOV   X1, X2, LSR #1   // Logical shift right
     MOV   X1, X2, ASR #1   // Arithmetic shift right
     MOV   X1, X2, ROR #1   // Rotate right
// Repeat the above shifts using mnemonics.
     LSL   X1, X2, #1   // Logical shift left
     LSR   X1, X2, #1   // Logical shift right
     ASR   X1, X2, #1   //Arithmetic shift right
     ROR   X1, X2, #1   // Rotate right
// Example that works with 8 bit immediate and shift
     MOV   X1, #0xAB000000  // Too big for #imm16
// Example that can't be represented and results in an error
// Uncomment the instruction if you want to see the error
//   MOV   X1, #0xABCDEF11    // Too big for #imm16 and can't be represented.
// Example of MOVN
     MOVN  W1, #45
// Example of a MOV that the Assembler will change to MOVN
     MOV  W1, #0xFFFFFFFE   // (-2)
// Setup the parameters to exit the program
// and then call Linux to do it.
      MOV     X0, #0   // Use 0 return code
      MOV     X8, #93  // Serv command code 93 terms
      SVC     0        // Call linux to terminate
Listing 2-1

MOV examples

You can compile this program with the build file:
as -o movexamps.o movexamps.s
ld -o movexamps movexamps.o

You can run the program after building it.

Note

This program doesn’t do anything besides move various numbers into registers.

We will look at how to see what is going on in Chapter 3, “Tooling Up,” when we cover the GNU Debugger (GDB).

If we disassemble the program using
objdump -s -d -M no-aliases movexamps.o
we get Listing 2-2.
Disassembly of section .text:
0000000000000000 <_start>:
   0: d28dc742   movz   x2, #0x6e3a
   4: f2a9eba2   movk   x2, #0x4f5d, lsl #16
   8: f2dfdb82   movk   x2, #0xfedc, lsl #32
   c: f2e24682   movk   x2, #0x1234, lsl #48
  10: 2a0203e1   orr    w1, wzr, w2
  14: aa0207e1   orr    x1, xzr, x2, lsl #1
  18: aa4207e1   orr    x1, xzr, x2, lsr #1
  1c: aa8207e1   orr    x1, xzr, x2, asr #1
  20: aac207e1   orr    x1, xzr, x2, ror #1
  24: d37ff841   ubfm   x1, x2, #63, #62
  28: d341fc41   ubfm   x1, x2, #1, #63
  2c: 9341fc41   sbfm   x1, x2, #1, #63
  30: 93c20441   extr   x1, x2, x2, #1
  34: d2b56001   movz   x1, #0xab00, lsl #16
  38: 128005a1   movn   w1, #0x2d
  3c: 12800021   movn   w1, #0x1
  40: d2800000   movz   x0, #0x0
  44: d2800ba8   movz   x8, #0x5d
  48: d4000001   svc    #0x0
Listing 2-2

Disassembly of the MOV examples

Here we can see the true ARM 64-bit instructions that are produced by the Assembler. We’ve talked about how MOV instructions can be converted into ORR or MOVZ instructions.

We see the shift instructions were converted into UBFM, SBFM, and EXTR instructions. These are the underlying shift and rotate instructions. These instructions have more functionality than the aliases we are using, but we won’t need that advanced functionality and will stick with the straightforward alias versions.

Now that we’ve loaded numbers into our registers, let’s perform some arithmetic on them.

ADD/ADC

We can now put any value we like in a register, so let’s start doing some computing. Let’s start with addition. The instructions we will cover are
  1. 1.

    ADD{S} Xd, Xs, Operand2

     
  2. 2.

    ADC{S} Xd, Xs, Operand2

     
These instructions all add their second and third parameters and put the result in their first parameter register destination (Rd). We already know about operand2. The registers Rd and source register (Rs) can be the same. Let’s look at some examples of the forms of operand2:
// the immediate value can be 12-bits, so 0-4095
// X2 = X1 + 4000
   ADD   X2, X1, #4000
// the shift on an immediate can be 0 or 12
// X2 = X1 + 0x20000
   ADD   X2, X1, #0x20, LSL 12
// simple addition of two registers
// X2 = X1 + X0
   ADD   X2, X1, X0
// addition of a register with a shifted register
// X2 = X1 + (X0 * 4)
   ADD   X2, X1, X0, LSL 2
// With register extension options
// X2 = X1 + signed extended byte(X0)
   ADD   X2, X1, X0, SXTB
// X2 = X1 + zero extended halfword(X0) * 4
   ADD   X2, X1, X0, UXTH 2
We haven’t developed the code to print out a number yet, as we must first convert the number to an ASCII string. We will get to this after we cover loops and conditional statements. In the meantime, we can get one number from our program via the program’s return code. This is a 1-byte unsigned integer. Let’s look at an example of multiplying a number by -1 and see the output. Listing 2-3 is the code to do this.
//
// Examples of the ADD/MOVN instructions.
//
.global _start   // Provide program starting address
// Multiply 2 by -1 by using MOVN and then adding 1
_start:    MOVN  W0, #2
           ADD   W0, W0, #1
// Setup the parameters to exit the program
// and then call Linux to do it.
// W0 is the return code and will be what we
// calculated above.
        MOV     X8, #93   // Service command code 93
        SVC     0         // Call linux to terminate
Listing 2-3

An example of MOVN and ADD

Here we use the MOVN instruction to calculate the one’s complement of our number, in this case 2; then we add 1 to get the two’s complement form. We use W0 since this will be the return code returned via the Linux terminate command. To see the return code, type
echo $?

After running the program, it prints out 254. If you examine the bits, you will see this is the two’s complement form for -2 in 1 byte.

With the ARM processor, we can combine multiple ADD instructions to add arbitrarily large integers. The key to this is the carry flag.

Add with Carry

The new concepts in this section are what the {S} after the instruction means along with why we have both ADD and ADC. This will be our first use of a condition flag.

Think back to how we learned to add numbers:
 17
+78
 95
  1. 1.

    We first add 7 + 8 and get 15.

     
  2. 2.

    We put 5 in our sum and carry the 1 to the tens column.

     
  3. 3.

    Now we add 1 + 7 + the carry from the ones column, so we add 1+7+1 and get 9 for the tens column.

     

This is the idea behind the carry flag. When an addition overflows, it sets the carry flag, so we can include that in the sum of the next part.

Note

A carry is always 0 or 1, so we only need a 1-bit flag for this.

The ARM processor adds 64 bits at a time, so we only need the carry flag if we are dealing with numbers larger than what will fit into 64 bits. This means we can easily add 128-bit or even larger integers.

In Chapter 1, “Getting Started,” we quickly mentioned that bit 29 in the instruction format specifies whether an instruction alters the condition flags. So far, we haven’t set that bit, so none of the instructions we’ve written so far will alter any condition flags. If we want an instruction to alter them, then we place an “S” on the end of the opcode, and the Assembler will set bit 29 when it builds binary version of the instruction. This applies to all instructions, including the MOV instructions we just looked at.
ADDS    X0, X0, #1
is just like
ADD X0, X0, #1

except that it sets various condition flags. We’ll cover all the flags when we cover conditional statements in Chapter 4, “Controlling Program Flow.” For now, we are interested in the carry flag that is designated C. If the result of an addition is too large, then the C flag is set to 1; otherwise it is set to 0.

To add two 128-bit integers, we use two registers to hold each number. In our example, we’ll use registers X2 and X3 for the first number, X4 and X5 for the second, and then X0 and X1 for the result. The code would then be
ADDS   X1, X3, X5    // Lower order 64-bits
ADC    X0, X2, X4    // Higher order 64-bits

The first ADDS adds the lower order 64 bits and sets the carry flag, if needed. It might set other flags, but we’ll worry about those later. The second instruction, ADDC, adds the higher-order words, plus the carry flag.

The nice thing here is that in 64-bit mode, we can do a 128-bit addition in only two clock cycles. Let’s look at a simple complete example in Listing 2-4.
//
// Example of 128-Bit addition with the ADD/ADC instructions.
//
.global _start    // Provide program starting address
// Load the registers with some data
// First 64-bit number is 0x0000000000000003FFFFFFFFFFFFFFFF
_start: MOV  X2, #0x0000000000000003
        MOV  X3, #0xFFFFFFFFFFFFFFFF //Assem will change to MOVN
// Second 64-bit number is 0x00000000000000050000000000000001
        MOV  X4, #0x0000000000000005
        MOV  X5, #0x0000000000000001
        ADDS X1, X3, X5 // Lower order 64-bits
        ADC  X0, X2, X4 // Higher order 64-bits
// Setup the parameters to exit the program
// and then call Linux to do it.
// W0 is the return code and will be what we
// calculated above.
        MOV     X8, #93    // Service command code 93 terminates
        SVC     0          // Call linux to terminate the program
Listing 2-4

Example of 128-bit addition with ADD and ADC

Here we are adding
0000000000000003 FFFFFFFFFFFFFFFF
0000000000000005 0000000000000001
0000000000000009 0000000000000000
We’ve rigged this example to demonstrate the carry flag, and to produce an answer we can see in the return code. The largest 64-bit unsigned integer is
0xFFFFFFFFFFFFFFFF
and adding 1 results in
0x10000000000000000
which doesn’t fit in 64 bits, so we get
0x0000000000000000
with a carry. The high-order words add 3 + 5 + carry to yield 9. The high-order word is in X0, so it is the return code when the program exits. If we type
echo $?

we get 9 as expected.

Learning about MOV was difficult, because this was the first time we encountered both shifting and Operand2. With these behind us, learning about ADD was much easier. We still have some complicated topics to cover, but as we become more experienced with how to manipulate bits and bytes, the learning should become easier.

Covering addition wouldn’t be complete without covering its inverse: subtraction.

SUB/SBC

Subtraction is the inverse of addition. We have
  1. 1.

    SUB{S} Xd, Xs, Operand2

     
  2. 2.

    SBC{S} Xd, Xs, Operand2

     

The operands are the same as those for addition, only now we are calculating Xs – Operand2. The carry flag is used to indicate when a borrow is necessary. SUBS will clear the carry flag if the result is negative and set it if positive; SBC then subtracts one if the carry flag is clear.

Summary

In this chapter, we learned how negative integers are represented in a computer. We went on to discuss big- vs. little-endian byte ordering. We then looked at the concept of shifting and rotating the bits in a register.

Next, we looked in detail at the MOV instruction that allows us to move data around the CPU registers or load constants from the MOV instruction into a register. We discovered the tricks of operand2 on how ARM represents a large range of values, given the limited number of bits it has at its disposal.

We covered the ADD and ADC instructions and discussed how to add both 64- and 128-bit numbers. Finally, we quickly covered the SUB and SBC instructions.

In Chapter 3, “Tooling Up,” we will look at better ways to build our programs and start debugging our programs with the GNU Debugger (gdb).

Exercises

  1. 1.

    Compute the 8-bit two’s complement for -79 and -23.

     
  2. 2.

    What are the negative decimal numbers represented by the bytes 0xF2 and 0x83?

     
  3. 3.

    Write out the bytes in the little-endian representation of 0x12345678.

     
  4. 4.

    Write out the bytes for 0x23 shifted left by 3 bits.

     
  5. 5.

    Write out the bytes for 0x4300 shifted right by 5 bits.

     
  6. 6.

    Write a program to add two 192-bit numbers. You will need to use the ADCS instruction for this. Remember you can set the flags from any instruction.

     
  7. 7.

    Write a program that performs 128-bit subtraction. Convince yourself that the way it sets and interprets the carry flag is what you need in this situation. Use it to reverse the operations from the preceding 128-bit example.

     
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.218.218.230