In current computers, bits are the smallest piece of information; a bit can have a value of 1 or 0. In this chapter, we will investigate how bits are combined to represent data, such as integers or floating-point values. The decimal representation of values, which is so intuitive to humans, is not ideal for computers to work with. When you have a binary system, with only two possible values (1 or 0), it is much more efficient to work with powers of 2. When we talk about historical computer generations, you had 8-bit CPUs (23), 16-bit CPUs (24), 32-bit CPUs (25), and currently mostly 64-bit CPUs (26). However, for humans, dealing with long strings of 1s and 0s is impractical or even impossible. In this chapter, we will show how to convert bits into decimal or hexadecimal values that we can more easily work with. After that, we will discuss registers, data storage areas that assist the processor in executing logical and arithmetic instructions.
A Short Course on Binary Numbers
Computers use binary digits (0s and 1s) to do the work. Eight binary digits grouped together are called a byte . However, binary numbers are too long for humans to work with, let alone to remember. Hexadecimal numbers are more user-friendly (only slightly), not in the least because every 8-bit byte can be represented by only two hexadecimal numbers.
Decimal | Hexadecimal | Binary |
---|---|---|
0 | 0 | 0000 |
1 | 1 | 0001 |
2 | 2 | 0010 |
3 | 3 | 0011 |
4 | 4 | 0100 |
5 | 5 | 0101 |
6 | 6 | 0110 |
7 | 7 | 0111 |
8 | 8 | 1000 |
9 | 9 | 1001 |
10 | a | 1010 |
11 | b | 1011 |
12 | c | 1100 |
13 | d | 1101 |
14 | e | 1110 |
15 | f | 1111 |
Integers
- 1.
Write the binary of the absolute value.
- 2.
Take the complement (change all the 1s to 0s and the 0s to 1s).
- 3.
Add 1.
Hexadecimal numbers are normally preceded with 0x in order to distinguish them from decimal numbers, so -17 in hexadecimal is 0xffef. If you investigate a machine language listing, a .lst file, and you see the number 0xffef, you have to find out from the context if it is a signed or unsigned integer. If it is a signed integer, it means -17 in decimal. If it is an unsigned integer, it means 65519. Of course, if it is a memory address, it is unsigned (you get that, right?). Sometimes you will see other notations in assembler code, such as 0800h, which is also a hexadecimal number; 10010111b, a binary number; or 420o, an octal number. Yes, indeed, octal numbers can also be used. We will use octal numbers when we write our code for file I/O. If you need to convert integer numbers, don’t sweat it; use the previously mentioned websites.
Floating-Point Numbers
Again, if you need to convert floating-point numbers, use the previously mentioned web sites; we will not go into further detail here.
A Short Course on Registers
The CPU, the brain of the computer, executes the program instructions by making extensive use of the registers and memory of the computer, doing mathematical and logical operations on these registers and memory. Therefore, it is important to have a basic knowledge of registers and memory and how they are used. Here we give a short overview of the registers; more details about the usage of registers will become clear in later chapters. Registers are storage locations, used by the CPU to store data, instructions, or memory addresses. There are only a small number of registers, but the CPU can read and write them extremely quickly. You can consider registers as sort of a scratchpad for the processor to store temporary information. One rule to keep in mind if speed is important is that the CPU can access registers much faster than it can access memory.
Do not worry if this section is above your head; things will start making sense when we use registers in the upcoming chapters.
General-Purpose Registers
64-bit | 32-bit | 16-bit | low 8-bit | high 8-bit | comment |
---|---|---|---|---|---|
rax | eax | ax | al | ah | |
rbx | ebx | bx | bl | bh | |
rcx | ecx | cx | cl | ch | |
rdx | edx | dx | dl | dh | |
rsi | esi | si | sil | - | |
rdi | edi | di | dil | - | |
rbp | ebp | bp | bpl | - | Base pointer |
rsp | esp | sp | spl | - | Stack pointer |
r8 | r8d | r8w | r8b | - | |
r9 | r9d | r9w | r9b | - | |
r10 | r10d | r10w | r10b | - | |
r11 | r11d | r11w | r11b | - | |
r12 | r12d | r12w | r12b | - | |
r13 | r13d | r13w | r13b | - | |
r14 | r14d | r14w | r14b | - | |
r15 | r15d | r15w | r15b | - |
Although rbp and rsp are called general-purpose registers, they should be handled with care, as they are used by the processor during the program execution. We will use rbp and rsp quite a bit in the more advanced chapters.
This is the binary representation of the number 60 in a 64-bit register.
A 32-bit register is the set of the 32 lower (rightmost) bits of a 64-bit register. Similarly, a 16-bit register and an 8-bit register consist of the lowest 16 and lowest 8 bits, respectively, of the 64-bit register.
Remember, the “lower” bits are always the rightmost bits.
Bit number 0 is the rightmost bit; we start counting from the right and start with index 0, not 1. Thus, the leftmost bit of a 64-bit register has index 63, not 64.
Instruction Pointer Register (rip)
The processor keeps track of the next instruction to be executed by storing the address of the next instruction in rip . You can change the value in rip to whatever you want at your own peril; you have been warned. A safer way of changing the value in rip is by using jump instructions. This will be discussed in a later chapter.
Flag Register
Name | Symbol | Bit | Content |
---|---|---|---|
Carry | CF | 0 | Previous instruction had a carry |
Parity | PF | 2 | Last byte has even number of 1s |
Adjust | AF | 4 | BCD operations |
Zero | ZF | 6 | Previous instruction resulted a zero |
Sign | SF | 8 | Previous instruction resulted in most significant bit equal to 1 |
Direction | DF | 10 | Direction of string operations (increment or decrement) |
Overflow | OF | 11 | Previous instruction resulted in overflow |
We will explain and use flags quite a bit in this book.
There is another flag register, called MXCSR, that will be used in the single instruction, multiple data (SIMD) instruction chapters; we will explain MXCSR there in more detail.
xmm and ymm Registers
These registers are used for floating-point calculations and SIMD. We will use the xmm and corresponding ymm registers extensively later, starting with the floating-point instructions.
In addition to the previously explained registers, there are more registers, but we will not use the others in this book.
Put the theory aside for now; it’s time for the real work!
Summary
How to display values in decimal, binary, and hexadecimal formats
How to use registers and flags