You already know about integer arithmetic; now we will introduce some floating-point computations. There is nothing difficult here; a floating-point value has a decimal point in it and zero or more decimals. We have two kinds of floating-point numbers: single precision and double precision. Double precision is more accurate because it can handle more significant digits. With that information, you now know enough to run and analyze the sample program in this chapter.
Single vs. Double Precision
For those more curious, here is the story.
The sign bit is simple. When the number is positive, it is 0; when the number is negative, the sign bit is 1.
200 = 2.0 × 102
5000.30 = 5.0003 × 103
1101010.01011 = 1.0101001011 x 26 (we moved the point six places to the left)
However, the exponent can be positive, negative, or zero. To make that distinction clear, in the case of single precision, 127 is added to a positive exponent before storing it. That means a zero exponent would be stored as 127. That 127 is called a bias. With double-precision values, the bias is 1023.
In the example above, the 1.0101001011 is called the significand or mantissa. The first bit of the significand is a 1 by assumption (it is ‘normalized’), so it is not stored.
Here is a simple example to show how it works. Use, for example, https://babbage.cs.qc.cuny.edu/IEEE-754/ to verify and experiment:
Decimal 10 is 1010 as a binary integer.
Sign bit 0, because the number is positive.
Obtain a number in the format b.bbbb. 1.010 is the significand with a leading 1 as required. The leading 1 will not be stored.
Hence, the exponent is 3 because we moved the point three places. We add 127 because the exponent is positive, so we obtain 130, which in binary is 10000010.
- Thus, the decimal single-precision number 10 will be stored as:0 10000010 01000000000000000000000S EEEEEEEE FFFFFFFFFFFFFFFFFFFFFFF
or 41200000 in hexadecimal.
Note that the hexadecimal representation of the same value is different in single precision than in double precision. Why not always use double precision and benefit from the higher precision? Double-precision calculations are slower than single-precision calculations, and the operands use more memory.
If you think this is complicated, you are right. Find an appropriate tool on the Internet to do or at least verify the conversions.
You can encounter 80-bit floating-point numbers in older programs, and these numbers have their own instructions, called FPU instructions. This functionality is a legacy from the past and should not be used in new developments. But you will find FPU instructions in articles on the Internet from time to time.
Let’s do some interesting things.
Coding with Floating-Point Numbers
fcalc.asm
This is a simple program; in fact, the printing takes more effort than the floating-point calculations.
Use a debugger to step through the program and investigate the registers and memory. Note, for example, how 9.0 and 73.0 are stored in memory addresses number1 and number2; these are the double-precision floating-point values.
Remember that when debugging in SASM, the xmm registers are at the bottom of the register window, in the leftmost part of the ymm registers.
movsd means “move a double precision-floating point value.” There is also movss for single precision. Similarly, there are addss , subss , mulss , divss , and sqrtss instructions.
Now that you know about the stack, try this: comment out push rbp at the beginning and pop rbp at the end. Make and run the program and see what happens: program crash! The cause for the crash will become clear later, but it has to do with stack alignment.
Summary
The basic use of xmm registers for floating-point calculations
The difference between single precision and double precision
The instructions movsd, addsd, subsd, mulsd, divsd, and sqrtsd