81
C H A P T E R 5
Fixed-Point vs. Floating-Point
One important feature that distinguishes different processors is whether their CPUs perform
fixed-point or floating-point arithmetic. In a fixed-point processor, numbers are represented and
manipulated in integer format. In a floating-point processor, in addition to integer arithmetic,
floating-point arithmetic can be handled. is means that numbers are represented by the com-
bination of a mantissa (or a fractional part) and an exponent part, and the CPU possesses the
necessary hardware for manipulating both of these parts. As a result, in general, floating-point
operations involve more logic elements (larger ALU) and more cycles (more time) to manipulate
floating-point values.
In a fixed-point processor, one needs to be concerned with the dynamic range of numbers,
since a much narrower range of numbers can be represented in integer format as compared to
floating-point format. For most applications, such a concern can be virtually ignored when using
a floating-point processor. Consequently, fixed-point processors usually demand more coding
effort than do floating-point processors.
5.1 Q-FORMAT NUMBER REPRESENTATION
e decimal value of a 2’s-complement number B D b
N 1
b
N 2
: : : b
1
b
0
; b
i
2 f0; 1g, is given by
D.B/ D b
N 1
2
N 1
C b
N 2
2
N 2
C C b
1
2
1
C b
0
2
0
: (5.1)
e 2’s-complement representation allows a processor to perform integer addition and subtrac-
tion by using the same hardware. When using unsigned integer representation, the sign bit is
treated as an extra bit. Only positive numbers get represented this way.
ere is a limitation to the dynamic range of the foregoing integer representation scheme.
For example, in a 16-bit system it is not possible to represent numbers larger than C2
15
1 D32,767 or smaller than 2
15
D32,768. To cope with this limitation, numbers are normal-
ized between 1 and 1. In other words, they are represented as fractions. is normalization is
achieved by the programmer moving the implied or imaginary binary point (note that there is
no physical memory allocated to this point), as indicated in Figure 5.1. is way, the fractional
value is given by
F .B/ D b
N 1
2
0
C b
N 2
2
1
C C b
1
2
.N
2/
C b
0
2
.N
1/
: (5.2)
is representation scheme is referred to as Q-format or fractional representation. e
programmer needs to keep track of the implied binary point when manipulating Q-format num-