Floating point numbers

Floating point numbers follow the IEEE 754 standard and represent numbers with a decimal point, such as 3.14, or an exponent notation, such as 4e-14, and come in the types Float16 up to Float64, the last one being used for double precision.

Single precision is achieved through the use of the Float32 type. Single precision float literals must be written in scientific notation, such as 3.14f0, but with f, where one normally uses e. That is, 2.5f2 indicates 2.5*10^2 with single precision, while 2.5e2 indicates 2.5*10^2 in double precision. Julia also has a BigFloat type for arbitrary-precision floating numbers computations.

A built-in type promotion system takes care of all the numeric types that can work together seamlessly, so that there is no explicit conversion needed. Special values exist: Inf and -Inf are used for infinity, and NaN is used for "not a number" values such as the result of 0/0 or Inf - Inf.

Floating point arithmetic in all programming languages is often a source of subtle bugs and counter-intuitive behavior. For instance, note the following:

julia> 0.1 + 0.2 
0.30000000000000000004 

This happens because of the way the floating point numbers are stored internally. Most numbers cannot be stored internally with a finite number of bits, such as 1/3 having no finite representation in base 10. The computer will choose the closest number it can represent, introducing a small round-off error. These errors might accumulate over the course of long computations, creating subtle problems.

Maybe the most important consequence of this is the need to avoid using equality when comparing floating point numbers:

julia> 0.1 + 0.2 == 0.3 
false 

A better solution is to use >= or <= comparisons in logical tests that involve floating point numbers, wherever possible.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.16.69.199