Limited Accuracy of Floating Point Types

Before we describe double and float, we need to point out two limitations of floating-point arithmetic. People are still getting Ph.D.s for probing the mathematics underlying floats, but the short story is:

  • Floating-point numbers only hold a limited number of significant figures. The float type only holds six to seven significant figures. So you can hold the number 123,456 accurately, and you can hold the number 0.123456 pretty accurately, but it's certain that you cannot hold the number 123,456.123456 in a float variable accurately because that would require 12 significant figures. You can write the number in your program, and you'll actually get a number that is approximately the value you want, but not exactly equal. (You'll get 123456.125, in fact).

  • Floating-point numbers may contain tiny inaccuracies that can mount up as you iterate through an expression. Don't expect ten iterations of adding 0.1 to a float variable to cause it to exactly equal 1.0F!

Floating-point numbers have these limitations in every programming language. It is inherent in the type. You are trying to represent an infinite quantity of numbers in a finite type. The only way this can be done is by picking points on the real-number continuum and representing those exactly. Then use those model values to represent approximations to all other real numbers. It's as though we could only store tenths, and so everything from 0.0 to 0.049 becomes 0.0. Everything from 0.05 to 0.149 becomes 0.1, and so on. With floats, we're working with millionths, not tenths. But it is still not perfectly accurate for most numbers. With this background about the limitations, let's review the two floating point types.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.12.123.189