Insight 6 -- Machine Epsilon


Rounding is a procedure for choosing the representation of a real number in a floating point number system. The machine epsilon, which is also called macheps or unit roundoff, is the maximum relative error of the chosen rounding procedure.

Recall that a real number x can be represented as a floating-point number:

x = σ • y • be
where
    • σ = ± 1 is the sign
    • e is an integer, the exponent
    • 1 ≤ y < b, the significand or mantissa

With the use of such a floating-point system, is important to note that the numbers we can store are not equally spaced even though a wide range of variably-spaced numbers can be represented exactly.
For instance, 2.125 = 21 + 2-3 has an exact representation in binary whereas 3.1 does not:

3.1 ≈ 21 + 20 + 2-4 + 2-5 + 2-8 + ... 

And, of course, there are numbers (like π) that have no finite representation in either decimal or binary number system.

If we suppose to limit the number of digits of the mantissa to 4 and the exponent [-99,99] we say that a computer with such a representation has a four-digit decimal floating point arithmetic. This implies we cannot accurately store more than the first four digits of a number; and even the fourth digit may be changed by rounding.

The IEEE established several standards for floating-point numbers used in almost all the computers.

  • IEEE single precision floating-point representation (4 bytes -- 32 bits):
    • precision of 24 binary digits
    • the exponent e is limited by -126 ≤ e ≤ 127 (in binary, -(1111110) ≤ e ≤ (1111111))
x = σ • (1.y1y2...y23) • 2e

  • IEEE double-precision floating-point representation (8 bytes -- 64 bits)
    • precision of 53 binary digits
    • exponent e limited by -1022 ≤ e ≤ 1023

x = σ • (1.y1y2...y52) • 2e

How accurate can a number be stored in the floating point representation? How can this be measured?

Machine epsilon:
For any format, the machine epsilon is the difference between 1 and the next larger number that can be stored in that format.

In single precision IEEE, the next larger binary number is

1.0000000000000000000000 1

(1+2-24 cannot be stored exactly!)

so it is: 2-23 = 1.19 x 10-7

i.e. we can store approximately 7 decimal digits of a number x in decimal format.


In double precision IEEE format the machine epsilon is


2-52 = 2.22 x 10-16

i.e. double-precision format can be used to store approximately 16 decimal digits of a number x in decimal format.


Another way to measure the accuracy of floating-point format:

- look for the largest integer M such that any integer x such that 0 ≤ x ≤ M can be stored and represented exactly in floating point form.

If n is the number of binary digits in the significand, all integers less or equal to

(1.11...1)2 * 2n-1 = (1 + 1*2-1 + 1*2-2 + ... + 1*2-(n-1))*2n-1
= 2n-1

In IEEE single precision format:

M = 224 = 16777216

and all 7-digit decimal integers will store exactly.

In IEEE double precision format:

M = 253 = 9.0 x 1015

and all 15-digit decimal integers and most 16 digit ones will store exactly.



Commenti

Post popolari in questo blog

Welford Algorithm

Research 6 - Derivation of Chebyshev's inequality and its application to prove the (weak) LLN

Research 7 -- Central Limit Theorem, LLN, and most common probability distributions