Floating-point formats |
---|
IEEE 754 |
|
Other |
Alternatives |
Tapered floating point |
Computer architecture bit widths |
---|
Bit |
Application |
Binary floating-point precision |
Decimal floating-point precision |
In computing, quadruple precision (or quad precision) is a binary floating-point–based computer number format that occupies 16 bytes (128 bits) with precision at least twice the 53-bit double precision.
This 128-bit quadruple precision is designed not only for applications requiring results in higher than double precision,[1] but also, as a primary function, to allow the computation of double precision results more reliably and accurately by minimising overflow and round-off errors in intermediate calculations and scratch variables. William Kahan, primary architect of the original IEEE 754 floating-point standard noted, "For now the 10-byte Extended format is a tolerable compromise between the value of extra-precise arithmetic and the price of implementing it to run fast; very soon two more bytes of precision will become tolerable, and ultimately a 16-byte format ... That kind of gradual evolution towards wider precision was already in view when IEEE Standard 754 for Floating-Point Arithmetic was framed."[2]
In IEEE 754-2008 the 128-bit base-2 format is officially referred to as binary128.