Wikipedia Easy Search

This is a documentation subpage for Template:Floating-point.
It may contain usage information, categories and other content that is not part of the original template page.

Floating-point formats

IEEE 754
16-bit: Half (binary16) 32-bit: Single (binary32), decimal32 64-bit: Double (binary64), decimal64 128-bit: Quadruple (binary128), decimal128 256-bit: Octuple (binary256) Extended precision
Other
Minifloat bfloat16 TensorFloat-32 Microsoft Binary Format IBM floating-point architecture PMBus Linear-11 G.711 8-bit floats
Alternatives
Arbitrary precision
Tapered floating point
Posit
v t e