Half precision floating-point format
Encyclopedia
In computing
Computing
Computing is usually defined as the activity of using and improving computer hardware and software. It is the computer-specific part of information technology...

, half precision is a binary floating-point computer number format that occupies 16 bits (two bytes in modern computers) in computer memory.

In IEEE 754-2008 the 16-bit base 2 format is officially referred to as binary16. It is intended for storage (of many floating-point values where higher precision need not be stored), not for performing arithmetic computations.

Half-precision floating point is a relatively new binary floating-point format. It was created concurrently by Nvidia
NVIDIA
Nvidia is an American global technology company based in Santa Clara, California. Nvidia is best known for its graphics processors . Nvidia and chief rival AMD Graphics Techonologies have dominated the high performance GPU market, pushing other manufacturers to smaller, niche roles...

 and Industrial Light & Magic. Nvidia defined the half datatype in the Cg language, released in early 2002, and was the first to implement 16-bit floating point in silicon, with the GeForce FX, released in late 2002. ILM was searching for an image format that could handle dynamic ranges, but without the hard drive and memory cost of floating-point representations that are commonly used for floating-point computation (single and double precision).

This format is used in several computer graphics
Computer graphics
Computer graphics are graphics created using computers and, more generally, the representation and manipulation of image data by a computer with help from specialized software and hardware....

 environments including OpenEXR
OpenEXR
OpenEXR is a high dynamic range imaging image file format, released as an open standard along with a set of software tools created by Industrial Light and Magic , released under a free software license similar to the BSD license....

, OpenGL
OpenGL
OpenGL is a standard specification defining a cross-language, cross-platform API for writing applications that produce 2D and 3D computer graphics. The interface consists of over 250 different function calls which can be used to draw complex three-dimensional scenes from simple primitives. OpenGL...

, Cg, and D3DX
D3dx
In computing, D3DX is a high level API library which is written to supplement Microsoft's Direct3D graphics API. The D3DX library was introduced in Direct3D 7, and subsequently was improved in Direct3D 9...

. The advantage over 8-bit or 16-bit binary integers is that the increased dynamic range
Dynamic range
Dynamic range, abbreviated DR or DNR, is the ratio between the largest and smallest possible values of a changeable quantity, such as in sound and light. It is measured as a ratio, or as a base-10 or base-2 logarithmic value.-Dynamic range and human perception:The human senses of sight and...

 allows for more detail to be preserved in highlights and shadow
Shadow
A shadow is an area where direct light from a light source cannot reach due to obstruction by an object. It occupies all of the space behind an opaque object with light in front of it. The cross section of a shadow is a two-dimensional silhouette, or reverse projection of the object blocking the...

s for images. The advantage over 32-bit single-precision binary formats is that it requires half the storage and bandwidth (at the expense of precision).

IEEE 754 half-precision binary floating-point format: binary16

The IEEE 754 standard specifies a binary16 as having:
  • Sign bit
    Sign bit
    In computer science, the sign bit is a bit in a computer numbering format that indicates the sign of a number. In IEEE format, the sign bit is the leftmost bit...

    : 1 bit
  • Exponent width: 5 bits
  • Significant precision
    Precision (arithmetic)
    The precision of a value describes the number of digits that are used to express that value. In a scientific setting this would be the total number of digits or, less commonly, the number of fractional digits or decimal places...

    : 11 (10 explicitly stored)

The format is assumed to have an implicit lead bit with value 1 unless the exponent field is stored with all zeros. Thus only 10 bits of the significand
Significand
The significand is part of a floating-point number, consisting of its significant digits. Depending on the interpretation of the exponent, the significand may represent an integer or a fraction.-Examples:...

 appear in the memory format but the total precision is 11 bits. In IEEE 754 parlance, there are 10 bits of significand, but there are 11 bits of significand precision (log10(211) ≈ 3.311 decimal digits). The bits are laid out as follows:


Exponent encoding

The half-precision binary floating-point exponent is encoded using an offset-binary representation, with the zero offset being 15; also known as exponent bias in the IEEE 754 standard.
  • Emin = 01h−0Fh = −14
  • Emax = 1Eh−0Fh = 15
  • Exponent bias
    Exponent bias
    In IEEE 754 floating point numbers, the exponent is biased in the engineering sense of the word – the value stored is offset from the actual value by the exponent bias....

     = 0Fh = 15


Thus, as defined by the offset binary representation, in order to get the true exponent the offset of 15 has to be subtracted from the stored exponent.

The stored exponents 0x00 and 0x1f are interpreted specially.
Exponent Significand zero Significand non-zero Equation
00h zero
0 (number)
0 is both a numberand the numerical digit used to represent that number in numerals.It fulfills a central role in mathematics as the additive identity of the integers, real numbers, and many other algebraic structures. As a digit, 0 is used as a placeholder in place value systems...

, −0 
subnormal numbers  (−1)signbit × 2−14 × 0.significandbits2
01h, ..., 1Eh normalized value (−1)signbit × 2exponent−15 × 1.significandbits2
1Fh ±infinity
Infinity
Infinity is a concept in many fields, most predominantly mathematics and physics, that refers to a quantity without bound or end. People have developed various ideas throughout history about the nature of infinity...

 
NaN
NaN
In computing, NaN is a value of the numeric data type representing an undefined or unrepresentable value, especially in floating-point calculations...

 (quiet, signalling)


The minimum strictly positive (subnormal) value is
2−24 ≈ 5.96 × 10−8.
The minimum positive normal value is 2−14 ≈ 6.10 × 10−5.
The maximum representable value is (2−2−10) × 215 = 65504.

Half precision examples

These examples are given in bit representation, in hexadecimal
Hexadecimal
In mathematics and computer science, hexadecimal is a positional numeral system with a radix, or base, of 16. It uses sixteen distinct symbols, most often the symbols 0–9 to represent values zero to nine, and A, B, C, D, E, F to represent values ten to fifteen...

,
of the floating-point value. This includes the sign, (biased) exponent, and significand.

3c00 = 1
c000 = −2

7bff = 6.5504 × 104 (max half precision)

0400 = 2−14 ≈ 6.10352 × 10−5 (minimum positive normal)

0001 = 2−24 ≈ 5.96046 × 10−8 (minimum strictly positive subnormal)

0000 = 0
8000 = −0

7c00 = infinity
fc00 = −infinity

3555 ≈ 0.33325... ≈ 1/3

By default, 1/3 rounds down like for double precision
Double precision
In computing, double precision is a computer number format that occupies two adjacent storage locations in computer memory. A double-precision number, sometimes simply called a double, may be defined to be an integer, fixed point, or floating point .Modern computers with 32-bit storage locations...

, because of the odd number of bits in the significand.
So the bits beyond the rounding point are 0101... which is less than 1/2 of a unit in the last place
Unit in the Last Place
In computer science and numerical analysis, unit in the last place or unit of least precision is the spacing between floating-point numbers, i.e., the value the least significant bit represents if it is 1...

.

Precision limitations on integer values

Integers between 0 and 2048 can be exactly represented


Integers between 2049 and 4096 round down to the nearest multiple of 2 (even number)


Integers between 4097 and 8192 round down to the nearest multiple of 4


Integers between 8193 and 16384 round down to the nearest multiple of 8


Integers between 16385 and 32768 round down to the nearest multiple of 16


Integers between 32769 and 65535 round down to the nearest multiple of 32


See also

  • IEEE Standard for Floating-Point Arithmetic (IEEE 754)
  • ISO/IEC 10967
    ISO/IEC 10967
    ISO/IEC 10967, Language independent arithmetic , is a series ofstandards on computer arithmetic. It is compatible with IEC 60559, and indeed much of thespecifications in parts 2 and 3 are for IEEE 754 special values...

    , Language Independent Arithmetic
  • Primitive data type
  • RGBE image format
    RGBE image format
    RGBE is an image format invented by Gregory Ward Larson. It stores pixels as one byte each for RGB values with a one byte shared exponent. Thus it stores four bytes per pixel....

  • minifloat
    Minifloat
    In computing, minifloats are floating point values represented with very few bits. Predictably, they are not well suited for general purpose numerical calculations. They are used for special purposes most often in computer graphics where iterations are small and precision has aesthetic effects...


External links

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK