Appendix A. GL_HALF_FLOAT

GL_HALF_FLOAT is a vertex and texture data type supported by OpenGL ES 3.0. The GL_HALF_FLOAT data type is used to specify 16-bit floating-point values. This can be useful, for example, in specifying vertex attributes such as texture coordinates,normals,binormals, and tangent vectors. Using GL_HALF_FLOAT rather than GL_FLOAT provides a two times reduction in memory bandwidth required to read vertex or texture data by the GPU.

One might argue that we can use GL_SHORT or GL_UNSIGNED_SHORT instead of a 16-bit floating-point data type and get the same memory footprint and bandwidth savings. However, with that approach, you will need to scale the data or matrices appropriately and apply a transform in the vertexshader. For example, consider the case where a texture pattern is to be repeated four times horizontally and vertically over a quad. GL_SHORT can be used to store the texture coordinates. The texture coordinates could be stored as a value of 4.12 or 8.8. The texture coordinate values stored as GL_SHORT are scaled by (1 << 12) or (1 << 8) to give us a fixed-point representation that uses 4 bits or 8 bits of integer and 12 bits or 8 bits of fraction. Because OpenGL ES does not understand such a format, the vertexshader will then need to apply a matrix tounscale these values, which affects the vertex shading performance. These additional transforms are not required if a 16-bit floating-point format is used. Further, values represented as floating-point numbers have a larger dynamic range than fixed-point values because of the use of an exponent in the representation.


Note

Fixed-point values have a different error metric than floating-point values. The absolute error in a floating-point number is proportional to the magnitude of the value, whereas the absolute error in a fixed-point format is constant. Developers need to be aware of these precision issues when choosing which data type to use when generating coordinates for a particular format.


16-Bit Floating-Point Number

Figure A-1 describes the representation of a half-float number. A half-float is a 16-bit floating-point number with 10 bits of mantissa m, 5 bits of exponent e, and a sign bit s.

Image

Figure A-1 A 16-Bit Floating-Point Number

The following rules should be used when interpreting a 16-bit floating-point number:

• If exponent e is between 1 and 30, the half-float value is computed as (– l)s * 2e-15 * (1 + m/1024).

• If exponent e and mantissa m are both 0, the half-float value is 0.0. The sign bit is used to represent –ve 0.0 or +ve 0.0.

• If exponent e is 0 and mantissa m is not 0, the half-float value is adenormalized number.

• If exponent e is 31, the half-float value is either infinity (+ve or –ve) or aNaN (“not a number”) depending on whether the mantissa m is zero.

A few examples follow:

0      00000      0000000000    = 0.0
0      00000      0000001111    = adenorm value
0      11111      0000000000    = positive infinity
1      11111      0000000000    = negative infinity
0      11111      0000011000    =NaN
1      11111      1111111111    =NaN
0      01111      0000000000    = 1.0
1      01110      0000000000    = −0.5
0      10100      1010101010    = 53.3125

OpenGL ES 3.0 implementations must be able to accept input half-float data values that are infinity,NaN, ordenormalized numbers. They do not have to support 16-bit floating-point arithmetic operations with these values. Most implementations will convertdenormalized numbers andNaN values to zero.

Converting a Float to a Half-Float

The following routines describe how to convert a single-precision floating-point number to a half-float value, and vice versa. The conversion routines are useful when vertex attributes are generated using single-precision floating-point calculations but then converted to half-floats before they are used as vertex attributes:

// −15 stored using a single-precision bias of 127
const unsignedint  HALF_FLOAT_MIN_BIASED_EXP_AS_SINGLE_FP_EXP = 0x38000000;
// max exponent value in single precision that will be converted
// toInf orNaN when stored as a half-float
const unsignedint  HALF_FLOAT_MAX_BIASED_EXP_AS_SINGLE_FP_EXP = 0x47800000;

// 255 is the max exponent biased value
const unsignedint  FLOAT_MAX_BIASED_EXP = (0x1F << 23);

const unsignedint  HALF_FLOAT_MAX_BIASED_EXP = (0x1F << 10);

typedef unsigned short   hfloat;

hfloat
convertFloatToHFloat(float *f)
{
   unsignedint   x = *(unsignedint *)f;
   unsignedint   sign = (unsigned short)(x >> 31);
   unsignedint   mantissa;
   unsignedint  exp;
   hfloat        hf;

   // get mantissa
   mantissa = x & ((1 << 23) − 1);
   // get exponent bits
   exp = X & FLOAT_MAX_BIASED_EXP;
   if (exp >= HALF_FLOAT_MAX_BIASED_EXP_AS_SINGLE_FP_EXP)
   {
      // check if the original single-precision float number
      // is aNaN
      if (mantissa && (exp == FLOAT_MAX_BIASED_EXP))
      {
         // we have a single-precisionNaN
         mantissa = (1 << 23) − 1;
      }
      else
      {
         // 16-bit half-float representation stores number
         // asInf mantissa = 0;
      }
      hf = (((hfloat)sign) << 15) |
            (hfloat)(HALF_FLOAT_MAX_BIASED_EXP) |
            (hfloat)(mantissa >> 13);
   }
   // check if exponent is <= −15
   else if (exp <= HALF_FLOAT_MIN_BIASED_EXP_AS_SINGLE_FP_EXP)
   {
      // store adenorm half-float value or zero
      exp = (HALF_FLOAT_MIN_BIASED_EXP_AS_SINGLE_FP_EXP −exp)
             >> 23;
      mantissa >>= (14 +exp);

      hf = (((hfloat)sign) << 15) | (hfloat)(mantissa);
   }
   else
   {
      hf = (((hfloat)sign) << 15) |
            (hfloat)
            ((exp − HALF_FLOAT_MIN_BIASED_EXP_AS_SINGLE_FP_EXP)
            >> 13)|
             (hfloat)(mantissa >> 13);
   }
   returnhf;
}
float
convertHFloatToFloat(hfloathf)
{
   unsignedint   sign = (unsignedint)(hf >> 15);
   unsignedint   mantissa = (unsignedint)(hf &
                 ((1 << 10) − 1));
   unsignedint   exp = (unsignedint)(hf &
                 HALF_FLOAT_MAX_BIASED_EXP);
   unsignedint   f;

   if (exp == HALF_FLOAT_MAX_BIASED_EXP)
   {
      // we have a half-floatNaN orInf
      // half-floatNaNs will be converted to a single-
      // precisionNaN
      // half-floatInfs will be converted to a single-
      // precisionInf
      exp = FLOAT_MAX_BIASED_EXP;
      if (mantissa)
          mantissa = (1 << 23) − 1;   // set all bits to
                                      // indicate aNaN
   }
   else if (exp == 0x0)
   {
      // convert half-float zero/denorm to single-precision
      // value
      if (mantissa)
      {
         mantissa <<= 1;
         exp = HALF_FLOAT_MIN_BIASED_EXP_AS_SINGLE_FP_EXP;
         // check for leading 1 indenorm mantissa
         while ((mantissa & (1 << 10)) == 0)
         {
            // for every leading 0, decrement single-
            // precision exponent by 1
            // and shift half-float mantissa value to the
            // left mantissa <<= 1;
            exp −= (1 << 23);
         }
         // clamp the mantissa to 10 bits
         mantissa &= ((I << 10) − 1);
         // shift left to generate single-precision mantissa
         // of 23-bits mantissa <<= 13;
      }
   }
   else
   {
      // shift left to generate single-precision mantissa of
      // 23-bits mantissa <<= 13;
      // generate single-precision biased exponent value
      exp = (exp << 13) +
      HALF_FLOAT_MIN_BIASED_EXP_AS_SINGLE_FP_EXP;
   }
   f = (sign << 31) |exp | mantissa;
   return *((float *)&f);
}

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.135.249.220