Aug 4, 2010

Crytek's Best Fit Normals

Among the SIGGRAPH presentations, there was one about Crytek's rendering methods. One interesting techniques quickly presented was best fit normals (BFN). This methods is aimed to improve normal precision when stored in the RGB8 format.

When using traditional scaled&biased normals in RGB8 format, some accuracy errors can occur because of the low precision of the RGB8 format related to the scale&bias. For example, considering a 256*256*6 cube map, 393216 directions can be represented. However, due to the low precision of RGB8 format, only 219890 (55.9%) of these directions are effectively represented (many similar directions being represented by the same compressed value). In this case, I have computed that only 1.31% of the full 256*256*256 voxel possibilities of the RGB8 values are used. Also, I have computed that each voxel effectively used represents meanly 1.788 directions.

The idea behind the BFN approach is to search for the voxel that will best represent (fit) a given direction: it may be a non-normalized vector. Using this method with a 256*256*6 cube map, I have found that 387107 (98.4%) of directions are effectively represented. Furthermore, in this case, each voxel used represents meanly 1.016 directions. Thus, using such a method results in a more accurate reconstruction of normals (see screen-shots). Moreover, compression is a single cubemap lookup, and reconstruction is an unbias and normalize.

Left: BFN cubemap, Right: scale&bias cubemap

Reconstruction error (absolute value scaled by 70) for BFN (left) and
scale&bias normal (right)

So how to generate each cube face? Currrently, I am using the brut force method which is horribly slow: for each direction on the cube face, I parse each voxel of the RGB8 volume to search for the one which match the best. One faster method I plan to implement later is to use the Amantide ray marching method to ray march the voxel volume along the ray direction and find the best representative one.

How can this method be used?
  • Better normal map encoding: when computing object normal map, instead of converting from floating point normal map to scaled&biased normal, do a texture look up in the BFN cube map texture.
  • Deferred rendering: high quality normal buffer in RGB8! :) It could also be possible to pack a normal in a 32F channel.
  • Any other ideas?
So when I will have time, don't know when because the end of my PhD is approaching quickly, I plan to implement Amantide's methods to accelerate the computation. Then, maybe use a better representation instead of a cube map.

If you have questions or want access to the cubemap textures, send me an email. As always, feel free to discuss here about this method.


  1. Clever, why haven't we thought about that sooner ? I guess the unpack requiring a normalization used to cost too much. That may not be the case with today's GPU performance.

    But if you have to generate a normal map in real time, fetching the precomputed 3d texture may take a little longer than just scale&bias :) Maybe they only use that technique in DCC Tools plugins.

    Anyway, we were not using the whole RGB range, that's a fact :) That changes the "look" of normal maps, they get dirty.

  2. Dear matumbo,

    I love this funny fractal-like look of the texture :)

    And it is a cubemap not a 3D texture so no problem for real time use. :) The current recommended ALUop/texFetch is 20:1 if I remember well so normalization should be acceptable as you said.

    In fact, A 256*256 cubemap is not the best solution because it is still using only 2.31% of the full RGB8 volume voxels (1.31% for the scale&bias approach based on my computation). I would like to compute the total number of directions that can be represented using the RGB8 volume (because some directions can be represented by several voxels) to know the real capacity of the RGB8 volume.

    Yes, implementing this in DCC tools would be great!

  3. On the consoles, the recommended ALU/texFetch is probably lower, which I'm guessing is Crytek's main motivation. The cost is the memory to store the cube-map.

    A colleague of mine pointed out that there is probably some symmetry in the cube-map that could be exploited, no? Like different faces would be the same values just swizzled. I wonder if there is an efficient way to reduce the storage requirements and take advantage of that symmetry.

  4. I totally agree with this method being oriented toward a console use.

    And yes, there is a symmetry for each cube face and around each major axis. That was something I was thinking about when I said "better representation instead of the cube map". However, I wonder how many ALU ops it will cost to handle this in the shader considering a single 2D texture storing the replicable symmetry pattern (face detection to swizzle the components, mirror repetition, etc).

  5. I may be mistaken as I've quickly browsed through the Crytek paper, but they seem to provide the "better representation" through symetries. I understand that they actually encode their "Normals Fitting Texture" in a 512x512 2D texture (some code in given at the end of the siggraph course paper).

    Using this storage for regular normal maps may prove difficult as it certainly doesn't play nice with bilinear filtering and mipmapping, no ?


  6. Indeed, you are right! I was a bit too much enthousiastic with this first post on bfn! I wanted to correct this in my more recent posts but forgot about it... Interpolating non-normalized normals could result in errors (wrong directions). The same for mipmap...

    example of wrong interpolation:
    A=(0,6) and A'=(0,2) (A' is same direction as A)

    A+0.5*(B-A)= (3,3)
    A'+0.5*(B-A')= (3,1)
    Not the same direction...

    So bfn can only be applied to compress normals in a g-buffer...Or any idea?

    1. That's a shame, about the interpolation issue. It seems like a lot of effort to get precision out of your g-buffer normals, when you could simply use something like R10G10B10 to get even more precision without any encoding effort. All BFN gets you is an extra 8-bit channel in your g-buffer which may not really be all that useful for some.