Commit Graph

3 Commits

Author SHA1 Message Date
Aras Pranckevicius
c6f5c89669 BLI: faster float<->half array conversions, use in Vulkan
In addition to float<->half functions to convert one number (#127708), add
float_to_half_array and half_to_float_array functions:
- On x64, this uses SSE2 4-wide implementation to do the conversion
  (2x faster half->float, 4x faster float->half compared to scalar),
  - There's also an AVX2 codepath that uses CPU hardware F16C instructions
    (8-wide), to be used when/if blender codebase will start to be built
    for AVX2 (today it is not yet).
- On arm64, this uses NEON VCVT instructions to do the conversion.

Use these functions in Vulkan buffer/texture conversion code. Time taken to
convert float->half texture while viewing EXR file in image space (22M
numbers to convert): 39.7ms -> 10.1ms (would be 6.9ms if building for AVX2)

Pull Request: https://projects.blender.org/blender/blender/pulls/127838
2024-09-22 17:39:54 +02:00
Campbell Barton
4bd0cc888e Cleanup: various non functional changes
- Reduce variable scope.
- Function style casts.
- Avoid variable shadowing.
- Quiet unused assignment warnings.
- Remove redundant call in GHOST_WindowNULL constructor.
2024-09-22 18:25:40 +10:00
Aras Pranckevicius
92544d6d76 BLI: add float<->half conversion functions with correct math, use in Vulkan
Blender codebase had two ways to convert half (FP16) to float (FP32):

- BLI_math_bits.h half_to_float. Out of 64k possible half values, it converts
  4096 of them incorrectly. Mostly denormals and NaNs, which is perhaps not too
  relevant. But more importantly, it converts half zero to float 0.000030517578
  which does not sound ideal.
- Functions in Vulkan vk_data_conversion.hh. This one converts 2046 possible
  half values incorrectly.

Function to convert float (FP32) to half (FP16) was in Vulkan
vk_data_conversion.hh, and it got a bunch of possible inputs wrong. I guess it
did not do proper "round to nearest even" that CPU/GPU hardware does.

This PR:

- Adds BLI_math_half.hh with float_to_half and half_to_float functions.
    - Documentation and test coverage.
    - When compiling on ARM NEON, use hardware VCVT instructions.
- Removes the incorrect half_to_float from BLI_math_bits.h and replaces single
  usage of it in View3D color picking to use the new function.
- Changes Vulkan FP32<->FP16 conversion code to use the new functions, to fix
  correctness issues (makes eevee_next_bsdf_vulkan test pass). This makes it
  faster too.

Pull Request: https://projects.blender.org/blender/blender/pulls/127708
2024-09-18 13:15:00 +02:00