Files
test/source
Jeroen Bakker 8dce2a422b EEVEE-Next: Specialization Constants for Film Accumulation
On lower end hardware the film accumulation has bad performance. Sometimes
upto 10ms. This PR improves the performance somewhat by adding a
specialization constant around the renderpasses that are actually needed for
rendering, the number of samples and if reprojection is enabled.

`enabled_categories`: Based on the enabled render passes some outer loops are
enabled/disabled that handle the specific render passes. This improves the performance
as no memory will be reserved for branches that are never accessed.

`samples_len` & `use_reprojection`: GPU compilers tend to optimize texture fetches
when they to the outer loop. This is only possible when the inner loop can be unrolled.
In the case of the film accumulation the inner loop couldn't be unrolled. By adding a
specialization constant would allow unrolling of the inner loop.

On old or low-end devices the improvement is around 40%. On newer devices
the improvement is 50+%. Performance of this shader is similar to
the godot.

| GPU                  | Before | New   |
|----------------------|--------|-------|
| NVIDIA GTX 760       | 3.5ms  | 2.4ms |
| GFX1036 (RDNA2 iGPU) | 9.9ms  | 6.2ms |
| AMD Radeon Pro W7500 | 2.1ms  | 0.9ms |

Pull Request: https://projects.blender.org/blender/blender/pulls/118385
2024-02-26 16:19:26 +01:00
..