These containers (Set, Vector, Map, Span), etc. have default constructors,
making the braces unnecessary for default initialization. Better to depend
on that consistently rather than having braces in some places and not others.
* Only works on machines with a Qualcomm Snapdragon 8cx Gen3 or above.
Older generation devices are not and will not be supported due to
some driver issues
* Requires VS2022 for building.
* Uses new MSVC preprocessor for sse2neon compatibility.
* SIMD is not enabled, waiting on conversion of blenlib to C++.
Ref #119126
Pull Request: https://projects.blender.org/blender/blender/pulls/117036
Use the original Light radius to compute the shadowmap projection.
Avoid unnecessary padding in shadowmaps, increasing the perceived
shadow resolution when the shadow softness is not 0.
Pull Request: https://projects.blender.org/blender/blender/pulls/118860
The armature reported had two bones with a `bMotionPath` but its lengths
are zero (which causes trouble in `motion_path_cache` drawing code due
to 0 allocations).
Not exactly sure how we got there, something like
`animviz_verify_motionpaths` should take care of this already in current
code, but this might be from a time where there were not enough sanity
checks.
So now early out in `motion_path_cache` if we encounter such a
"corrupted" motion path.
Pull Request: https://projects.blender.org/blender/blender/pulls/119081
Add percentage closer filtering to shadowmap sampling and a
`shadow_filter_radius` property to lights to control it.
Notes:
* This adds PCF to `eevee_shadow_tracing_lib`, but not to
`eevee_shadow_lib`, which is used by volumes (not required) and
thickness.
* PCF is computed based on the LOD0 size. This assumes that higher
LODs are only used when the shadowmap resolution is actually good
enough to match the render resolution.
Pull Request: https://projects.blender.org/blender/blender/pulls/118220
Tiles tagged for update in eevee_shadow_tag_update_comp can be
untagged in eevee_shadow_tilemap_init_comp, since those tiles
might be tagged as rendered.
Regression from 9e015f703c
Currently, we don't support this. Depending on the geometry type, the rotations
are either displayed as black, magenta or there is a crash. Better disable this for
now until we have a proper implementation. It's not quite obvious how rotation
values should be converted to a color, so this also needs some design work.
Pull Request: https://projects.blender.org/blender/blender/pulls/118808
Performing an off-screen draw call while drawing the viewport isn't
supported, add a check that raises an exception when called from Python
instead of crashing.
Ref: !118780
Enums are stored as uints, but due to a missing implementation they
where stored in shaders as ints. As draw manager now supports uints
as specialization constants we can update these constants to be
stored as uints on the shader side as well.
Pull Request: https://projects.blender.org/blender/blender/pulls/118788
On lower end hardware the film accumulation has bad performance. Sometimes
upto 10ms. This PR improves the performance somewhat by adding a
specialization constant around the renderpasses that are actually needed for
rendering, the number of samples and if reprojection is enabled.
`enabled_categories`: Based on the enabled render passes some outer loops are
enabled/disabled that handle the specific render passes. This improves the performance
as no memory will be reserved for branches that are never accessed.
`samples_len` & `use_reprojection`: GPU compilers tend to optimize texture fetches
when they to the outer loop. This is only possible when the inner loop can be unrolled.
In the case of the film accumulation the inner loop couldn't be unrolled. By adding a
specialization constant would allow unrolling of the inner loop.
On old or low-end devices the improvement is around 40%. On newer devices
the improvement is 50+%. Performance of this shader is similar to
the godot.
| GPU | Before | New |
|----------------------|--------|-------|
| NVIDIA GTX 760 | 3.5ms | 2.4ms |
| GFX1036 (RDNA2 iGPU) | 9.9ms | 6.2ms |
| AMD Radeon Pro W7500 | 2.1ms | 0.9ms |
Pull Request: https://projects.blender.org/blender/blender/pulls/118385
When implementing film accumulation specialization constants we came
across a missing implementation for uint as specialization constant.
This is a split-off from the original patch to add support for uint.
When using it is important to compile with asserts on. uint can be casted
to int without knowning. There are assert mechanism that point you to
these cases.
Pull Request: https://projects.blender.org/blender/blender/pulls/118750
Currently there are two vertex buffers that contain mesh normals. First, the
normals are extracted and stored interleaved with positions. Then there is
a second vertex buffer for just normals. Interleaving them makes some
sense, since they change together, but it fights with the contiguous storage
benefits of `Mesh` and generally makes code more difficult to optimize.
This PR removes the normals interleaved with the positions and changes
the code for extracting positions and normals from meshes to be simpler
and faster, mainly by not using the "extract iterators" as described by the
#116901 design task. That moves most of the branching outside of hot
loops, so we don't do the same work for every mesh element. This also
gives us the option of not calculating or extracting normals in more
situations like wireframe display in the future.
This is only a small part of the work for #116901, so the state of the code
after this PR will have more design inconsistencies. I'll keep working to
resolve those in the future.
In general I observed at least a 5-40% improvement in FPS in playback
of files with large meshes.
Pull Request: https://projects.blender.org/blender/blender/pulls/116902
This adds support by just reusing the GGX reflection LTC
look-up table. This avoid more memory usage for another
table.
This is quite a hack and has no real physical ground.
We already have a roughness remapping function for
reusing sphere-probe for refraction and matching the
blur level. We can reuse this function and use it
for sampling the reflection LUT.
Then getting the theta LUT parameter is done by
computing the angle between the refraction direction
and the reversed normal.
This works because the table is parametrized using the
angle between the view vector and the normal. This angle
is the same as the angle between the reflection vector
and the normal. So to get the equivalent lobe in the
refraction direction we get the angle between the
refraction direction and the reversed normal.
Note: This has issues shadow-map tagging but it should
be fixed separately.
Pull Request: https://projects.blender.org/blender/blender/pulls/118589
This optimizes a few loops that become significant bottlenecks during
viewport rendering of scenes with large numbers of curves.
To render a curves object, Blender needs to generate a potentially
very large (but trivial) index buffer. As previously implemented,
this index buffer is generated in an extremely inefficient manner,
with a single-threaded loop and an explicit function call per entry.
The buffer then needs to be pushed onto the GPU, which is also a fairly
slow task.
The PR generates the index buffer directly on the GPU with compute
shader.
Pull Request: https://projects.blender.org/blender/blender/pulls/116617
The goal of this task is to remove noise in the most common material
layering configuration.
Subsequently, this also split the evaluation of different closure to
their own buffer to avoid discontinuity when denoising them.
This commit does a few things:
- [x] Removes use of global for closure random number.
- [x] Refactor the forward evaluation to be closure type agnostic.
- [x] Refactor the gbuffer lib to be closure type agnostic.
- [x] Reduces the number of picked closure to 3 maximum or less.
- [x] Use GPU_MATFLAG_COAT to tag the use of multiple usage of glossy BSDF.
- [x] Use two closure bin for Glossy when more than one.
- [x] Set closure bin per type for best noise level for most materials.
- [x] Change the gbuffer header to put the closure at their bin index.
- [x] Add a method to get a closure from the gbuffer from a specific bin.
- [x] Split lighting passes per Closure.
Pull Request: https://projects.blender.org/blender/blender/pulls/118079