The goal of this PR is to remove the per face, NDC space
tracing for shadow maps.
This requires custom occluder extrapolation but in return
removes quite a lot of complexity in other areas, namely:
- No more per face transform before the tracing
- No more per face jittering (fix#119565)
- No more frustum padding (increased maximum precision)
- Better use of 32bit precision shadow map
- Fix#121343
This improve softness at relatively low step count (default 6
is better) and reduces light leaking at very low sample
count (sharper). It makes it more intuitive now that
higher sample count is smoother.
Pull Request: https://projects.blender.org/blender/blender/pulls/121317
The goal of this PR is to remove any user facing parameter
bias that fixes issues that are caused by inherent nature of
the shadow map (aliasing or discretization).
Compute shadow bias in the normal direction to avoid
both shadow leaking at certain angles or shadow
acnee.
This bias is computed automatically based on the minimum
bias amount required to remove all errors. The render
setting is removed as of no use for now.
#### Normal bias
We do the bias in world space instead of shadow space
for speed and simplicity. This requires us to bias using
the upper bound of biases for the same location in space
(using biggest texel world radius instead of UVZ bias).
This isn't much of an issue since the bias is still less
than 2 texel. The bias is still modulated by facing
ratio to the light so surfaces facing the light have no
biase.
This fixes both self shadowing and flat occluders aliasing
artifacts at the cost of moving the shadow a bit on the
side. This is the blue arrow in the diagram.
We always bias toward the normal direction instead of the
light direction. This is alike Cycles geometric offset for
the shadow terminator fix. This is better since it does'nt
modify the shading at all.
#### Slope bias
To avoid aliasing issue on zero slope receiver, we still
have to use the slope bias with a size of 1 pixel.
#### PCF filtering
We now parametrize the filter around the normal instead of
using the shadow map local space. This requires to use
a disk filter instead of box, which is also more pleasant
for most light shapes (all except rectangle lights).
Setting the filter around normal avoid overshadowing from
zero slope occluders. This cannot be fixed by more slope bias
in light space PCF. We could fix it in light space by projecting
onto the normal plane but that gives an unbounded bias when `N.L`
is near 0 which causes either missing shadows or self shadow if
using an arbitrary max offset value.
To avoid overshadowing from any surface behind the shading
point, we reflect the offset to always face the light.
Doing so instead of using the perpendicular direction
is better for very sharp geometric angles, has less
numerical precision issue, is symetrical and is cheaper.
To avoid any self shadowing artifact on zero slope receivers
with angled neighbors (like a wall and the floor), we have
to increase the slope bias according to the filter size.
This might be overkill in most situation but I don't feel
this should become a setting and should be kept in sync
with the filter. If it has to become an option, it should
simply a factor between unbiased filter and best bias.
#### Shadow terminator
The remaining artifacts are all related to shadow terminator
one way or another. It is always caused by the shading
normal we use for biasing and visibility computation not
being aligned with the geometric normal.
This is still something we need a setting for somewhere.
Pull Request: https://projects.blender.org/blender/blender/pulls/121088
This introduces fragment clipping for the shadow pipeline.
The fragments are culled using a sphere distance test
for sphere lights, which is then re-purposed as a simple
distance test for area lights.
However, in the case of area lights, this introduces some
light bleeding. To mitigate this, we bring the shadow origin
back towards the light center so that this artifact only
appears if using larger shadow tracing.
Fixes#119334
Pull Request: https://projects.blender.org/blender/blender/pulls/120992
This act in multiple phases:
- A shader scan the whole clipmap after tilemap finalize to gather
where valid tiles from lower LODs are available and write the page
location and LOD offset at invalid tiles location.
- At sampling time we add the LOD offset before the pixel page
modulo operation. This offset is equal to the amount of pages of
the **sampling** LOD needed to have the same modulo result as the
**sampled** LOD.
The whole thing being very tricky, I added a lot of unit testing.
This has no use for now but the system is needed to implement:
- Shadows from Volumetrics at lower cost & memory footprint.
- Fixing soft shadows artifacts.
These will be implemented in separate PRs.
Pull Request: https://projects.blender.org/blender/blender/pulls/120031
This limits the number of tilemaps per LOD that can be fed to avoid the
easy to hit "Too many shadow updates" (#119757).
This allows for a max 64 tilemaps to be updated at once at their lowest
requested LOD (so ~10.6667 point lights if every faces of the punctual
shadow map is needed, but likely more in practice).
Unfortunately this is still quite low and will surely be hit quite soon
with directional shadow added to it. One idea to workaround this would
be to time slice the update of some lights, but this opens a whole can
of worms that I'm not ready to open for now so I created #119890 for
future reference.
Some notes, most lights seems to request around 3 LODs. It might help
to allow requesting at least 2 LODs if we are rendering since volumes
might want lower LOD available for volumes.
I added a very simplistic heuristic that also lowers the max tilemaps
when transforming, animation playback or navigating the 3D view to
improve the responsiveness of the engine. Note that this doesn't
only lowers the resolution to the minimum requested one. So it should
be good enough in most cases.
Pull Request: https://projects.blender.org/blender/blender/pulls/119889
Now that all relevant code is C++, the indirection from the C struct
`GPUVertBuf` to the C++ `blender::gpu::VertBuf` class just adds
complexity and necessitates a wrapper API, making more cleanups like
use of RAII or other C++ types more difficult.
This commit replaces the C wrapper structs with direct use of the
vertex and index buffer base classes. In C++ we can choose which parts
of a class are private, so we don't risk exposing too many
implementation details here.
Pull Request: https://projects.blender.org/blender/blender/pulls/119825
The test uses a none points shader to draw points, which is incorrect
and asserts when using Vulkan. It also didn't test the gpu part of the
pipeline (PassMain).
When using PassMain the order of the resources are expected to be
different as the draw calls are ordered based on the primitive type and
handles.
Pull Request: https://projects.blender.org/blender/blender/pulls/117714
Fix an issue in `find_first_valid` where Nvidia would incorrectly
increment `src` after return.
This should fix the issue where some tiles would stay corrupted
until resetting EEVEE.
Initialize tiles_data in `TestAlloc` so it doesn't fail in debug builds.
Pull Request: https://projects.blender.org/blender/blender/pulls/114550
Optimization of EEVEE Next's Virtual Shadow Maps for TBDRs.
The core of these optimizations lie in eliminating use of
atomic shadow atlas writes and instead utilise tile memory to
perform depth accumulation as a secondary pass once all
geometry updates for a given shadow view have been updated.
This also allows use of fast on-tile depth testing/sorting, reducing
overdraw and redundant fragment operations, while also allowing
for tile indirection calculations to be offloaded into the vertex
shader to increase fragment storage efficiency and throughput.
Authored by Apple: Michael Parkin-White
Co-authored-by: Michael Parkin-White <mparkinwhite@apple.com>
Co-authored-by: Clément Foucault <foucault.clem@gmail.com>
Pull Request: https://projects.blender.org/blender/blender/pulls/111283
Shadow Map Ray Tracing is a technique that ray cast against the shadow
depth buffer. The technique is described in "Soft Shadows by
Ray Tracing Multilayer Transparent Shadow Maps".
Note that we only implement the single layer approach since storing
multiple depth is prohibitively expensive.
Pull Request: https://projects.blender.org/blender/blender/pulls/111809
This adds a new entry to the split sum LUT to isolate
the effect of the F82 tint.
The application of the tint part is similar to cycles
and uses the same way for precomputing the `b` factor.
Results matches almost perfectly to the extent of the
split sum approximation.
Note that this removes the unused LTC MAG LUT for
EEVEE next to make space for the new table. It can still
be added back if needed.
Pull Request: https://projects.blender.org/blender/blender/pulls/112881
This moves the pre-computation offline and store the pre-computed
table in the binary. The pre-computed tables are quite small and are
not a concern with respect to binary size increase.
This rewrites the precomputation to use manually fitted
approximations for both burley and random walk.
The approximations fix a discrepancy between cycles and EEVEE
SSS translucency look. The absolute maximum error is below 2%.
I believe better results could be achieved with automatic fitting
tools.
Note that Cycles Burley translucency profile has some issues as it
does not give a smooth profile. The profile is biased near the end
of the lower radii. For this reason, the fit was done on a white
diffuse with (1,1,1) radii which does not exhibit this artifact.
Note that while this adds the profile for random walk, it isn't
currently used because the profile type is not yet passed down
the deferred path.
The fitting data can be found attached to this PR.
Pull Request: https://projects.blender.org/blender/blender/pulls/112512
because it contains reflectance and transmittance, so BSDF would be a
morep proper name.
Also rename BSDF to BRDF at places where only reflectance is returned.
Split shadow rendering per LOD per tilemap and improve
fragment shader invocation rate by using multi-viewport.
Also changes the layout of the atlas to be 4 x 4 x Layers.
This allow to grow the atlas while keeping the content
and page indirection correct, but this isn't implemented
in this patch.
# First attempt
Shadow rendering using atomic proved to be less than ideal
and performance were not quite to an acceptable level.
The previous method had issue with atomic contention when
a lot of triangle would overlap and too many fragment shader
invocations with quite complex indirection rules and biases
which made the technique costly.
The new implementation leverage multi viewport and
layered rendeing to effectively replace the need for atomic
and render directly to the shadow atlas. Using the well
supported extension these are free on modern hardware and
do not need a geometry shader.
One view per tile is needed since we use the viewport index
and the layer index as a way to index a specific tile in the
array.
# Geometric Complexity Problem
The counterpart of this is that we need to draw one geometry
instance per tile which is 32x32 time more instances (at most)
than with the previous method.
This means that we will have to find a way to mitigate this
geometry cost by either reducing the number of tiles per
tilemaps (in other words, making the system less memory efficient)
or splitting complex objects' geometry into smaller, more
cull friendly chunks (for example, like the sculpt PBVH nodes).
The later seems to be a longer term solution as it requires
way too much engineering time we have right now.
# Update Lag Problem
This also mean we can only update up to 64 tile per redraw
which is not enough even in the most basic cases. This leads
to missing or over shadowing when a light updates until there
is no updates and the shadow rendering can catch up.
One possible solution is to update a lower LODs first waiting
until there is no update to render. This would allow no artifact
during the transforms (unless there is too many light updates
even for lowest LOD, but that was an issue also for the
previous implementation). This could also help with the
geometric complexity.
# Solution
In the end, we decided to have one view per lod. This limits
the complexity of the fragment shader (improve speed),
reduces the number of views per tilemap (fix update lag),
and reduces the number of instances.
This also mean we cannot render directly to the atlas anymore
and reverted to the atomic solution. Using the smallest
possible viewport, we assure that there isn't that much fragment
shader invocations which was one of the bottleneck. And also
reduces the amount of geometry instances that pass the
clipping test.
Pull Request: https://projects.blender.org/blender/blender/pulls/110979
Listing the "Blender Foundation" as copyright holder implied the Blender
Foundation holds copyright to files which may include work from many
developers.
While keeping copyright on headers makes sense for isolated libraries,
Blender's own code may be refactored or moved between files in a way
that makes the per file copyright holders less meaningful.
Copyright references to the "Blender Foundation" have been replaced with
"Blender Authors", with the exception of `./extern/` since these this
contains libraries which are more isolated, any changed to license
headers there can be handled on a case-by-case basis.
Some directories in `./intern/` have also been excluded:
- `./intern/cycles/` it's own `AUTHORS` file is planned.
- `./intern/opensubdiv/`.
An "AUTHORS" file has been added, using the chromium projects authors
file as a template.
Design task: #110784
Ref !110783.
This PR enabled the draw manager test cases when compiling with
`WITH_VULKAN_BACKEND=On`. Currently they should pass all the tests
in draw_pass_test.cc that also pass for OpenGL. The draw_visibility
test seems to be faulty (also for OpenGL).
The vulkan backend doesn't have all the features implemented to
pass the Eevee testcases and are expected to fail.
Pull Request: https://projects.blender.org/blender/blender/pulls/110994
Ensure correct SSBO bindings are present for shadow tests.
Metal validation errors occur if SSBO bindings that are expected are
not bound. In this case, we can bind empty SSBOs, but these should
be of the correct type for the tests.
Also adding missing zero-initializations for required members within
LightData. Without these, unit tests fail with various issues including
prevalence of OOB reads.
Co-authored-by: Michael Parkin-White <mparkinwhite@apple.com>
Pull Request: https://projects.blender.org/blender/blender/pulls/109645
A lot of files were missing copyright field in the header and
the Blender Foundation contributed to them in a sense of bug
fixing and general maintenance.
This change makes it explicit that those files are at least
partially copyrighted by the Blender Foundation.
Note that this does not make it so the Blender Foundation is
the only holder of the copyright in those files, and developers
who do not have a signed contract with the foundation still
hold the copyright as well.
Another aspect of this change is using SPDX format for the
header. We already used it for the license specification,
and now we state it for the copyright as well, following the
FAQ:
https://reuse.software/faq/
See: https://projects.blender.org/blender/blender/issues/103343
Changes:
1. Added `BKE_node.hh` file. New file includes old one.
2. Functions moved to new file. Redundant `(void)`, `struct` are removed.
3. All cpp includes replaced from `.h` on `.hh`.
4. Everything in `BKE_node.hh` is on `blender::bke` namespace.
5. All implementation functions moved in namespace.
6. Function names (`BKE_node_*`) changed to `blender::bke::node_*`.
7. `eNodeSizePreset` now is a class, with renamed items.
Pull Request: https://projects.blender.org/blender/blender/pulls/107790
Both the shader_builder and existing shader tests eventually
tested the same aspects. shader_builder is more modern and
handles more cases.
The old shader test requires a full backend in order to run
This commit replaces the old tests to just use the
shader builder for validation.
Shader builder can still be run at compile time, this is
just a convenience to have as a test case as well for CI/CD.
Ref: #105482
Add overlay option for retopology, which hides the shaded mesh akin to Hidden Wire, and offsets the edit mesh overlay towards the view.
Related Task #70267
Pull Request #104599
Implements virtual shadow mapping for EEVEE-Next primary shadow solution.
This technique aims to deliver really high precision shadowing for many
lights while keeping a relatively low cost.
The technique works by splitting each shadows in tiles that are only
allocated & updated on demand by visible surfaces and volumes.
Local lights use cubemap projection with mipmap level of detail to adapt
the resolution to the receiver distance.
Sun lights use clipmap distribution or cascade distribution (depending on
which is better) for selecting the level of detail with the distance to
the camera.
Current maximum shadow precision for local light is about 1 pixel per 0.01
degrees.
For sun light, the maximum resolution is based on the camera far clip
distance which sets the most coarse clipmap.
## Limitation:
Alpha Blended surfaces might not get correct shadowing in some corner
casses. This is to be fixed in another commit.
While resolution is greatly increase, it is still finite. It is virtually
equivalent to one 8K shadow per shadow cube face and per clipmap level.
There is no filtering present for now.
## Parameters:
Shadow Pool Size: In bytes, amount of GPU memory to dedicate to the
shadow pool (is allocated per viewport).
Shadow Scaling: Scale the shadow resolution. Base resolution should
target subpixel accuracy (within the limitation of the technique).
Related to #93220
Related to #104472
Straightforward port. I took the oportunity to remove some C vector
functions (ex: copy_v2_v2).
This makes some changes to DRWView to accomodate the alignement
requirements of the float4x4 type.
Straightforward port. I took the oportunity to remove some C vector
functions (ex: `copy_v2_v2`).
This makes some changes to DRWView to accomodate the alignement
requirements of the float4x4 type.