Performing an off-screen draw call while drawing the viewport isn't
supported, add a check that raises an exception when called from Python
instead of crashing.
Ref: !118780
Enums are stored as uints, but due to a missing implementation they
where stored in shaders as ints. As draw manager now supports uints
as specialization constants we can update these constants to be
stored as uints on the shader side as well.
Pull Request: https://projects.blender.org/blender/blender/pulls/118788
On lower end hardware the film accumulation has bad performance. Sometimes
upto 10ms. This PR improves the performance somewhat by adding a
specialization constant around the renderpasses that are actually needed for
rendering, the number of samples and if reprojection is enabled.
`enabled_categories`: Based on the enabled render passes some outer loops are
enabled/disabled that handle the specific render passes. This improves the performance
as no memory will be reserved for branches that are never accessed.
`samples_len` & `use_reprojection`: GPU compilers tend to optimize texture fetches
when they to the outer loop. This is only possible when the inner loop can be unrolled.
In the case of the film accumulation the inner loop couldn't be unrolled. By adding a
specialization constant would allow unrolling of the inner loop.
On old or low-end devices the improvement is around 40%. On newer devices
the improvement is 50+%. Performance of this shader is similar to
the godot.
| GPU | Before | New |
|----------------------|--------|-------|
| NVIDIA GTX 760 | 3.5ms | 2.4ms |
| GFX1036 (RDNA2 iGPU) | 9.9ms | 6.2ms |
| AMD Radeon Pro W7500 | 2.1ms | 0.9ms |
Pull Request: https://projects.blender.org/blender/blender/pulls/118385
When implementing film accumulation specialization constants we came
across a missing implementation for uint as specialization constant.
This is a split-off from the original patch to add support for uint.
When using it is important to compile with asserts on. uint can be casted
to int without knowning. There are assert mechanism that point you to
these cases.
Pull Request: https://projects.blender.org/blender/blender/pulls/118750
Currently there are two vertex buffers that contain mesh normals. First, the
normals are extracted and stored interleaved with positions. Then there is
a second vertex buffer for just normals. Interleaving them makes some
sense, since they change together, but it fights with the contiguous storage
benefits of `Mesh` and generally makes code more difficult to optimize.
This PR removes the normals interleaved with the positions and changes
the code for extracting positions and normals from meshes to be simpler
and faster, mainly by not using the "extract iterators" as described by the
#116901 design task. That moves most of the branching outside of hot
loops, so we don't do the same work for every mesh element. This also
gives us the option of not calculating or extracting normals in more
situations like wireframe display in the future.
This is only a small part of the work for #116901, so the state of the code
after this PR will have more design inconsistencies. I'll keep working to
resolve those in the future.
In general I observed at least a 5-40% improvement in FPS in playback
of files with large meshes.
Pull Request: https://projects.blender.org/blender/blender/pulls/116902
This adds support by just reusing the GGX reflection LTC
look-up table. This avoid more memory usage for another
table.
This is quite a hack and has no real physical ground.
We already have a roughness remapping function for
reusing sphere-probe for refraction and matching the
blur level. We can reuse this function and use it
for sampling the reflection LUT.
Then getting the theta LUT parameter is done by
computing the angle between the refraction direction
and the reversed normal.
This works because the table is parametrized using the
angle between the view vector and the normal. This angle
is the same as the angle between the reflection vector
and the normal. So to get the equivalent lobe in the
refraction direction we get the angle between the
refraction direction and the reversed normal.
Note: This has issues shadow-map tagging but it should
be fixed separately.
Pull Request: https://projects.blender.org/blender/blender/pulls/118589
This optimizes a few loops that become significant bottlenecks during
viewport rendering of scenes with large numbers of curves.
To render a curves object, Blender needs to generate a potentially
very large (but trivial) index buffer. As previously implemented,
this index buffer is generated in an extremely inefficient manner,
with a single-threaded loop and an explicit function call per entry.
The buffer then needs to be pushed onto the GPU, which is also a fairly
slow task.
The PR generates the index buffer directly on the GPU with compute
shader.
Pull Request: https://projects.blender.org/blender/blender/pulls/116617
The goal of this task is to remove noise in the most common material
layering configuration.
Subsequently, this also split the evaluation of different closure to
their own buffer to avoid discontinuity when denoising them.
This commit does a few things:
- [x] Removes use of global for closure random number.
- [x] Refactor the forward evaluation to be closure type agnostic.
- [x] Refactor the gbuffer lib to be closure type agnostic.
- [x] Reduces the number of picked closure to 3 maximum or less.
- [x] Use GPU_MATFLAG_COAT to tag the use of multiple usage of glossy BSDF.
- [x] Use two closure bin for Glossy when more than one.
- [x] Set closure bin per type for best noise level for most materials.
- [x] Change the gbuffer header to put the closure at their bin index.
- [x] Add a method to get a closure from the gbuffer from a specific bin.
- [x] Split lighting passes per Closure.
Pull Request: https://projects.blender.org/blender/blender/pulls/118079
This fixes an issue where the number of viewport samples are set
to 1 and reprojection is deactivated. In this case the sample that
has the data to update the probes is ignored as all samples where
already rendered. A tweak in the viewport was needed to fix this issue.
Pull Request: https://projects.blender.org/blender/blender/pulls/118654
When disabling/enabling reflection probes the atlas texture can be
recreated removing the existing content of the texture. When this
happens the world probe needs to be rerendered.
Pull Request: https://projects.blender.org/blender/blender/pulls/118656
The issue described was that the motion path didn't display the last frame
of a scene.
This PR makes the user facing motion path range inclusive on both ends.
E.g. when the user specifies a motion path from 1-24 the will now get all 24
frames, whereas previously the motion path would end at frame 23.
This also makes the `Scene Frame Range` option work properly since that
had the same issue. Now it displays the actual full scene range.
Internally, the `bMotionPath` is still exclusive on the upper bound.
It is just the `bAnimVizSettings` range that has been modified.
Pull Request: https://projects.blender.org/blender/blender/pulls/118611
This allows better roughness approximation for glass
materials when not using raytracing.
The fit was done by rendering a checkerboard
with a refractive plane and an orthographic camera.
Same setup with a reflective plane gave the
reference roughness to match.
Rendering a few hundred of small images with
Cycles and then finding a curve fit manually
by matching blur level.
Rel #118256
Pull Request: https://projects.blender.org/blender/blender/pulls/118533
When the retopology overlay is enabled, the edit mesh is not drawn
in solid mode. When you disabled overlays however, it would not be
drawn in any mode, which understandably confused users.
Now it checks whether overlays are enabled before it hides the solid mesh.
Pull Request: https://projects.blender.org/blender/blender/pulls/118422
This patch adds a new `Stretching Opacity` slider to the overlays panel in the UV Editor.
This allows users to tweak the opacity of the UV stretching overlay, so the image texture
can still be visible through it.
Pull Request: https://projects.blender.org/blender/blender/pulls/117381
This adds back sphere probe pre-convolution.
The difference is that we use spherical
Gaussian instead of the GGX NDF.
This allows us to reuse the previous mip as
a source for the convolution and thus reduce
the sample count and give a noiseless result.
However since we don't use filtered importance
sampling anymore, we have to compensate with
some more samples. This could be addressed in
a follow up PR if needed.
This also changes the octahedral mapping
procedure to avoid padding texels and
interpolation artifacts.
Also cleanup to make sure all functions
related to mapping are in the same file.
The change to Spherical Gaussian has some impact
on the look. The resulting visual is a less "foggy"
but most of the energy is where it should be.
Only the caracteristic "GGX tail" is missing.
These sphere light-probes convolved mips are only
used when raytracing is off or un-available (forward
surfaces).
Ref #118256
Pull Request: https://projects.blender.org/blender/blender/pulls/118354
The depsgraph CoW mechanism is a bit of a misnomer. It creates an
evaluated copy for data-blocks regardless of whether the copy will
actually be written to. The point is to have physical separation between
original and evaluated data. This is in contrast to the commonly used
performance improvement of keeping a user count and copying data
implicitly when it needs to be changed. In Blender code we call this
"implicit sharing" instead. Importantly, the dependency graph has no
idea about the _actual_ CoW behavior in Blender.
Renaming this functionality in the despgraph removes some of the
confusion that comes up when talking about this, and will hopefully
make the depsgraph less confusing to understand initially too. Wording
like "the evaluated copy" (as opposed to the original data-block) has
also become common anyway.
Pull Request: https://projects.blender.org/blender/blender/pulls/118338
The compositor assumes the entire viewport as its compositing space even
in camera view. The current design decision was to limit the compositing
space by the camera region only if the camera passepartout is opaque,
that is, areas outside of the camera are not visible.
This patch changes that behavior to always limit the compositing space
by the camera region. The downside is that areas outside of the camera
will be left uncomposited.
This is useful to match viewport compositing to final render compositing
in terms of maintaining the same space, but not necessarily the same
resolution. However, this still has the limitation that space will be
different when the camera region intersects the viewport, since we only
composite their intersection in that case.
Pull Request: https://projects.blender.org/blender/blender/pulls/118241
The `object_to_world` and `world_to_object` matrices are set during
depsgraph evaluation, calculated from the object's animated location,
rotation, scale, parenting, and constraints. It's confusing and
unnecessary to store them with the original data in DNA.
This commit moves them to `ObjectRuntime` and moves the matrices to
use the C++ `float4x4` type, giving the potential for simplified code
using the C++ abstractions. The matrices are accessible with functions
on `Object` directly since they are used so commonly. Though for write
access, directly using the runtime struct is necessary.
The inverse `world_to_object` matrix is often calculated before it's
used, even though it's calculated as part of depsgraph evaluation.
Long term we might not want to store this in `ObjectRuntime` at all,
and just calculate it on demand. Or at least we should remove the
redundant calculations. That should be done separately though.
Pull Request: https://projects.blender.org/blender/blender/pulls/118210