Since ee1b2f53cc the ffmpeg libraries for Windows x64 are built effectively
without CPU specific SIMD optimizations. `--arch=x64` is not an architecture
that ffmpeg configure understands, so it falls back to "nothing is known,
turn any architecture specific bits off" code path.
Pull Request: https://projects.blender.org/blender/blender/pulls/126396
when tracing shadow ray through a volume and no hit is registered, we
consider the whole ray segment inside the volume.
However, no hit registered could also happen when the volume is
invisible to shadow ray. We should explicitly check this case and skip
rendering the volume segment instead.
Pull Request: https://projects.blender.org/blender/blender/pulls/126139
This PR implements #126353; In short: keep discard list as part of swap chain images. This allows
better determination when resources are actually not in use anymore.
## Resource pool
Resource pools keep track of the resources for a swap chain image.
In Blender this is a bit more complicated due to the way GPUContext work. A single thread can have
multiple contexts. Some of them have a swap chain (GHOST Window) other don't (draw manager). The
resource pool should be shared between the contexts running on the same thread.
When opening multiple windows there are also multiple swap chains to consider.
### Discard pile
Resource handles that are deleted and stored in the discard pile. When we are sure that these
resources are not used on the GPU anymore these are destroyed.
### Reusable resources
There are other resources as well like:
- Descriptor sets
- Descriptor pools
## Open issues
There are some limitations that require future PRs to fix including:
- Background rendering
- Handling multiple windows
- Improve CPU/GPU synchronization
- Reuse staging buffers
Pull Request: https://projects.blender.org/blender/blender/pulls/126353
This only impacted configurations that don't include large cursor sizes
which isn't so common.
However when it does happen the cursors are small enough that they're
difficult to see.
Hair objects did not take the curves into account that could go
outside the bounds set by the keys of the curves. These bounds
are used in the dynamic bvh, leading to clipped curves in the
viewport.
Pull Request: https://projects.blender.org/blender/blender/pulls/126157
OptiX has accepted Catmull-Rom curve data natively since OptiX 7.4, but due to the previous conversion to B-Spline code, the format that data is fed to OptiX wasn't optimal.
Each curve segment was put in the vertex buffer as four independent control points, even though continuous segments actually share control points between each other. This patch compacts that so shared control points only occur once in the vertex buffer.
This compact form uses less memory and also allows OptiX to easily identify segments that belong together into a curve (those where the step between indices is one).
Pull Request: https://projects.blender.org/blender/blender/pulls/125899
On X11 windowing systems there's no `MessageBox` prompt like on
win32 that could show users that they have unsupported GPUs,
this leads to confusion to them as they typically don't open
blender from a command line so none of the messages could be
available. This patch utilizes `system->showMessageBox` to display
a GUI message box telling the user that the GPU is unsupported.
Pull Request: https://projects.blender.org/blender/blender/pulls/126220
Overload resolution must have changed and is causing issues for one
particular code path attempting to use `isfinite(ccl::uchar)`.
Compiler output attached.
It turns out that the code in question can be simplified to just remove
the ambiguity because only the float codepath wants to check for finite
values.
----
Reduced repro: https://godbolt.org/z/YWz3Yc3x8
Pull Request: https://projects.blender.org/blender/blender/pulls/125348
This patch improves the isotropic Gabor noise UI controls such that
variations happen in both directions of the base orientation, as opposed
to being biased in the positive direction only.
Thanks to Charlie Jolly for suggesting this improvement.
This patch optimizes the Gabor noise standard deviation estimation by
computing the upper limit of the integral as the frequency approaches
infinity, since the integral is mostly constant for the relevant
frequency range. The limits are 0.25 for the 2D case and 1 / 4 * sqrt2
for the 3D case.
This also improves normalization for low frequencies, possibly due to
the effect of windowing.
Thanks to Charlie Jolly for spotting the optimization.
Optimize the Gabor noise texture code with an early exit for points that
are further away from the kernel center. This was already done for the
kernel, but is now being done earlier before computing the weight, so
its computation is now skipped.
Thanks to Charlie Jolly for the suggestion.
Fixes missing intersections on straight 3D curves with the
Metal backend, with BVH2.
This issue could of manifested on other devices, but didn't seem to
in practice.
Pull Request: https://projects.blender.org/blender/blender/pulls/126197
This gets Windows ARM64 to compile with clang-cl, which gives up to 40% performance improvements in certain scenes rendered with cycles, compared to MSVC.
This is all tested using LLVM 18.1.8 and a VS2022 `vcvarsall` window.
Subsequent PRs with various lib version updates, etc to go in at a later point.
Pull Request: https://projects.blender.org/blender/blender/pulls/124182
The GPU packed state is a static check from the Cycles core perspective,
and it is disabled for non-Apple Silicon GPUs. However, the Metal kernel
always used packed integrator.
This change makes it so the Host and Device side checks for the Host CPU
are aligned, and that Device-side packed state check does not differ from
the Host side.
Pull Request: https://projects.blender.org/blender/blender/pulls/126082
Fixes#91369, where a pointer to a deleted btTypedConstraint instance
was being dereferenced, causing a crash.
What was happening was:
1. When the animation starts the first time, BKE_rigidbody_rebuild_sim
eventually calls btDiscreteDynamicsWorld::addConstraint, which in turn
will store a pointer to the btTypedConstraint in
btRigidBody::m_constraintRefs for each body in the constraint
2. When undoing, the btDynamicsWorld is deleted, then the Object
containing the btTypedConstraint (taking the btTypedConstraint with
it) - however, the pointer to the btTypedConstraint is still in the
btRigidBody!
3. When playing the animation a second time, rigidbody_update_simulation
will rebuild the simulation, which causes RB_body_delete to be called,
which iterates over all the body's m_constraintRefs and dereferences
the deleted pointer.
Co-authored-by: Eoin Mcloughlin <hkeoin@eoinrul.es>
Pull Request: https://projects.blender.org/blender/blender/pulls/126079
The issue was caused by an attempt to write buffer pass which is
actually supposed to be calculated as compositing (either summing
direct/indirect lights, optionally diving by albedo).
The fact that the crash was only observed on Metal is a lucky
con-incident: it just happened to be so that writing at offset
-1 to the render buffer did not trigger obvious issues.
Pull Request: https://projects.blender.org/blender/blender/pulls/126057
This decreases BSDF_ROUGHNESS_SQ_THRESH so that the microfacet
roughness has a cutoff at much lower values and fixes a precision
issue in the bsdf_sample code that prevented this previously.
Pull Request: https://projects.blender.org/blender/blender/pulls/125919
I ran into this in a test scene - somehow the normalization here can result
in NaN (so presumably a zero vector). I don't think this has a notable
performance impact from some basic tests.
Pull Request: https://projects.blender.org/blender/blender/pulls/125930
the code snippet is supposed to compute the maximal `isect.t` in the
array, which is used to determine if subsequent intersections should be
added.
However, the previous implementation includes the old `isect.t` which is
going to be replaced, resulting an overestimation of `tmax_hits` and
thus missing closer intersections.
For BVH2, the issue is fixed by computing the `max_t` after a new entry
is inserted.
For Embree, the issue is fixed by finding the `second_largest_t` as well, and
compare that with the new insertion to find the new `max_t`.
Pull Request: https://projects.blender.org/blender/blender/pulls/125739
Fixes a crash that can occur if motion blur was on, there is a
deforming mesh in the scene with deformable motion blur turned on,
with BVH time steps set >0.
Render results in my test scene appear to match CPU Embree.
Pull Request: https://projects.blender.org/blender/blender/pulls/125854
A phase function is normalized over the sphere, it is therefore
incorrect to sum two phase functions together when evaluating for NEE.
It should be a weighted sum with normalized weights, which, according to
`volume_shader_phase_pick()`, is `sample_weight / sum_sample_weight`.
Also corrects an error in `volume_shader_phase_pick()`.
Fix a NaN when rendering glossy materials that can appear due to a
division by zero in bsdf_D when rendering materials with low roughness.
Thank you to Weizhen for the fix after my incorrect
first attempt.
Pull Request: https://projects.blender.org/blender/blender/pulls/125756
Align Cycles SVM and EEVEE's rendering of the vector math node
in reflect mode with OSL when the normal vector is 0,0,0.
This is done by using safe_normalize rather than normalize on the
normal vector. Which also fixes a NaN in the reflect mode in this
specific configuration.
Pull Request: https://projects.blender.org/blender/blender/pulls/125688
This type of projection is often used e.g. in exhibitions that leverage big
curved screens.
Effectively, the frame is mapped onto a cylinder, with the x axis becoming the
longitude and y axis becoming the height.
Users can configure the min/max longitude, the min/max height and the radius of
the cylinder.
Co-authored-by: Lukas Stockner <lukas.stockner@freenet.de>
Pull Request: https://projects.blender.org/blender/blender/pulls/123046
Add a new API to store data that is guaranteed to not be freed
before the memleak detector has run.
This will be used in next commit by the readfile code to improve
reporting on leaks from blendfile readingi process.
This is done by a two-layer approach:
A new templated `MEM_construct_leak_detection_data` allows to
create any type of data. Its ownership and lifetime are handled
internally, and guaranteed to not be destroyed before the memleak
detector has run.
Add a new template-based 'allocation string storage' system to
`intern/memutil`. This uses the new `Guardedalloc Persistent Storage`
system to store all 'complex' allocation messages, that cannot be
defined as literals.
Internally, the storage is done through an owning reference (a
`shared_ptr`) of the created data into a mutex-protected static
vector.
`MEM_init_memleak_detection` code ensures that this static storage
is created before the memleak detection data, so that it is destructed
after the memleak detector has ran.
The main container (`AllocStringStorageContainer`) is wrapping a
map of `{string -> AllocStringStorage<key_type, hash_type>}`.
The key is a storage identifier.
Each storage is also a map wrapped into a simple templated API
class (`AllocStringStorage`), where the values are the alloc strings,
and the keys type is defined by the user code.
Pull Request: https://projects.blender.org/blender/blender/pulls/125320