TLAS wasn't being refreshed when empty.
This PR removes a spurious early-exit during BVH build that was preventing
the TLAS from being recreated when it was empty.
Pull Request: https://projects.blender.org/blender/blender/pulls/141215
GPU devices can only be selected in the user preferences if a suitable
device is available. This uses a dynamic enum and the items are not
always defined in RNA, so they need to be extracted manually using
`n_()`.
Also rephrase one message slightly to respect the style guide
("Don't" -> "Do not").
In addition, fix my mistake where an import was mixed up
(`pgettext_tip` was imported as `n_`).
Pull Request: https://projects.blender.org/blender/blender/pulls/141244
At the moment there are two main usability issues that make it hard to
recommend to enable HIP RT by default:
- Dramatically increased memory usage during BVH construction on
high poly meshes compared to BVH2 (#136174)
- This issue can be fixed by using the "balanced" HIP RT BVH, but
it requires a HIP RT update that won't make it into 4.5 (!136622)
- Many Blender and GPU driver crashes when modifying objects in the
viewport. #140763, #140738, #139013, #138043
Pull Request: https://projects.blender.org/blender/blender/pulls/140794
When the list of extensions is constructed for `vkCreateDevice` it
uses a function that retrieves all extensions just to iterate to
check a specific extension is supported. However there is already a
list cached that is that is the subset of the desired extensions that
are supported by the device.
This cleanup will use that list instead of requiring all supported
extensions.
Pull Request: https://projects.blender.org/blender/blender/pulls/141074
This PR fixes a validation error about the swapchain semaphores. When
swapchain maintenance 1 is supported the semaphores can be reused, but
requires a fence. We didn't implement the fence. This PR doesn't reuse
the semaphores as introducing the fence leads to more changes.
Pull Request: https://projects.blender.org/blender/blender/pulls/141066
The theme cursor size was ignored when setting custom cursors
such as the knife, only the DPI from GHOST was taken into account.
This meant cursors such as the knife would sometimes display too small.
Now when the theme-size is larger, a larger cursor will be used.
Currently the theme size is read from XCURSOR_SIZE environment variable
however it may be read from the system preferences in the future.
Also fix the software cursor sizes which incorrectly used the UI scale
preference which is ignored by cursor sizes.
The distributed memory access toggle in Cycles preferences would show up
when a user has two GPUs that can access each other's memory, but only one
of them is supported by Cycles.
For example the AMD RX 5700XT and AMD Vega 64 can access each other's
memory, but only the 5700XT is supported by Cycles.
Pull Request: https://projects.blender.org/blender/blender/pulls/140521
The performance of the sorted_paths_array kernel on B570 is problematic.
Relying on local sorting+partitioning instead gives a 25% overall rendering
speedup and no regression in shade_surface when rendering Agent 327 Barbershop scene.
On Arc A770, it still gives a 2% speedup when rendering Barbershop.
Pull Request: https://projects.blender.org/blender/blender/pulls/140308
Device::const_copy_to is sometimes called when the Embree BVH has been freed
and not replaced yet. Previously this was a simpler pointer copy, now there is
a function call. Make sure it's just a function copy.
Thanks to Nikita Sirgienko for figuring this out.
Pull Request: https://projects.blender.org/blender/blender/pulls/140457
This reverts commit 23c762e388 in the
blender-v4.5-release branch to work around HIP compiler issues. It will
remain in the main branch.
Ref blender/blender#139836
This reverts commit 64dc9cc98c in the
blender-v4.5-release branch to work around HIP compiler issues. It will
remain in the main branch.
Ref blender/blender#139836
This reverts commit a6015e1411 in the
blender-v4.5-release branch to work around HIP compiler issues. It will
remain in the main branch.
Ref blender/blender#139836
This reverts commit 5abf42012d in the
blender-v4.5-release branch to work around HIP compiler issues. It will
remain in the main branch.
Ref blender/blender#139836
This reverts commit 0e7a696819 in the
blender-v4.5-release branch to work around HIP compiler issues. It will
remain in the main branch.
Ref blender/blender#139836
e934792169 introduced a workaround for
NVIDIA windows where NVIDIA drivers fail to allocate swapchain images
of minimized windows. The fix was to clamp it to 1,1. With this clamping
AMD driver seems to tell Blender that the created swapchain is suboptimal,
and needs to be recreated. This results in over and over creation of
swapchains as they are all considered sub-optimal.
This PR limits the clamping to NVIDIA drivers only.
Pull Request: https://projects.blender.org/blender/blender/pulls/140112
Descriptor sets/pools are known to be troublesome as it doesn't match
how GPUs work, or how application want to work, adding more complexity
than needed. This results is quite an overhead allocating and
deallocating descriptor sets.
This PR will use descriptor buffers when they are available. Most platforms
support descriptor buffers. When not available descriptor pools/sets
will be used.
Although this is a feature I would like to land it in 4.5 due to the API changes.
This makes it easier to fix issues when 4.5 is released.
The feature can easily be disabled by setting the feature to false if it has
to many problems.
Pull Request: https://projects.blender.org/blender/blender/pulls/138266
This requires a minimum driver version of 535, however most devices
were already requiring 570 due to the CUDA toolkit version.
The update is required to be able to use an API function for correct
stack size calculation.
Code for older API versions has been removed.
Fix#138185: OSL custom camera errors with OptiX
Pull Request: https://projects.blender.org/blender/blender/pulls/139801
This is required to make ray differentials work correctly for OSL custom
cameras.
But it also lets us simplify the implementation, and makes the OSL
functionality more complete, such as implementing all noise types.
Pull Request: https://projects.blender.org/blender/blender/pulls/138161
Keep around the dummy BVH for lights, even if it serves no purpose for now.
Previously I assumed it was not needed, but there is some device specific
code that assumes it exists, and not much point trying to refactor that now
when in the future we actually want to create a BVH for lights.
Pull Request: https://projects.blender.org/blender/blender/pulls/139798
With these changes, we can now mark devices which are expected to work as
performant as possible, and devices which were not optimized for some reason.
For example, because the device was released after the Blender release,
making it impossible for developers to optimize for devices in already
released unchangeable code. This is primarily relevant for the LTS versions,
which are supported for two years and require proper communication about
optimization status for the new devices released during this time.
This is implemented for oneAPI devices. Other device types currently are
marked as optimized for compatibility with old behavior, but may implement
the same in the future.
Pull Request: https://projects.blender.org/blender/blender/pulls/139751
calloc is generally faster than zeroing separately after a regular
allocation. Our allocator API exposed an allocation call with "calloc"
in the name that didn't actually use "calloc" because it had an
alignment argument (there is no standardized calloc-with-alignment
provided by the OS). However, we can still use calloc internally if
the alignment fits within the default. That just aligns the function
better with performance expectations.
Pull Request: https://projects.blender.org/blender/blender/pulls/139749
`xkb_compose_state_get_utf8` may return multiple characters, while
this isn't supported, prevent a buffer overflow & report a warning.
Co-authored-by: Phoenix Katsch <phoenixkatsch@gmail.com>
Ref: !114612
e.g. stands for "exempli gratia" in Latin which means "for example".
The best way to make sure it makes sense when writing is to just expand
it to "for example". In these cases where the text was "for e.g.", that
leaves us with "for for example" which makes no sense. This commit fixes
all 110 cases, mostly just just replacing the words with "for example",
but also restructuring the text a bit more in a few cases, mostly by
moving "e.g." to the beginning of a list in parentheses.
Pull Request: https://projects.blender.org/blender/blender/pulls/139596
Several small speedups for Voronoi node (no behavior change). This
affects Cycles and CPU execution of Voronoi node e.g. in Compositor.
- F1 mode: when evaluating distance for Voronoi cells, use a faster
distance estimation, and only do final distance calculation on the
resulting closest cell. This is only really relevant for the default
Euclidian distance, where this saves a square root per evaluated cell
(in 3D Voronoi case saves 26 square roots; in 4D case saves 80 square
roots).
- N-Sphere Radius mode: speedup by doing squared distance calculations.
We only need to find the closest one, so again doing the square root
per cell is not needed here.
Something like 5%-10% speedup for F1 3D Voronoi; more performance details
in the PR.
Pull Request: https://projects.blender.org/blender/blender/pulls/139490
This started with investigating a render issue that appears to be caused by
GCC 15. From what I can tell, it was caused by
`*viewplane = (*viewplane) * bcam->zoom;`.
I'm not entirely sure what the root cause is (potentially pointer aliasing?),
but the restructured code works fine now.
Pull Request: https://projects.blender.org/blender/blender/pulls/139416