After spending way too much time looking into the image handle code
because I assumed the issue is some interaction between OSL code using
a texture and the image system that's in SVM mode (which never happened
before the custom camera) with printf debuggine because OptiX doesn't
work in debug builds, it turns out the issue was something else entirely:
C++ iterator bullshit.
Specifically, when we remove an entry from services->textures, this
invalidates the iterator, so the code restarts from the beginning.
However, the for-loop still increments the iterator *before* checking
the termination criterion.
If we remove the only element of the map, we:
- Set it = map.begin(), which equals map.end() since it's empty now
- Increment it at the end of the loop iteration
- Compare it == map.end(), which is wrong now since we're past the end
No idea how this didn't blow up sooner, none of this seems camera-specific??
Anyways, the fix is simple - only increment if we didn't restart.
Pull Request: https://projects.blender.org/blender/blender/pulls/141580
When launching Blender via blender-launcher.exe, the window briefly
displays incorrectly on startup when using the Vulkan backend. This is
caused by not properly handling the GHOST_kWindowStateFullScreen case.
Previously, even if the window state was set to fullscreen, nCmdShow
would default to SW_SHOWNORMAL or SW_SHOWNOACTIVATE. With this fix,
nCmdShow is explicitly set to SW_SHOWMAXIMIZED when the window is in
fullscreen state, preventing the flicker.
Pull Request: https://projects.blender.org/blender/blender/pulls/141518
Detected an incorrect structure type. A property struct was used to
store feature data. This could lead to incorrect values for enabling
descriptorBufferPushDescriptor, what isn't used.
Adding a Flow to a Mantaflow domain could sometimes cause a crash. This
was because the grids are allocated lazily, and as such, getting them
through `PyObject_GetAttrString` will fail when they are not yet
available. As the resulting Python errors were not cleared, Python could
be left in a bad state, leading to crashes. This is avoided by clearing
these errors before returning from `callPythonFunction` when such an
error is raised.
Ref !141364
This commit brings multi-monitor window positioning support to the macOS
GHOST backend. This fixes a plethora of issues with macOS window
creation and positioning, such as:
* Windows not being properly restored when loading a file with Load UI
* Users default startup windows not being properly restored on multiple
screens
* Temporary windows (Settings, Render, Playblast, etc..) wrongly
appearing in unexpected places / other screens
* Duplicating an area into a new window (AKA popping out an editor) not
working on non-primary screens.
* etc..
Internally, this makes all macOS windows coordinates be relative to the
user primary monitor, instead of being local to the currently focused
one. I have tested this to properly work using all sorts of multiple
screen arrangements, and can also confirm that restoring windows from
screens that do not exist anymore / are now out of bounds (due to being
unplugged or re-arranged) also works properly, in which case they get
snapped back to the closest available screen similarly to other backends.
This fixes issue #126410 and implements behavior described in TODO task #69819.
Pull Request: https://projects.blender.org/blender/blender/pulls/141159
TLAS wasn't being refreshed when empty.
This PR removes a spurious early-exit during BVH build that was preventing
the TLAS from being recreated when it was empty.
Pull Request: https://projects.blender.org/blender/blender/pulls/141215
GPU devices can only be selected in the user preferences if a suitable
device is available. This uses a dynamic enum and the items are not
always defined in RNA, so they need to be extracted manually using
`n_()`.
Also rephrase one message slightly to respect the style guide
("Don't" -> "Do not").
In addition, fix my mistake where an import was mixed up
(`pgettext_tip` was imported as `n_`).
Pull Request: https://projects.blender.org/blender/blender/pulls/141244
At the moment there are two main usability issues that make it hard to
recommend to enable HIP RT by default:
- Dramatically increased memory usage during BVH construction on
high poly meshes compared to BVH2 (#136174)
- This issue can be fixed by using the "balanced" HIP RT BVH, but
it requires a HIP RT update that won't make it into 4.5 (!136622)
- Many Blender and GPU driver crashes when modifying objects in the
viewport. #140763, #140738, #139013, #138043
Pull Request: https://projects.blender.org/blender/blender/pulls/140794
When the list of extensions is constructed for `vkCreateDevice` it
uses a function that retrieves all extensions just to iterate to
check a specific extension is supported. However there is already a
list cached that is that is the subset of the desired extensions that
are supported by the device.
This cleanup will use that list instead of requiring all supported
extensions.
Pull Request: https://projects.blender.org/blender/blender/pulls/141074
This PR fixes a validation error about the swapchain semaphores. When
swapchain maintenance 1 is supported the semaphores can be reused, but
requires a fence. We didn't implement the fence. This PR doesn't reuse
the semaphores as introducing the fence leads to more changes.
Pull Request: https://projects.blender.org/blender/blender/pulls/141066
The theme cursor size was ignored when setting custom cursors
such as the knife, only the DPI from GHOST was taken into account.
This meant cursors such as the knife would sometimes display too small.
Now when the theme-size is larger, a larger cursor will be used.
Currently the theme size is read from XCURSOR_SIZE environment variable
however it may be read from the system preferences in the future.
Also fix the software cursor sizes which incorrectly used the UI scale
preference which is ignored by cursor sizes.
The distributed memory access toggle in Cycles preferences would show up
when a user has two GPUs that can access each other's memory, but only one
of them is supported by Cycles.
For example the AMD RX 5700XT and AMD Vega 64 can access each other's
memory, but only the 5700XT is supported by Cycles.
Pull Request: https://projects.blender.org/blender/blender/pulls/140521
The performance of the sorted_paths_array kernel on B570 is problematic.
Relying on local sorting+partitioning instead gives a 25% overall rendering
speedup and no regression in shade_surface when rendering Agent 327 Barbershop scene.
On Arc A770, it still gives a 2% speedup when rendering Barbershop.
Pull Request: https://projects.blender.org/blender/blender/pulls/140308
Device::const_copy_to is sometimes called when the Embree BVH has been freed
and not replaced yet. Previously this was a simpler pointer copy, now there is
a function call. Make sure it's just a function copy.
Thanks to Nikita Sirgienko for figuring this out.
Pull Request: https://projects.blender.org/blender/blender/pulls/140457
This reverts commit 23c762e388 in the
blender-v4.5-release branch to work around HIP compiler issues. It will
remain in the main branch.
Ref blender/blender#139836
This reverts commit 64dc9cc98c in the
blender-v4.5-release branch to work around HIP compiler issues. It will
remain in the main branch.
Ref blender/blender#139836
This reverts commit a6015e1411 in the
blender-v4.5-release branch to work around HIP compiler issues. It will
remain in the main branch.
Ref blender/blender#139836
This reverts commit 5abf42012d in the
blender-v4.5-release branch to work around HIP compiler issues. It will
remain in the main branch.
Ref blender/blender#139836
This reverts commit 0e7a696819 in the
blender-v4.5-release branch to work around HIP compiler issues. It will
remain in the main branch.
Ref blender/blender#139836
e934792169 introduced a workaround for
NVIDIA windows where NVIDIA drivers fail to allocate swapchain images
of minimized windows. The fix was to clamp it to 1,1. With this clamping
AMD driver seems to tell Blender that the created swapchain is suboptimal,
and needs to be recreated. This results in over and over creation of
swapchains as they are all considered sub-optimal.
This PR limits the clamping to NVIDIA drivers only.
Pull Request: https://projects.blender.org/blender/blender/pulls/140112
Descriptor sets/pools are known to be troublesome as it doesn't match
how GPUs work, or how application want to work, adding more complexity
than needed. This results is quite an overhead allocating and
deallocating descriptor sets.
This PR will use descriptor buffers when they are available. Most platforms
support descriptor buffers. When not available descriptor pools/sets
will be used.
Although this is a feature I would like to land it in 4.5 due to the API changes.
This makes it easier to fix issues when 4.5 is released.
The feature can easily be disabled by setting the feature to false if it has
to many problems.
Pull Request: https://projects.blender.org/blender/blender/pulls/138266
This requires a minimum driver version of 535, however most devices
were already requiring 570 due to the CUDA toolkit version.
The update is required to be able to use an API function for correct
stack size calculation.
Code for older API versions has been removed.
Fix#138185: OSL custom camera errors with OptiX
Pull Request: https://projects.blender.org/blender/blender/pulls/139801
This is required to make ray differentials work correctly for OSL custom
cameras.
But it also lets us simplify the implementation, and makes the OSL
functionality more complete, such as implementing all noise types.
Pull Request: https://projects.blender.org/blender/blender/pulls/138161
Keep around the dummy BVH for lights, even if it serves no purpose for now.
Previously I assumed it was not needed, but there is some device specific
code that assumes it exists, and not much point trying to refactor that now
when in the future we actually want to create a BVH for lights.
Pull Request: https://projects.blender.org/blender/blender/pulls/139798
With these changes, we can now mark devices which are expected to work as
performant as possible, and devices which were not optimized for some reason.
For example, because the device was released after the Blender release,
making it impossible for developers to optimize for devices in already
released unchangeable code. This is primarily relevant for the LTS versions,
which are supported for two years and require proper communication about
optimization status for the new devices released during this time.
This is implemented for oneAPI devices. Other device types currently are
marked as optimized for compatibility with old behavior, but may implement
the same in the future.
Pull Request: https://projects.blender.org/blender/blender/pulls/139751
calloc is generally faster than zeroing separately after a regular
allocation. Our allocator API exposed an allocation call with "calloc"
in the name that didn't actually use "calloc" because it had an
alignment argument (there is no standardized calloc-with-alignment
provided by the OS). However, we can still use calloc internally if
the alignment fits within the default. That just aligns the function
better with performance expectations.
Pull Request: https://projects.blender.org/blender/blender/pulls/139749
`xkb_compose_state_get_utf8` may return multiple characters, while
this isn't supported, prevent a buffer overflow & report a warning.
Co-authored-by: Phoenix Katsch <phoenixkatsch@gmail.com>
Ref: !114612