Commit Graph

14886 Commits

Author SHA1 Message Date
Jeroen Bakker
3da222cb9a Vulkan/OpenXR: Direct3D Bridge
Some OpenXR platforms do not support OpenGL or Vulkan. To support these
platforms we use a bridge. Blender still renders in OpenGL/Vulkan, but
will copy the render result into a D3D11 swapchain.

OpenGL doesn this by importing the D3D11 swapchain into the OpenGL
context and perfor OpenGL calls to update the swapchain. However for
vulkan that could lead to construct 3 context for OpenXR

- Blender GPU Context
- OpenXR D3D Context
- New context that imports the Blender render result and the OpenXR
  Swapchain image and copies them.

Due to Direct3D limitations importing into a vulkan context has known
issues (driver + extensions). Secondly we are not sure if we are running
on the same device as the OpenXR swapchain. The solution provided with
this PR is to only support CPU data transfers.

**SteamVR using d3d bridge**

SteamVR normally would use the Vulkan binding. But by changing the binding
priority in code you can make it select the D3D bridge.

<img width="1518" alt="Screenshot 2025-04-10 114534.png" src="attachments/f856bb2b-9ad5-4bb2-9cfd-a1412da9edd1">

It has been tested and validated to work using Mixed reality portal as well.

Pull Request: https://projects.blender.org/blender/blender/pulls/137264
2025-04-10 16:15:27 +02:00
Brecht Van Lommel
7aaa43b557 Revert "Fix: Build failures when using path with spaces on macOS"
This reverts commit be63ebd961.

This doesn't work well with MSBuild, semicolons get escaped even in
verbatim mode.
2025-04-10 13:04:34 +02:00
Jeroen Bakker
969e70fed8 Fix: Vulkan: Unable to start on WoA
Current Windows on ARM GPUs don't support
extenral memory. External memory is required
for OpenXR. So most likely OpenXR will not work
on these devices.

Most (read all) OpenXR platforms that support
vulkan also require external memory. So might
just be that those platforms won't work at all
on these devices.

In any case when not supported, the GHOST
OpenXR platform will use CPU for data transfer.
2025-04-10 10:15:23 +02:00
Jeroen Bakker
ae8665b8b2 Cleanup: Missing pragma once 2025-04-10 09:36:19 +02:00
Jeroen Bakker
6f6df389a1 Cleanup: OpenXR: Move Direct3D to its own compile unit
_No response_

Pull Request: https://projects.blender.org/blender/blender/pulls/137255
2025-04-10 09:32:48 +02:00
Jeroen Bakker
07a9306bb4 Cleanup: Vulkan/OpenXR: Pass Blender context via constructor
Similar to Direct3D.

Pull Request: https://projects.blender.org/blender/blender/pulls/137254
2025-04-10 08:51:20 +02:00
Jeroen Bakker
a564a27c1f Cleanup: OpenXR: Introduce a Direct3D base class.
For the Vulkan/OpenXR code will be shared with the OpenGL-Direct3D
bridge. This cleanup separates the OpenGL specifics in its own class.

Pull Request: https://projects.blender.org/blender/blender/pulls/137252
2025-04-10 08:50:52 +02:00
Campbell Barton
b2dbfa7d77 Cleanup: spelling in comments, use doxygen comments 2025-04-10 13:02:29 +10:00
Sergey Sharybin
30b962b3d8 Cycles: Optimize 3d and 4d noise
The goal is to reduce the affect of the fmod() used in the noise code,
which was initially reported in the comment:

    https://projects.blender.org/blender/blender/pulls/119884#issuecomment-1258902

Basic idea is to benefit from SIMD vectorization on CPU.

Tested on Linux i9-11900K and macOS on M2 Ultra, in both cases performance
after this change is very close to what it could be with the fmod() commented
out (the call itself, `p = p + precision_correction`).

On macOS the penalty of fmod() was about 10%, on Linux it was closer to 30%
when built with GCC-13. With Linux builds from the buildbot it is more like 18%.

The optimization is only done for 3d and 4d noise. It might be possible to
gain some performance improvement for 1d and 2d cases, but the approach would
need to be different: we'd need to optimize scalar version fmodf(). Maybe
tricks with integer cast will be faster (since we are a bit optimistic in the
kernel and do not guarantee exact behavior in extreme cases such as NaN inputs).

Pull Request: https://projects.blender.org/blender/blender/pulls/137109
2025-04-09 13:40:10 +02:00
Campbell Barton
bf03a2684b Fix: error with CMake checking the wrong WEBP variable 2025-04-09 10:21:37 +10:00
Sergey Sharybin
5b0ed683a0 Cycles: Make select() and mask() for vectorized float work on CPU and GPU
Pull Request: https://projects.blender.org/blender/blender/pulls/137148
2025-04-08 17:04:18 +02:00
Jeroen Bakker
7ecacbc3e6 Vulkan/OpenXR: Support VK_KHR_external_memory_win32
This PR add support to use a win32 handle to perform share render
result with the OpenXR vulkan instance. This is only possible when
the GPU matches. Otherwise a CPU roundtrip will be performed.

Pull Request: https://projects.blender.org/blender/blender/pulls/137093
2025-04-08 15:21:55 +02:00
Brecht Van Lommel
6db2c6b864 Revert: Part of "Fix: Build failures when using path with spaces"
Commit be63ebd961

This is causing issues with CUDA kernel compilation in some setups, even
though the builbot is ok. Since this isn't yet working for oneAPI anyway,
revert all the changes to Cycles kernel compilation for now.
2025-04-08 14:55:03 +02:00
Jeroen Bakker
22ae59d28d Vulkan: Include Win32 extensions definitions
Includes win32 specific extensions definitions when including
`vk_common.hh`. Inside `gpu_context.cc` vulkan needs to be
included before opengl, otherwise windows 10 builders will
report a warning.

```
[6421/7520] Building CXX object source\blender\gpu\CMakeFiles\bf_gpu.dir\intern\gpu_context.cc.obj
C:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\shared\minwindef.h(130): warning C4005: 'APIENTRY': macro redefinition
C:\Users\blender\git\blender-vexp\blender.git\lib\windows_x64\epoxy\include\epoxy/gl.h(59): note: see previous definition of 'APIENTRY'
```

Pull Request: https://projects.blender.org/blender/blender/pulls/137134
2025-04-08 14:10:01 +02:00
Jeroen Bakker
70995180ed Fix: Vulkan/OpenXR: Incorrect swapchain image layout
In the OpenXR/Vulkan specs it is mentioned that the swapchain layout
should be set to optimal color attachment. This wasn't the case and
could lead to validation errors.

Also fixes a memory size validation error. Not concerning as it was
abount already backed memory.

Pull Request: https://projects.blender.org/blender/blender/pulls/137143
2025-04-08 13:35:23 +02:00
Campbell Barton
0b369b6613 Cleanup: remove GHOST_TaskbarX11
This only worked for Ubuntu's discontinued Unity desktop.

Even though `libunity.so` can be used outside of Unity,
it's no longer actively developed.

Ref: !137128
2025-04-08 10:56:29 +00:00
Campbell Barton
3a51d140d8 Cleanup: remove WITH_X11_XF86VMODE since mode setting is no longer used
Setting the display mode was last used for the Game Engine
remove the CMake option WITH_X11_XF86VMODE and the associated code.

Ref: !137126
2025-04-08 10:56:27 +00:00
Jeroen Bakker
566c3d0846 Fix: Vulkan/OpenXR: Crash when vulkan physical device not found
When using SteamVR without starting SteamVR, the physical device cannot
be found. This isn't reported to the user nicely. This change will
tell the user why OpenXR couldn't be started.

Note that the physical device can be created when SteamVR has already
launched. OpenGL will start SteamVR automatically, but seems Vulkan
isn't able to do that.
2025-04-08 11:30:29 +02:00
Campbell Barton
f459d97dfd Cleanup: remove unused full-screen support from GHOST
Remove full-screen support from GHOST API's.
Note that this only had back-end implements for X11 and WIN32.

This was last used for the Game Engine to run games full-screen,
removing as it's unused and it doesn't seem likely to be used in the
future.

This doesn't impact making Blender full-screen from the window menu
which uses a window decoration setting.

Ref: !137050
2025-04-08 05:18:57 +00:00
Brecht Van Lommel
be63ebd961 Fix: Build failures when using path with spaces on macOS
Use VERBATIM to ensure spaces inside command line arguments don't get
escaped automatically.

On Linux and Windows the oneAPI kernel compilation still has problems.
There is an apparent bug with single quote escaping in add_custom_command
which means it's not easy to use VERBATIM.
2025-04-07 16:29:14 +02:00
Campbell Barton
0ff9a162c1 Fix: memory leak on Wayland with IME
For practical purposes this only leaks on exit,
unless seats are change at run-time.
2025-04-07 11:24:02 +00:00
Alaska
975d61daf3 Cycles: Disable MNEE on RDNA4 GPUs
At the moment MNEE locks up Cycles, or has rendering artifacts on
RDNA4 GPUs on WIndows.

This commit disables MNEE on that configuration until a fix
is avaliable.

Pull Request: https://projects.blender.org/blender/blender/pulls/136980
2025-04-05 14:06:40 +02:00
Campbell Barton
f48b4e3abf Cleanup: wrap long lines for CMake 2025-04-05 20:30:37 +11:00
Campbell Barton
e7c9ea3b27 Cleanup: correct typo in variable name & comments 2025-04-05 08:49:20 +00:00
Jesse Yurkovich
40cbc09d56 Cleanup: correct typos
_No response_

Pull Request: https://projects.blender.org/blender/blender/pulls/137009
2025-04-04 22:58:10 +02:00
Jesse Yurkovich
f60c528c48 Cleanup: Fix one-definition-rule violations for various structs
This fixes most "One Definition Rule" violations inside blender proper
resulting from duplicate structures of the same name. The fixes were
made similar to that of !135491. See also #120444 for how this has come
up in the past.

These were found by using the following compile options:
-flto=4 -Werror=odr -Werror=lto-type-mismatch -Werror=strict-aliasing

Note: There are still various ODR issues remaining that require
more / different fixes than what was done here.

Pull Request: https://projects.blender.org/blender/blender/pulls/136371
2025-04-04 21:05:16 +02:00
Germano Cavalcante
3ab65cff04 Windows: show popup after crash
Implements a crash dialog for Windows.

The crash popup provides the following actions:
- Restart: reopen Blender from the last saved or auto-saved time
- Report a Bug: forward to Blender bug tracker
- View Crash Log: open the .txt file with the crash log
- Close: Closes without any further action

Pull Request: https://projects.blender.org/blender/blender/pulls/129974
2025-04-04 18:38:53 +02:00
Miguel Pozo
2d75688259 GHOST: Add GHOST_GetActiveGPUContext()
Track the (per-thread) active `GHOST_Context`.
This is required to restore the active context after creating a new one
if the current context is unknown.
(Required for creating GPU contexts in the GPU module without depending
on the WM module)

Pull Request: https://projects.blender.org/blender/blender/pulls/136992
2025-04-04 18:14:13 +02:00
Jeroen Bakker
a46643af0f Vulkan/OpenXR: Add support for VK_KHR_external_memory_fd
Current implementation uses a CPU roundtrip to transfer render result
to the Xr Swapchain. This PR adds support for sharing the render result
on Linux systems by using file descriptors.

To extend this solution to win32 or dx handles can be done by extending
the data transfer modes, register the correct extensions. When not
using the same GPU between Blender and OpenXR the CPU roundtrip
will still be used.

Solution has been validated with monado simulator and seems to be as
fast as OpenGL.

Performance can be improved by using GPU based synchronization.
Current API is limited as we cannot chain the different renders and
swapchains.

Pull Request: https://projects.blender.org/blender/blender/pulls/136933
2025-04-04 16:01:06 +02:00
Weizhen Huang
0dc46773e0 Fix: Cycles Render Scheduler not respecting sample limit per update
Delay the clamping until later

Pull Request: https://projects.blender.org/blender/blender/pulls/136984
2025-04-04 14:17:08 +02:00
Jeroen Bakker
d6687de291 Cleanup: Code-style 2025-04-04 11:56:02 +02:00
Campbell Barton
4139d4a8f0 Cleanup: spelling in comments (make check_spelling_*) 2025-04-04 12:48:04 +11:00
Hans Goudey
d4b23d38c9 Cleanup: Formatting 2025-04-03 11:44:25 -04:00
Michael Jones
326d5bca03 Cycles: Support Decomposed MetalRT motion interpolation
Currently MetalRT interpolates transformation matrix on per-element basis
which leads to issues like #135659.

This change adds implementation of for decomposed (Scale/Rotate/Translate)
motion interpolation, matching behavior of BVH2 and other HW-RT.

This requires macOS 15 and Xcode 16 in order to use this interpolation.
On older platforms and compilers old interpolation is used.

Currently there is no changes on the user (by default) and it is only
available via CYCLES_METALRT_PCMI environment variable. This is because
there are some issues with complex motion paths that need to be looked
into. Having code available makes it easier to do further debugging.

Ref #135659

Authored by Emma Liu

Pull Request: https://projects.blender.org/blender/blender/pulls/136253
2025-04-03 16:24:04 +02:00
Xavier Hallade
3dd6104c87 Cycles: oneAPI: Fix building for non-Intel SYCL targets without ocloc
The current logic disables WITH_CYCLES_ONEAPI_BINARIES when ocloc is not
found, which is fine, but prevented building for other non-Intel SYCL
targets without (unnecessary) ocloc.

The fix here is to remove spir64_gen target when
WITH_CYCLES_ONEAPI_BINARIES is disabled, instead of forcing only spir64.
2025-04-03 15:13:57 +02:00
Jeroen Bakker
d271b69c67 Vulkan: Optimize swapchain image selection
Improve the number of images that are requested in the swapchain. For
mailbox presentation mode we select tripple buffering. For V-Sync
presentation mode we select double buffering.

In future we might want to make the presentation mode and swapchain
images dynamic based on the actual action that the user is performing.
When doing action that benefit from low latency (paint cursor) we should
select mailbox, otherwise we can use fifo to reduce CPU/GPU usage and
safe some energy usage along the way. See #136923 for design task.

Pull Request: https://projects.blender.org/blender/blender/pulls/136922
2025-04-03 10:30:09 +02:00
Campbell Barton
58e971138b Cleanup: quiet warnings from clangd
Also remove spaces around field id, following our own convention.
2025-04-03 13:51:52 +11:00
Brecht Van Lommel
1ea89c82d4 Build: Remove OpenMP
It's better for performance to use a single thread pool for all areas of
Blender, and this gets us closer to that.

Bullet, Quadriflow, Mantaflow and Ceres still contain OpenMP code, but it
was already disabled.

On macOS, our OpenMP libraries are no longer compatible with the latest
Xcode 16.3. By removing OpenMP we no longer have to solve that problem.

OpenMP was disabled for bpy module builds on Windows ARM64, which also no
longer needs to be solved.

Pull Request: https://projects.blender.org/blender/blender/pulls/136865
2025-04-02 16:50:50 +02:00
Brecht Van Lommel
da9a9093ec Refactor: Eigen: Switch from OpenMP to TBB
Only the parallel sparse matrix code was updated. This is used by e.g.
LSCM and ABF unwrap, and performance seems about the same or better.

Parallel GEMM (dense matrix-matrix multiplication) is used by libmv,
for example in libmv_keyframe_selection_test for a 54 x 54 matrix.
However it appears to harm performance, removing parallelization makes
that test run 5x faster on a Apple M3 Max.

There has been no new Eigen release since 2021, however there is active
development in master and it includes support for a C++ thread pool for
GEMM. So we could upgrade, but the algorithm remains the same and
looking at the implementation it just does not seem designed for modern
many core CPUs. Unless the matrix is much larger, there's too much thread
synchronization overhead. So it does not seem useful to enable that
thread pool for us.

Pull Request: https://projects.blender.org/blender/blender/pulls/136865
2025-04-02 16:50:50 +02:00
Brecht Van Lommel
98b3b36411 Refactor: Build: Add bf::dependencies::eigen target
To make adding a dependeny on TBB easier.

Additional changes:
* Using LIB for libmv tests, as it now brings in includes
* Removing Eigen header listing in iTaSC

Pull Request: https://projects.blender.org/blender/blender/pulls/136865
2025-04-02 16:50:46 +02:00
Sergey Sharybin
ccb493b805 Libmv: Abstract parallel range in camera intrinsics
The goal is to be able to switch away from OpenMP by default.

In order to achieve this a new parallel_for() function is added,
which follows the same declaration as the one from TBB. For now
the simplest formulation is used where range is provided by start
and last indices (the last one is excluded from the range).

The side effect is that Libmv now expects the threading limits
to be imposed by the default thread area (or have an explicit
thread area with the limit in the caller). In a way this is
similar to other external libraries.

Pull Request: https://projects.blender.org/blender/blender/pulls/136834
2025-04-02 10:25:49 +02:00
Campbell Barton
90fd070c28 Cleanup: spelling in comments (make check_spelling_*) 2025-04-02 03:02:01 +00:00
Campbell Barton
f89cf19ba6 Cleanup: indentation for CMake files, strip trailing space 2025-04-02 03:01:59 +00:00
Harley Acheson
8325e7f5e4 UI: Use busyButClickableCursor for MacOS Wait Cursor
MacOS is currently using one of our custom "hourglass" cursors during
busy times. This PR changes it to an OS-supplied cursor, a pointer
arrow with an animated blue spinner, meant for this purpose. Although
undocumented it has been used by many applications for many years.
Including Firefox for 11 years now.

Pull Request: https://projects.blender.org/blender/blender/pulls/136735
2025-04-02 01:30:13 +02:00
Sergey Sharybin
36559fd89f Fix #136811: HIP-RT performance regression in 4.5
Reduce the register pressure and branching in the switch() by using
subclass and cast from void* to the base class.

This ensures intersection functions are not inlined multiple times,
bringing performance back.

Alternative could be to avoid functions (they are quite large) but
that only partially resolves the performance regression.

Pull Request: https://projects.blender.org/blender/blender/pulls/136823
2025-04-01 17:59:44 +02:00
Jeroen Bakker
439a4e4687 Vulkan: Partial revert of swap chain refactor.
On Windows presenting mode mailbox would reduce lag. We might want to be
more selective on which mode to use when.
2025-04-01 16:33:31 +02:00
Jeroen Bakker
aed9f22233 Refactor: Vulkan: swapchain
This PR refactors the way how swapchains are used.

Allow scaling of the swapchain content to the actual resolution of the swapchain.
can reduce artefacts when resizing windows when supported.

When frame rate is to fast the previous implementation could use a semaphore
that were still in use, leading to unwanted stuttering on certain platforms. Waiting
when the rendering has finished (GHOST_Frame.submission_fence), before the
next image is acquired from the swap chain.

Mailbox has been disabled as it can calculate more frames then actually been
presented, leading to a lag and increased  power usage on others.

Pull Request: https://projects.blender.org/blender/blender/pulls/136603
2025-04-01 16:01:22 +02:00
Xavier Hallade
17e0d88c05 Cycles: oneAPI: Avoid returning 0 from get_max_num_threads_per_multiprocessor
Instead of relying on the Intel extensions that may not be implemented,
we can use max_work_group_size until there is a better alternative.
Thanks to Codeplay for this proposal.

Co-authored-by: Georgi Mirazchiyski <georgi.mirazchiyski@codeplay.com>
2025-04-01 11:10:08 +02:00
Campbell Barton
8de112efc5 Cleanup: remove unused variables in mataflow 2025-04-01 13:11:17 +11:00
Campbell Barton
00deec7c66 Cleanup: spelling in comments 2025-04-01 12:37:36 +11:00