Commit Graph

1462 Commits

Author SHA1 Message Date
Sebastian Herholz
5abf42012d Cycles: Guiding cleaning up and refactoring the guiding code
In detail:
- Direct accesses of state attributes are replaced with the INTEGRATOR_STATE and INTEGRATOR_STATE_WRITE macros.
- Unified the checks for the __PATH_GUIDING define to use #  if defined (__PATH_GUIDING__).
- Even if __PATH_GUIDING__ is defined, we now check if the feature is enabled using if ((kernel_data.kernel_features & KERNEL_FEATURE_PATH_GUIDING)) {. This is important for later GPU ports.
- The kernel usage of the guiding field, surface, and volume sampling distributions is wrapped behind macros for each specific device (atm only CPU). This will make it easier for a GPU port later.
2025-05-22 13:46:30 +02:00
Alaska
ce0ec6e708 Cycles: Disable MNEE in the HIP backend
MNEE on HIP has rendering artifacts on RDNA1 (#134978), RDNA2 (#139068)
and RDNA4 (#136980), and can lock up the GPU under specific situations with
RDNA3 (#138607).

There are certain configurations that work (E.g. RDNA4 seems to work on
Linux), but the number of configurations that work keep dropping as further
developments are made in other areas. So it was decided it's just better to
disable MNEE entirely on HIP.

This commit disables MNEE on HIP, and does a small cleanup to remove the
unused functions as a result of this change.

Fix #139068: MNEE renders with artifacts on RDNA2
Fix #138607: MNEE render test stalls on RDNA3

Pull Request: https://projects.blender.org/blender/blender/pulls/139069
2025-05-19 12:42:20 +02:00
Nikita Sirgienko
54766b6a54 Cycles: Introducing the code for adoption of Embree 4.4
Embree 4.4 introduces an improvement in the Embree GPU
implementation by dropping shared memory usage in favor
of direct controllable memory transfers. This should allow
addressing several problems spotted in Blender regarding
multithreading and memory corruption when BVH and rendering
happen at the same time. However, to implement such
improvements, the API has changed for several functions, and
this commit adopts Blender code to these changes, making Blender
buildable and functional with all existing Embree 4.X
versions, before and after 4.4.

No functional changes in Blender behavior are expected if
using Embree versions below 4.4.

Pull Request: https://projects.blender.org/blender/blender/pulls/139061
2025-05-19 11:25:50 +02:00
Campbell Barton
9eb9493ecd Fix: build error WITH_CYCLES_OSL & GCC 15.1
Ref !138238
2025-05-02 14:22:27 +10:00
Brecht Van Lommel
2c99edbffa Cycles: Bump Embree minimum version to 4.0.0
The build is already failing with Embree 3, as noticed in #137556.
And Embree 4 was released 2 years ago.

Pull Request: https://projects.blender.org/blender/blender/pulls/138221
2025-04-30 19:50:14 +02:00
Brecht Van Lommel
5046fe168f Cleanup: Compiler warning 2025-04-30 19:06:36 +02:00
Lukas Stockner
b4c8d709e8 Cleanup: Cycles: Deduplicate OptiX module creation
Pull Request: https://projects.blender.org/blender/blender/pulls/138091
2025-04-28 14:04:15 +02:00
Lukas Stockner
0dc4754da4 Cycles: Move OptiX OSL Camera kernel into its own PTX module
On the one hand, this improves initialization time since we don't need to
load/compile the full OSL module with all the shading logic if we're only
using a custom camera with SVM shading.

On the other hand, it also fixes a bug I noticed while preparing test scenes:
The AO and Bevel nodes don't work when using custom cameras with SVM on OptiX.

The issue there is that those two are handled by the SHADE_SURFACE_RAYTRACE
kernel, but since that one has intersection logic, we use the OptiX-specific
kernel even if OSL shading is disabled.
However, with the previous unified OSL module, this would mean loading
SHADE_SURFACE_RAYTRACE from kernel_osl.cu, which has `#undef __SVM__` and
therefore doesn't handle them correctly.

With this change, we'll use the kernels from kernel_shader_raytrace.cu in that
case, which do support SVM nodes just fine.

Disk usage of the new kernel_optix_osl_camera.ptx.zst file is 30KB, so this
also doesn't blow up the kernel disk size (and kernel_optix_osl.ptx.zst is
probably smaller by that amount now).

Since it seems that we can mix modules just fine, I'm suspecting that we could
split the modules properly (intersection, SVM shading with raytracing,
OSL shading, OSL camera), instead of the current approach where modules
essentially correspond to feature set tiers and each includes the previous
one's kernels as well - but that's a separate refactor.

Pull Request: https://projects.blender.org/blender/blender/pulls/138021
2025-04-28 12:49:35 +02:00
Brecht Van Lommel
ecd54ba4e4 Cycles: Metal graphics interop
This is trivial with unified memory, and avoids one memory copy.

Pull Request: https://projects.blender.org/blender/blender/pulls/137363
2025-04-28 11:38:56 +02:00
Brecht Van Lommel
b174e5f0d1 Cycles: Vulkan CUDA graphics interop
* Using CUDA external memory
* Checks that device UUID matches Vulkan

Pull Request: https://projects.blender.org/blender/blender/pulls/137363
2025-04-28 11:38:56 +02:00
Brecht Van Lommel
4d7bd22beb Refactor: Cycles: Graphics interop changes
* Add GraphicsInteropDevice to check if interop is possible with device
* Rename GraphcisInterop to GraphicsInteropBuffer
* Include display device type and memory size in GraphicsInteropBuffer
* Unnest graphics interop class to make forward declarations possible

Pull Request: https://projects.blender.org/blender/blender/pulls/137363
2025-04-28 11:38:56 +02:00
Lukas Stockner
bf412ed9dd Cycles: Support for custom OSL cameras
This allows users to implement arbitrary camera models using OSL by writing
shaders that take an image position as input and compute ray origin and
direction.

The obvious applications for this are e.g. panorama modes, lens distortion
models and realistic lens simulation, but the possibilities are endless.

Currently, this is only supported on devices with OSL support, so CPU and
OptiX. However, it is independent from the shading model used, so custom
cameras can be used without getting the performance hit of OSL shading.

A few samples are provided as Text Editor templates.

One notable current limitation (in addition to the limited device support)
is that inverse mapping is not supported, so Window texture coordinates and
the Vector pass will not work with custom cameras.

Pull Request: https://projects.blender.org/blender/blender/pulls/129495
2025-04-25 19:27:30 +02:00
Alaska
0a7a12f873 Cycles: Print additional warnings about unsupported oneAPI driver versions to terminal
This commit adds some extra prints to terminal related to oneAPI driver
information in the situation that the driver version is considered
incompatible with the current version of Cycles.

Pull Request: https://projects.blender.org/blender/blender/pulls/137272
2025-04-15 09:03:45 +02:00
Brecht Van Lommel
c8f9fdc0c8 Fix: Cycles CUDA errors after recent changes for scene update
Broken by 86b67a20d6. Delay upload of shader data to GPU until
after kernels have been loaded.

Pull Request: https://projects.blender.org/blender/blender/pulls/137349
2025-04-11 19:14:14 +02:00
Alaska
975d61daf3 Cycles: Disable MNEE on RDNA4 GPUs
At the moment MNEE locks up Cycles, or has rendering artifacts on
RDNA4 GPUs on WIndows.

This commit disables MNEE on that configuration until a fix
is avaliable.

Pull Request: https://projects.blender.org/blender/blender/pulls/136980
2025-04-05 14:06:40 +02:00
Hans Goudey
d4b23d38c9 Cleanup: Formatting 2025-04-03 11:44:25 -04:00
Michael Jones
326d5bca03 Cycles: Support Decomposed MetalRT motion interpolation
Currently MetalRT interpolates transformation matrix on per-element basis
which leads to issues like #135659.

This change adds implementation of for decomposed (Scale/Rotate/Translate)
motion interpolation, matching behavior of BVH2 and other HW-RT.

This requires macOS 15 and Xcode 16 in order to use this interpolation.
On older platforms and compilers old interpolation is used.

Currently there is no changes on the user (by default) and it is only
available via CYCLES_METALRT_PCMI environment variable. This is because
there are some issues with complex motion paths that need to be looked
into. Having code available makes it easier to do further debugging.

Ref #135659

Authored by Emma Liu

Pull Request: https://projects.blender.org/blender/blender/pulls/136253
2025-04-03 16:24:04 +02:00
Xavier Hallade
17e0d88c05 Cycles: oneAPI: Avoid returning 0 from get_max_num_threads_per_multiprocessor
Instead of relying on the Intel extensions that may not be implemented,
we can use max_work_group_size until there is a better alternative.
Thanks to Codeplay for this proposal.

Co-authored-by: Georgi Mirazchiyski <georgi.mirazchiyski@codeplay.com>
2025-04-01 11:10:08 +02:00
Xavier Hallade
795a76029a Cycles: oneAPI: Restrict use of experimental copy optimization to L0
This API is not properly implemented in other SYCL backends at the
moment and we don't want it to fail at runtime, so we conservatively
enable it only for Level-Zero.
2025-03-31 16:14:36 +02:00
Xavier Hallade
7a257359f8 Cycles: oneAPI: Use max_compute_units in get_num_multiprocessors
Instead of returning 0 in case the Intel extension for getting the count
of Execution Units isn't available, we now use
sycl::info::device::max_compute_units.

We keep using the Intel extension in priority since it logically goes
with sycl::ext::intel::info::device::gpu_hw_threads_per_eu used in
get_max_num_threads_per_multiprocessor(), for which there is no
sycl::info::device::max_threads_per_compute_unit replacement yet.
2025-03-26 23:15:49 +01:00
Sergey Sharybin
42cbc52b07 Fix: Warning in Cycles motion blur kernel features expression
This fixes the following warning with MSVC:
device_impl.cpp(287): warning C4805: '|=': unsafe mix of type 'bool' and type 'ccl::uint' in operation

The similar fix is applied to Metal code as well.

There is no short-circuiting boolean operator ||=, so expand the expression.

Pull Request: https://projects.blender.org/blender/blender/pulls/136561
2025-03-26 17:20:33 +01:00
salipour
ae710101f5 Fix #136138, #136449: Cycles HIP RDNA2 white and blue render artifacts
There is a known precision bug in the current HIP compiler version                                                                                                                                                    (RDNA2 family/Windows) that has already been fixed and will be available in
a future HIP SDK release. Enabling more precise math prevents the artifacts.

This may cause a 5-10% performance drop in some scenes.

Fix #136138: Microfacet BSDF
Fix #136449: Hair BSDF

Pull Request: https://projects.blender.org/blender/blender/pulls/136341
2025-03-25 18:21:16 +01:00
Brecht Van Lommel
f506564a47 Cleanup: Unused argument compiler warning 2025-03-23 21:01:25 +01:00
Michael Jones
c23c4ae6ba Cycles: Fix issue affecting Metal kernel profiling (normally disabled)
This issue only affects profiling mode (`CYCLES_METAL_PROFILING=1`). There's a modest limit to the number of concurrent counter sampling buffers per device, so instead of creating one per device queue, we create one per device that can be reused by successive device queues.

Authored by Emma Liu.

Pull Request: https://projects.blender.org/blender/blender/pulls/136248
2025-03-21 12:47:15 +01:00
Michael Jones
9dca0ba856 Cycles: Maximise MTLCompiler concurrency when GUI isn't active
This PR will result in much faster Metal kernel (re)compilation for command line rendering.

Pull Request: https://projects.blender.org/blender/blender/pulls/136247
2025-03-20 14:07:14 +01:00
Michael Jones
584f19a5af Cycles: Apple Silicon tidy: Remove non-UMA codepaths (v2)
This PR removes a bunch of dead code following #123551 (removal of AMD and Intel GPU support). It is safe to assume that UMA will be available, so a lot of codepaths that dealt with copying between CPU and GPU are now just clutter.

Pull Request: https://projects.blender.org/blender/blender/pulls/136146
2025-03-19 12:53:01 +01:00
Brecht Van Lommel
ab3204e251 Revert "Cycles: Apple Silicon tidy: Remove non-UMA codepaths"
This reverts commit 1a93dfe4fc.

This is hitting asserts in the tests, revert until it's fixed.

Ref #136117
2025-03-18 20:37:23 +01:00
Michael Jones
1a93dfe4fc Cycles: Apple Silicon tidy: Remove non-UMA codepaths
This PR removes a bunch of dead code following #123551 (removal of AMD and Intel GPU support). It is safe to assume that UMA will be available, so a lot of codepaths that dealt with copying between CPU and GPU are now just clutter.

Pull Request: https://projects.blender.org/blender/blender/pulls/136117
2025-03-18 19:09:25 +01:00
Sergey Sharybin
b8bd8ba36d Merge branch 'blender-v4.4-release' 2025-03-14 14:52:02 +01:00
Sergey Sharybin
1d4a211d6c Fix: Incorrect check of device pointers in HIP-RT code
The code was checking the same device pointer instead of
checking that both allocations are successful.

Pull Request: https://projects.blender.org/blender/blender/pulls/135977
2025-03-14 14:51:49 +01:00
Brecht Van Lommel
71a4f1ab96 Merge branch 'blender-v4.4-release' 2025-03-12 11:40:08 +01:00
Brecht Van Lommel
73ea95a56a Fix #135644: HIP-RT crash with host memory fallback
Avoid manipulating the host pointer in device memory, this fails when host
mapped memory gets used and the pointers gets re-allocated.

Pull Request: https://projects.blender.org/blender/blender/pulls/135724
2025-03-12 11:38:07 +01:00
Brecht Van Lommel
0ff2635131 Fix #135644: Cycles HIP-RT crash when running out of memory
Tightehn up checks for failed allocations, early out on errors.

Pull Request: https://projects.blender.org/blender/blender/pulls/135724
2025-03-12 11:37:59 +01:00
Lukas Stockner
dbe275895e Cleanup: Cycles: Deduplicate OptiX OSL code
Not a big difference for now, but will be nicer for #129495.

Pull Request: https://projects.blender.org/blender/blender/pulls/135049
2025-03-10 13:30:58 +01:00
Brecht Van Lommel
5a9d4fd613 Cleanup: Use default member initializers 2025-03-06 22:34:22 +01:00
Brecht Van Lommel
ab394c8e8d Refactor: Use std::bitset to avoid overflow in device queue logging 2025-03-06 22:34:22 +01:00
Sybren A. Stüvel
15758ab854 Merge remote-tracking branch 'origin/blender-v4.4-release' 2025-03-06 14:13:03 +01:00
Sergey Sharybin
7397e6da29 Fix: Cycles HIP-RT crash
The crash has been introduced by the refactor of lights to be
objects in #134846.

We can make such cases easier to catch at compile time in the
future, but for now applying the minimal patch which solves the
problem without going deeper into refactor.

Pull Request: https://projects.blender.org/blender/blender/pulls/135570
2025-03-06 11:58:13 +01:00
Sergey Sharybin
f89728a5e4 Fix: HIP-RT creates copy of vector<Object *> during build
Is harmless from functional perspective, but uses more resources and
potentially slower than it should be. Although, probably something
hard to measure in practice, but still better not follow this anti-
pattern.

Pull Request: https://projects.blender.org/blender/blender/pulls/135529
2025-03-06 11:57:51 +01:00
Sergey Sharybin
3f6fca4297 Cycles: Enable HIP-RT logging when debug log is on
These logs do not appear to be that noisy and do help nailing down
issues in HIP-RT.

Pull Request: https://projects.blender.org/blender/blender/pulls/135530
2025-03-06 11:57:34 +01:00
Campbell Barton
5b856ba447 Merge branch 'blender-v4.4-release' 2025-03-06 10:35:59 +11:00
Campbell Barton
b85fc32cae Cleanup: spelling & repeated words in comments
Address warnings from check_spelling.py
2025-03-06 10:33:21 +11:00
Brecht Van Lommel
28f7e2ae91 Merge branch 'blender-v4.4-release' 2025-03-04 17:53:08 +01:00
Brecht Van Lommel
a3baf60df4 Fix: Cycles device info uninitialized variable
It's unclear if this caused an actual bug, detected by ASAN.
2025-03-04 17:46:04 +01:00
Sergey Sharybin
ad30bdd470 Merge branch 'blender-v4.4-release' 2025-02-28 19:02:39 +01:00
Sahar A. Kashi
99a487a07c Fix: Cycles HIP-RT curve motion blur and motion pass
Various fixes in the HIP-RT BVH building related on making sure
curves motion blur is supported and is working correctly, as well
as properly handle motion pass configuration when path tracing is
to ignore motion blur (and instead write vector pass).

This PR contains #134797 with fixes needed to fully finish it:
moving commits from that PR here made it easier to ensure all
moving parts are tested without mental overhead.

Fixes #134510

Co-authored-by: Sahar A. Kashi  <sahar.alipourkashi@amd.com>
Co-authored-by: Sergey Sharybin <sergey@blender.org>
Co-authored-by: Brecht Van Lommel <brecht@blender.org>

Pull Request: https://projects.blender.org/blender/blender/pulls/135125
2025-02-28 19:02:03 +01:00
Sean Stirling
5372346978 Cycles: oneAPI: Use linear USM memory for 1D images
Rewrite the ONEAPI Blender texture allocation code to make use of
1D images backed by linear USM memory. This increases parity
with the CUDA implementation and sets the ground work for enabling
host USM allocations in Blender. By enabling this functionality,
previously failing benchmarks are now passing.

Together with the previous commit, no functional changes are expected.
2025-02-28 17:52:41 +01:00
Nikita Sirgienko
dcbc7c1623 Cycles: oneAPI: Remove some texture code from the squished bindless texture commit
This code will be reintroduced back shortly, but under proper credentials.

No functional changes are expected along with the next commit.
2025-02-28 17:51:35 +01:00
Brecht Van Lommel
3cf21ceaf4 Merge branch 'blender-v4.4-release' 2025-02-28 13:28:20 +01:00
Brecht Van Lommel
8f00d8b0c8 Fix: Cycles hardware RT is only supported if all multi devices have it
Pull Request: https://projects.blender.org/blender/blender/pulls/135179
2025-02-28 13:21:33 +01:00