108 Commits

Author SHA1 Message Date
Nikita Sirgienko
38adb8f1a4 Cycles: oneAPI: Fix duplicated GPU device entries on some setups
In some hardware configurations, it is possible that DPC++ or
Intel Drivers wrongfully report all devices twice. It is already
being worked on internally, and the fixes will be available in
the future - but for now, we need a workaround for this problem
in Blender as well, to ensure that our end-users are not impacted.

Pull Request: https://projects.blender.org/blender/blender/pulls/147731
2025-10-10 17:25:29 +02:00
Nikita Sirgienko
b133019f9f Cycles: oneAPI: use ocloc 101.8132 on Windows
This new version of the graphics compiler improves performance
for the majority of supported Intel devices and adds support
for upcoming Intel hardware. Such an upgrade also requires
an increase in the minimal supported driver version on Windows,
which is why these changes are combined together with
the ocloc upgrade.

Previously set minimal version 101.6557 was increased to 101.8132.

Pull Request: https://projects.blender.org/blender/blender/pulls/147460
2025-10-08 13:36:08 +02:00
Christoph Neuhauser
72f098248d Cycles: Add Vulkan/oneAPI graphics interop
This PR adds Vulkan/oneAPI graphics interop to Cycles. Just like for
CUDA and HIP interop, persistent memory mapping is used, as there could
potentially be some overhead of continuously mapping/unmapping buffers.

Pull Request: https://projects.blender.org/blender/blender/pulls/144442
2025-10-06 18:16:56 +02:00
Nikita Sirgienko
49414a72f6 Cycles: oneAPI: Add new arch codes for upcoming Intel hardware
Pull Request: https://projects.blender.org/blender/blender/pulls/147221
2025-10-04 22:34:54 +02:00
Thomas Dinges
66224d69b0 Deps: Library changes for Blender 5.0
This commit includes the changes to the build system, updated hashes to the actual new libraries as well as a required test update.

* DPC++ 6.2.0 RC
* freetype 2.13.3
* HIP 6.4.5010
* IGC 2.16.0
* ISPC 1.28.0
* libharu  2.4.5
* libpng 1.6.50
* libvpx 1.15.2
* libxml2 2.14.5
* LLVM 20.1.8
* Manifold 3.2.1
* MaterialX 1.39.3
* OpenColorIO 2.4.2
* openexr 3.3.5
* OpenImageIO 3.0.9.1
* openjpeg 2.5.3
* OpenShadingLanguage 1.14.7.0
* openssl 3.5.2
* Python 3.11.13
* Rubber Band 4.0.0
* ShaderC 2025.3
* sqlite 3.50.4
* USD 25.08
* Wayland 1.24.0

Ref #138940

Co-authored-by: Ray Molenkamp <github@lazydodo.com>
Co-authored-by: Jesse Yurkovich <jesse.y@gmail.com>
Co-authored-by: Brecht Van Lommel <brecht@blender.org>
Co-authored-by: Nikita Sirgienko <nikita.sirgienko@intel.com>
Co-authored-by: Sybren A. Stüvel <sybren@blender.org>
Co-authored-by: Kace <lakacey03@gmail.com>
Co-authored-by: Sebastian Parborg <sebastian@blender.org>
Co-authored-by: Anthony Roberts <anthony.roberts@linaro.org>
Co-authored-by: Jonas Holzman <jonas@holzman.fr>

Pull Request: https://projects.blender.org/blender/blender/pulls/144479
2025-10-02 18:34:11 +02:00
Weizhen Huang
2b0a1cae06 Cycles: Add an option to use ray marching for volume rendering
Null Scattering currently has performance and noise issues, and it will
take time to address them. For now add the previous Ray Marching back as
an option.

Co-authored-by: Brecht Van Lommel <brecht@blender.org>
Pull Request: https://projects.blender.org/blender/blender/pulls/146317
2025-09-26 12:14:45 +02:00
Nikita Sirgienko
5efeb06613 Fix #145449: Workaround wrongly generated Intel Linux driver version
There are several Driver versions which are constructing the wrong,
semantically, version which would force Blender to decline the Intel
device for oneAPI backend usage, based on this. Unfortunately,
the upstream fix is taking a long time to be finally delivered to
the distros and end-users, so it is better if Blender will detect
this wrong version string and parse it properly, allowing these
devices to be used - as the wrong driver version string is the only
issue here, besides this the driver functionality is fine.

Pull Request: https://projects.blender.org/blender/blender/pulls/145658
2025-09-03 19:26:05 +02:00
Brecht Van Lommel
2615cecf10 Refactor: Cycles: Align log levels with CLOG
WORK -> DEBUG
DEBUG, STATS -> TRACE

Pull Request: https://projects.blender.org/blender/blender/pulls/144490
2025-08-18 20:22:44 +02:00
Nikita Sirgienko
21cba7024c Cycles: oneAPI: Disable L0 copy optimization for several dGPUs
Currently, it was discovered that in the case of several different
Intel dGPUs being present in the system, the experimental L0 copy
optimization does not work correctly in the Intel Driver, which is
causing crashes in the driver and Blender application. So, to avoid
this situation and restore functionality on these platforms,
a workaround was added to disable this extension from being used if
such a configuration is detected. In the future, when this problem is
fully fixed in all Intel Drivers, this workaround can be removed from
the Blender source code to restore some performance that was lost on
configurations of several dGPUs because of this workaround.

Pull Request: https://projects.blender.org/blender/blender/pulls/144262
2025-08-14 12:14:51 +02:00
Weizhen Huang
b2b2d9a4f3 Cycles: Render volume by ray marching through octrees
One octree per volume per shader based on the density. In preparation
for the null scattering
2025-08-13 10:28:50 +02:00
Campbell Barton
cccc2c77c5 Cleanup: consistent for C-style comment blocks 2025-08-08 07:37:33 +10:00
Stefan Werner
c81e1d95c1 Cycles: Fixed typo in my last commit 2025-07-29 10:53:13 +02:00
Stefan Werner
e7312b1ad5 Cycles: Explicitly setting SYCL device for Embree
This fixes issues when using Embree on mutliple GPUs.
A previous workaround used separate contexts, this one now
lets us keep a single context for all GPUs.

Pull Request: https://projects.blender.org/blender/blender/pulls/143089
2025-07-29 10:40:28 +02:00
Hans Goudey
c3181490f3 Cleanup: Formatting 2025-07-14 10:22:46 -04:00
Nikita Sirgienko
609f8ddbef Cycles: oneAPI: Fix DPC++ level issues for multi GPU execution
These changes introduce modifications to the SYCL queue creation
in OneapiDevice::create_queue. In case several DPC++ devices are
detected by Blender and exposed through it, we are now creating
a new SYCL context for each device, which allows us to prevent
execution failures due to some known issues in the DPC++ runtime
regarding multi GPU support. As this would have some small
performance impact, few percents, it is only applied to
multi GPU configurations, while the behavior for a single
GPU configuration remains the same.

Pull Request: https://projects.blender.org/blender/blender/pulls/141834
2025-07-14 14:33:42 +02:00
Brecht Van Lommel
73fe848e07 Fix: Cycles log levels conflict with macros on some platforms
In particular DEBUG, but prefix all of them to be sure.

Pull Request: https://projects.blender.org/blender/blender/pulls/141749
2025-07-10 19:44:14 +02:00
Xavier Hallade
94e9203713 Fix previous 4.5 merge 2025-07-10 17:47:03 +02:00
Xavier Hallade
48f89ff1c3 Merge branch 'blender-v4.5-release' 2025-07-10 17:43:30 +02:00
Xavier Hallade
05f27f594e Fix #141661: Crash when selecting oneAPI in preferences with legacy drivers
On systems with multiple Intel GPUs with a mix of recent and old
unsupported drivers (such as 101.3302), the Level-Zero stack may have
troubles initializing, leading to a crash while enumerating devices.

Luckily this condition actually leads to an exception we can catch,
as implemented here in this commit.

Pull Request: https://projects.blender.org/blender/blender/pulls/141674
2025-07-10 17:36:00 +02:00
Brecht Van Lommel
b6c4233b28 Refactor: Cycles: Remove now unused 3D image texture support
Pull Request: https://projects.blender.org/blender/blender/pulls/132908
2025-07-09 21:04:38 +02:00
Brecht Van Lommel
7978799e6f Cycles: Always render volume as NanoVDB
All GPU backends now support NanoVDB, using our own kernel side code
that is easily portable. This simplifies kernel and device code.

Volume bounds are now built from the NanoVDB grid instead of OpenVDB,
to avoid having to keep around the OpenVDB grid after loading.

While this reduces memory usage, it does have a performance impact,
particularly for the Cubic filter. That will be addressed by
another commit.

Pull Request: https://projects.blender.org/blender/blender/pulls/132908
2025-07-09 21:04:38 +02:00
Brecht Van Lommel
fb4e3c8167 Refactor: Cycles: Remove distinction between severity and verbosity
Only use LOG() and LOG_IS_ON() macros, no more VLOG_.

Pull Request: https://projects.blender.org/blender/blender/pulls/140244
2025-07-09 20:59:24 +02:00
Xavier Hallade
7691e6520b Fix #141171: oneAPI: Rendering artifacts in barbershop scene
max_shaders was not updated when Embree was disabled.

Pull Request: https://projects.blender.org/blender/blender/pulls/141175
2025-06-30 16:39:53 +02:00
Brecht Van Lommel
e84fad92ea Fix #139986: Cycles crash on some scene updates, after Embree upgrade
Device::const_copy_to is sometimes called when the Embree BVH has been freed
and not replaced yet. Previously this was a simpler pointer copy, now there is
a function call. Make sure it's just a function copy.

Thanks to Nikita Sirgienko for figuring this out.

Pull Request: https://projects.blender.org/blender/blender/pulls/140457
2025-06-16 17:59:57 +02:00
Brecht Van Lommel
f7ffcfe652 Cleanup: Cycles: Use default initializers in oneAPI device
Ref #140457
2025-06-16 17:59:50 +02:00
Brecht Van Lommel
a4cfd14f0a Fix #137966: oneAPI crash when max texture size is exceeded
Until the texture cache addresses this properly, show a useful error rather
than crashing.

Pull Request: https://projects.blender.org/blender/blender/pulls/139892
2025-06-05 20:37:25 +02:00
Nikita Sirgienko
69091c5028 Cycles: Show device optimizations status in preferences for oneAPI
With these changes, we can now mark devices which are expected to work as
performant as possible, and devices which were not optimized for some reason.

For example, because the device was released after the Blender release,
making it impossible for developers to optimize for devices in already
released unchangeable code. This is primarily relevant for the LTS versions,
which are supported for two years and require proper communication about
optimization status for the new devices released during this time.

This is implemented for oneAPI devices. Other device types currently are
marked as optimized for compatibility with old behavior, but may implement
the same in the future.

Pull Request: https://projects.blender.org/blender/blender/pulls/139751
2025-06-03 20:07:52 +02:00
Nikita Sirgienko
54766b6a54 Cycles: Introducing the code for adoption of Embree 4.4
Embree 4.4 introduces an improvement in the Embree GPU
implementation by dropping shared memory usage in favor
of direct controllable memory transfers. This should allow
addressing several problems spotted in Blender regarding
multithreading and memory corruption when BVH and rendering
happen at the same time. However, to implement such
improvements, the API has changed for several functions, and
this commit adopts Blender code to these changes, making Blender
buildable and functional with all existing Embree 4.X
versions, before and after 4.4.

No functional changes in Blender behavior are expected if
using Embree versions below 4.4.

Pull Request: https://projects.blender.org/blender/blender/pulls/139061
2025-05-19 11:25:50 +02:00
Brecht Van Lommel
4d7bd22beb Refactor: Cycles: Graphics interop changes
* Add GraphicsInteropDevice to check if interop is possible with device
* Rename GraphcisInterop to GraphicsInteropBuffer
* Include display device type and memory size in GraphicsInteropBuffer
* Unnest graphics interop class to make forward declarations possible

Pull Request: https://projects.blender.org/blender/blender/pulls/137363
2025-04-28 11:38:56 +02:00
Alaska
0a7a12f873 Cycles: Print additional warnings about unsupported oneAPI driver versions to terminal
This commit adds some extra prints to terminal related to oneAPI driver
information in the situation that the driver version is considered
incompatible with the current version of Cycles.

Pull Request: https://projects.blender.org/blender/blender/pulls/137272
2025-04-15 09:03:45 +02:00
Xavier Hallade
17e0d88c05 Cycles: oneAPI: Avoid returning 0 from get_max_num_threads_per_multiprocessor
Instead of relying on the Intel extensions that may not be implemented,
we can use max_work_group_size until there is a better alternative.
Thanks to Codeplay for this proposal.

Co-authored-by: Georgi Mirazchiyski <georgi.mirazchiyski@codeplay.com>
2025-04-01 11:10:08 +02:00
Xavier Hallade
795a76029a Cycles: oneAPI: Restrict use of experimental copy optimization to L0
This API is not properly implemented in other SYCL backends at the
moment and we don't want it to fail at runtime, so we conservatively
enable it only for Level-Zero.
2025-03-31 16:14:36 +02:00
Xavier Hallade
7a257359f8 Cycles: oneAPI: Use max_compute_units in get_num_multiprocessors
Instead of returning 0 in case the Intel extension for getting the count
of Execution Units isn't available, we now use
sycl::info::device::max_compute_units.

We keep using the Intel extension in priority since it logically goes
with sycl::ext::intel::info::device::gpu_hw_threads_per_eu used in
get_max_num_threads_per_multiprocessor(), for which there is no
sycl::info::device::max_threads_per_compute_unit replacement yet.
2025-03-26 23:15:49 +01:00
Sean Stirling
5372346978 Cycles: oneAPI: Use linear USM memory for 1D images
Rewrite the ONEAPI Blender texture allocation code to make use of
1D images backed by linear USM memory. This increases parity
with the CUDA implementation and sets the ground work for enabling
host USM allocations in Blender. By enabling this functionality,
previously failing benchmarks are now passing.

Together with the previous commit, no functional changes are expected.
2025-02-28 17:52:41 +01:00
Nikita Sirgienko
dcbc7c1623 Cycles: oneAPI: Remove some texture code from the squished bindless texture commit
This code will be reintroduced back shortly, but under proper credentials.

No functional changes are expected along with the next commit.
2025-02-28 17:51:35 +01:00
Brecht Van Lommel
c87a269021 Fix #133953: Cycles oneAPI texture randomly renders black
* Do oneAPI copy optimization as part of host memory alloc and free, so
  it is properly released before host memory is freed.
* Synchronize after loading texture info, like CUDA and HIP.

https://projects.blender.org/blender/blender/pulls/134412
2025-02-13 19:58:56 +01:00
Brecht Van Lommel
f99f958c47 Refactor: Cycles: Add host_alloc/free to device API
This may be used for device to do host memory allocation in a way that
is more efficient for copy the host memory to the device.

Also rename and group device memory allocation functions for clarity.

Pull Request: https://projects.blender.org/blender/blender/pulls/134412
2025-02-13 19:58:56 +01:00
Campbell Barton
c83c62439e Cleanup: correct typo 2025-02-13 11:14:50 +11:00
Nikita Sirgienko
2bab4ae370 Cycles: oneAPI: Optimize texture access by using GPU HW sampler
The current usage of software-based texture operations in
the oneAPI implementation puts additional register pressure on
the GPU compiler during register allocation. And it also creates
code that requires maintenance. This commit is intended to address
this situation by utilizing a recently productized SYCL bindless
texture API to enable HW-based texture operations using
Intel GPUs' hardware sampler.

This currently translates to 1-11% rendering speedups (scene-specific)
on my Arc A770 and Arc B580. At the moment, there are small
performance regressions with NanoVDB texture operations on Arc B580
and small performance regressions in shade surface MNEE and Raytrace
kernels on Arc A770, but they look recoverable and will be handled
in the future.

Pull Request: https://projects.blender.org/blender/blender/pulls/133457
2025-02-12 21:47:34 +01:00
Nikita Sirgienko
bee534eea5 Build: Upgrade Intel Graphics Compiler to 2.1.14 on Linux
This corresponds the latest rolling 2448.13 release:
https://dgpu-docs.intel.com/releases/packages.html?release=Rolling+2448.13&os=Ubuntu+24.04

Graphics compiler upgrades require increasing the minimum required
driver (compute-runtime) version to the corresponding one to guarantee
compatibility, which is XX.XX.31740.15 in this release, so we bump this
requirement accordingly.

Co-authored-by: Xavier Hallade <me@ph0b.com>
Pull Request: https://projects.blender.org/blender/blender/pulls/134051
2025-02-05 15:00:04 +01:00
Xavier Hallade
e7589f8973 Fix: Cycles: Missing texture transfers in oneAPI backend
Since 2cfe2e0bfe, textures were not being
allocated nor transfered to device.

This fix improves the situation reported in
https://projects.blender.org/blender/blender/issues/133953 but is not
enough to make all unit tests pass.
2025-02-03 20:20:21 +01:00
Brecht Van Lommel
e8ebcb3ee3 Fix: Cycles: Check if memory is host mapped without access to device_mem_map
This avoids concurrency issues.

Pull Request: https://projects.blender.org/blender/blender/pulls/132912
2025-01-29 14:12:23 +01:00
Brecht Van Lommel
cd3d3b2646 Refactor: Cycles: Delay load_texture_info() to enqueue
Doing it immediately after moving textures to the host is less efficient, and
interacts in confusing ways.

Pull Request: https://projects.blender.org/blender/blender/pulls/132912
2025-01-29 14:12:06 +01:00
Brecht Van Lommel
fec593ec3b Fix: Cycles: Avoid unnecessary move to host with multi-device
If one of the devices already used host happed memory but another not,
it would previously realloc both.

Thanks to Jorn Visser for investigating and finding this problem.

Pull Request: https://projects.blender.org/blender/blender/pulls/132912
2025-01-29 14:12:02 +01:00
Brecht Van Lommel
2cfe2e0bfe Fix: Cycles: Re-copy memory from host to device without realloc
Should be a bit more efficient, and it fixes host memory fallback bugs,
where host memory was incorrectly freed during re-copy. For the case
where memory should get reallocated on the host, a new mem_move_to_host
was added.

Thanks to Jorn Visser for investigating and finding this problem.

Pull Request: https://projects.blender.org/blender/blender/pulls/132912
2025-01-29 14:11:50 +01:00
Xavier Hallade
ce463bd6b1 Cycles: oneAPI: optimize device<->host copies
There is a large overhead when doing copies between a device and non-USM host memory.
Using the prepare/release API avoids it, as presented in the optimization guide:
https://www.intel.com/content/www/us/en/docs/oneapi/optimization-guide-gpu/2025-0/optimizing-data-transfers.html

This currently translates to a 4-5% overall rendering speedups on my Arc B580 in most scenes.

Pull Request: https://projects.blender.org/blender/blender/pulls/132859
2025-01-09 21:00:12 +01:00
Stefan Werner
a79d95099f Cycles: Fix OneAPI crash after unique_ptr refactor
Memory was freed too early, probably a typo.
2025-01-07 09:37:47 +01:00
Brecht Van Lommel
9971648783 Refactor: Cycles: Replace new/delete by unique_ptr, in simple cases
Pull Request: https://projects.blender.org/blender/blender/pulls/132361
2025-01-03 10:23:30 +01:00
Brecht Van Lommel
57ff24cb99 Refactor: Cycles: Add const keyword to more function parameters
Pull Request: https://projects.blender.org/blender/blender/pulls/132361
2025-01-03 10:23:24 +01:00
Brecht Van Lommel
dd51c8660b Refactor: Cycles: Add const keyword where possible, using clang-tidy
Check was misc-const-correctness, combined with readability-isolate-declaration
as suggested by the docs.

Temporarily clang-format "QualifierAlignment: Left" was used to get consistency
with the prevailing order of keywords.

Pull Request: https://projects.blender.org/blender/blender/pulls/132361
2025-01-03 10:23:20 +01:00