Check was misc-const-correctness, combined with readability-isolate-declaration
as suggested by the docs.
Temporarily clang-format "QualifierAlignment: Left" was used to get consistency
with the prevailing order of keywords.
Pull Request: https://projects.blender.org/blender/blender/pulls/132361
The kernel zeroing memory since we've added host memory fallback didn't
expect large inputs, so with these scenes, it was running into
"Provided range is out of integer limits. Pass
`-fno-sycl-id-queries-fit-in-int' to disable range check" error.
This kernel was used instead of memset to avoid some issues with the
free_memory queries not always being updated.
As we can't reproduce these with recent drivers, we now use memset,
which fixes rendering with BVH2.
This enables scenes with all textures not fitting in GPU
memory to finally render. For scenes that are fitting,
no functional change or performance change is expected.
Pull Request: https://projects.blender.org/blender/blender/pulls/122385
Along with the 4.1 libraries upgrade, we are bumping the clang-format
version from 8-12 to 17. This affects quite a few files.
If not already the case, you may consider pointing your IDE to the
clang-format binary bundled with the Blender precompiled libraries.
OpenImageDenoise V2 comes with GPU support for various backends. This adds a new class, OIDNDenoiserGPU, in order to add this functionality into the existing Cycles post processing pipeline without having to change it much. OptiX and OIDN CPU denoising remain as they are. Rendering on a supported Intel GPU will automatically select the GPU denoiser.
Device support is initially limited to the oneAPI devices that are supported by Cycles, but can be extended.
Ref #115045
Co-authored-by: Stefan Werner <stefan.werner@intel.com>
Co-authored-by: Ray Molenkamp <github@lazydodo.com>
Pull Request: https://projects.blender.org/blender/blender/pulls/108314
<algorithm> header include is missing from some sycl headers, this will
be fixed upstream with https://github.com/intel/llvm/pull/10424,
meanwhile, we work around it by including it directly.
When running oneAPI with AoT binaries, on hardware that's not compatible with
these, recompilation could have been missing from the kernels loading phase and
happen during execution instead.
These changes fixes it, any kernel compilation will now happen during the
kernels loading phase.
Updated Embree 4 library with GPU support is required for it to be
compiled - compatiblity with Embree 3 and Embree 4 without GPU support
is maintained.
Enabling hardware raytracing is an opt-in user setting for now.
Pull Request: https://projects.blender.org/blender/blender/pulls/106266
This functionality is related only to debugging of SYCL implementation
via single-threaded CPU execution and is disabled by default.
Host device has been deprecated in SYCL 2020 spec and we removed it
in 305b92e05f.
Since this is still very useful for debugging, we're restoring a
similar functionality here through SYCL 2020 Host Task.
Test kernel will now test functionalities related to kernel execution
with USM memory allocations instead of with SYCL buffers and accessors
as these aren't currently used in the backend.
JIT compilation of oneAPI kernels now happens during load stage
and proper message gets shown in the GUI during compilation.
Also, this implementation skips kernels that aren't needed for
the used scene, reducing overall (re)compilation time.
This is a minimal set of changes, allowing a lot of cleanup that can
happen afterward as it allows sycl method and objects to be used outside
of kernel.cpp.
Reviewed By: brecht, sergey
Differential Revision: https://developer.blender.org/D15397
Windows drivers 101.3430 fix an important GUI-related crash and it's
best to prevent users from running into it.
Linux drivers weren't affected but still had relevant gpu binary
compatibility fixes, so it makes sense to keep the min-supported version
aligned across OSes.
sycl/L0 runtime reports compute-runtime version since Intel graphics
driver 101.3268 on Windows, when querying driver version from sycl.
Prior to this driver, it was 0. Now we can bump minimum requirement to
this one and filter-out devices returning 0.
Maniphest Tasks: T100648
The number of Execution Units and resident "threads" (simd width * threads
per EUs) are now exposed and used to select the number of states using
a simplified heuristic.
The current specific CentOS7 workaround we have for AoT, which is to
disable __FAST_MATH__ by using -fhonor-nans, now also fixes the
compilation issue for JIT as well since at least driver 23570.
with a very high min-driver version requirement, placeholder until JIT
CentOS runtime compilation issue gets fixed in a defined version.
min-driver version check can be worked around by setting
CYCLES_ONEAPI_ALL_DEVICES environment variable.
We used it only to access device id for explicitly allowing Arc GPUs.
It made the backend require ze_loader.dll which could be problematic if
we end up using direct linking.
I've replaced filtering based on PCI device id by using other HW properties
instead (EUs, threads per EU), that are now available through Level-Zero.
Initially oneAPI implementation have waited after each memory
operation, even if there was no need for this. Now, the implementation
will wait only if it is really necessary - it have improved
performance noticeble for some scenes and a bit for the rest of them.
This patch adds a new Cycles device with similar functionality to the
existing GPU devices. Kernel compilation and runtime interaction happen
via oneAPI DPC++ compiler and SYCL API.
This implementation is primarly focusing on Intel® Arc™ GPUs and other
future Intel GPUs. The first supported drivers are 101.1660 on Windows
and 22.10.22597 on Linux.
The necessary tools for compilation are:
- A SYCL compiler such as oneAPI DPC++ compiler or
https://github.com/intel/llvm
- Intel® oneAPI Level Zero which is used for low level device queries:
https://github.com/oneapi-src/level-zero
- To optionally generate prebuilt graphics binaries: Intel® Graphics
Compiler All are included in Linux precompiled libraries on svn:
https://svn.blender.org/svnroot/bf-blender/trunk/lib The same goes for
Windows precompiled binaries but for the graphics compiler, available
as "Intel® Graphics Offline Compiler for OpenCL™ Code" from
https://www.intel.com/content/www/us/en/developer/articles/tool/oneapi-standalone-components.html,
for which path can be set as OCLOC_INSTALL_DIR.
Being based on the open SYCL standard, this implementation could also be
extended to run on other compatible non-Intel hardware in the future.
Reviewed By: sergey, brecht
Differential Revision: https://developer.blender.org/D15254
Co-authored-by: Nikita Sirgienko <nikita.sirgienko@intel.com>
Co-authored-by: Stefan Werner <stefan.werner@intel.com>