Commit Graph

40 Commits

Author SHA1 Message Date
Xavier Hallade
1489c5a57b Merge branch 'blender-v3.6-release' 2023-06-23 13:12:58 +02:00
Xavier Hallade
6437c0c948 Cycles: oneAPI: avoid crashes from old drivers
During recent testing, the oldest 101.4032 (windows) and <25812 (linux)
drivers led to crashes during JIT compilation, so we bump the
requirement to newer 101.4313 and 25812.14 drivers that do incorporate
the required fixes.

Pull Request: https://projects.blender.org/blender/blender/pulls/109281
2023-06-23 13:12:21 +02:00
Campbell Barton
c12994612b License headers: use SPDX-FileCopyrightText in intern/cycles 2023-06-14 16:53:23 +10:00
Xavier Hallade
23de320878 Cycles: fix multi-device rendering with oneAPI and Hardware Raytracing
Only Embree CPU BVH was built in the multi-device case. However, one
Embree GPU BVH is needed per GPU, so we now reuse the same logic as in
the other backends.

Pull Request: https://projects.blender.org/blender/blender/pulls/107992
2023-05-22 15:26:58 +02:00
Campbell Barton
bf36a61e62 Cleanup: spelling in comments & some corrections 2023-05-20 21:17:09 +10:00
Nikita Sirgienko
bafd82c9c1 Cycles: oneAPI: use local memory for faster shader sorting
Co-authored-by: Stefan Werner <stefan.werner@intel.com>

Pull Request: https://projects.blender.org/blender/blender/pulls/107994
2023-05-17 11:07:57 +02:00
Xavier Hallade
5c57b9aa79 Cleanup: avoid warning on unused argument in cycles_device 2023-05-15 07:35:03 +02:00
Xavier Hallade
5ec2495550 Cycles: oneAPI: enable Hardware Raytracing for Raytrace/MNEE kernels
We do so if Embree 4.1+ is present.
2023-05-12 14:17:50 +02:00
Nikita Sirgienko
1dcc8e6ffa Fix #107356: Cycles: improve oneAPI error handling 2023-05-03 12:06:08 +02:00
Campbell Barton
6859bb6e67 Cleanup: format (with BraceWrapping::AfterControlStatement "MultiLine") 2023-05-02 09:37:49 +10:00
Nikita Sirgienko
0d9fa73b42 Cycles: oneAPI: Fix motion blur rendering for Embree GPU execution
CPU non-unified shared memory was used for shared geometry buffers.
For the Embree GPU case, we now create new geometry buffers on GPU instead.
2023-04-20 21:20:33 +02:00
Nikita Sirgienko
7ce10ebbbf Cycles: oneAPI: Remove excess quotes in a capabilities output 2023-04-20 11:09:16 +02:00
Campbell Barton
eb2867de90 Cleanup: spelling in comments 2023-04-19 08:02:41 +10:00
Xavier Hallade
4382a0b350 Cleanup: avoid warnings from gcc in oneAPI device compilation
When building using GCC and with Embree without GPU support, there were
a few unused variables and a non-defined macro.
2023-04-18 22:40:40 +02:00
Xavier Hallade
70892e82ac Cycles: oneAPI: use specialization constant to compile with/without Embree on GPU 2023-04-18 22:09:42 +02:00
Xavier Hallade
9821a2d397 Cycles: pass kernel features to get_bvh_layout_mask
This allows to selectively disable Hardware Raytracing in oneAPI
backend, depending on features used.
2023-04-18 22:09:42 +02:00
Nikita Sirgienko
3f8c995109 Cycles: add hardware raytracing support to oneAPI device
Updated Embree 4 library with GPU support is required for it to be
compiled - compatiblity with Embree 3 and Embree 4 without GPU support
is maintained.
Enabling hardware raytracing is an opt-in user setting for now.

Pull Request: https://projects.blender.org/blender/blender/pulls/106266
2023-04-18 22:09:42 +02:00
Xavier Hallade
887022257d Cycles: update DPCPP to 2022-12 release
We also backport a patch to program_manager to it as
61e51015a5
helps avoid unnecessary recompilation when enumerating available
kernels.
2023-04-18 22:09:41 +02:00
Campbell Barton
266d8de687 Cleanup: spelling in comments 2023-02-03 12:41:01 +11:00
Xavier Hallade
8afcecdf1f Cycles: update Intel Graphics compiler to 101.4032 on Windows
A noticeable (>5%) performance regression in oneAPI backend came with
a501a2dbff. Updating to latest graphics
compiler from driver 101.4032 fixes it.

I've tested it with current min-supported drivers and it runs well but
since compatibility of graphics compiler with older drivers isn't
guaranteed, I'm also bumping the min-supported driver versions.

If end-users consider latest drivers too fresh to switch to (version
isn't released as stable on Linux as of today but should be before
Blender 3.5 release), CYCLES_ONEAPI_ALL_DEVICES=1 env variable can be
used.

Intel Graphics Compiler on Linux will be updated in a later commit
so we can then close D16984.

Reviewed By: sergey, LazyDodo
2023-01-23 19:36:34 +01:00
Nikita Sirgienko
858fffc2df Cycles: oneAPI: add support for SYCL host task
This functionality is related only to debugging of SYCL implementation
via single-threaded CPU execution and is disabled by default.
Host device has been deprecated in SYCL 2020 spec and we removed it
in 305b92e05f.
Since this is still very useful for debugging, we're restoring a
similar functionality here through SYCL 2020 Host Task.
2023-01-03 20:47:24 +01:00
Nikita Sirgienko
f07b09da27 Cycles: Improve oneAPI backend support for non-Intel platforms 2022-11-25 17:46:59 +01:00
Nikita Sirgienko
412642865d Cleanup: Resolve a warning for the ambiguity on the parenthesis in oneAPI code
No functional changes.
2022-11-24 18:05:02 +01:00
Xavier Hallade
454dd3f7f0 Cycles: fix up logic in oneAPI devices filtering
CYCLES_ONEAPI_ALL_DEVICES environment variable wasn't working as
intended after 305b92e05f.
2022-10-27 23:09:14 +02:00
Michael Jones
8dd7b5b26b Cycles: Metal integrator state size tuning
This patch tunes the integrator state sizing for Metal (`num_concurrent_states` and `num_concurrent_busy_states`).

On all GPUs architecture, we adjust the busy:total states ratio to be 1:4 which gives better rendering performance than the previous 1:16 ratio (independent of total state count). This gives a small performance uplift (e.g. 2-3% on M1 Ultra).

Additionally for M2 architectures, we double the overall state size if there is available headroom. Inclusive of the first change, we can expect uplift of close to 10% in future, as this results in larger dispatch sizes and minimises work submission overheads. In order to make an accurate determination of available headroom, we defer the calculation of `num_concurrent_states` and `num_concurrent_busy_states` until the time of integrator state allocation (i.e. after all of the scene data has been allocated). We also refactor `alloc_integrator_soa` to calculate an *exact* single-state-size in a first pass, right before allocating the integrator SoA buffers in a second pass.

Reviewed By: brecht

Differential Revision: https://developer.blender.org/D16313
2022-10-24 17:14:33 +01:00
Xavier Hallade
0cfac5b043 Cycles: oneAPI: migrate from deprecated APIs, require libSYCL 6.0+
sycl::info::device::ext_intel_* descriptors are deprecated,
replaced with sycl::ext::intel::info::device:: that are available from
6.0+, for which we now check version in CMake.
2022-10-21 15:36:49 +02:00
Xavier Hallade
305b92e05f Cycles: oneAPI: remove use of SYCL host device
Host device is deprecated in SYCL 2020 spec, cpu device or standard C++
should be used instead.
2022-10-21 15:36:48 +02:00
Xavier Hallade
2943997d2a Cycles: oneAPI: include sycl/sycl.hpp instead of CL/sycl.hpp
Since SYCL 2020 API, sycl/sycl.hpp is the way.
2022-10-19 16:42:10 +02:00
Xavier Hallade
d816bae7bf Cycles: oneAPI: fix check_usm for debug builds 2022-10-19 16:42:10 +02:00
Werner, Stefan
c32a455605 Cleanup: Fixed some warnings
Some unused parameters were left after changing the oneAPI device code
to be a direclty linked shared library.
2022-10-13 09:45:53 +02:00
Nikita Sirgienko
82a5790d2a Cycles: oneAPI: Trigger compilation of used kernels only
JIT compilation of oneAPI kernels now happens during load stage
and proper message gets shown in the GUI during compilation.
Also, this implementation skips kernels that aren't needed for
the used scene, reducing overall (re)compilation time.
2022-10-10 16:38:11 +02:00
Xavier Hallade
7eeeaec6da Cycles: use direct linking for oneAPI backend
This is a minimal set of changes, allowing a lot of cleanup that can
happen afterward as it allows sycl method and objects to be used outside
of kernel.cpp.

Reviewed By: brecht, sergey

Differential Revision: https://developer.blender.org/D15397
2022-10-07 09:50:05 +02:00
Nikita Sirgienko
2ead05d738 Cycles: Add optional per-kernel performance statistics
When verbose level 4 is enabled, Blender prints kernel performance
data for Cycles on GPU backends (except Metal that doesn't use
debug_enqueue_* methods) for groups of kernels.
These changes introduce a new CYCLES_DEBUG_PER_KERNEL_PERFORMANCE
environment variable to allow getting timings for each kernels
separately and not grouped with others. This is done by adding
explicit synchronization after each kernel execution.

Differential Revision: https://developer.blender.org/D15971
2022-09-27 22:15:00 +02:00
Sergey Sharybin
3c2c296130 Fix compilation error on Windows after recent change 2022-09-13 11:52:11 +02:00
Sergey Sharybin
602cca671e Cycles: Include reason the oneAPI library could not be loaded
Additionally, just stick to a pure error stating. Such messages
are aimed for developers and it is rather implied that oneAPI
rendering will be disabled.
2022-09-13 10:52:18 +02:00
Nikita Sirgienko
8b11ed392c Cycles: Fix crashes in oneAPI backend for scenes not fitting in dGPU memory
Differential Revision: https://developer.blender.org/D15889
2022-09-06 15:38:15 +02:00
Xavier Hallade
d706d0460c Cycles oneAPI: simplify num_concurrent_states selection
The number of Execution Units and resident "threads" (simd width * threads
per EUs) are now exposed and used to select the number of states using
a simplified heuristic.
2022-07-27 09:45:33 +02:00
Xavier Hallade
41c10ac84a Cycles: fix support for multiple Intel GPUs
Identical Intel GPUs ended up with the same id.
Added PCI BDF to the id to make it unique.
2022-07-01 11:20:00 +02:00
Campbell Barton
b6c28002ac Cleanup: spelling in comments 2022-06-30 12:14:22 +10:00
Xavier Hallade
a02992f131 Cycles: Add support for rendering on Intel GPUs using oneAPI
This patch adds a new Cycles device with similar functionality to the
existing GPU devices.  Kernel compilation and runtime interaction happen
via oneAPI DPC++ compiler and SYCL API.

This implementation is primarly focusing on Intel® Arc™ GPUs and other
future Intel GPUs.  The first supported drivers are 101.1660 on Windows
and 22.10.22597 on Linux.

The necessary tools for compilation are:
- A SYCL compiler such as oneAPI DPC++ compiler or
  https://github.com/intel/llvm
- Intel® oneAPI Level Zero which is used for low level device queries:
  https://github.com/oneapi-src/level-zero
- To optionally generate prebuilt graphics binaries: Intel® Graphics
  Compiler All are included in Linux precompiled libraries on svn:
  https://svn.blender.org/svnroot/bf-blender/trunk/lib The same goes for
  Windows precompiled binaries but for the graphics compiler, available
  as "Intel® Graphics Offline Compiler for OpenCL™ Code" from
  https://www.intel.com/content/www/us/en/developer/articles/tool/oneapi-standalone-components.html,
  for which path can be set as OCLOC_INSTALL_DIR.

Being based on the open SYCL standard, this implementation could also be
extended to run on other compatible non-Intel hardware in the future.

Reviewed By: sergey, brecht

Differential Revision: https://developer.blender.org/D15254

Co-authored-by: Nikita Sirgienko <nikita.sirgienko@intel.com>
Co-authored-by: Stefan Werner <stefan.werner@intel.com>
2022-06-29 12:58:04 +02:00