Commit Graph

84 Commits

Author SHA1 Message Date
Xavier Hallade
90a10dcd50 Cycles: Adjust inlining attributes for oneAPI device
Now ccl_device sets inlining and ccl_device_inline forces inlining.
This matches more closely with what is currently done for cuda and metal
backends.
I've measured from 1% to 6% overall performance improvement in rendering
benchmark scenes on Arc B580, as well as a small decrease in compile
time.
2025-03-03 18:20:02 +01:00
Lukas Stockner
8cb5e05c48 Cleanup: Cycles: Deduplicate kernel attribute code using templating
The attribute handling code in the kernel is currently highly duplicated since
it needs to handle five different data types and we couldn't use templates
back then.
We can now, so might as well make use of it and get rid of ~1000 lines.

There are also some small fixes for the GPU OSL code:
- Wrong derivative for .w component when converting float2/float3->float4
- Different conversion for float2->float (CPU averages, GPU used to take .x)
- Removed useless code for converting to float2, not used by OSL

Pull Request: https://projects.blender.org/blender/blender/pulls/134694
2025-02-20 19:28:45 +01:00
Nikita Sirgienko
2bab4ae370 Cycles: oneAPI: Optimize texture access by using GPU HW sampler
The current usage of software-based texture operations in
the oneAPI implementation puts additional register pressure on
the GPU compiler during register allocation. And it also creates
code that requires maintenance. This commit is intended to address
this situation by utilizing a recently productized SYCL bindless
texture API to enable HW-based texture operations using
Intel GPUs' hardware sampler.

This currently translates to 1-11% rendering speedups (scene-specific)
on my Arc A770 and Arc B580. At the moment, there are small
performance regressions with NanoVDB texture operations on Arc B580
and small performance regressions in shade surface MNEE and Raytrace
kernels on Arc A770, but they look recoverable and will be handled
in the future.

Pull Request: https://projects.blender.org/blender/blender/pulls/133457
2025-02-12 21:47:34 +01:00
Nikita Sirgienko
a0b7ad436b Cleanup: Cycles: oneAPI: Switch to non-experimental work item API
There is now a non-experimental API for this_work_item functionality, so
let's use it for better code quality and also to avoid the deprecation
warning during compilation.

No functional or performance changes are expected.

Pull Request: https://projects.blender.org/blender/blender/pulls/133472
2025-02-12 21:46:22 +01:00
Brecht Van Lommel
9971648783 Refactor: Cycles: Replace new/delete by unique_ptr, in simple cases
Pull Request: https://projects.blender.org/blender/blender/pulls/132361
2025-01-03 10:23:30 +01:00
Brecht Van Lommel
57ff24cb99 Refactor: Cycles: Add const keyword to more function parameters
Pull Request: https://projects.blender.org/blender/blender/pulls/132361
2025-01-03 10:23:24 +01:00
Brecht Van Lommel
dd51c8660b Refactor: Cycles: Add const keyword where possible, using clang-tidy
Check was misc-const-correctness, combined with readability-isolate-declaration
as suggested by the docs.

Temporarily clang-format "QualifierAlignment: Left" was used to get consistency
with the prevailing order of keywords.

Pull Request: https://projects.blender.org/blender/blender/pulls/132361
2025-01-03 10:23:20 +01:00
Brecht Van Lommel
3a57b97eba Cleanup: Cycles: Remove unneeded oneAPI double emulation for NanoVDB
Pull Request: https://projects.blender.org/blender/blender/pulls/132361
2025-01-03 10:22:59 +01:00
Brecht Van Lommel
d0c2e68e5f Refactor: Cycles: Automated clang-tidy fixups in Cycles
* Use .empty() and .data()
* Use nullptr instead of 0
* No else after return
* Simple class member initialization
* Add override for virtual methods
* Include C++ instead of C headers
* Remove some unused includes
* Use default constructors
* Always use braces
* Consistent names in definition and declaration
* Change typedef to using

Pull Request: https://projects.blender.org/blender/blender/pulls/132361
2025-01-03 10:22:55 +01:00
Brecht Van Lommel
5c46063607 Refactor: Cycles: Make kernel headers work by themselves
Shuffle around some code and add more includes so that individual
header files compile without errors.

Pull Request: https://projects.blender.org/blender/blender/pulls/132361
2025-01-03 10:22:50 +01:00
Weizhen Huang
e2d7681fe6 Cleanup: Cycles: remove unused ccl_loop_no_unroll
Was added in 6121c28501 to ensure compiling
on OpenCL, now the definition is empty on all platforms

Pull Request: https://projects.blender.org/blender/blender/pulls/131100
2024-11-28 16:37:01 +01:00
Nikita Sirgienko
2aa9203f2f Cycles: Reintroduce noinline keyword for oneAPI device
In 891d71a4d4 this keyword was
dropped due to performance regression after
fdc2962beb, but currently code
does not experience this performance degradation, and in fact
there is minor performance improvement on Lunar Lake GPUs,
along with an expected improvement in compile time.
However, this change brings a minor performance regression to
shade_surface kernel on Intel Arc and Meteor Lake GPUs, which
will be solved later by disabling this keyword for
these platforms only.

Pull Request: https://projects.blender.org/blender/blender/pulls/130299
2024-11-15 12:09:37 +01:00
Xavier Hallade
b614953971 Cycles: oneAPI: fix Linux compilation with fno-honor-nans
Previously, when compiling on Rocky Linux 8 with fno-honor-nans, compile
time was more than 5x longer than expected, and there was an unresolved
symbol to __sqrtf_finite in GPU binaries.
Once defining sqrtf in compat.h, both issues are effectively gone, this
was certainly due to problematic interactions with build system's math
library headers.
So we can remove current workaround of defining fhonor-nans, and now
have the same set of flags on both Windows and Linux.
2024-10-04 17:50:24 +02:00
Nikita Sirgienko
94c9898f41 Fix #124811: Cycles: oneAPI: no hair strands in viewport with Embree
oneAPI kernels preloading logic was letting un-needed kernels to be
compiled without features, which would then miss when these kernels
were needed later.

Pull Request: https://projects.blender.org/blender/blender/pulls/127114
2024-09-04 11:08:00 +02:00
Xavier Hallade
1a0dbbd242 Fix: Cannot render Victor and Spring with embree disabled on Intel GPUs
The kernel zeroing memory since we've added host memory fallback didn't
expect large inputs, so with these scenes, it was running into
"Provided range is out of integer limits. Pass
`-fno-sycl-id-queries-fit-in-int' to disable range check" error.

This kernel was used instead of memset to avoid some issues with the
free_memory queries not always being updated.
As we can't reproduce these with recent drivers, we now use memset,
which fixes rendering with BVH2.
2024-09-02 18:35:51 +02:00
Nikita Sirgienko
759bb6c768 Cycles: oneAPI: Enable host memory migration
This enables scenes with all textures not fitting in GPU
memory to finally render. For scenes that are fitting,
no functional change or performance change is expected.

Pull Request: https://projects.blender.org/blender/blender/pulls/122385
2024-05-28 19:04:19 +02:00
Xavier Hallade
891d71a4d4 Cycles: Drop noinline keyword for oneAPI device
fdc2962beb indirectly introduced a change
in inlining (light_tree_pdf started getting inlined) that led to a 5-10%
drop in performance for most scenes.
Dropping the noinline keyword for oneAPI device recovers it.
It however brings another performance regression to MNEE and Raytrace
kernels, that we'll look into separately.
2024-04-02 18:29:35 +02:00
Brecht Van Lommel
d377ef2543 Clang Format: bump to version 17
Along with the 4.1 libraries upgrade, we are bumping the clang-format
version from 8-12 to 17. This affects quite a few files.

If not already the case, you may consider pointing your IDE to the
clang-format binary bundled with the Blender precompiled libraries.
2024-01-03 13:38:14 +01:00
Brecht Van Lommel
6cdb43195e Refactor: replace NanoVDB kernel side implementation by own code
The NanoVDB headers are not compatible with Metal due to missing address
space qualifiers. We currently have a big patch for NanoVDB header
files, which is difficult to update for OpenVDB 11. Instead extract a
few hundred lines of code from NanoVDB to do just what we need.

Pull Request: https://projects.blender.org/blender/blender/pulls/115992
2023-12-10 19:37:36 +01:00
Brecht Van Lommel
8ba474dc4f Refactor: replace NanoVDB SampleFromVoxels by own code
This makes the GPU tricubic implementation more efficient. The dense
grid code implemented this in terms of trilinear lookups that are
hardware accelerated, but for NanoVDB this just causes unnecessary voxel
reads. Instead match the CPU code.

Pull Request: https://projects.blender.org/blender/blender/pulls/115992
2023-12-10 19:37:36 +01:00
Stefan Werner
02b5e27f89 Cycles: Add Intel GPU support for OpenImageDenoise
OpenImageDenoise V2 comes with GPU support for various backends. This adds a new class, OIDNDenoiserGPU, in order to add this functionality into the existing Cycles post processing pipeline without having to change it much. OptiX and OIDN CPU denoising remain as they are. Rendering on a supported Intel GPU will automatically select the GPU denoiser.

Device support is initially limited to the oneAPI devices that are supported by Cycles, but can be extended.

Ref #115045

Co-authored-by: Stefan Werner <stefan.werner@intel.com>
Co-authored-by: Ray Molenkamp <github@lazydodo.com>
Pull Request: https://projects.blender.org/blender/blender/pulls/108314
2023-11-20 11:12:41 +01:00
Xavier Hallade
d26a2b09bc Cycles: oneAPI: use hardware cos
Speckles and missing lights were experienced in scenes with Nishita Sky
Texture and a Sun Size smaller than 1.5°, such as in Lone Monk and Attic
scenes.
We previously worked around these by using a more precise
software implementation of cosine.
After recent changes in Cycles, it turns out this workaround isn't
currently needed.
2023-10-06 13:10:27 +02:00
Campbell Barton
2721b937fb Cleanup: use braces in headers 2023-09-24 14:52:38 +10:00
Xavier Hallade
01931e213f Cycles: oneAPI: only export necessary symbols
The API for the kernels library is defined, there is no need to
export more than that. This change only affects linux since hidden
visiblity is the default on Windows.
2023-09-08 15:44:39 +02:00
Sergey Sharybin
71b4a97cbc Refactor: De-duplicate Metal RT self intersection checks
Use the common BVH utilities header for this.

Added a special type qualifier ccl_ray_data which is defined to ccl_private
for all platforms but Metal. On Metal it is defined to ray_data.

The tricky part is that the BVH utilities are wrapped into the Metal context
class. In some of the BVH functions the context has been already constructed,
but it wasn't done in all the callbacks.

From a quick render tests of the Junkshop benchmark scene there is no render
time difference,

No functional changes are expected.

Pull Request: https://projects.blender.org/blender/blender/pulls/111967
2023-09-05 17:21:49 +02:00
Xavier Hallade
40a39c2976 Cycles: oneAPI: cleanup: drop __spirv_ocl_cos workaround
As __FAST_MATH__ isn't defined anymore since
09df1f4caf, sycl::cos uses the precise
implementation, no need to call __spirv_ocl_cos anymore.
2023-08-31 13:10:29 +02:00
Nikita Sirgienko
abab47a805 Cycles: oneAPI: Refactoring of local size choice logic 2023-08-22 19:04:16 +02:00
Xavier Hallade
aefc9835f8 Cycles: oneAPI: fix kernel host-side compilation with MSVC 17.7
<algorithm> header include is missing from some sycl headers, this will
be fixed upstream with https://github.com/intel/llvm/pull/10424,
meanwhile, we work around it by including it directly.
2023-07-25 12:01:09 +02:00
Ray Molenkamp
235c564aa0 Cycles: re-Fixed oneAPI build on Windows
fixes one uint missed in a0846a60c9
2023-07-06 14:47:35 -06:00
Stefan Werner
a0846a60c9 Cycles: Fixed oneAPI build on Windows
Turns out uint wasn't defined this early in our kernels on Windows.
Using unsigned int instead should fix this.
2023-07-06 21:50:03 +02:00
Werner, Stefan
7befc40386 Cycles: Use sycl::bitcast in oneAPI backend
Using sycl::bitcast instead of union hack
2023-07-06 15:06:33 +02:00
Nikita Sirgienko
d801ffddff Cycles: oneAPI: Fix execution error with cryptomatte kernel 2023-06-29 14:51:49 +02:00
Campbell Barton
c12994612b License headers: use SPDX-FileCopyrightText in intern/cycles 2023-06-14 16:53:23 +10:00
Sergey Sharybin
ba3f26fac5 Cycles: light and shadow linking
With light linking, lights can be set to affect only specific objects in the
scene. Shadow linking additionally gives control over which objects acts a
shadow blockers for a light.

Usage:
https://wiki.blender.org/wiki/Reference/Release_Notes/4.0/Cycles

Implementation:
https://wiki.blender.org/wiki/Source/Render/Cycles/LightLinking

Ref #104972
Co-authored-by: Brecht Van Lommel <brecht@blender.org>
2023-05-24 14:11:47 +02:00
Campbell Barton
bf36a61e62 Cleanup: spelling in comments & some corrections 2023-05-20 21:17:09 +10:00
Nikita Sirgienko
bafd82c9c1 Cycles: oneAPI: use local memory for faster shader sorting
Co-authored-by: Stefan Werner <stefan.werner@intel.com>

Pull Request: https://projects.blender.org/blender/blender/pulls/107994
2023-05-17 11:07:57 +02:00
Nikita Sirgienko
b8173278b0 Cycles: oneAPI: set correct work group sizes for kernels that have a predefined one 2023-05-17 00:02:12 +02:00
Nikita Sirgienko
a17d07ee87 Cycles: oneAPI: Fix prevented execution with sycl runtime > 20230323
NanoVDB headers have unused code using "double" type, which is not supported on Arc GPUs.
Recent DPC++ changes enforced runtime verifications:
7663dc201d
which prevents execution when such type has been present even if unused.
This is a solution to avoid double to be compiled at all, similar as how it is done for Metal.
2023-05-17 00:00:52 +02:00
Xavier Hallade
5ec2495550 Cycles: oneAPI: enable Hardware Raytracing for Raytrace/MNEE kernels
We do so if Embree 4.1+ is present.
2023-05-12 14:17:50 +02:00
Campbell Barton
6859bb6e67 Cleanup: format (with BraceWrapping::AfterControlStatement "MultiLine") 2023-05-02 09:37:49 +10:00
Nikita Sirgienko
7e92fb92ec Cycles: oneAPI: Fix kernels preloading in case of incompatible AoT binaries
When running oneAPI with AoT binaries, on hardware that's not compatible with
these, recompilation could have been missing from the kernels loading phase and
happen during execution instead.

These changes fixes it, any kernel compilation will now happen during the
kernels loading phase.
2023-04-20 21:20:33 +02:00
Campbell Barton
eb2867de90 Cleanup: spelling in comments 2023-04-19 08:02:41 +10:00
Xavier Hallade
70892e82ac Cycles: oneAPI: use specialization constant to compile with/without Embree on GPU 2023-04-18 22:09:42 +02:00
Nikita Sirgienko
3f8c995109 Cycles: add hardware raytracing support to oneAPI device
Updated Embree 4 library with GPU support is required for it to be
compiled - compatiblity with Embree 3 and Embree 4 without GPU support
is maintained.
Enabling hardware raytracing is an opt-in user setting for now.

Pull Request: https://projects.blender.org/blender/blender/pulls/106266
2023-04-18 22:09:42 +02:00
Michael Jones
5f61eca7af Cycles: Exploit non-uniform threadgroup sizes on Metal
This patch replaces `dispatchThreadgroups` with `dispatchThreads` which takes care of non-uniform threadgroup bounds. This allows us to remove the bounds guards in the integrator kernel entry points.

Pull Request: https://projects.blender.org/blender/blender/pulls/106217
2023-03-29 21:46:11 +02:00
Brecht Van Lommel
9eee008691 Fix Cycles oneAPI build error due to conflicting CONSTANT define 2023-03-06 00:13:21 +01:00
Brecht Van Lommel
773a36d2f8 Fix Cycles OneAPI build error after recent changes 2023-02-06 15:36:49 +01:00
Campbell Barton
27b4916b1a Cleanup: spelling in comments
Also minor changes in comments:
- Reference BLENDER_HISTORY_FILE instead of the literal file-name
  (simplifies looking up usage).
- Use usernames in tags, as noted in code-style.
2023-01-31 14:22:23 +11:00
Xavier Hallade
1c90f8209d Cycles: fix rendering with Nishita Sky Texture on Intel Arc GPUs
Speckles and missing lights were experienced in scenes with Nishita Sky
Texture and a Sun Size smaller than 1.5°, such as in Lone Monk and Attic
scenes.
Increasing the precision of cosf fixes it.
2023-01-24 09:58:22 +01:00
Nikita Sirgienko
858fffc2df Cycles: oneAPI: add support for SYCL host task
This functionality is related only to debugging of SYCL implementation
via single-threaded CPU execution and is disabled by default.
Host device has been deprecated in SYCL 2020 spec and we removed it
in 305b92e05f.
Since this is still very useful for debugging, we're restoring a
similar functionality here through SYCL 2020 Host Task.
2023-01-03 20:47:24 +01:00