test2

Author	SHA1	Message	Date
Stefan Werner	083aad8a45	Cycles: Specialization constants for Embree/SYCL Making heavier use of specialization constants in SYCL for Embree. This reduces code size of the intersection kernels and bring performance improvement up to 9% in some scenes on Intel GPUs. Co-authored-by: Stefan Werner <stefan.werner@intel.com> Co-authored-by: Nikita Sirgienko <nikita.sirgienko@intel.com> Pull Request: https://projects.blender.org/blender/blender/pulls/141559	2025-10-02 16:44:24 +02:00
Weizhen Huang	2b0a1cae06	Cycles: Add an option to use ray marching for volume rendering Null Scattering currently has performance and noise issues, and it will take time to address them. For now add the previous Ray Marching back as an option. Co-authored-by: Brecht Van Lommel <brecht@blender.org> Pull Request: https://projects.blender.org/blender/blender/pulls/146317	2025-09-26 12:14:45 +02:00
Patrick Mours	b4bb075285	Cycles: Flip image vertically before passing to OptiX denoiser to improve result quality Experiments have shown that the OptiX denoiser performs best when operating on images that have their origin at the top-left corner, while Blender renders with the origin at the bottom-left corner. Simply flipping the image vertically before and after denoising is a relatively trivial operation, so this patch introduces this as an additional preprocessing and postprocessing step for denoising when the OptiX denoiser is used. Additionally, this patch also removes an unused helper function, now that OptiX 8.0 is the minimum. Pull Request: https://projects.blender.org/blender/blender/pulls/145358	2025-09-04 16:04:23 +02:00
Nikita Sirgienko	a984114d5e	Cleanup: oneAPI: Fix warnings about unused variables No performance or functional changes are expected	2025-09-03 11:01:20 +02:00
Weizhen Huang	a4f8e0bfa2	Cycles: Use RGBE for denoised guiding buffers to reduce memory usage Co-authored-by: Brecht Van Lommel <brecht@blender.org>	2025-08-13 10:28:50 +02:00
Weizhen Huang	5cb6014efd	Cycles: Volume Scattering Probability Guiding Guide the probability to scatter in or transmit through the volume. Only applied for primary rays. Co-authored-by: Brecht Van Lommel <brecht@blender.org>	2025-08-13 10:28:50 +02:00
Weizhen Huang	8c36f9ce49	Cycles: Compute volume transmittance using telescoping	2025-08-13 10:28:50 +02:00
Weizhen Huang	b2b2d9a4f3	Cycles: Render volume by ray marching through octrees One octree per volume per shader based on the density. In preparation for the null scattering	2025-08-13 10:28:50 +02:00
Patrick Mours	6487395fa5	Cycles: Add linear curve shape Add new "Linear 3D Curves" option in the Curves panel in the render properties. This renders curves as linear segments rather than smooth curves, for faster render time at the cost of accuracy. On NVIDIA Blackwell GPUs, this can give a 6x speedup compared to smooth curves, due to hardware acceleration. On NVIDIA Ada there is still a 3x speedup, and CPU and other GPU backends will also render this faster. A difference with smooth curves is that these have end caps, as this was simpler to implement and they are usually helpful anyway. In the future this functionality will also be used to properly support the CURVE_TYPE_POLY on the new curves object. Pull Request: https://projects.blender.org/blender/blender/pulls/139735	2025-07-29 17:05:01 +02:00
Nikita Sirgienko	9875836519	Cycles: oneAPI: Compile only needed device binaries in multi-GPU case The code of the "oneapi_load_kernels" function before this modification was loading kernels and compiling them, if needed, for all devices in the associated GPU context. This makes sense for one GPU execution scenario, as well as for execution scenario of multi identical GPU, but in cases where Blender users have several different GPUs in render, the previous implementation would compile all kernels for all devices for each device, unnecessarily doing the same work multiple times. Because of this, I am changing the implementation so that now compilation happens only for the used device per used device, ensuring that no unnecessary work is done. No render performance changes are expected.	2025-07-19 14:15:36 +02:00
Nikita Sirgienko	69091c5028	Cycles: Show device optimizations status in preferences for oneAPI With these changes, we can now mark devices which are expected to work as performant as possible, and devices which were not optimized for some reason. For example, because the device was released after the Blender release, making it impossible for developers to optimize for devices in already released unchangeable code. This is primarily relevant for the LTS versions, which are supported for two years and require proper communication about optimization status for the new devices released during this time. This is implemented for oneAPI devices. Other device types currently are marked as optimized for compatibility with old behavior, but may implement the same in the future. Pull Request: https://projects.blender.org/blender/blender/pulls/139751	2025-06-03 20:07:52 +02:00
Xavier Hallade	90a10dcd50	Cycles: Adjust inlining attributes for oneAPI device Now ccl_device sets inlining and ccl_device_inline forces inlining. This matches more closely with what is currently done for cuda and metal backends. I've measured from 1% to 6% overall performance improvement in rendering benchmark scenes on Arc B580, as well as a small decrease in compile time.	2025-03-03 18:20:02 +01:00
Lukas Stockner	8cb5e05c48	Cleanup: Cycles: Deduplicate kernel attribute code using templating The attribute handling code in the kernel is currently highly duplicated since it needs to handle five different data types and we couldn't use templates back then. We can now, so might as well make use of it and get rid of ~1000 lines. There are also some small fixes for the GPU OSL code: - Wrong derivative for .w component when converting float2/float3->float4 - Different conversion for float2->float (CPU averages, GPU used to take .x) - Removed useless code for converting to float2, not used by OSL Pull Request: https://projects.blender.org/blender/blender/pulls/134694	2025-02-20 19:28:45 +01:00
Nikita Sirgienko	2bab4ae370	Cycles: oneAPI: Optimize texture access by using GPU HW sampler The current usage of software-based texture operations in the oneAPI implementation puts additional register pressure on the GPU compiler during register allocation. And it also creates code that requires maintenance. This commit is intended to address this situation by utilizing a recently productized SYCL bindless texture API to enable HW-based texture operations using Intel GPUs' hardware sampler. This currently translates to 1-11% rendering speedups (scene-specific) on my Arc A770 and Arc B580. At the moment, there are small performance regressions with NanoVDB texture operations on Arc B580 and small performance regressions in shade surface MNEE and Raytrace kernels on Arc A770, but they look recoverable and will be handled in the future. Pull Request: https://projects.blender.org/blender/blender/pulls/133457	2025-02-12 21:47:34 +01:00
Nikita Sirgienko	a0b7ad436b	Cleanup: Cycles: oneAPI: Switch to non-experimental work item API There is now a non-experimental API for this_work_item functionality, so let's use it for better code quality and also to avoid the deprecation warning during compilation. No functional or performance changes are expected. Pull Request: https://projects.blender.org/blender/blender/pulls/133472	2025-02-12 21:46:22 +01:00
Brecht Van Lommel	9971648783	Refactor: Cycles: Replace new/delete by unique_ptr, in simple cases Pull Request: https://projects.blender.org/blender/blender/pulls/132361	2025-01-03 10:23:30 +01:00
Brecht Van Lommel	57ff24cb99	Refactor: Cycles: Add const keyword to more function parameters Pull Request: https://projects.blender.org/blender/blender/pulls/132361	2025-01-03 10:23:24 +01:00
Brecht Van Lommel	dd51c8660b	Refactor: Cycles: Add const keyword where possible, using clang-tidy Check was misc-const-correctness, combined with readability-isolate-declaration as suggested by the docs. Temporarily clang-format "QualifierAlignment: Left" was used to get consistency with the prevailing order of keywords. Pull Request: https://projects.blender.org/blender/blender/pulls/132361	2025-01-03 10:23:20 +01:00
Brecht Van Lommel	3a57b97eba	Cleanup: Cycles: Remove unneeded oneAPI double emulation for NanoVDB Pull Request: https://projects.blender.org/blender/blender/pulls/132361	2025-01-03 10:22:59 +01:00
Brecht Van Lommel	d0c2e68e5f	Refactor: Cycles: Automated clang-tidy fixups in Cycles * Use .empty() and .data() * Use nullptr instead of 0 * No else after return * Simple class member initialization * Add override for virtual methods * Include C++ instead of C headers * Remove some unused includes * Use default constructors * Always use braces * Consistent names in definition and declaration * Change typedef to using Pull Request: https://projects.blender.org/blender/blender/pulls/132361	2025-01-03 10:22:55 +01:00
Brecht Van Lommel	5c46063607	Refactor: Cycles: Make kernel headers work by themselves Shuffle around some code and add more includes so that individual header files compile without errors. Pull Request: https://projects.blender.org/blender/blender/pulls/132361	2025-01-03 10:22:50 +01:00
Weizhen Huang	e2d7681fe6	Cleanup: Cycles: remove unused `ccl_loop_no_unroll` Was added in `6121c28501` to ensure compiling on OpenCL, now the definition is empty on all platforms Pull Request: https://projects.blender.org/blender/blender/pulls/131100	2024-11-28 16:37:01 +01:00
Nikita Sirgienko	2aa9203f2f	Cycles: Reintroduce noinline keyword for oneAPI device In `891d71a4d4` this keyword was dropped due to performance regression after `fdc2962beb`, but currently code does not experience this performance degradation, and in fact there is minor performance improvement on Lunar Lake GPUs, along with an expected improvement in compile time. However, this change brings a minor performance regression to shade_surface kernel on Intel Arc and Meteor Lake GPUs, which will be solved later by disabling this keyword for these platforms only. Pull Request: https://projects.blender.org/blender/blender/pulls/130299	2024-11-15 12:09:37 +01:00
Xavier Hallade	b614953971	Cycles: oneAPI: fix Linux compilation with fno-honor-nans Previously, when compiling on Rocky Linux 8 with fno-honor-nans, compile time was more than 5x longer than expected, and there was an unresolved symbol to __sqrtf_finite in GPU binaries. Once defining sqrtf in compat.h, both issues are effectively gone, this was certainly due to problematic interactions with build system's math library headers. So we can remove current workaround of defining fhonor-nans, and now have the same set of flags on both Windows and Linux.	2024-10-04 17:50:24 +02:00
Nikita Sirgienko	94c9898f41	Fix #124811 : Cycles: oneAPI: no hair strands in viewport with Embree oneAPI kernels preloading logic was letting un-needed kernels to be compiled without features, which would then miss when these kernels were needed later. Pull Request: https://projects.blender.org/blender/blender/pulls/127114	2024-09-04 11:08:00 +02:00
Xavier Hallade	1a0dbbd242	Fix: Cannot render Victor and Spring with embree disabled on Intel GPUs The kernel zeroing memory since we've added host memory fallback didn't expect large inputs, so with these scenes, it was running into "Provided range is out of integer limits. Pass `-fno-sycl-id-queries-fit-in-int' to disable range check" error. This kernel was used instead of memset to avoid some issues with the free_memory queries not always being updated. As we can't reproduce these with recent drivers, we now use memset, which fixes rendering with BVH2.	2024-09-02 18:35:51 +02:00
Nikita Sirgienko	759bb6c768	Cycles: oneAPI: Enable host memory migration This enables scenes with all textures not fitting in GPU memory to finally render. For scenes that are fitting, no functional change or performance change is expected. Pull Request: https://projects.blender.org/blender/blender/pulls/122385	2024-05-28 19:04:19 +02:00
Xavier Hallade	891d71a4d4	Cycles: Drop noinline keyword for oneAPI device `fdc2962beb` indirectly introduced a change in inlining (light_tree_pdf started getting inlined) that led to a 5-10% drop in performance for most scenes. Dropping the noinline keyword for oneAPI device recovers it. It however brings another performance regression to MNEE and Raytrace kernels, that we'll look into separately.	2024-04-02 18:29:35 +02:00
Brecht Van Lommel	d377ef2543	Clang Format: bump to version 17 Along with the 4.1 libraries upgrade, we are bumping the clang-format version from 8-12 to 17. This affects quite a few files. If not already the case, you may consider pointing your IDE to the clang-format binary bundled with the Blender precompiled libraries.	2024-01-03 13:38:14 +01:00
Brecht Van Lommel	6cdb43195e	Refactor: replace NanoVDB kernel side implementation by own code The NanoVDB headers are not compatible with Metal due to missing address space qualifiers. We currently have a big patch for NanoVDB header files, which is difficult to update for OpenVDB 11. Instead extract a few hundred lines of code from NanoVDB to do just what we need. Pull Request: https://projects.blender.org/blender/blender/pulls/115992	2023-12-10 19:37:36 +01:00
Brecht Van Lommel	8ba474dc4f	Refactor: replace NanoVDB SampleFromVoxels by own code This makes the GPU tricubic implementation more efficient. The dense grid code implemented this in terms of trilinear lookups that are hardware accelerated, but for NanoVDB this just causes unnecessary voxel reads. Instead match the CPU code. Pull Request: https://projects.blender.org/blender/blender/pulls/115992	2023-12-10 19:37:36 +01:00
Stefan Werner	02b5e27f89	Cycles: Add Intel GPU support for OpenImageDenoise OpenImageDenoise V2 comes with GPU support for various backends. This adds a new class, OIDNDenoiserGPU, in order to add this functionality into the existing Cycles post processing pipeline without having to change it much. OptiX and OIDN CPU denoising remain as they are. Rendering on a supported Intel GPU will automatically select the GPU denoiser. Device support is initially limited to the oneAPI devices that are supported by Cycles, but can be extended. Ref #115045 Co-authored-by: Stefan Werner <stefan.werner@intel.com> Co-authored-by: Ray Molenkamp <github@lazydodo.com> Pull Request: https://projects.blender.org/blender/blender/pulls/108314	2023-11-20 11:12:41 +01:00
Xavier Hallade	d26a2b09bc	Cycles: oneAPI: use hardware cos Speckles and missing lights were experienced in scenes with Nishita Sky Texture and a Sun Size smaller than 1.5°, such as in Lone Monk and Attic scenes. We previously worked around these by using a more precise software implementation of cosine. After recent changes in Cycles, it turns out this workaround isn't currently needed.	2023-10-06 13:10:27 +02:00
Campbell Barton	2721b937fb	Cleanup: use braces in headers	2023-09-24 14:52:38 +10:00
Xavier Hallade	01931e213f	Cycles: oneAPI: only export necessary symbols The API for the kernels library is defined, there is no need to export more than that. This change only affects linux since hidden visiblity is the default on Windows.	2023-09-08 15:44:39 +02:00
Sergey Sharybin	71b4a97cbc	Refactor: De-duplicate Metal RT self intersection checks Use the common BVH utilities header for this. Added a special type qualifier ccl_ray_data which is defined to ccl_private for all platforms but Metal. On Metal it is defined to ray_data. The tricky part is that the BVH utilities are wrapped into the Metal context class. In some of the BVH functions the context has been already constructed, but it wasn't done in all the callbacks. From a quick render tests of the Junkshop benchmark scene there is no render time difference, No functional changes are expected. Pull Request: https://projects.blender.org/blender/blender/pulls/111967	2023-09-05 17:21:49 +02:00
Xavier Hallade	40a39c2976	Cycles: oneAPI: cleanup: drop __spirv_ocl_cos workaround As __FAST_MATH__ isn't defined anymore since `09df1f4caf`, sycl::cos uses the precise implementation, no need to call __spirv_ocl_cos anymore.	2023-08-31 13:10:29 +02:00
Nikita Sirgienko	abab47a805	Cycles: oneAPI: Refactoring of local size choice logic	2023-08-22 19:04:16 +02:00
Xavier Hallade	aefc9835f8	Cycles: oneAPI: fix kernel host-side compilation with MSVC 17.7 <algorithm> header include is missing from some sycl headers, this will be fixed upstream with https://github.com/intel/llvm/pull/10424, meanwhile, we work around it by including it directly.	2023-07-25 12:01:09 +02:00
Ray Molenkamp	235c564aa0	Cycles: re-Fixed oneAPI build on Windows fixes one uint missed in `a0846a60c9`	2023-07-06 14:47:35 -06:00
Stefan Werner	a0846a60c9	Cycles: Fixed oneAPI build on Windows Turns out uint wasn't defined this early in our kernels on Windows. Using unsigned int instead should fix this.	2023-07-06 21:50:03 +02:00
Werner, Stefan	7befc40386	Cycles: Use sycl::bitcast in oneAPI backend Using sycl::bitcast instead of union hack	2023-07-06 15:06:33 +02:00
Nikita Sirgienko	d801ffddff	Cycles: oneAPI: Fix execution error with cryptomatte kernel	2023-06-29 14:51:49 +02:00
Campbell Barton	c12994612b	License headers: use SPDX-FileCopyrightText in intern/cycles	2023-06-14 16:53:23 +10:00
Sergey Sharybin	ba3f26fac5	Cycles: light and shadow linking With light linking, lights can be set to affect only specific objects in the scene. Shadow linking additionally gives control over which objects acts a shadow blockers for a light. Usage: https://wiki.blender.org/wiki/Reference/Release_Notes/4.0/Cycles Implementation: https://wiki.blender.org/wiki/Source/Render/Cycles/LightLinking Ref #104972 Co-authored-by: Brecht Van Lommel <brecht@blender.org>	2023-05-24 14:11:47 +02:00
Campbell Barton	bf36a61e62	Cleanup: spelling in comments & some corrections	2023-05-20 21:17:09 +10:00
Nikita Sirgienko	bafd82c9c1	Cycles: oneAPI: use local memory for faster shader sorting Co-authored-by: Stefan Werner <stefan.werner@intel.com> Pull Request: https://projects.blender.org/blender/blender/pulls/107994	2023-05-17 11:07:57 +02:00
Nikita Sirgienko	b8173278b0	Cycles: oneAPI: set correct work group sizes for kernels that have a predefined one	2023-05-17 00:02:12 +02:00
Nikita Sirgienko	a17d07ee87	Cycles: oneAPI: Fix prevented execution with sycl runtime > 20230323 NanoVDB headers have unused code using "double" type, which is not supported on Arc GPUs. Recent DPC++ changes enforced runtime verifications: `7663dc201d` which prevents execution when such type has been present even if unused. This is a solution to avoid double to be compiled at all, similar as how it is done for Metal.	2023-05-17 00:00:52 +02:00
Xavier Hallade	5ec2495550	Cycles: oneAPI: enable Hardware Raytracing for Raytrace/MNEE kernels We do so if Embree 4.1+ is present.	2023-05-12 14:17:50 +02:00

1 2

95 Commits