test2

Author	SHA1	Message	Date
Xavier Hallade	90a10dcd50	Cycles: Adjust inlining attributes for oneAPI device Now ccl_device sets inlining and ccl_device_inline forces inlining. This matches more closely with what is currently done for cuda and metal backends. I've measured from 1% to 6% overall performance improvement in rendering benchmark scenes on Arc B580, as well as a small decrease in compile time.	2025-03-03 18:20:02 +01:00
Lukas Stockner	8cb5e05c48	Cleanup: Cycles: Deduplicate kernel attribute code using templating The attribute handling code in the kernel is currently highly duplicated since it needs to handle five different data types and we couldn't use templates back then. We can now, so might as well make use of it and get rid of ~1000 lines. There are also some small fixes for the GPU OSL code: - Wrong derivative for .w component when converting float2/float3->float4 - Different conversion for float2->float (CPU averages, GPU used to take .x) - Removed useless code for converting to float2, not used by OSL Pull Request: https://projects.blender.org/blender/blender/pulls/134694	2025-02-20 19:28:45 +01:00
Nikita Sirgienko	2bab4ae370	Cycles: oneAPI: Optimize texture access by using GPU HW sampler The current usage of software-based texture operations in the oneAPI implementation puts additional register pressure on the GPU compiler during register allocation. And it also creates code that requires maintenance. This commit is intended to address this situation by utilizing a recently productized SYCL bindless texture API to enable HW-based texture operations using Intel GPUs' hardware sampler. This currently translates to 1-11% rendering speedups (scene-specific) on my Arc A770 and Arc B580. At the moment, there are small performance regressions with NanoVDB texture operations on Arc B580 and small performance regressions in shade surface MNEE and Raytrace kernels on Arc A770, but they look recoverable and will be handled in the future. Pull Request: https://projects.blender.org/blender/blender/pulls/133457	2025-02-12 21:47:34 +01:00
Nikita Sirgienko	a0b7ad436b	Cleanup: Cycles: oneAPI: Switch to non-experimental work item API There is now a non-experimental API for this_work_item functionality, so let's use it for better code quality and also to avoid the deprecation warning during compilation. No functional or performance changes are expected. Pull Request: https://projects.blender.org/blender/blender/pulls/133472	2025-02-12 21:46:22 +01:00
Brecht Van Lommel	57ff24cb99	Refactor: Cycles: Add const keyword to more function parameters Pull Request: https://projects.blender.org/blender/blender/pulls/132361	2025-01-03 10:23:24 +01:00
Weizhen Huang	e2d7681fe6	Cleanup: Cycles: remove unused `ccl_loop_no_unroll` Was added in `6121c28501` to ensure compiling on OpenCL, now the definition is empty on all platforms Pull Request: https://projects.blender.org/blender/blender/pulls/131100	2024-11-28 16:37:01 +01:00
Nikita Sirgienko	2aa9203f2f	Cycles: Reintroduce noinline keyword for oneAPI device In `891d71a4d4` this keyword was dropped due to performance regression after `fdc2962beb`, but currently code does not experience this performance degradation, and in fact there is minor performance improvement on Lunar Lake GPUs, along with an expected improvement in compile time. However, this change brings a minor performance regression to shade_surface kernel on Intel Arc and Meteor Lake GPUs, which will be solved later by disabling this keyword for these platforms only. Pull Request: https://projects.blender.org/blender/blender/pulls/130299	2024-11-15 12:09:37 +01:00
Xavier Hallade	b614953971	Cycles: oneAPI: fix Linux compilation with fno-honor-nans Previously, when compiling on Rocky Linux 8 with fno-honor-nans, compile time was more than 5x longer than expected, and there was an unresolved symbol to __sqrtf_finite in GPU binaries. Once defining sqrtf in compat.h, both issues are effectively gone, this was certainly due to problematic interactions with build system's math library headers. So we can remove current workaround of defining fhonor-nans, and now have the same set of flags on both Windows and Linux.	2024-10-04 17:50:24 +02:00
Xavier Hallade	891d71a4d4	Cycles: Drop noinline keyword for oneAPI device `fdc2962beb` indirectly introduced a change in inlining (light_tree_pdf started getting inlined) that led to a 5-10% drop in performance for most scenes. Dropping the noinline keyword for oneAPI device recovers it. It however brings another performance regression to MNEE and Raytrace kernels, that we'll look into separately.	2024-04-02 18:29:35 +02:00
Brecht Van Lommel	6cdb43195e	Refactor: replace NanoVDB kernel side implementation by own code The NanoVDB headers are not compatible with Metal due to missing address space qualifiers. We currently have a big patch for NanoVDB header files, which is difficult to update for OpenVDB 11. Instead extract a few hundred lines of code from NanoVDB to do just what we need. Pull Request: https://projects.blender.org/blender/blender/pulls/115992	2023-12-10 19:37:36 +01:00
Xavier Hallade	d26a2b09bc	Cycles: oneAPI: use hardware cos Speckles and missing lights were experienced in scenes with Nishita Sky Texture and a Sun Size smaller than 1.5°, such as in Lone Monk and Attic scenes. We previously worked around these by using a more precise software implementation of cosine. After recent changes in Cycles, it turns out this workaround isn't currently needed.	2023-10-06 13:10:27 +02:00
Sergey Sharybin	71b4a97cbc	Refactor: De-duplicate Metal RT self intersection checks Use the common BVH utilities header for this. Added a special type qualifier ccl_ray_data which is defined to ccl_private for all platforms but Metal. On Metal it is defined to ray_data. The tricky part is that the BVH utilities are wrapped into the Metal context class. In some of the BVH functions the context has been already constructed, but it wasn't done in all the callbacks. From a quick render tests of the Junkshop benchmark scene there is no render time difference, No functional changes are expected. Pull Request: https://projects.blender.org/blender/blender/pulls/111967	2023-09-05 17:21:49 +02:00
Xavier Hallade	40a39c2976	Cycles: oneAPI: cleanup: drop __spirv_ocl_cos workaround As __FAST_MATH__ isn't defined anymore since `09df1f4caf`, sycl::cos uses the precise implementation, no need to call __spirv_ocl_cos anymore.	2023-08-31 13:10:29 +02:00
Ray Molenkamp	235c564aa0	Cycles: re-Fixed oneAPI build on Windows fixes one uint missed in `a0846a60c9`	2023-07-06 14:47:35 -06:00
Stefan Werner	a0846a60c9	Cycles: Fixed oneAPI build on Windows Turns out uint wasn't defined this early in our kernels on Windows. Using unsigned int instead should fix this.	2023-07-06 21:50:03 +02:00
Werner, Stefan	7befc40386	Cycles: Use sycl::bitcast in oneAPI backend Using sycl::bitcast instead of union hack	2023-07-06 15:06:33 +02:00
Campbell Barton	c12994612b	License headers: use SPDX-FileCopyrightText in intern/cycles	2023-06-14 16:53:23 +10:00
Nikita Sirgienko	bafd82c9c1	Cycles: oneAPI: use local memory for faster shader sorting Co-authored-by: Stefan Werner <stefan.werner@intel.com> Pull Request: https://projects.blender.org/blender/blender/pulls/107994	2023-05-17 11:07:57 +02:00
Nikita Sirgienko	3f8c995109	Cycles: add hardware raytracing support to oneAPI device Updated Embree 4 library with GPU support is required for it to be compiled - compatiblity with Embree 3 and Embree 4 without GPU support is maintained. Enabling hardware raytracing is an opt-in user setting for now. Pull Request: https://projects.blender.org/blender/blender/pulls/106266	2023-04-18 22:09:42 +02:00
Michael Jones	5f61eca7af	Cycles: Exploit non-uniform threadgroup sizes on Metal This patch replaces `dispatchThreadgroups` with `dispatchThreads` which takes care of non-uniform threadgroup bounds. This allows us to remove the bounds guards in the integrator kernel entry points. Pull Request: https://projects.blender.org/blender/blender/pulls/106217	2023-03-29 21:46:11 +02:00
Brecht Van Lommel	9eee008691	Fix Cycles oneAPI build error due to conflicting CONSTANT define	2023-03-06 00:13:21 +01:00
Campbell Barton	27b4916b1a	Cleanup: spelling in comments Also minor changes in comments: - Reference BLENDER_HISTORY_FILE instead of the literal file-name (simplifies looking up usage). - Use usernames in tags, as noted in code-style.	2023-01-31 14:22:23 +11:00
Xavier Hallade	1c90f8209d	Cycles: fix rendering with Nishita Sky Texture on Intel Arc GPUs Speckles and missing lights were experienced in scenes with Nishita Sky Texture and a Sun Size smaller than 1.5°, such as in Lone Monk and Attic scenes. Increasing the precision of cosf fixes it.	2023-01-24 09:58:22 +01:00
Nikita Sirgienko	858fffc2df	Cycles: oneAPI: add support for SYCL host task This functionality is related only to debugging of SYCL implementation via single-threaded CPU execution and is disabled by default. Host device has been deprecated in SYCL 2020 spec and we removed it in `305b92e05f`. Since this is still very useful for debugging, we're restoring a similar functionality here through SYCL 2020 Host Task.	2023-01-03 20:47:24 +01:00
Patrick Mours	e6b38deb9d	Cycles: Add basic support for using OSL with OptiX This patch generalizes the OSL support in Cycles to include GPU device types and adds an implementation for that in the OptiX device. There are some caveats still, including simplified texturing due to lack of OIIO on the GPU and a few missing OSL intrinsics. Note that this is incomplete and missing an update to the OSL library before being enabled! The implementation is already committed now to simplify further development. Maniphest Tasks: T101222 Differential Revision: https://developer.blender.org/D15902	2022-11-09 15:30:21 +01:00
Xavier Hallade	305b92e05f	Cycles: oneAPI: remove use of SYCL host device Host device is deprecated in SYCL 2020 spec, cpu device or standard C++ should be used instead.	2022-10-21 15:36:48 +02:00
Werner, Stefan	0c824837ab	Cycles: Cleanup in oneAPI math includes and definitions Now explicitly including math.h first before #defining funcitons. This avoids undefined behavior and improves compatibility with different SYCL compilers and backends.	2022-09-22 11:33:57 +02:00
Bastien Montagne	19a7a013ce	Merge branch 'blender-v3.3-release'	2022-08-01 14:37:16 +02:00
Nikita Sirgienko	76169472d3	Cycles: Resolve recent performance regression in oneAPI implementation for Intel® Arc™ GPUs Recently, performance with oneAPI have regressed due some recent changes in Blender itself. This commit's changes is resolving this and also improve compilation time for oneAPI backend first execution (or Blender compilation time in case of AoT). Regression have appeared after `5152c7c152` and not related to the changes itself, but increase of kernels complexity introduced with it. Changes in this commit is marking some Blender functions as noinlined for oneAPI backend, which helps GPU compiler to deal with this complexity without any negative side-effects on performance.	2022-08-01 12:45:34 +02:00
Brecht Van Lommel	79ab76e156	Cleanup: simplifications and consistency for vector types * OneAPI: remove separate float3 definition * OneAPI: disable operator[] to match other GPUs * OneAPI: make int3 compact to match other GPUs * Use #pragma once * Add __KERNEL_NATIVE_VECTOR_TYPES__ to simplify checks * Remove unused vector3	2022-07-28 21:27:13 +02:00
Campbell Barton	b6c28002ac	Cleanup: spelling in comments	2022-06-30 12:14:22 +10:00
Xavier Hallade	a02992f131	Cycles: Add support for rendering on Intel GPUs using oneAPI This patch adds a new Cycles device with similar functionality to the existing GPU devices. Kernel compilation and runtime interaction happen via oneAPI DPC++ compiler and SYCL API. This implementation is primarly focusing on Intel® Arc™ GPUs and other future Intel GPUs. The first supported drivers are 101.1660 on Windows and 22.10.22597 on Linux. The necessary tools for compilation are: - A SYCL compiler such as oneAPI DPC++ compiler or https://github.com/intel/llvm - Intel® oneAPI Level Zero which is used for low level device queries: https://github.com/oneapi-src/level-zero - To optionally generate prebuilt graphics binaries: Intel® Graphics Compiler All are included in Linux precompiled libraries on svn: https://svn.blender.org/svnroot/bf-blender/trunk/lib The same goes for Windows precompiled binaries but for the graphics compiler, available as "Intel® Graphics Offline Compiler for OpenCL™ Code" from https://www.intel.com/content/www/us/en/developer/articles/tool/oneapi-standalone-components.html, for which path can be set as OCLOC_INSTALL_DIR. Being based on the open SYCL standard, this implementation could also be extended to run on other compatible non-Intel hardware in the future. Reviewed By: sergey, brecht Differential Revision: https://developer.blender.org/D15254 Co-authored-by: Nikita Sirgienko <nikita.sirgienko@intel.com> Co-authored-by: Stefan Werner <stefan.werner@intel.com>	2022-06-29 12:58:04 +02:00

32 Commits