griefith/test

Author	SHA1	Message	Date
Brecht Van Lommel	57ff24cb99	Refactor: Cycles: Add const keyword to more function parameters Pull Request: https://projects.blender.org/blender/blender/pulls/132361	2025-01-03 10:23:24 +01:00
Brecht Van Lommel	dd51c8660b	Refactor: Cycles: Add const keyword where possible, using clang-tidy Check was misc-const-correctness, combined with readability-isolate-declaration as suggested by the docs. Temporarily clang-format "QualifierAlignment: Left" was used to get consistency with the prevailing order of keywords. Pull Request: https://projects.blender.org/blender/blender/pulls/132361	2025-01-03 10:23:20 +01:00
Brecht Van Lommel	3c2a6fbb9c	Refactor: Cycles: Use nullptr instead of NULL Pull Request: https://projects.blender.org/blender/blender/pulls/132361	2025-01-03 10:22:43 +01:00
Nikita Sirgienko	fb21f3fb56	Cleanup: Cycles: oneAPI: Fix deprecation warnings about get_pointer()	2024-10-01 22:26:15 +02:00
Michael Jones	99f5433445	Cycles: Dormant fixes for adaptive feature compilation This PR fixes the (currently unused) scene-based selective feature compilation macros. These feature based macros haven't been used for a few years, and enabling them currently results in compilation errors. The only functional change in this PR is in geom/primitive.h where undef-ing `__HAIR__` had exposed an inconsistency in how pointcloud attributes were being fetched. Using the more general `primitive_surface_attribute_float4` (instead of `curve_attribute_float4`) fixed a compilation error that occurred when rendering pointcloud unit test scenes with adaptive compilation enabled. Pull Request: https://projects.blender.org/blender/blender/pulls/121216	2024-04-30 12:56:22 +02:00
Stefan Werner	31d55e87f9	Cycles: Metal support for OpenImageDenoise This is supported on Apple Silicon GPUs and macOS 13.0+. Co-authored-by: Stefan Werner <stefan.werner@intel.com> Co-authored-by: Attila Afra <attila.t.afra@intel.com> Pull Request: https://projects.blender.org/blender/blender/pulls/116124	2024-02-06 21:13:23 +01:00
Brecht Van Lommel	d015e98ee6	Fix Cycles ASAN error with boolean kernel arguments	2023-12-12 13:27:36 +01:00
Campbell Barton	c12994612b	License headers: use SPDX-FileCopyrightText in intern/cycles	2023-06-14 16:53:23 +10:00
Sergey Sharybin	ba3f26fac5	Cycles: light and shadow linking With light linking, lights can be set to affect only specific objects in the scene. Shadow linking additionally gives control over which objects acts a shadow blockers for a light. Usage: https://wiki.blender.org/wiki/Reference/Release_Notes/4.0/Cycles Implementation: https://wiki.blender.org/wiki/Source/Render/Cycles/LightLinking Ref #104972 Co-authored-by: Brecht Van Lommel <brecht@blender.org>	2023-05-24 14:11:47 +02:00
Campbell Barton	bf36a61e62	Cleanup: spelling in comments & some corrections	2023-05-20 21:17:09 +10:00
Nikita Sirgienko	bafd82c9c1	Cycles: oneAPI: use local memory for faster shader sorting Co-authored-by: Stefan Werner <stefan.werner@intel.com> Pull Request: https://projects.blender.org/blender/blender/pulls/107994	2023-05-17 11:07:57 +02:00
Sahar A. Kashi	557a245dd5	Cycles: add HIP RT device, for AMD hardware ray tracing on Windows HIP RT enables AMD hardware ray tracing on RDNA2 and above, and falls back to a to shader implementation for older graphics cards. It offers an average 25% sample rendering rate improvement in Cycles benchmarks, on a W6800 card. The ray tracing feature functions are accessed through HIP RT SDK, available on GPUOpen. HIP RT traversal functionality is pre-compiled in bitcode format and shipped with the SDK. This is not yet enabled as there are issues to be resolved, but landing the code now makes testing and further changes easier. Known limitations: * Not working yet with current public AMD drivers. * Visual artifact in motion blur. * One of the buffers allocated for traversal has a static size. Allocating it dynamically would reduce memory usage. * This is for Windows only currently, no Linux support. Co-authored-by: Brecht Van Lommel <brecht@blender.org> Ref #105538	2023-04-25 20:19:43 +02:00
Xavier Hallade	70892e82ac	Cycles: oneAPI: use specialization constant to compile with/without Embree on GPU	2023-04-18 22:09:42 +02:00
Nikita Sirgienko	3f8c995109	Cycles: add hardware raytracing support to oneAPI device Updated Embree 4 library with GPU support is required for it to be compiled - compatiblity with Embree 3 and Embree 4 without GPU support is maintained. Enabling hardware raytracing is an opt-in user setting for now. Pull Request: https://projects.blender.org/blender/blender/pulls/106266	2023-04-18 22:09:42 +02:00
Michael Jones	5f61eca7af	Cycles: Exploit non-uniform threadgroup sizes on Metal This patch replaces `dispatchThreadgroups` with `dispatchThreads` which takes care of non-uniform threadgroup bounds. This allows us to remove the bounds guards in the integrator kernel entry points. Pull Request: https://projects.blender.org/blender/blender/pulls/106217	2023-03-29 21:46:11 +02:00
Campbell Barton	91346755ce	Cleanup: use '#' prefix for issues instead of 'T' Match the convention from Gitea instead of Phabricator's T for tasks.	2023-02-12 14:56:05 +11:00
Brecht Van Lommel	773a36d2f8	Fix Cycles OneAPI build error after recent changes	2023-02-06 15:36:49 +01:00
Michael Jones	654e1e901b	Cycles: Use local atomics for faster shader sorting (enabled on Metal) This patch adds two new kernels: SORT_BUCKET_PASS and SORT_WRITE_PASS. These replace PREFIX_SUM and SORTED_PATHS_ARRAY on supported devices (currently implemented on Metal, but will be trivial to enable on the other backends). The new kernels exploit sort partitioning (see D15331) by sorting each partition separately using local atomics. This can give an overall render speedup of 2-3% depending on architecture. As before, we fall back to the original non-partitioned sorting when the shader count is "too high". Reviewed By: brecht Differential Revision: https://developer.blender.org/D16909	2023-02-06 11:18:26 +00:00
Brecht Van Lommel	222b64fcdc	Fix Cycles CUDA crash when building kernels without optimizations (for debug) In this case the blocksize may not the one we requested, which was assumed to be the case. Instead get the effective block size from the compiler as was already done for Metal and OneAPI.	2022-11-30 21:46:17 +01:00
Brecht Van Lommel	cf57624764	Cleanup: refactoring of kernel film function names and organization	2022-09-02 17:13:28 +02:00
Brecht Van Lommel	bb376da6df	Fix Cycles MetalRT error after recent specialization changes	2022-07-15 18:28:13 +02:00
Xavier Hallade	a02992f131	Cycles: Add support for rendering on Intel GPUs using oneAPI This patch adds a new Cycles device with similar functionality to the existing GPU devices. Kernel compilation and runtime interaction happen via oneAPI DPC++ compiler and SYCL API. This implementation is primarly focusing on Intel® Arc™ GPUs and other future Intel GPUs. The first supported drivers are 101.1660 on Windows and 22.10.22597 on Linux. The necessary tools for compilation are: - A SYCL compiler such as oneAPI DPC++ compiler or https://github.com/intel/llvm - Intel® oneAPI Level Zero which is used for low level device queries: https://github.com/oneapi-src/level-zero - To optionally generate prebuilt graphics binaries: Intel® Graphics Compiler All are included in Linux precompiled libraries on svn: https://svn.blender.org/svnroot/bf-blender/trunk/lib The same goes for Windows precompiled binaries but for the graphics compiler, available as "Intel® Graphics Offline Compiler for OpenCL™ Code" from https://www.intel.com/content/www/us/en/developer/articles/tool/oneapi-standalone-components.html, for which path can be set as OCLOC_INSTALL_DIR. Being based on the open SYCL standard, this implementation could also be extended to run on other compatible non-Intel hardware in the future. Reviewed By: sergey, brecht Differential Revision: https://developer.blender.org/D15254 Co-authored-by: Nikita Sirgienko <nikita.sirgienko@intel.com> Co-authored-by: Stefan Werner <stefan.werner@intel.com>	2022-06-29 12:58:04 +02:00
Michael Jones	4412e14708	Cycles: Useful Metal backend debug & profiling functionality This patch adds some useful debugging & profiling env vars to the Metal backend: - `CYCLES_METAL_PROFILING`: output a per-kernel timing report at the end of the render - `CYCLES_METAL_DEBUG`: enable per-dispatch tracing (very verbose) - `CYCLES_DEBUG_METAL_CAPTURE_KERNEL`: enable programatic .gputrace capture for a specified kernel index Here's an example of the timing report with `CYCLES_METAL_PROFILING` enabled: ``` --------------------------------------------------------------------------------------------------- Kernel name Total threads Dispatches Avg. T/D Time Time% --------------------------------------------------------------------------------------------------- integrator_init_from_camera 657,407,232 161 4,083,274 0.24s 0.51% integrator_intersect_closest 1,629,288,440 681 2,392,494 15.18s 32.12% integrator_intersect_shadow 751,652,291 470 1,599,260 5.80s 12.28% integrator_shade_background 304,612,074 263 1,158,220 1.16s 2.45% integrator_shade_surface 1,159,764,041 676 1,715,627 20.57s 43.52% integrator_shade_shadow 598,885,847 418 1,432,741 1.27s 2.69% integrator_queued_paths_array 2,969,650,130 805 3,689,006 0.35s 0.74% integrator_queued_shadow_paths_array 593,936,619 379 1,567,115 0.14s 0.29% integrator_terminated_paths_array 22,205,417 155 143,260 0.05s 0.10% integrator_sorted_paths_array 2,517,140,043 676 3,723,579 1.65s 3.50% integrator_compact_paths_array 648,912,748 155 4,186,533 0.03s 0.07% integrator_compact_states 20,872,687 155 134,662 0.14s 0.29% integrator_terminated_shadow_paths_array 374,100,675 438 854,111 0.16s 0.33% integrator_compact_shadow_paths_array 503,768,657 438 1,150,156 0.05s 0.10% integrator_compact_shadow_states 37,664,941 202 186,460 0.23s 0.50% integrator_reset 25,165,824 6 4,194,304 0.06s 0.12% film_convert_combined_half_rgba 3,110,400 6 518,400 0.00s 0.01% prefix_sum 676 676 1 0.19s 0.40% --------------------------------------------------------------------------------------------------- 6,760 47.27s 100.00% --------------------------------------------------------------------------------------------------- ``` Reviewed By: brecht Differential Revision: https://developer.blender.org/D15044	2022-06-07 11:08:39 +01:00
Brecht Van Lommel	f2cd7e08fe	Fix Cycles MNEE not working for Metal Move MNEE to own kernel, separate from shader ray-tracing. This does introduce the limitation that a shader can't use both MNEE and AO/bevel, but that seems like the better trade-off for now. We can experiment with bigger kernel organization changes later. Differential Revision: https://developer.blender.org/D15070	2022-05-31 17:24:43 +02:00
Sergey Sharybin	9bb4bf5748	Fix missing 64bit casts when calculating Cycles render buffer offset Found those missing casts while looking into a crash report made in the Blender Chat. Was unable to reproduce the crash, but the casts should totally be there to avoid integer overflow.	2022-05-23 15:59:52 +02:00
Stefan Werner	9c6dff70c8	Cycles: Introduce postfix for kernel body definition Increases flexibility of code-generation for kernel entry points. Currently no functional changes, preparing for integration with oneAPI.	2022-04-01 19:44:02 +02:00
Brecht Van Lommel	a9a05d5597	Merge branch 'blender-v3.1-release'	2022-02-15 01:05:47 +01:00
Brecht Van Lommel	facd9d8268	Cleanup: clang-format	2022-02-15 01:05:25 +01:00
Brecht Van Lommel	35c261dfcf	Merge branch 'blender-v3.1-release'	2022-02-11 23:58:41 +01:00
Michael Jones	27d3140b13	Cycles: Fix Metal kernel compilation for AMD GPUs Workaround for a compilation issue preventing kernels compiling for AMD GPUs: Avoid problematic use of templates on Metal by making `gpu_parallel_active_index_array` a wrapper macro, and moving `blocksize` to be a macro parameter. Reviewed By: brecht Differential Revision: https://developer.blender.org/D14081	2022-02-11 22:52:48 +00:00
Brecht Van Lommel	9cfc7967dd	Cycles: use SPDX license headers * Replace license text in headers with SPDX identifiers. * Remove specific license info from outdated readme.txt, instead leave details to the source files. * Add list of SPDX license identifiers used, and corresponding license texts. * Update copyright dates while we're at it. Ref D14069, T95597	2022-02-11 17:47:34 +01:00
Brecht Van Lommel	0cf2fafd81	Fix T94050, T94570, T94527: Cycles Bevel and AO nodes not working with Metal Workaround what may be a compiler bug, solution found by Michael Jones.	2022-01-13 10:40:41 +01:00
Michael Jones	efe3d60a2c	Cycles: Fix Metal build This patch fixes a couple of new Metal kernel compilation errors: 1) a kernel parameter count overflow, and 2) missing address space qualifiers. Reviewed By: brecht Differential Revision: https://developer.blender.org/D13763	2022-01-07 16:19:31 +00:00
Patrick Mours	8393ccd076	Cycles: Add OptiX temporal denoising support Enables the `bpy.ops.cycles.denoise_animation()` operator again and modifies it to support temporal denoising with OptiX. This requires renders that were done with both the "Vector" and "Denoising Data" passes. Differential Revision: https://developer.blender.org/D11442	2022-01-05 15:58:36 +01:00
Michael Jones	9558fa5196	Cycles: Metal host-side code This patch adds the Metal host-side code: - Add all core host-side Metal backend files (device_impl, queue, etc) - Add MetalRT BVH setup files - Integrate with Cycles device enumeration code - Revive `path_source_replace_includes` in util/path (required for MSL compilation) This patch also includes a couple of small kernel-side fixes: - Add an implementation of `lgammaf` for Metal [Nemes, Gergő (2010), "New asymptotic expansion for the Gamma function", Archiv der Mathematik](https://users.renyi.hu/~gergonemes/) - include "work_stealing.h" inside the Metal context class because it accesses state now Ref T92212 Reviewed By: brecht Maniphest Tasks: T92212 Differential Revision: https://developer.blender.org/D13423	2021-12-07 15:52:21 +00:00
Michael Jones	f613c4c095	Cycles: MetalRT support (kernel side) This patch adds MetalRT support to Cycles kernel code. It is mostly additive in nature or confined to Metal-specific code, however there are a few areas where this interacts with other code: - MetalRT closely follows the Optix implementation, and in some cases (notably handling of transforms) it makes sense to extend Optix special-casing to MetalRT. For these generalisations we now have `__KERNEL_GPU_RAYTRACING__` instead of `__KERNEL_OPTIX__`. - MetalRT doesn't support primitive offsetting (as with `primitiveIndexOffset` in Optix), so we define and populate a new kernel texture, `__object_prim_offset`, containing per-object primitive / curve-segment offsets. This is referenced and applied in MetalRT intersection handlers. - Two new BVH layout enum values have been added: `BVH_LAYOUT_METAL` and `BVH_LAYOUT_MULTI_METAL_EMBREE` for XPU mode). Some host-side enum case handling has been updated where it is trivial to do so. Ref T92212 Reviewed By: brecht Maniphest Tasks: T92212 Differential Revision: https://developer.blender.org/D13353	2021-11-29 15:20:26 +00:00
Michael Jones	eb7827e797	Cycles: Fix film convert address space mismatch on Metal This patch fixes an address space mismatch in the film convert kernels on Metal. The `film_get_pass_pixel_...` functions take a `ccl_private` result pointer, but the film convert kernels pass a `ccl_global` memory pointer. Specialising the pass-fetch functions with templates results in compilation errors on Visual Studio, so instead this patch just adds an intermediate local on Metal. Reviewed By: brecht Differential Revision: https://developer.blender.org/D13350	2021-11-26 13:58:48 +00:00
Michael Jones	d1f944c186	Cycles: declare constants at program scope on Metal MSL requires that constant address space literals be declared at program scope. This patch moves the `blackbody_table_r/g/b` and `cie_colour_match` constants into separate files so they can be declared at the appropriate scope. Ref T92212 Differential Revision: https://developer.blender.org/D13241	2021-11-18 14:38:05 +01:00
Michael Jones	d19e35873f	Cycles: several small fixes and additions for MSL This patch contains many small leftover fixes and additions that are required for Metal-enablement: - Address space fixes and a few other small compile fixes - Addition of missing functionality to the Metal adapter headers - Addition of various scattered `__KERNEL_METAL__` blocks (e.g. for atomic support & maths functions) Ref T92212 Differential Revision: https://developer.blender.org/D13263	2021-11-18 14:38:02 +01:00
Sergey Sharybin	ce395c84a3	Merge branch 'blender-v3.0-release'	2021-11-11 15:29:35 +01:00
Sergey Sharybin	d26d3cfe19	Fix T92868: Cycles catcher with transparency crashes The issue was caused by splitting happening twice. Fixed by checking for split flag which is assigned to the both states during split. The tricky part was to write catcher data at the moment of split: the transparency and shadow catcher sample count is to be accumulated at that point. Now it is happening in the `intersect_closest` kernel. The downside is that render buffer is to be passed to the kernel, but the benefit is that extra split bounce check is not needed now. Had to move the passes write to shadow catcher header, since include of `film/passes.h` causes all the fun of requirement to have BSDF data structures available. Differential Revision: https://developer.blender.org/D13177	2021-11-11 15:21:35 +01:00
Brecht Van Lommel	3fa86f4b28	Merge branch 'blender-v3.0-release'	2021-11-10 20:19:09 +01:00
Brecht Van Lommel	6b0008129e	Fix T92972: Cycles HIP wrong render display after a recent refactor It's unclear why this fails. Maybe the size of half4 is not the expected 8 bytes and adjacent pixels are overwritten. Or there is some bug in the HIP compiler writing a struct into global memory, which we probably don't do elsewhere in the kernel. Thanks to Thomas, William and Jeroen for helping investigate this.	2021-11-10 20:03:07 +01:00
Patrick Mours	f565620435	Fix T92985: CUDA errors with Cycles film convert kernels rB3a4c8f406a3a3bf0627477c6183a594fa707a6e2 changed the macros that create the film convert kernel entry points, but in the process accidentally changed the parameter definition to one of those (which caused CUDA launch and misaligned address errors) and changed the implementation as well. This restores the correct implementation from before. In addition, the `ccl_gpu_kernel_threads` macro did not work as intended and caused the generated launch bounds to end up with an incorrect input for the second parameter (it was set to "thread_num_registers", rather than the result of the block number calculation). I'm not entirely sure why, as the macro definition looked sound to me. Decided to simply go with two separate macros instead, to simplify and solve this. Also changed how state is captured with the `ccl_gpu_kernel_lambda` macro slightly, to avoid a compiler warning (expression has no effect) that otherwise occurred. Maniphest Tasks: T92985 Differential Revision: https://developer.blender.org/D13175	2021-11-10 15:49:50 +01:00
Michael Jones	3a4c8f406a	Cycles: Adapt shared kernel/device/gpu layer for MSL This patch adapts the shared kernel entrypoints so that they can be compiled as MSL (Metal Shading Language). Where possible, the adaptations avoid changes in common code. In MSL, kernel function inputs are explicitly bound to resources. In the case of argument buffers, we declare a struct containing the kernel arguments, accessible via device pointer. This differs from CUDA and HIP where kernel function arguments are declared as traditional C-style function parameters. This patch adapts the entrypoints declared in kernel.h so that they can be translated via a new `ccl_gpu_kernel_signature` macro into the required parameter struct + kernel entrypoint pairing for MSL. MSL buffer attribution must be applied to function parameters or non-static class data members. To allow universal access to the integrator state, kernel data, and texture fetch adapters, we wrap all of the shared kernel code in a `MetalKernelContext` class. This is achieved by bracketing the appropriate kernel headers with "context_begin.h" and "context_end.h" on Metal. When calling deeper into the kernel code, we must reference the context class (e.g. `context.integrator_init_from_camera`). This extra prefixing is performed by a set of defines in "context_end.h". These will require explicit maintenance if entrypoints change. We invite discussion on more maintainable ways to enforce correctness. Lambda expressions are not supported on MSL, so a new `ccl_gpu_kernel_lambda` macro generates an inline function object and optionally capturing any required state. This yields the same behaviour. This approach is applied to all parallel_... implementations which are templated by operation. The lambda expressions in the film_convert... kernels don't adapt cleanly to use function objects. However, these entrypoints can be macro-generated more concisely to avoid lambda expressions entirely, instead relying on constant folding to handle the pixel/channel conversions. A separate implementation of `gpu_parallel_active_index_array` is provided for Metal to workaround some subtle differences in SIMD width, and also to encapsulate some required thread parameters which must be declared as explicit entrypoint function parameters. Ref T92212 Reviewed By: brecht Maniphest Tasks: T92212 Differential Revision: https://developer.blender.org/D13109	2021-11-09 21:43:10 +00:00
Patrick Mours	440a3475b8	Cycles: Improve OptiX denoising with dark images and fix crash when denoiser is destroyed Adds a pass before denoising that calculates the intensity of the image, which can be passed into the OptiX denoiser for more optimal results for very dark or very bright images. In addition this also fixes a crash that sometimes occurred on exit. The OptiX denoiser object has to be destroyed before the OptiX device context object (since it references that). But in C++ the destructor function of a class is called before its fields are destructed, so "~OptiXDevice" was always called before "OptiXDevice::~Denoiser" and therefore "optixDeviceContextDestroy" was called before "optixDenoiserDestroy", hence the crash. Differential Revision: https://developer.blender.org/D13160	2021-11-09 14:49:00 +01:00
Brecht Van Lommel	fd25e883e2	Cycles: remove prefix from source code file names Remove prefix of filenames that is the same as the folder name. This used to help when #includes were using individual files, but now they are always relative to the cycles root directory and so the prefixes are redundant. For patches and branches, git merge and rebase should be able to detect the renames and move over code to the right file.	2021-10-26 15:37:04 +02:00
Brecht Van Lommel	d7d40745fa	Cycles: changes to source code folders structure * Split render/ into scene/ and session/. The scene/ folder now contains the scene and its nodes. The session/ folder contains the render session and associated data structures like drivers and render buffers. * Move top level kernel headers into new folders kernel/camera/, kernel/film/, kernel/light/, kernel/sample/, kernel/util/ * Move integrator related kernel headers into kernel/integrator/ * Move OSL shaders from kernel/shaders/ to kernel/osl/shaders/ For patches and branches, git merge and rebase should be able to detect the renames and move over code to the right file.	2021-10-26 15:36:39 +02:00
Brecht Van Lommel	282516e53e	Cleanup: refactor float/half conversions for clarity	2021-10-22 13:03:03 +02:00
Brecht Van Lommel	df00463764	Cycles: add shadow path compaction for GPU rendering Similar to main path compaction that happens before adding work tiles, this compacts shadow paths before launching kernels that may add shadow paths. Only do it when more than 50% of space is wasted. It's not a clear win in all scenes, some are up to 1.5% slower. Likely caused by different order of scheduling kernels having an unpredictable performance impact. Still feels like compaction is just the right thing to avoid cases where a few shadow paths can hold up a lot of main paths. Differential Revision: https://developer.blender.org/D12944	2021-10-21 15:38:03 +02:00

1 2

58 Commits