Commit Graph

75 Commits

Author SHA1 Message Date
Campbell Barton
5b9740c913 Cleanup: use braces for sources in intern/
Omitted intern/itasc as some of these sources are from KDL:
https://www.orocos.org/kdl.html
2023-09-17 09:05:40 +10:00
Campbell Barton
c12994612b License headers: use SPDX-FileCopyrightText in intern/cycles 2023-06-14 16:53:23 +10:00
Campbell Barton
6859bb6e67 Cleanup: format (with BraceWrapping::AfterControlStatement "MultiLine") 2023-05-02 09:37:49 +10:00
Xavier Hallade
9821a2d397 Cycles: pass kernel features to get_bvh_layout_mask
This allows to selectively disable Hardware Raytracing in oneAPI
backend, depending on features used.
2023-04-18 22:09:42 +02:00
Brecht Van Lommel
cc6d8cd573 Fix #105442: Cycles CUDA and HIP host memory fallback not working
Transforming the host pointer should not be done in an assert, it only works
in debug builds then. Caused by 6dcfb6d.
2023-03-17 21:52:29 +01:00
Nikita Sirgienko
6dcfb6df9c Cycles: Abstract host memory fallback for GPU devices
Host memory fallback in CUDA and HIP devices is almost identical.
We remove duplicated code and create a shared generic version that
other devices (oneAPI) will be able to use.

Reviewed By: brecht

Differential Revision: https://developer.blender.org/D17173
2023-02-06 22:19:32 +01:00
Hallam Roberts
a501a2dbff Images: add mirror extension type
This adds a new mirror image extension type for shaders and
geometry nodes (next to the existing repeat, extend and clip
options).

See D16432 for a more detailed explanation of `wrap_mirror`.

This also adds a new sampler flag `GPU_SAMPLER_MIRROR_REPEAT`.
It acts as a modifier to `GPU_SAMPLER_REPEAT`, so any `REPEAT`
flag must be set for the `MIRROR` flag to have an effect.

Differential Revision: https://developer.blender.org/D16432
2022-12-14 19:27:29 +01:00
Brecht Van Lommel
009f7de619 Cleanup: use better matching integer types for graphics interop handle
Ref D16042
2022-12-01 15:55:48 +01:00
Chris Blackbourn
4b57bc4e5d Cleanup: format 2022-11-09 08:30:18 +13:00
Gon Solo
c306ccb67f Fix Cycles error with runtime compilation when there is no path to OptiX SDK
If no OPTIX_ROOT is set, nvcc fails to compile because there is a stray "-I"
in the arguments. Detect if the include path is empty and act accordingly.

Differential Revision: https://developer.blender.org/D16308
2022-11-08 19:40:57 +01:00
Michael Jones
8dd7b5b26b Cycles: Metal integrator state size tuning
This patch tunes the integrator state sizing for Metal (`num_concurrent_states` and `num_concurrent_busy_states`).

On all GPUs architecture, we adjust the busy:total states ratio to be 1:4 which gives better rendering performance than the previous 1:16 ratio (independent of total state count). This gives a small performance uplift (e.g. 2-3% on M1 Ultra).

Additionally for M2 architectures, we double the overall state size if there is available headroom. Inclusive of the first change, we can expect uplift of close to 10% in future, as this results in larger dispatch sizes and minimises work submission overheads. In order to make an accurate determination of available headroom, we defer the calculation of `num_concurrent_states` and `num_concurrent_busy_states` until the time of integrator state allocation (i.e. after all of the scene data has been allocated). We also refactor `alloc_integrator_soa` to calculate an *exact* single-state-size in a first pass, right before allocating the integrator SoA buffers in a second pass.

Reviewed By: brecht

Differential Revision: https://developer.blender.org/D16313
2022-10-24 17:14:33 +01:00
Nikita Sirgienko
2ead05d738 Cycles: Add optional per-kernel performance statistics
When verbose level 4 is enabled, Blender prints kernel performance
data for Cycles on GPU backends (except Metal that doesn't use
debug_enqueue_* methods) for groups of kernels.
These changes introduce a new CYCLES_DEBUG_PER_KERNEL_PERFORMANCE
environment variable to allow getting timings for each kernels
separately and not grouped with others. This is done by adding
explicit synchronization after each kernel execution.

Differential Revision: https://developer.blender.org/D15971
2022-09-27 22:15:00 +02:00
Patrick Mours
79787bf8e1 Cycles: Improve denoiser update performance when rendering with multiple GPUs
This patch causes the render buffers to be copied to the denoiser
device only once before denoising and output/display is then fed
from that single buffer on the denoiser device. That way usually all
but one copy (from all the render devices to the denoiser device)
can be eliminated, provided that the denoiser device is also the
display device (in which case interop is used to update the display).
As such this patch also adds some logic that tries to ensure the
chosen denoiser device is the same as the display device.

Differential Revision: https://developer.blender.org/D15657
2022-08-12 16:00:54 +02:00
Brecht Van Lommel
ff1883307f Cleanup: renaming and consistency for kernel data
* Rename "texture" to "data array". This has not used textures for a long time,
  there are just global memory arrays now. (On old CUDA GPUs there was a cache
  for textures but not global memory, so we used to put all data in textures.)
* For CUDA and HIP, put globals in KernelParams struct like other devices.
* Drop __ prefix for data array names, no possibility for naming conflict now that
  these are in a struct.
2022-06-20 12:30:48 +02:00
Brecht Van Lommel
2c1bffa286 Cleanup: add verbose logging category names instead of numbers
And use them more consistently than before.
2022-06-17 14:08:14 +02:00
Brecht Van Lommel
610619c203 Merge branch 'blender-v3.2-release' 2022-05-31 17:35:16 +02:00
Brecht Van Lommel
f2cd7e08fe Fix Cycles MNEE not working for Metal
Move MNEE to own kernel, separate from shader ray-tracing. This does introduce
the limitation that a shader can't use both MNEE and AO/bevel, but that seems
like the better trade-off for now.

We can experiment with bigger kernel organization changes later.

Differential Revision: https://developer.blender.org/D15070
2022-05-31 17:24:43 +02:00
Patrick Mours
a8c81ffa83 Cycles: Add half precision float support for volumes with NanoVDB
This patch makes it possible to change the precision with which to
store volume data in the NanoVDB data structure (as float, half, or
using variable bit quantization) via the previously unused precision
field in the volume data block.
It makes it possible to further reduce memory usage during
rendering, at a slight cost to the visual detail of a volume.

Differential Revision: https://developer.blender.org/D10023
2022-05-23 19:08:01 +02:00
Sergey Sharybin
eccc9d8eba Cleanup: Remove unused function in Cycles queue
Noticed while looking into oneAPI patch.

Seems to be unused, without clear indication why/when it might be
needed. Removing the function simplifies adding the new backend.

Differential Revision: https://developer.blender.org/D14652
2022-04-19 10:32:07 +02:00
Brecht Van Lommel
9cfc7967dd Cycles: use SPDX license headers
* Replace license text in headers with SPDX identifiers.
* Remove specific license info from outdated readme.txt, instead leave details
  to the source files.
* Add list of SPDX license identifiers used, and corresponding license texts.
* Update copyright dates while we're at it.

Ref D14069, T95597
2022-02-11 17:47:34 +01:00
Brecht Van Lommel
ae28d90578 Fix T93350: Cycles renders shows black during rendering huge resolutions
The root of the issue is caused by Cycles ignoring OpenGL limitation on
the maximum resolution of textures: Cycles was allocating texture of the
final render resolution. It was exceeding limitation on certain GPUs and
driver.

The idea is simple: use multiple textures for the display, each of which
will fit into OpenGL limitations.

There is some code which allows the display driver to know when to start
the new tile. Also added some code to allow force graphics interop to be
re-created. The latter one ended up not used in the final version of the
patch, but it might be helpful for other drivers implementation.

The tile size is limited to 8K now as it is the safest size for textures
on many GPUs and OpenGL drivers.

This is an updated fix with a workaround for freezing with the NVIDIA
driver on Linux.

Differential Revision: https://developer.blender.org/D13385
2022-01-07 17:20:04 +01:00
Brecht Van Lommel
204ae33d75 Revert "Fix T93350: Cycles renders shows black during rendering huge resolutions"
This reverts commit 5e37f70307.

It is leading to freezing of the entire desktop for a few seconds when stopping
3D viewport rendering on my Linux / NVIDIA system.
2021-12-07 20:49:34 +01:00
Sergey Sharybin
5e37f70307 Fix T93350: Cycles renders shows black during rendering huge resolutions
The root of the issue is caused by Cycles ignoring OpenGL limitation on
the maximum resolution of textures: Cycles was allocating texture of the
final render resolution. It was exceeding limitation on certain GPUs and
driver.

The idea is simple: use multiple textures for the display, each of which
will fit into OpenGL limitations.

There is some code which allows the display driver to know when to start
the new tile. Also added some code to allow force graphics interop to be
re-created. The latter one ended up not used in the final version of the
patch, but it might be helpful for other drivers implementation.

The tile size is limited to 8K now as it is the safest size for textures
on many GPUs and OpenGL drivers.

Differential Revision: https://developer.blender.org/D13385
2021-12-07 19:01:42 +01:00
Patrick Mours
e14f8c2dd7 Cycles: Reintroduce device-only memory handling that got lost in Cycles X merge
Somehow only a part of rBf4f8b6dde32b0438e0b97a6d8ebeb89802987127 ended up in
Cycles X, causing the issue that commit fixed, "OPTIX_ERROR_INVALID_VALUE" when the
system is out of memory, to show up again.
This adds the missing changes to fix that problem.

Maniphest Tasks: T93620

Differential Revision: https://developer.blender.org/D13488
2021-12-07 18:50:10 +01:00
Campbell Barton
ac447ba1a3 Cleanup: clang-format, trailing space 2021-11-30 10:15:17 +11:00
Michael Jones
98a5c924fc Cycles: Metal readiness: Specify DeviceQueue::enqueue arg types
This patch adds new arg-type parameters to `DeviceQueue::enqueue` and its overrides. This is in preparation for the Metal backend which needs this information for correct argument encoding.

Ref T92212

Reviewed By: brecht

Maniphest Tasks: T92212

Differential Revision: https://developer.blender.org/D13357
2021-11-29 14:56:06 +00:00
Sergey Sharybin
1706bf7780 Merge branch 'blender-v3.0-release' 2021-11-22 17:32:23 +01:00
Sergey Sharybin
336ca6796a Fix T90308: Cycles crash copying memory from device to host
Happens when device runs out of memory and Cycles is moving some
textures to the host memory.

The delayed memory free for OptiX BVH was moving data from one
device_memory to another, leaving the original device memory in
an invalid state. This was ruining the allocation map in the CUDA
device which is using pointer to the device_memory.

This change makes it so the memory pointer is stolen from BVH
into the delayed memory free list.

Additionally, forbid copying and moving instances of device_memory
and added sanity checks in the device implementation.

Differential Revision: https://developer.blender.org/D13316
2021-11-22 17:26:59 +01:00
Sebastian Herholz
d9bc8f189c Cycles: add build option to enable a debugging feature for MIS
This patch adds a CMake option "WITH_CYCLES_DEBUG" which builds cycles with
a feature that allows debugging/selecting the direct-light sampling strategy.
The same option may later be used to add other debugging features that could
affect performance in release builds.

The three options are:
* Forward path tracing (e.g., via BSDF or phase function)
* Next-event estimation
* Multiple importance sampling combination of the previous two methods

Such a feature is useful for debugging light different sampling, evaluation,
and pdf methods (e.g., for light sources and BSDFs).

Differential Revision: https://developer.blender.org/D13152
2021-11-17 18:03:56 +01:00
Thomas Dinges
83a4d51997 Cleanup: Remove unused show_samples() device code in Cycles. 2021-11-17 11:16:48 +01:00
Thomas Dinges
25e7365d0d Cleanup CUDA / HIP comments
Remove outdated CUDA comments for bindless textures and cleanup some HIP comments that still mentioned CUDA.

Differential Revision: https://developer.blender.org/D13189
2021-11-11 16:37:29 +01:00
Clément Foucault
3f0991266f Merge branch 'blender-v3.0-release' 2021-11-01 12:15:09 +01:00
Thomas Dinges
5327413b37 Cleanup: Remove Cycles device checks for half float.
All supported devices support half float now, so we can remove the check.

Differential Revision: https://developer.blender.org/D13021
2021-11-01 10:18:30 +01:00
Brecht Van Lommel
806521f703 Fix T92671: confusing Cycles debug logs about CPU architecture
Instead of printing debug flags listing various CPU and GPU settings that
may or may not be used, print when we are using them. This include CPU
kernel types, OptiX debugging and CUDA and HIP adaptive compilation. BVH
type was already printed.
2021-11-01 08:36:50 +01:00
Brecht Van Lommel
fd25e883e2 Cycles: remove prefix from source code file names
Remove prefix of filenames that is the same as the folder name. This used
to help when #includes were using individual files, but now they are always
relative to the cycles root directory and so the prefixes are redundant.

For patches and branches, git merge and rebase should be able to detect the
renames and move over code to the right file.
2021-10-26 15:37:04 +02:00
Brecht Van Lommel
d7d40745fa Cycles: changes to source code folders structure
* Split render/ into scene/ and session/. The scene/ folder now contains the
  scene and its nodes. The session/ folder contains the render session and
  associated data structures like drivers and render buffers.
* Move top level kernel headers into new folders kernel/camera/, kernel/film/,
  kernel/light/, kernel/sample/, kernel/util/
* Move integrator related kernel headers into kernel/integrator/
* Move OSL shaders from kernel/shaders/ to kernel/osl/shaders/

For patches and branches, git merge and rebase should be able to detect the
renames and move over code to the right file.
2021-10-26 15:36:39 +02:00
Brecht Van Lommel
df00463764 Cycles: add shadow path compaction for GPU rendering
Similar to main path compaction that happens before adding work tiles, this
compacts shadow paths before launching kernels that may add shadow paths.

Only do it when more than 50% of space is wasted.

It's not a clear win in all scenes, some are up to 1.5% slower. Likely caused
by different order of scheduling kernels having an unpredictable performance
impact. Still feels like compaction is just the right thing to avoid cases
where a few shadow paths can hold up a lot of main paths.

Differential Revision: https://developer.blender.org/D12944
2021-10-21 15:38:03 +02:00
Brecht Van Lommel
39810b3f51 Cleanup: make HIP and CUDA code more consistent
Ref D12834
2021-10-21 13:08:10 +02:00
Brecht Van Lommel
001f548227 Cycles: reduce kernel reserved local memory when not using shader raytracing
Ref T87836
2021-10-20 17:50:31 +02:00
Brecht Van Lommel
a754e35198 Cycles: refactor API for GPU display
* Split GPUDisplay into two classes. PathTraceDisplay to implement the Cycles side,
  and DisplayDriver to implement the host application side. The DisplayDriver is now
  a fully abstract base class, embedded in the PathTraceDisplay.
* Move copy_pixels_to_texture implementation out of the host side into the Cycles side,
  since it can be implemented in terms of the texture buffer mapping.
* Move definition of DeviceGraphicsInteropDestination into display driver header, so
  that we do not need to expose private device headers in the public API.
* Add more detailed comments about how the DisplayDriver should be implemented.

The "driver" terminology might not be obvious, but is also used in other renderers.

Differential Revision: https://developer.blender.org/D12626
2021-09-30 20:48:08 +02:00
Brecht Van Lommel
a6b53ef994 Cycles: print name of kernels on errors in CUDA queue, for debugging 2021-09-27 15:24:12 +02:00
Brecht Van Lommel
ab8f24811d Cleanup: remove unused device code and includes 2021-09-24 16:34:14 +02:00
Brecht Van Lommel
d7f803f522 Fix T91641: crash rendering with 16k environment map in Cycles
Protect against integer overflow.
2021-09-23 17:48:16 +02:00
Campbell Barton
4d66cbd140 Cleanup: spelling in comments 2021-09-22 14:54:01 +10:00
Brecht Van Lommel
0803119725 Cycles: merge of cycles-x branch, a major update to the renderer
This includes much improved GPU rendering performance, viewport interactivity,
new shadow catcher, revamped sampling settings, subsurface scattering anisotropy,
new GPU volume sampling, improved PMJ sampling pattern, and more.

Some features have also been removed or changed, breaking backwards compatibility.
Including the removal of the OpenCL backend, for which alternatives are under
development.

Release notes and code docs:
https://wiki.blender.org/wiki/Reference/Release_Notes/3.0/Cycles
https://wiki.blender.org/wiki/Source/Render/Cycles

Credits:
* Sergey Sharybin
* Brecht Van Lommel
* Patrick Mours (OptiX backend)
* Christophe Hery (subsurface scattering anisotropy)
* William Leeson (PMJ sampling pattern)
* Alaska (various fixes and tweaks)
* Thomas Dinges (various fixes)

For the full commit history, see the cycles-x branch. This squashes together
all the changes since intermediate changes would often fail building or tests.

Ref T87839, T87837, T87836
Fixes T90734, T89353, T80267, T80267, T77185, T69800
2021-09-21 14:55:54 +02:00
Brecht Van Lommel
073bf8bf52 Cycles: remove WITH_CYCLES_DEBUG, add WITH_CYCLES_DEBUG_NAN
WITH_CYCLES_DEBUG was used for rendering BVH debugging passes. But since we
mainly use Embree an OptiX now, this information is no longer important.

WITH_CYCLES_DEBUG_NAN will enable additional checks for NaNs and invalid values
in the kernel, for Cycles developers. Previously these asserts where enabled in
all debug builds, but this is too likely to crash Blender in scenes that render
fine regardless of the NaNs. So this is behind a CMake option now.

Fixes T90240
2021-07-28 19:27:57 +02:00
Brecht Van Lommel
cf74cd9367 Cycles: upgrade CUDA to 11.4
This fixes a performance regression on Ampere cards, on specific scenes like
classroom. For cycles-x there is little difference, but this is still helpful
for LTS releases, and we need to upgrade at some point anyway.
2021-07-26 19:46:51 +02:00
Patrick Mours
f4f8b6dde3 Cycles: Change device-only memory to actually only allocate on the device
This patch changes the `MEM_DEVICE_ONLY` type to only allocate on the device and fail if
that is not possible anymore because out-of-memory (since OptiX acceleration structures may
not be allocated in host memory). It also fixes high peak memory usage during OptiX
acceleration structure building.

Reviewed By: brecht

Maniphest Tasks: T85985

Differential Revision: https://developer.blender.org/D10535
2021-03-11 14:12:35 +01:00
Campbell Barton
17e1e2bfd8 Cleanup: correct spelling in comments 2021-02-05 16:23:34 +11:00
James Horsley
4fbeb3e6be Fix T85089: Crash when rendering scene that does not fit into GPU memory with CUDA/OptiX
The "cuda_mem_map_mutex" was potentially being locked recursively during the call to
"CUDADevice::move_textures_to_host", which crashed. This moves around the locking and
unlocking of "cuda_mem_map_mutex", so that it doesn't call a function that locks it while
still holding the lock.

Reviewed By: pmoursnv

Maniphest Tasks: T85089, T84734

Differential Revision: https://developer.blender.org/D10219
2021-01-27 15:27:57 +01:00