Commit Graph

917 Commits

Author SHA1 Message Date
Brecht Van Lommel
949dbb08d2 Cleanup: remove useless WITH_CYCLES_DEVICE_MULTI 2021-10-26 15:37:59 +02:00
Brecht Van Lommel
fd25e883e2 Cycles: remove prefix from source code file names
Remove prefix of filenames that is the same as the folder name. This used
to help when #includes were using individual files, but now they are always
relative to the cycles root directory and so the prefixes are redundant.

For patches and branches, git merge and rebase should be able to detect the
renames and move over code to the right file.
2021-10-26 15:37:04 +02:00
Brecht Van Lommel
d7d40745fa Cycles: changes to source code folders structure
* Split render/ into scene/ and session/. The scene/ folder now contains the
  scene and its nodes. The session/ folder contains the render session and
  associated data structures like drivers and render buffers.
* Move top level kernel headers into new folders kernel/camera/, kernel/film/,
  kernel/light/, kernel/sample/, kernel/util/
* Move integrator related kernel headers into kernel/integrator/
* Move OSL shaders from kernel/shaders/ to kernel/osl/shaders/

For patches and branches, git merge and rebase should be able to detect the
renames and move over code to the right file.
2021-10-26 15:36:39 +02:00
Sayak Biswas
d092933abb Cycles: various fixes for HIP and compilation of HIP binaries
* Additional structs added to the hipew loader for device props
* Adds hipRTC functions to the loader for future usage
* Enables CPU+GPU usage for HIP
* Cleanup to the adaptive kernel compilation process
* Fix for kernel compilation failures with HIP with latest master

Ref T92393, D12958
2021-10-22 12:15:29 +02:00
Brecht Van Lommel
be558d2d97 Fix T92363: OptiX fails with ambient occlusion node, after recent changes
This triggered a compiler bug where it does not handle the sub.s16 PTX
instruction. Instead refactor the code so we don't need to do uint16_t
subtraction at all.

Also update OptiX device to remove the AO pass direct callable.

Thanks Patrick Mours for figuring this out.
2021-10-21 21:25:34 +02:00
Brecht Van Lommel
df00463764 Cycles: add shadow path compaction for GPU rendering
Similar to main path compaction that happens before adding work tiles, this
compacts shadow paths before launching kernels that may add shadow paths.

Only do it when more than 50% of space is wasted.

It's not a clear win in all scenes, some are up to 1.5% slower. Likely caused
by different order of scheduling kernels having an unpredictable performance
impact. Still feels like compaction is just the right thing to avoid cases
where a few shadow paths can hold up a lot of main paths.

Differential Revision: https://developer.blender.org/D12944
2021-10-21 15:38:03 +02:00
Brecht Van Lommel
39810b3f51 Cleanup: make HIP and CUDA code more consistent
Ref D12834
2021-10-21 13:08:10 +02:00
William Leeson
f0df0e9e07 Fix: Add cast to atof for CYCLES_CONCURRENT_STATES_FACTOR env variable parsing.
The conversion from double to float was causing a build failure.

Differential Revision: https://developer.blender.org/D12946
2021-10-20 21:01:39 +02:00
Brecht Van Lommel
7d111f4ac2 Cleanup: remove unused code 2021-10-20 18:15:21 +02:00
Brecht Van Lommel
001f548227 Cycles: reduce kernel reserved local memory when not using shader raytracing
Ref T87836
2021-10-20 17:50:31 +02:00
Sayak Biswas
ba4e227def HIP device code cleanup and fix for high VRAM usage
This patch cleans up code for HIP device and makes it more consistent with the CUDA code.
It also fixes the issue with high VRAM usage on AMD cards using HIP allowing better performance and usage on cards like 6600XT.
Added a check in intern/cycles/kernel/bvh/bvh_util.h to prevent compiler error with hipcc

Reviewed By: brecht, leesonw

Maniphest Tasks: T92124

Differential Revision: https://developer.blender.org/D12834
2021-10-20 14:04:28 +02:00
Brecht Van Lommel
fd77a28031 Cycles: bake transparent shadows for hair
These transparent shadows can be expansive to evaluate. Especially on the
GPU they can lead to poor occupancy when only some pixels require many kernel
launches to trace and evaluate many layers of transparency.

Baked transparency allows tracing a single ray in many cases by accumulating
the throughput directly in the intersection program without recording hits
or evaluating shaders. Transparency is baked at curve vertices and
interpolated, for most shaders this will look practically the same as actual
shader evaluation.

Fixes T91428, performance regression with spring demo file due to transparent
hair, and makes it render significantly faster than Blender 2.93.

Differential Revision: https://developer.blender.org/D12880
2021-10-19 15:11:09 +02:00
Brecht Van Lommel
1df3b51988 Cycles: replace integrator state argument macros
* Rename struct KernelGlobals to struct KernelGlobalsCPU
* Add KernelGlobals, IntegratorState and ConstIntegratorState typedefs
  that every device can define in its own way.
* Remove INTEGRATOR_STATE_ARGS and INTEGRATOR_STATE_PASS macros and
  replace with these new typedefs.
* Add explicit state argument to INTEGRATOR_STATE and similar macros

In preparation for decoupling main and shadow paths.

Differential Revision: https://developer.blender.org/D12888
2021-10-18 19:02:10 +02:00
Brecht Van Lommel
2ba7c3aa65 Cleanup: refactor to make number of channels for shader evaluation variable 2021-10-15 15:42:44 +02:00
Brecht Van Lommel
04857cc8ef Cycles: fully decouple triangle and curve primitive storage from BVH2
Previously the storage here was optimized to avoid indirections in BVH2
traversal. This helps improve performance a bit, but makes performance
and memory usage of Embree and OptiX BVHs a bit worse also. It also adds
code complexity in other parts of the code.

Now decouple triangle and curve primitive storage from BVH2.
* Reduced peak memory usage on all devices
* Bit better performance for OptiX and Embree
* Bit worse performance for CUDA
* Simplified code:
** Intersection.prim/object now matches ShaderData.prim/object
** No more offset manipulation for mesh displacement before a BVH is built
** Remove primitive packing code and flags for Embree and OptiX
** Curve segments are now stored in a KernelCurve struct
* Also happens to fix a bug in baking with incorrect prim/object

Fixes T91968, T91770, T91902

Differential Revision: https://developer.blender.org/D12766
2021-10-06 17:52:04 +02:00
Sergey Sharybin
6e268a749f Fix adaptive sampling artifacts on tile boundaries
Implement an overscan support for tiles, so that adaptive sampling can
rely on the pixels neighbourhood.

Differential Revision: https://developer.blender.org/D12599
2021-10-05 16:19:14 +02:00
Campbell Barton
74f45ed9c5 Cleanup: spelling in comments 2021-10-03 12:13:29 +11:00
Brecht Van Lommel
a754e35198 Cycles: refactor API for GPU display
* Split GPUDisplay into two classes. PathTraceDisplay to implement the Cycles side,
  and DisplayDriver to implement the host application side. The DisplayDriver is now
  a fully abstract base class, embedded in the PathTraceDisplay.
* Move copy_pixels_to_texture implementation out of the host side into the Cycles side,
  since it can be implemented in terms of the texture buffer mapping.
* Move definition of DeviceGraphicsInteropDestination into display driver header, so
  that we do not need to expose private device headers in the public API.
* Add more detailed comments about how the DisplayDriver should be implemented.

The "driver" terminology might not be obvious, but is also used in other renderers.

Differential Revision: https://developer.blender.org/D12626
2021-09-30 20:48:08 +02:00
Campbell Barton
6dceaafe5a Cleanup: trailing space, newlines at EOF 2021-09-29 07:30:34 +10:00
Brecht Van Lommel
86ec9d79ec Fix build without Cycles HIP device 2021-09-28 20:00:55 +02:00
Brian Savery
044a77352f Cycles: add HIP device support for AMD GPUs
NOTE: this feature is not ready for user testing, and not yet enabled in daily
builds. It is being merged now for easier collaboration on development.

HIP is a heterogenous compute interface allowing C++ code to be executed on
GPUs similar to CUDA. It is intended to bring back AMD GPU rendering support
on Windows and Linux.

https://github.com/ROCm-Developer-Tools/HIP.

As of the time of writing, it should compile and run on Linux with existing
HIP compilers and driver runtimes. Publicly available compilers and drivers
for Windows will come later.

See task T91571 for more details on the current status and work remaining
to be done.

Credits:

Sayak Biswas (AMD)
Arya Rafii (AMD)
Brian Savery (AMD)

Differential Revision: https://developer.blender.org/D12578
2021-09-28 19:18:55 +02:00
Patrick Mours
2189dfd6e2 Cycles: Rework OptiX visibility flags handling
Before the visibility test against the visibility flags was performed in an any-hit program in OptiX
(called `__anyhit__kernel_optix_visibility_test`), which was using the `__prim_visibility` array.
This is not entirely correct however, since `__prim_visibility` is filled with the merged visibility
flags of all objects that reference that primitive, so if one object uses different visibility flags
than another object, but they both are instances of the same geometry, they would appear the same
way. The reason that the any-hit program was used rather than the OptiX instance visibility mask is
that the latter is currently limited to 8 bits only, which is not sufficient to contain all Cycles
visibility flags (12 bits).

To mostly fix the problem with multiple instances and different visibility flags, I changed things to
use the OptiX instance visibility mask for a subset of the Cycles visibility flags (`PATH_RAY_CAMERA`
to `PATH_RAY_VOLUME_SCATTER`, which fit into 8 bits) and only fall back to the visibility test any-hit
program if that isn't enough (e.g. the ray visibility mask exceeds 8 bits or when using the built-in
curves from OptiX, since the any-hit program is then also used to skip the curve endcaps).

This may also improve performance in some cases, since by default OptiX can now perform the normal
scene intersection trace calls entirely on RT cores without having to jump back to the SM on every
hit to execute the any-hit program.

Fixes T89801

Differential Revision: https://developer.blender.org/D12604
2021-09-27 17:12:43 +02:00
Brecht Van Lommel
a6b53ef994 Cycles: print name of kernels on errors in CUDA queue, for debugging 2021-09-27 15:24:12 +02:00
Brecht Van Lommel
ab8f24811d Cleanup: remove unused device code and includes 2021-09-24 16:34:14 +02:00
Brecht Van Lommel
d7f803f522 Fix T91641: crash rendering with 16k environment map in Cycles
Protect against integer overflow.
2021-09-23 17:48:16 +02:00
Campbell Barton
4d66cbd140 Cleanup: spelling in comments 2021-09-22 14:54:01 +10:00
Brecht Van Lommel
0803119725 Cycles: merge of cycles-x branch, a major update to the renderer
This includes much improved GPU rendering performance, viewport interactivity,
new shadow catcher, revamped sampling settings, subsurface scattering anisotropy,
new GPU volume sampling, improved PMJ sampling pattern, and more.

Some features have also been removed or changed, breaking backwards compatibility.
Including the removal of the OpenCL backend, for which alternatives are under
development.

Release notes and code docs:
https://wiki.blender.org/wiki/Reference/Release_Notes/3.0/Cycles
https://wiki.blender.org/wiki/Source/Render/Cycles

Credits:
* Sergey Sharybin
* Brecht Van Lommel
* Patrick Mours (OptiX backend)
* Christophe Hery (subsurface scattering anisotropy)
* William Leeson (PMJ sampling pattern)
* Alaska (various fixes and tweaks)
* Thomas Dinges (various fixes)

For the full commit history, see the cycles-x branch. This squashes together
all the changes since intermediate changes would often fail building or tests.

Ref T87839, T87837, T87836
Fixes T90734, T89353, T80267, T80267, T77185, T69800
2021-09-21 14:55:54 +02:00
Campbell Barton
93eb460dd0 Cleanup: clang-format (re-run after v12 version bump) 2021-07-30 16:19:19 +10:00
Brecht Van Lommel
073bf8bf52 Cycles: remove WITH_CYCLES_DEBUG, add WITH_CYCLES_DEBUG_NAN
WITH_CYCLES_DEBUG was used for rendering BVH debugging passes. But since we
mainly use Embree an OptiX now, this information is no longer important.

WITH_CYCLES_DEBUG_NAN will enable additional checks for NaNs and invalid values
in the kernel, for Cycles developers. Previously these asserts where enabled in
all debug builds, but this is too likely to crash Blender in scenes that render
fine regardless of the NaNs. So this is behind a CMake option now.

Fixes T90240
2021-07-28 19:27:57 +02:00
Brecht Van Lommel
cf74cd9367 Cycles: upgrade CUDA to 11.4
This fixes a performance regression on Ampere cards, on specific scenes like
classroom. For cycles-x there is little difference, but this is still helpful
for LTS releases, and we need to upgrade at some point anyway.
2021-07-26 19:46:51 +02:00
Campbell Barton
f1e4903854 Cleanup: full sentences in comments, improve comment formatting 2021-06-26 21:50:48 +10:00
Campbell Barton
4b9ff3cd42 Cleanup: comment blocks, trailing space in comments 2021-06-24 15:59:34 +10:00
Kévin Dietrich
cd39e3dec1 OptiX: select BVH build options from Scene params
Currently, the OptiX BVH build options are selected based on whether
we are in background mode (final renders) or not (viewport renders).
In background mode, the BVH is built for fast path tracing and low
memory footprint, while in viewport, it is built for fast updates.

However, on platforms without OpenGL support, the background flag is
always set to true and prevents using fast BVH builds in the viewport.

Now, the BVH options derive from the Scene BVH settings:
* if BVH is static, a fast to trace BVH is built
* if BVH is dynamic, a fast to update BVH is built

Reviewed By: #cycles, brecht

Differential Revision: https://developer.blender.org/D11154
2021-06-22 07:38:28 +02:00
Patrick Mours
b046bc536b Fix T88096: Baking with OptiX and displacement fails
Using displacement runs the shader eval kernel, but since OptiX modules are not loaded when
baking is active, those were not available and therefore failed to launch. This fixes that by falling
back to the CUDA kernels.
2021-05-25 16:56:16 +02:00
Sergey Sharybin
ff51c2e89a Cleanup: Use named unused arguments in Cycles Device 2021-05-21 11:19:33 +02:00
Sybren A. Stüvel
0745afeddb Merge remote-tracking branch 'origin/blender-v2.93-release' 2021-05-20 13:00:07 +02:00
Brecht Van Lommel
0456223cde Fix T87793: Cycles OptiX crash hiding objects in viewport render 2021-05-19 18:30:43 +02:00
Brecht Van Lommel
3e472d87a8 Cycles OpenCL: disable AO preview kernels
These seem to be causing some stability issues, and really are just not that
useful in practice. Compiling them is slow already, so it does not improve
the user experience much to show an AO preview if it's not nearly instant.
2021-05-19 18:30:43 +02:00
Brecht Van Lommel
542b8da831 Merge branch 'blender-v2.93-release' 2021-05-17 20:18:39 +02:00
Brecht Van Lommel
91a5dbbd17 Fix OpenCL group size performance issue on Intel GPUs
Contributed by Intel. On some scenes like classroom with particular integrated
GPUs this speeds up rendering 1.97x. With other benchmarks and GPUs it's
between 0.99-1.14x.
2021-05-17 19:40:57 +02:00
Patrick Mours
94960250b5 Cycles: Fix build with OptiX 7.3 SDK 2021-04-26 14:55:39 +02:00
Patrick Mours
7cbd66d42f Cycles: Initialize all OptiX structs to zero before use
This is done to ensure building with newer OptiX SDK releases that add new struct fields gives
deterministic results (no uninitialized fields and therefore random data is passed to OptiX).
2021-04-13 13:56:15 +02:00
Patrick Mours
f1fe42d912 Cycles: Do not allocate tile buffers on all devices when peer memory is active and denoising is not
Separate tile buffers on all devices only need to exist when denoising is active (so any overlap
being rendered simultaneously does not write to the same memory region).
When denoising is not active they can be distributed like all other memory when peer
memory support is available.

Reviewed By: brecht

Differential Revision: https://developer.blender.org/D10858
2021-03-30 14:04:56 +02:00
Brecht Van Lommel
91c44fe885 Cycles: disable NanoVDB for AMD OpenCL
It is causing issue with AMD OpenCL drivers, due to a potential driver bug.

Ref T84461
2021-03-30 00:00:17 +02:00
Brecht Van Lommel
8f93386e62 Fix (apparently harmless) Cycles asan warnings 2021-03-15 20:46:57 +01:00
Patrick Mours
f4f8b6dde3 Cycles: Change device-only memory to actually only allocate on the device
This patch changes the `MEM_DEVICE_ONLY` type to only allocate on the device and fail if
that is not possible anymore because out-of-memory (since OptiX acceleration structures may
not be allocated in host memory). It also fixes high peak memory usage during OptiX
acceleration structure building.

Reviewed By: brecht

Maniphest Tasks: T85985

Differential Revision: https://developer.blender.org/D10535
2021-03-11 14:12:35 +01:00
Campbell Barton
17e1e2bfd8 Cleanup: correct spelling in comments 2021-02-05 16:23:34 +11:00
Patrick Mours
b2e00e8f8e Merge branch 'blender-v2.92-release' 2021-01-29 13:35:21 +01:00
Patrick Mours
9f89166b52 Fix T85148: OptiX viewport denoising regression
Commit 6e74a8b69f changed the denoiser input passes default to
include the normal pass. This does not always produce optimal images though, hence why the
default was previously set to only include the color and albedo passes. This restores that behavior, so
that viewport denoising with OptiX produces the same results as before.
2021-01-29 13:35:00 +01:00
Patrick Mours
9b80291412 Merge branch 'blender-v2.92-release' 2021-01-27 15:29:39 +01:00