Commit Graph

20 Commits

Author SHA1 Message Date
Michael Jones
a0f269f682 Cycles: Kernel address space changes for MSL
This is the first of a sequence of changes to support compiling Cycles kernels as MSL (Metal Shading Language) in preparation for a Metal GPU device implementation.

MSL requires that all pointer types be declared with explicit address space attributes (device, thread, etc...). There is already precedent for this with Cycles' address space macros (ccl_global, ccl_private, etc...), therefore the first step of MSL-enablement is to apply these consistently. Line-for-line this represents the largest change required to enable MSL. Applying this change first will simplify future patches as well as offering the emergent benefit of enhanced descriptiveness.

The vast majority of deltas in this patch fall into one of two cases:

- Ensuring ccl_private is specified for thread-local pointer types
- Ensuring ccl_global is specified for device-wide pointer types

Additionally, the ccl_addr_space qualifier can be removed. Prior to Cycles X, ccl_addr_space was used as a context-dependent address space qualifier, but now it is either redundant (e.g. in struct typedefs), or can be replaced by ccl_global in the case of pointer types. Associated function variants (e.g. lcg_step_float_addrspace) are also redundant.

In cases where address space qualifiers are chained with "const", this patch places the address space qualifier first. The rationale for this is that the choice of address space is likely to have the greater impact on runtime performance and overall architecture.

The final part of this patch is the addition of a metal/compat.h header. This is partially complete and will be extended in future patches, paving the way for the full Metal implementation.

Ref T92212

Reviewed By: brecht

Maniphest Tasks: T92212

Differential Revision: https://developer.blender.org/D12864
2021-10-14 16:14:43 +01:00
Sergey Sharybin
aa46459543 Fix shadow catcher behind transparent object on GPU
The assumption about absent shadow path was wrong.

The rest of the changes are to ensure shadow paths are finished prior
to the split, so that they write to the proper passes.

The issue was caught by running regression tests on OptiX.

Differential Revision: https://developer.blender.org/D12857
2021-10-14 09:39:38 +02:00
Campbell Barton
c1c6c11ca6 Cleanup: spelling in comments 2021-10-12 17:55:02 +11:00
Brecht Van Lommel
a94343a8af Cycles: improve SSS Fresnel and retro-reflection in Principled BSDF
For details see the "Extending the Disney BRDF to a BSDF with Integrated
Subsurface Scattering" paper.

We split the diffuse BSDF into a lambertian and retro-reflection component.
The retro-reflection component is always handled as a BSDF, while the
lambertian component can be replaced by a BSSRDF.

For the BSSRDF case, we compute Fresnel separately at the entry and exit
points, which may have different normals. As the scattering radius decreases
this converges to the BSDF case.

A downside is that this increases noise for subsurface scattering in the
Principled BSDF, due to some samples going to the retro-reflection component.
However the previous logic (also in 2.93) was simple wrong, using a
non-sensical view direction vector at the exit point. We use an importance
sampling weight estimate for the retro-reflection to try to better balance
samples between the BSDF and BSSRDF.

Differential Revision: https://developer.blender.org/D12801
2021-10-11 18:22:54 +02:00
Brecht Van Lommel
73a05ff9e8 Cycles: restore Christensen-Burley SSS
There is not enough time before the release to improve Random Walk to handle
all cases this was used for, so restore it for now.

Since there is no more path splitting in cycles-x, this can increase noise in
non-flat areas for the sample number of samples, though fewer rays will be traced
also. This is fundamentally a trade-off we made in the new design and why Random
Walk is a better fit. However the importance resampling we do now does help to
reduce noise.

Differential Revision: https://developer.blender.org/D12800
2021-10-11 18:22:54 +02:00
Brecht Van Lommel
736be7cf58 Fix T91997: Cycles glass + SSS not rendering correctly 2021-10-08 16:11:02 +02:00
Sergey Sharybin
f01c4f27f9 Fix Cycles speed regression after dynamic volume stack change
Only copy required part of volume stack instead of entire stack.

Solves time regression introduced by D12759 and avoids need in
implementing volume stack calculation to exactly match what the
path tracing will do (as well as potentially makes scenes with
a lot of volumes ans a tiny bit of deeply nested ones render
faster).

Still need to look into memory aspect of the regression, but
that is for separate patch.

Ref T92014

Maniphest Tasks: T92014

Differential Revision: https://developer.blender.org/D12790
2021-10-08 15:44:03 +02:00
Campbell Barton
de07bf2b13 Cleanup: spelling 2021-10-08 13:23:19 +11:00
Brecht Van Lommel
04857cc8ef Cycles: fully decouple triangle and curve primitive storage from BVH2
Previously the storage here was optimized to avoid indirections in BVH2
traversal. This helps improve performance a bit, but makes performance
and memory usage of Embree and OptiX BVHs a bit worse also. It also adds
code complexity in other parts of the code.

Now decouple triangle and curve primitive storage from BVH2.
* Reduced peak memory usage on all devices
* Bit better performance for OptiX and Embree
* Bit worse performance for CUDA
* Simplified code:
** Intersection.prim/object now matches ShaderData.prim/object
** No more offset manipulation for mesh displacement before a BVH is built
** Remove primitive packing code and flags for Embree and OptiX
** Curve segments are now stored in a KernelCurve struct
* Also happens to fix a bug in baking with incorrect prim/object

Fixes T91968, T91770, T91902

Differential Revision: https://developer.blender.org/D12766
2021-10-06 17:52:04 +02:00
Sergey Sharybin
0194e54fd3 Fix compilation error with MSVC
MSVC does not support variable size array definition.
Use maximum possible stack, similar to the GPU case.

Not expected to have user-measurable difference.
2021-10-06 16:51:07 +02:00
Sergey Sharybin
c6275da852 Fix T91922: Cycles artifacts with high volume nested level
Make volume stack allocated conditionally, potentially based on the
actual nested level of objects in the scene.

Currently the nested level is estimated by number of volume objects.
This is a non-expensive check which is probably enough in practice
to get almost perfect memory usage and performance.

The conditional allocation is a bit tricky.

For the CPU we declare and define maximum possible volume stack,
because there are only that many integrator states on the CPU.

On the GPU we declare outer SoA to have all volume stack elements,
but only allocate actually needed ones. The actually used volume
stack size is passed as a pre-processor, which seems to be easiest
and fastest for the GPU state copy.

There seems to be no speed regression in the demo files on RTX6000.

Note that scenes with high nested level of volume will now be slower
but correct.

Differential Revision: https://developer.blender.org/D12759
2021-10-06 15:46:32 +02:00
Sergey Sharybin
f806bd8261 Fix T91861: Black environment behind shadow catcher
Always sample background pass behind shadow catcher (if the pass
exists, of course), regardless of whether shadow catcher will be
used as approximate or accurate.

Allows to combine accurate shadows into an environment map.

Differential Revision: https://developer.blender.org/D12747
2021-10-04 15:07:32 +02:00
Brecht Van Lommel
fc4886a314 Fix T91894: Cycles baking normal maps of transformed objects not working 2021-10-04 13:58:37 +02:00
Brecht Van Lommel
4d4113adc2 Cycles: record large number of transparent shadow intersections on CPU
So we can do fewer intersection calls, only on the GPU do we need to save
memory and do this in small steps.

Ref T87836
2021-09-29 16:37:32 +02:00
Brecht Van Lommel
ed541de29d Fix T91626: Cycles sss behind fully transparent object renders differently 2021-09-23 18:32:30 +02:00
Brecht Van Lommel
6279efbb78 Fix Cycles compiler warning on GCC 11
For shadow rays there are no closures, leave out the closure merging code
there to avoid warnings about accessing closure memory that does not exist.
2021-09-23 17:48:16 +02:00
Brecht Van Lommel
204b01a254 Fix T91590: Cycles specular baking not using smooth normal 2021-09-22 16:08:45 +02:00
Brecht Van Lommel
53e7c64be7 Fix T91597: Cycles volume scatter visibility on lights not working 2021-09-22 16:08:45 +02:00
Campbell Barton
4d66cbd140 Cleanup: spelling in comments 2021-09-22 14:54:01 +10:00
Brecht Van Lommel
0803119725 Cycles: merge of cycles-x branch, a major update to the renderer
This includes much improved GPU rendering performance, viewport interactivity,
new shadow catcher, revamped sampling settings, subsurface scattering anisotropy,
new GPU volume sampling, improved PMJ sampling pattern, and more.

Some features have also been removed or changed, breaking backwards compatibility.
Including the removal of the OpenCL backend, for which alternatives are under
development.

Release notes and code docs:
https://wiki.blender.org/wiki/Reference/Release_Notes/3.0/Cycles
https://wiki.blender.org/wiki/Source/Render/Cycles

Credits:
* Sergey Sharybin
* Brecht Van Lommel
* Patrick Mours (OptiX backend)
* Christophe Hery (subsurface scattering anisotropy)
* William Leeson (PMJ sampling pattern)
* Alaska (various fixes and tweaks)
* Thomas Dinges (various fixes)

For the full commit history, see the cycles-x branch. This squashes together
all the changes since intermediate changes would often fail building or tests.

Ref T87839, T87837, T87836
Fixes T90734, T89353, T80267, T80267, T77185, T69800
2021-09-21 14:55:54 +02:00