Commit Graph

130 Commits

Author SHA1 Message Date
Patrick Mours
a45c36efae Cycles: Make OSL implementation independent from SVM
Cleans up the file structure to be more similar to that of the SVM
and also makes it possible to build kernels with OSL support, but
without having to include SVM support.

This patch was split from D15902.

Differential Revision: https://developer.blender.org/D15949
2022-09-13 10:59:28 +02:00
Brecht Van Lommel
4ac69c26db Fix Cycles wrong MIS logic in shade_light kernel after recent changes
Though end result was still correct. Thanks to Alaska for spotting this.
2022-09-08 15:23:21 +02:00
Brecht Van Lommel
aa174f632e Cleanup: split surface/displacement/volume shader eval into separate files 2022-09-02 17:13:28 +02:00
Brecht Van Lommel
b865339833 Cleanup: remove some unnecessary kernel feature defines
That are either unused or aren't useful for testing anymore without a
megakernel.
2022-09-02 17:13:28 +02:00
Brecht Van Lommel
cf57624764 Cleanup: refactoring of kernel film function names and organization 2022-09-02 17:13:28 +02:00
Brecht Van Lommel
06d2dc6be2 Cleanup: minor cleanups for sample pattern code 2022-09-01 14:57:39 +02:00
Brecht Van Lommel
60119daef5 Cycles: remove old Sobol pattern, simplify sampling dimensions
The multi-dimensional Sobol pattern required us to carefully use as low
dimensions as possible, as quality goes down in higher dimensions. Now that we
have two sampling patterns that are at least as good, there is no need to keep
it around and the implementation can be simplified.

Differential Revision: https://developer.blender.org/D15788
2022-09-01 14:57:39 +02:00
Nathan Vegdahl
50df9caef0 Cycles: improve Progressive Multi-Jittered sampling
Fix two issues in the previous implementation:
* Only power-of-two prefixes were progressively stratified, not suffixes.
  This resulted in unnecessarily increased noise when using non-power-of-two
  sample counts.
* In order to try to get away with just a single sample pattern, the code
  used a combination of sample index shuffling and Cranley-Patterson rotation.
  Index shuffling is normally fine, but due to the sample patterns themselves
  not being quite right (as described above) this actually resulted in
  additional increased noise. Cranley-Patterson, on the other hand, always
  increases noise with randomized (t,s) nets like PMJ02, and should be avoided
  with these kinds of sequences.

Addressed with the following changes:
* Replace the sample pattern generation code with a much simpler algorithm
  recently published in the paper "Stochastic Generation of (t, s) Sample
  Sequences". This new implementation is easier to verify, produces fully
  progressively stratified PMJ02, and is *far* faster than the previous code,
  being O(N) in the number of samples generated.
* It keeps the sample index shuffling, which works correctly now due to the
  improved sample patterns. But it now uses a newer high-quality hash instead
  of the original Laine-Karras hash.
* The scrambling distance feature cannot (to my knowledge) be implemented with
  any decorrelation strategy other than Cranley-Patterson, so Cranley-Patterson
  is still used when that feature is enabled. But it is now disabled otherwise,
  since it increases noise.
* In place of Cranley-Patterson, multiple independent patterns are generated
  and randomly chosen for different pixels and dimensions as described in the
  original PMJ paper. In this patch, the pattern selection is done via
  hash-based shuffling to ensure there are no repeats within a single pixel
  until all patterns have been used.

The combination of these fixes brings the quality of Cycles' PMJ sampler in
line with the previously submitted Sobol-Burley sampler in D15679. They are
essentially indistinguishable in terms of quality/noise, which is expected
since they are both randomized (0,2) sequences.

Differential Revision: https://developer.blender.org/D15746
2022-09-01 14:57:39 +02:00
Brecht Van Lommel
3a605b23d0 Fix T100708: Cycles bake of diffuse/glossy color not outputting alpha 2022-08-31 20:51:50 +02:00
Nathan Vegdahl
a06c9b5ca8 Cycles: add Sobol-Burley sampling pattern
Based on the paper "Practical Hash-based Owen Scrambling" by Brent Burley,
2020, Journal of Computer Graphics Techniques.

It is distinct from the existing Sobol sampler in two important ways:
* It is Owen scrambled, which gives it a much better convergence rate in many
  situations.
* It uses padding for higher dimensions, rather than using higher Sobol
  dimensions directly. In practice this is advantagous because high-dimensional
  Sobol sequences have holes in their sampling patterns that don't resolve
  until an unreasonable number of samples are taken. (See Burley's paper for
  details.)

The pattern reduces noise in some benchmark scenes, however it is also slower,
particularly on the CPU. So for now Progressive Multi-Jittered sampling remains
the default.

Differential Revision: https://developer.blender.org/D15679
2022-08-19 16:27:22 +02:00
Brecht Van Lommel
e949d6da5b Cycles: simplify handling of ray differentials
* Store compact ray differentials in ShaderData and compute full differentials
  on demand. This reduces register pressure on the GPU.
* Remove BSDF differential code that was effectively doing nothing as the
  differential orientation was discarded when making it compact.

This gives a 1-5% speedup with RTX A6000 + OptiX in our benchmarks, with the
bigger speedups in simpler scenes.

Renders appear to be identical except for the Both displacement option that
does both displacement and bump.

Differential Revision: https://developer.blender.org/D15677
2022-08-15 13:48:02 +02:00
Campbell Barton
996cb4008d Cleanup: repeated words in comments 2022-08-12 12:38:54 +10:00
Campbell Barton
77f41da5f1 Cleanup: spelling 2022-08-10 16:23:11 +10:00
Brecht Van Lommel
752fb5dd08 Merge branch 'blender-v3.3-release' 2022-08-09 19:19:54 +02:00
Brecht Van Lommel
79f1cc601c Cycles: improve ray tracing precision near triangle edges
Detect cases where a ray-intersection would miss the current triangle, which if
the intersection is strictly watertight, implies that a neighboring triangle would
incorrectly be hit instead.

When that is detected, apply a ray-offset. The idea being that we only want to
introduce potential error from ray offsets if we really need to.

This work for BVH2 and Embree, as we are able to match the ray-interesction
bit-for-bit, though doing so for Embree requires ugly hacks. Tiny differences
like fused-multiply-add or dot product intrinstics in matrix inversion and ray
intersection needed to be matched exactly, so this is fragile.

Unfortunately we're not able to do the same for OptiX or MetalRT, since those
implementations are unknown (and possibly impossible to match as hardware
instructions). Still artifacts are much reduced, though not eliminated.

Ref T97259

Differential Revision: https://developer.blender.org/D15559
2022-08-09 18:42:01 +02:00
Andrii Symkin
d832d993c5 Cycles: add new Spectrum and PackedSpectrum types
These replace float3 and packed_float3 in various places in the kernel where a
spectral color representation will be used in the future. That representation
will require more than 3 channels and conversion to from/RGB. The kernel code
was refactored to remove the assumption that Spectrum and RGB colors are the
same thing.

There are no functional changes, Spectrum is still a float3 and the conversion
functions are no-ops.

Differential Revision: https://developer.blender.org/D15535
2022-08-09 16:49:34 +02:00
Brecht Van Lommel
43a124bc1c Fix T99179: holdout does not affect transparency without transparent background
This was by design, but maybe not so useful in practice. It's always possible to
set alpha to 1 in compositing if needed.
2022-08-05 16:32:13 +02:00
Brecht Van Lommel
91d365f6df Fix T100205: Cycles wrong volume shading with two materials in object
Assume that all faces using the smae material form a closed mesh, so that
joining meshes gives the same result as separate meshes.

It does mean that using different materials on different sides of one
closed mesh do not work, but the meaning of that is poorly defined anyway
if there is a volume interior.
2022-08-04 19:02:25 +02:00
Brecht Van Lommel
38af5b0501 Cycles: switch Cycles triangle barycentric convention to match Embree/OptiX
Simplifies intersection code a little and slightly improves precision regarding
self intersection.

The parametric texture coordinate in shader nodes is still the same as before
for compatibility.
2022-07-27 21:03:33 +02:00
Brecht Van Lommel
f26aa186b2 Cleanup: remove __KERNEL_CPU__
This was tested in some places to check if code was being compiled for the
CPU, however this is only defined in the kernel. Checking __KERNEL_GPU__
always works.
2022-07-25 17:43:35 +02:00
Brecht Van Lommel
7a74d91e32 Cleanup: move device BVH code to kernel/device/*/bvh.h
Having the OptiX/MetalRT/Embree/MetalRT implementations all in one file with
many #ifdefs became too confusing. Instead split it up per device, and also
move it together with device specific hit/filter/intersect functions and
associated data types.
2022-07-25 16:34:22 +02:00
Brecht Van Lommel
881ef0548a Fix wrong Cycles SSS intersection distance after ray distance changes
No need anymore to have a difference between CPU/GPU, all distances
remain in world space.
2022-07-25 15:19:29 +02:00
Brecht Van Lommel
5152c7c152 Cycles: refactor rays to have start and end distance, fix precision issues
For transparency, volume and light intersection rays, adjust these distances
rather than the ray start position. This way we increment the start distance
by the smallest possible float increment to avoid self intersections, and be
sure it works as the distance compared to be will be exactly the same as
before, due to the ray start position and direction remaining the same.

Fix T98764, T96537, hair ray tracing precision issues.

Differential Revision: https://developer.blender.org/D15455
2022-07-15 18:46:24 +02:00
Brecht Van Lommel
523bbf7065 Cycles: generalize shader sorting / locality heuristic to all GPU devices
This was added for Metal, but also gives good results with CUDA and OptiX.
Also enable it for future Apple GPUs instead of only M1 and M2, since this has
been shown to help across multiple GPUs so the better bet seems to enable
rather than disable it.

Also moves some of the logic outside of the Metal device code, and always
enables the code in the kernel since other devices don't do dynamic compile.

Time per sample with OptiX + RTX A6000:
                                         new                  old
barbershop_interior                      0.0730s              0.0727s
bmw27                                    0.0047s              0.0053s
classroom                                0.0428s              0.0464s
fishy_cat                                0.0102s              0.0108s
junkshop                                 0.0366s              0.0395s
koro                                     0.0567s              0.0578s
monster                                  0.0206s              0.0223s
pabellon                                 0.0158s              0.0174s
sponza                                   0.0088s              0.0100s
spring                                   0.1267s              0.1280s
victor                                   0.0524s              0.0531s
wdas_cloud                               0.0817s              0.0816s

Ref D15331, T87836
2022-07-15 13:42:47 +02:00
Brecht Van Lommel
b8ffd43bd2 Cleanup: make format 2022-07-15 13:40:04 +02:00
Olivier Maury
1b5db02a02 Fix Cycles MNEE wrong results with area light spread
When the solve is successful, the light sample needs to be updated since the
effective shading point is now on the last refractive interface. Spread was
not taken into account, creating false caustics.

Differential Revision: https://developer.blender.org/D15449
2022-07-14 16:36:38 +02:00
Brecht Van Lommel
28c3739a9b Cleanup: replace state flow macros in the kernel with functions 2022-07-14 16:36:38 +02:00
Michael Jones
4b1d315017 Cycles: Improve cache usage on Apple GPUs by chunking active indices
This patch partitions the active indices into chunks prior to sorting by material in order to tradeoff some material coherence for better locality. On Apple Silicon GPUs (particularly higher end M1-family GPUs), we observe overall render time speedups of up to 15%. The partitioning is implemented by repeating the range of `shader_sort_key` for each partition, and encoding a "locator" key which distributes the indices into sorted chunks.

Reviewed By: brecht

Differential Revision: https://developer.blender.org/D15331
2022-07-14 14:26:18 +01:00
Andrii Symkin
c2a2f3553a Cycles: unify math functions names
This patch unifies the names of math functions for different data types and uses
overloading instead. The goal is to make it possible to swap out all the float3
variables containing RGB data with something else, with as few as possible
changes to the code. It's a requirement for future spectral rendering patches.

Differential Revision: https://developer.blender.org/D15276
2022-06-23 15:02:53 +02:00
Brecht Van Lommel
ff1883307f Cleanup: renaming and consistency for kernel data
* Rename "texture" to "data array". This has not used textures for a long time,
  there are just global memory arrays now. (On old CUDA GPUs there was a cache
  for textures but not global memory, so we used to put all data in textures.)
* For CUDA and HIP, put globals in KernelParams struct like other devices.
* Drop __ prefix for data array names, no possibility for naming conflict now that
  these are in a struct.
2022-06-20 12:30:48 +02:00
Brecht Van Lommel
074010ad6d Fix T98570: Cycles AOVs not working in background shader
Must include the AOV writing feature in background shader evaluation.

Differential Revision: https://developer.blender.org/D15114
2022-06-03 15:34:31 +02:00
Brecht Van Lommel
f2cd7e08fe Fix Cycles MNEE not working for Metal
Move MNEE to own kernel, separate from shader ray-tracing. This does introduce
the limitation that a shader can't use both MNEE and AO/bevel, but that seems
like the better trade-off for now.

We can experiment with bigger kernel organization changes later.

Differential Revision: https://developer.blender.org/D15070
2022-05-31 17:24:43 +02:00
Sergey Sharybin
9bb4bf5748 Fix missing 64bit casts when calculating Cycles render buffer offset
Found those missing casts while looking into a crash report made in
the Blender Chat. Was unable to reproduce the crash, but the casts
should totally be there to avoid integer overflow.
2022-05-23 15:59:52 +02:00
Campbell Barton
ffbeb34f5f Cleanup: format 2022-05-18 12:17:42 +10:00
Olivier Maury
b48adbc9d7 Fix T97921: Cycles MNEE issue with light path nodes
Ensure the correct total/diffuse/transmission depth is set when evaluating
shaders for MNEE, consistent with regular light shader evaluation.

Differential Revision: https://developer.blender.org/D14902
2022-05-17 21:07:49 +02:00
Olivier Maury
48754bc146 Fix T97867: Cycles MNEE blocky artefacts for rough refractive interfaces
Made tangent frame consistent across the surface regardless of the sample,
which was not the case with the previous algorithm. Previously, a tangent
frame would stay consistent for the same sample throughout the walk, but not
from sample to sample for the same triangle. This actually resulted in code
simplification.

Also includes additional fixes:

* Fixed an important bug that manifested itself with multiple lights in the
  scene, where caustics had abnormally low amplitude: The final light pdf did
  not include the light distribution pdf.
* Removed unnecessary orthonormal basis generation function, using cycles'
  native one instead.
* Increased solver max iteration back to 64: It turns out we sometimes need
  these extra iterations in cases where projection back to the surface takes
  many steps. The effective solver iteration count, the most expensive part,
  is actually much less than the raw iteration count.

Differential Revision: https://developer.blender.org/D14931
2022-05-16 15:11:54 +02:00
Olivier Maury
dc55e095e6 Fix T97056: Cycles MNEE not working with glass and pure refraction BSDFs
Differential Revision: https://developer.blender.org/D14901
2022-05-10 18:51:02 +02:00
Brecht Van Lommel
1b566b70c1 Fix T95308, T93913: Cycles mist pass wrong with SSS shader
It was wrongly writing passes twice, for both the surface entry and exit points.
We can skip code for filtering closures, emission and holdout also, as these do
nothing with only a subsurface diffuse closure present.
2022-05-05 22:01:45 +02:00
Brecht Van Lommel
e4931ab86d Cleanup: clang format 2022-05-05 21:57:08 +02:00
Brecht Van Lommel
75a051a6ab Fix T93246: Cycles wrong volume shading after transparent surface
The Russian roulette probability was not taken into account for volumes in all
cases. It should not be left out from the SD_HAS_ONLY_VOLUME case.
2022-05-05 20:51:01 +02:00
Brecht Van Lommel
c722993ef1 Fix T94467: Cycles baking of normal pass slower than before cycles-x 2022-04-28 19:55:12 +02:00
Christophe Hery
110eb23005 Fix T97056: Cycles MNEE caustics treated as direct instead of indirect light
This fixes wrong render passs and bounce limits.

Differential Revision: https://developer.blender.org/D14737
2022-04-28 18:00:11 +02:00
Campbell Barton
bba757ef81 Cleanup: various minor changes
- Add missing doxy-section for Apply Parent Inverse Operator
- Use identity for None comparison in Python.
- Remove newline from operator doc-strings.
- Use '*' prefix multi-line C comment blocks.
- Separate filenames from doc-strings.
- Remove break after return.
2022-04-24 13:41:03 +10:00
Olivier Maury
58be9708bf Cycles: removed UV requirement for MNEE, along with fixes and cleanups
Remove need for shadow caustic caster geometry to have a UV layout. UVs were
useful to maintain a consistent tangent frame across the surface while
performing the walk. A consistent tangent frame is necessary for rough
surfaces where a normal offset encodes the sampled h, which should point
towards the same direction across the mesh.

In order to get a continuous surface parametrization without UVs, the
technique described in this paper was implemented:

"The Natural-Constraint Representation of the Path Space for Efficient
 Light Transport Simulation" (Supplementary Material), SIGGRAPH 2014.

In addition to implementing this feature:
* Shadow caustic casters without smooth normals are now ignored (triggered
  some refactoring and cleaning).
* Hit point calculation was refactored using existing utils functions,
  simplifying the code.
* The max number of solver iterations was reduced to 32, a solution is
  usually found by then.
* Added generalized geometry term clamping (transfer matrix calculation can
  sometimes get unstable).
* Add stop condition to Newton solver for more consistent CPU and GPU result.
* Add support for multi scatter GGX refraction.

Fixes T96990, T96991

Ref T94120

Differential Revision: https://developer.blender.org/D14623
2022-04-22 18:31:15 +02:00
Campbell Barton
2547c3c70c Cleanup: spelling in comments 2022-04-22 10:11:48 +10:00
Kévin Dietrich
2890c11cd7 Cycles: add support for volume motion blur
This adds support for rendering motion blur for volumes, using their
velocity field. This works for fluid simulations and imported VDB
volumes. For the latter, the name of the velocity field can be set per
volume object, with automatic detection of velocity fields that are
split into 3 scalar grids.

A new parameter is also added to scale velocity for more artistic control.

Like for Alembic and USD caches, a parameter to set the unit of time in
which the velocity vectors are expressed is also added. For Blender gas
simulations, the velocity unit should always be in seconds, so this is
only exposed for volume objects which may come from external OpenVDB
files.

These parameters are available under the `Render` panels for the fluid
domain and the volume object data properties respectively.

Credits: kernel advection code from Tangent Animation's Blackbird based
on earlier work by Geraldine Chua

Differential Revision: https://developer.blender.org/D14629
2022-04-19 17:07:53 +02:00
Patrick Mours
e513687288 Cycles: Fix a few type casting warnings
Stumbled over the `integrate_surface_volume_only_bounce` kernel
function not returning the right type. The others too showed up as
warnings when building Cycles as a standalone which didn't have
those warnings disabled.

Differential Revision: https://developer.blender.org/D14558
2022-04-05 18:09:21 +02:00
Campbell Barton
85a0115c44 Cleanup: spelling in comments 2022-04-04 12:35:33 +10:00
Campbell Barton
27fea7a3a5 Cleanup: clang-format
Add ccl_gpu_kernel_postfix as a statement macro to prevent the following
declarations from being indented.
2022-04-04 12:35:33 +10:00
Lukas Stockner
ad35453cd1 Cycles: Add support for light groups
Light groups are a type of pass that only contains lighting from a subset of light sources.
They are created in the View layer, and light sources (lamps, objects with emissive materials
and/or the environment) can be assigned to a group.

Currently, each light group ends up generating its own version of the Combined pass.
In the future, additional types of passes (e.g. shadowcatcher) might be getting their own
per-lightgroup versions.

The lightgroup creation and assignment is not Cycles-specific, so Eevee or external render
engines could make use of it in the future.

Note that Lightgroups are identified by their name - therefore, the name of the Lightgroup
in the View Layer and the name that's set in an object's settings must match for it to be
included.
Currently, changing a Lightgroup's name does not update objects - this is planned for the
future, along with other features such as denoising for light groups and viewing them in
preview renders.

Original patch by Alex Fuller (@mistaed), with some polishing by Lukas Stockner (@lukasstockner97).

Differential Revision: https://developer.blender.org/D12871
2022-04-02 06:14:27 +02:00