Commit Graph

1116 Commits

Author SHA1 Message Date
Brecht Van Lommel
d20be55c1a Cleanup: in Cycles force inline transform_inverse_impl
We expect this to always happen.

Ref T100891
2022-10-03 17:58:34 +02:00
Campbell Barton
ea2c41c730 Cleanup: spelling in comments
Also replace "dm" for evaluated mesh in some comments.
2022-10-03 11:03:46 +11:00
Sebastian Herhoz
75a6d3abf7 Cycles: add Path Guiding on CPU through Intel OpenPGL
This adds path guiding features into Cycles by integrating Intel's Open Path
Guiding Library. It can be enabled in the Sampling > Path Guiding panel in the
render properties.

This feature helps reduce noise in scenes where finding a path to light is
difficult for regular path tracing.

The current implementation supports guiding directional sampling decisions on
surfaces, when the material contains a least one diffuse component, and in
volumes with isotropic and anisotropic Henyey-Greenstein phase functions.

On surfaces, the guided sampling decision is proportional to the product of
the incident radiance and the normal-oriented cosine lobe and in volumes it
is proportional to the product of the incident radiance and the phase function.

The incident radiance field of a scene is learned and updated during rendering
after each per-frame rendering iteration/progression.

At the moment, path guiding is only supported by the CPU backend. Support for
GPU backends will be added in future versions of OpenPGL.

Ref T92571

Differential Revision: https://developer.blender.org/D15286
2022-09-27 15:56:32 +02:00
Brecht Van Lommel
fd1bc90679 Cycles: sync changes from standalone repository
* Windows build fixes
* Workaround for Hydra + OpenColorIO link issue
* Bump version
2022-09-18 17:34:23 +02:00
Lukas Stockner
6951e8890a Mikktspace: Optimized port to C++
This commit is a big overhaul to the Mikktspace module, which is used
to compute tangents. I'm not calling it a rewrite since it's the
result of a lot of iterations on the original code, but pretty much
everything is reworked somehow.

Overall goal was to a) make it faster and b) make it maintainable.

Notable changes:
- Since the callbacks for requesting geometry data were a big
  bottleneck before, I've ported it to C++ and made it header-only,
  templating on the data source. That way, the compiler generates code
  specific to the caller, which allows it to inline the data source and
  specialize for some cases (e.g. subd vs. non-subd in Cycles).
- The one input parameter, an optional angle threshold, was not used
  anywhere. Turns out that removing it allows for considerable
  algorithmic simplification, removing a lot of the complexity in the
  later stages. Therefore, I've just removed the option in the new code.
- The code computes several outputs, but only one (the tangent itself)
  is ever used in Blender. Therefore, I've removed the others to
  simplify the code. They could easily be brought back if needed, none
  of the algorithmic simplifications are conflicting with them.
- The original code had fallback paths for many steps in case temporary
  memory allocation fails, but that never actually gets used anyways
  since malloc() doesn't really ever return NULL in practise, so I
  removed them.
- In general, I've restructured A LOT of the code to make the
  algorithms clearer and make use of some C++ features (vectors,
  std::array, booleans, classes), though there's still some of cleanup
  that could be done.
- Parallelized duplicate detection, neighbor detection, triangle
  tangent computation, degenerate triangle handling and tangent space
  accumulation.
- Replaced several algorithms with faster equivalents: Duplicate
  detection uses a (concurrent) hash set now, neighbor detection uses
  Radixsort and splits vertices by index pairs etc.

As for results, the exact speedup depends on the scene of course, but
let's consider the file from T97378:
- Blender 3.1 (before D14675): 6.07sec
- Blender 3.2 (with D14675): 4.62sec
- rBf0a36599007d (last nightly build): 4.42sec
- With this commit: 0.90sec

This speedup will mostly be noticed at the start of Cycles renders and,
even more importantly, in Eevee when doing something that changes the
geometry (e.g. animating) on a model using normal maps.

Differential Revision: https://developer.blender.org/D15589
2022-09-07 00:35:44 +02:00
Nathan Vegdahl
50df9caef0 Cycles: improve Progressive Multi-Jittered sampling
Fix two issues in the previous implementation:
* Only power-of-two prefixes were progressively stratified, not suffixes.
  This resulted in unnecessarily increased noise when using non-power-of-two
  sample counts.
* In order to try to get away with just a single sample pattern, the code
  used a combination of sample index shuffling and Cranley-Patterson rotation.
  Index shuffling is normally fine, but due to the sample patterns themselves
  not being quite right (as described above) this actually resulted in
  additional increased noise. Cranley-Patterson, on the other hand, always
  increases noise with randomized (t,s) nets like PMJ02, and should be avoided
  with these kinds of sequences.

Addressed with the following changes:
* Replace the sample pattern generation code with a much simpler algorithm
  recently published in the paper "Stochastic Generation of (t, s) Sample
  Sequences". This new implementation is easier to verify, produces fully
  progressively stratified PMJ02, and is *far* faster than the previous code,
  being O(N) in the number of samples generated.
* It keeps the sample index shuffling, which works correctly now due to the
  improved sample patterns. But it now uses a newer high-quality hash instead
  of the original Laine-Karras hash.
* The scrambling distance feature cannot (to my knowledge) be implemented with
  any decorrelation strategy other than Cranley-Patterson, so Cranley-Patterson
  is still used when that feature is enabled. But it is now disabled otherwise,
  since it increases noise.
* In place of Cranley-Patterson, multiple independent patterns are generated
  and randomly chosen for different pixels and dimensions as described in the
  original PMJ paper. In this patch, the pattern selection is done via
  hash-based shuffling to ensure there are no repeats within a single pixel
  until all patterns have been used.

The combination of these fixes brings the quality of Cycles' PMJ sampler in
line with the previously submitted Sobol-Burley sampler in D15679. They are
essentially indistinguishable in terms of quality/noise, which is expected
since they are both randomized (0,2) sequences.

Differential Revision: https://developer.blender.org/D15746
2022-09-01 14:57:39 +02:00
Campbell Barton
a3e1a9e2aa Cleanup: spelling in comments, format 2022-08-26 12:47:21 +10:00
Brecht Van Lommel
6b9209ddfa Merge branch 'blender-v3.3-release' 2022-08-19 21:02:02 +02:00
Brecht Van Lommel
4b62970dd3 Cleanup: replace CHECK_TYPE macro with static_assert
To avoid conflicts with BLI headers and simplify code.
2022-08-19 20:36:02 +02:00
Nathan Vegdahl
a06c9b5ca8 Cycles: add Sobol-Burley sampling pattern
Based on the paper "Practical Hash-based Owen Scrambling" by Brent Burley,
2020, Journal of Computer Graphics Techniques.

It is distinct from the existing Sobol sampler in two important ways:
* It is Owen scrambled, which gives it a much better convergence rate in many
  situations.
* It uses padding for higher dimensions, rather than using higher Sobol
  dimensions directly. In practice this is advantagous because high-dimensional
  Sobol sequences have holes in their sampling patterns that don't resolve
  until an unreasonable number of samples are taken. (See Burley's paper for
  details.)

The pattern reduces noise in some benchmark scenes, however it is also slower,
particularly on the CPU. So for now Progressive Multi-Jittered sampling remains
the default.

Differential Revision: https://developer.blender.org/D15679
2022-08-19 16:27:22 +02:00
Sebastian Parborg
8ffc11dbcb Cleanup OpenGL linking and related code after libepoxy merge
This cleans up the OpenGL build flags and linking.
It additionally also removes some dead code.

One of these dead code paths is WITH_X11_ALPHA which actually never was
active even with the build flag on. The call to use this was never
called because the default initializer for GHOST was set to have it off
per default. Nothing called this function with a boolean value to enable it.

These cleanups are needed to support true headless OpenGL rendering.
Without these cleanups libepoxy will fail to load the correct OpenGL
Libraries as we have already linked them to the blender binary.

Reviewed By: Brecht, Campbell, Jeroen

Differential Revision: http://developer.blender.org/D15554
2022-08-15 16:47:20 +02:00
Christian Rauch
a296b8f694 GPU: replace GLEW with libepoxy
With libepoxy we can choose between EGL and GLX at runtime, as well as
dynamically open EGL and GLX libraries without linking to them.

This will make it possible to build with Wayland, EGL, GLVND support while
still running on systems that only have X11, GLX and libGL. It also paves
the way for headless rendering through EGL.

libepoxy is a new library dependency, and is included in the precompiled
libraries. GLEW is no longer a dependency, and WITH_SYSTEM_GLEW was removed.

Includes contributions by Brecht Van Lommel, Ray Molenkamp, Campbell Barton
and Sergey Sharybin.

Ref T76428

Differential Revision: https://developer.blender.org/D15291
2022-08-15 16:10:29 +02:00
Brecht Van Lommel
c9d821294f Cycles: take into account time limit for progress bar
This change allows the Cycles progress report system to take into conderation
the time limit property. This allows for more accuracte progress reports for
high sample count renders with short time limits.

Contributed by Alaska.

Differential Revision: https://developer.blender.org/D15599
2022-08-11 19:37:18 +02:00
Brecht Van Lommel
5cbfdaccd0 Cleanup: minor changes to DebugFlags
Use C++11, remove unused running_inside_blender and move viewport_static_bvh
to BlenderSync.
2022-08-11 17:03:10 +02:00
Brecht Van Lommel
752fb5dd08 Merge branch 'blender-v3.3-release' 2022-08-09 19:19:54 +02:00
Brecht Van Lommel
79f1cc601c Cycles: improve ray tracing precision near triangle edges
Detect cases where a ray-intersection would miss the current triangle, which if
the intersection is strictly watertight, implies that a neighboring triangle would
incorrectly be hit instead.

When that is detected, apply a ray-offset. The idea being that we only want to
introduce potential error from ray offsets if we really need to.

This work for BVH2 and Embree, as we are able to match the ray-interesction
bit-for-bit, though doing so for Embree requires ugly hacks. Tiny differences
like fused-multiply-add or dot product intrinstics in matrix inversion and ray
intersection needed to be matched exactly, so this is fragile.

Unfortunately we're not able to do the same for OptiX or MetalRT, since those
implementations are unknown (and possibly impossible to match as hardware
instructions). Still artifacts are much reduced, though not eliminated.

Ref T97259

Differential Revision: https://developer.blender.org/D15559
2022-08-09 18:42:01 +02:00
Brecht Van Lommel
230f9ade64 Cycles: make transform inverse match Embree exactly
Helps improve ray-tracing precision. This is a bit complicated as it requires
different implementation depending on the CPU architecture.
2022-08-09 16:59:05 +02:00
Brecht Van Lommel
286e535071 Cleanup: simplify CPU instruction checking
The performance of this will be slightly more important for upcoming changes.
Also removed an unused function and changed includes so these system.h can
be included in more places.
2022-08-09 16:59:05 +02:00
Andrii Symkin
d832d993c5 Cycles: add new Spectrum and PackedSpectrum types
These replace float3 and packed_float3 in various places in the kernel where a
spectral color representation will be used in the future. That representation
will require more than 3 channels and conversion to from/RGB. The kernel code
was refactored to remove the assumption that Spectrum and RGB colors are the
same thing.

There are no functional changes, Spectrum is still a float3 and the conversion
functions are no-ops.

Differential Revision: https://developer.blender.org/D15535
2022-08-09 16:49:34 +02:00
Brecht Van Lommel
1988665c3c Cleanup: make vector types make/print functions consistent between CPU and GPU
Now all the same ones are available on CPU and GPU, which was previously not
possible due to lack of operator overloadng in OpenCL. Print functions are
no-ops on some GPUs.

Ref D15535
2022-08-09 16:07:23 +02:00
Sergey Sharybin
cefd6140f3 Fix T100119: Cycles light object's parametric vector distorted
Caused by 38af5b0501.

Adjust barycentric coordinates used for intersection result in the
ray-to-rectangle intersection check.

Differential Revision: https://developer.blender.org/D15592
2022-08-09 15:56:03 +02:00
Sergey Sharybin
25a0124bc8 Fix T100119: Light object's parametric vector distorted in blender 3.4
Caused by 38af5b0501.

Adjust barycentric coordinates used for intersection result in the
ray-to-rectangle intersection check.

Differential Revision: https://developer.blender.org/D15592
2022-08-02 14:17:10 +02:00
Bastien Montagne
d3879e9aaa Merge branch 'blender-v3.3-release' 2022-07-29 15:17:40 +02:00
Tianhao Chai
b862cf0b9f Fix Cycles build error with CUDA on arm64
Checking arm64 assembly support before CUDA/Metal would cause NVCC to
generate inline arm64 assembly.

Differential Revision: https://developer.blender.org/D15569
2022-07-29 14:57:09 +02:00
Brecht Van Lommel
79ab76e156 Cleanup: simplifications and consistency for vector types
* OneAPI: remove separate float3 definition
* OneAPI: disable operator[] to match other GPUs
* OneAPI: make int3 compact to match other GPUs
* Use #pragma once
* Add __KERNEL_NATIVE_VECTOR_TYPES__ to simplify checks
* Remove unused vector3
2022-07-28 21:27:13 +02:00
Brecht Van Lommel
38af5b0501 Cycles: switch Cycles triangle barycentric convention to match Embree/OptiX
Simplifies intersection code a little and slightly improves precision regarding
self intersection.

The parametric texture coordinate in shader nodes is still the same as before
for compatibility.
2022-07-27 21:03:33 +02:00
Brecht Van Lommel
4cf6524731 Fix Cycles Metal build errors after recent changes
float8 is a reserved type in Metal, but is not implemented. So rename to
float8_t for now.

Also move back intersection handlers to kernel.metal, they can't be in the
class that encapsulates the other Metal kernel functions.
2022-07-26 00:17:37 +02:00
Brecht Van Lommel
f26aa186b2 Cleanup: remove __KERNEL_CPU__
This was tested in some places to check if code was being compiled for the
CPU, however this is only defined in the kernel. Checking __KERNEL_GPU__
always works.
2022-07-25 17:43:35 +02:00
Andrii Symkin
793d203139 Cycles: add math functions for float8
This patch adds required math functions for float8 to make it possible
using float8 instead of float3 for color data.

Differential Revision: https://developer.blender.org/D15525
2022-07-25 17:36:58 +02:00
Brecht Van Lommel
023eb2ea7c Cycles: more closely match some math and intersection operations in Embree
This helps with debugging, and gives a slightly closer match between CPU
and CUDA/HIP/Metal renders when it comes to ray tracing precision.
2022-07-25 13:27:40 +02:00
Brecht Van Lommel
5152c7c152 Cycles: refactor rays to have start and end distance, fix precision issues
For transparency, volume and light intersection rays, adjust these distances
rather than the ray start position. This way we increment the start distance
by the smallest possible float increment to avoid self intersections, and be
sure it works as the distance compared to be will be exactly the same as
before, due to the ray start position and direction remaining the same.

Fix T98764, T96537, hair ray tracing precision issues.

Differential Revision: https://developer.blender.org/D15455
2022-07-15 18:46:24 +02:00
Michael Jones
da4ef05e4d Cycles: Apple Silicon optimization to specialize intersection kernels
The Metal backend now compiles and caches a second set of kernels which are
optimized for scene contents, enabled for Apple Silicon.

The implementation supports doing this both for intersection and shading
kernels. However this is currently only enabled for intersection kernels that
are quick to compile, and already give a good speedup. Enabling this for
shading kernels would be faster still, however this also causes a long wait
times and would need a good user interface to control this.

M1 Max samples per minute (macOS 13.0):

                    PSO_GENERIC  PSO_SPECIALIZED_INTERSECT  PSO_SPECIALIZED_SHADE

barbershop_interior       83.4	            89.5                   93.7
bmw27                   1486.1	          1671.0                 1825.8
classroom                175.2	           196.8                  206.3
fishy_cat                674.2	           704.3                  719.3
junkshop                 205.4	           212.0                  257.7
koro                     310.1	           336.1                  342.8
monster                  376.7	           418.6                  424.1
pabellon                 273.5	           325.4                  339.8
sponza                   830.6	           929.6                 1142.4
victor                    86.7              96.4                   96.3
wdas_cloud               111.8	           112.7                  183.1

Code contributed by Jason Fielder, Morteza Mostajabodaveh and Michael Jones

Differential Revision: https://developer.blender.org/D14645
2022-07-15 13:40:04 +02:00
Andrii Symkin
f00d9e80ae Cycles: add more math functions for float4
Add more math functions for float4 to make them on par with float3 ones. It
makes it possible to change the types of float3 variables to float4 without
additional work.

Differential Revision: https://developer.blender.org/D15318
2022-06-30 16:25:21 +02:00
Xavier Hallade
a02992f131 Cycles: Add support for rendering on Intel GPUs using oneAPI
This patch adds a new Cycles device with similar functionality to the
existing GPU devices.  Kernel compilation and runtime interaction happen
via oneAPI DPC++ compiler and SYCL API.

This implementation is primarly focusing on Intel® Arc™ GPUs and other
future Intel GPUs.  The first supported drivers are 101.1660 on Windows
and 22.10.22597 on Linux.

The necessary tools for compilation are:
- A SYCL compiler such as oneAPI DPC++ compiler or
  https://github.com/intel/llvm
- Intel® oneAPI Level Zero which is used for low level device queries:
  https://github.com/oneapi-src/level-zero
- To optionally generate prebuilt graphics binaries: Intel® Graphics
  Compiler All are included in Linux precompiled libraries on svn:
  https://svn.blender.org/svnroot/bf-blender/trunk/lib The same goes for
  Windows precompiled binaries but for the graphics compiler, available
  as "Intel® Graphics Offline Compiler for OpenCL™ Code" from
  https://www.intel.com/content/www/us/en/developer/articles/tool/oneapi-standalone-components.html,
  for which path can be set as OCLOC_INSTALL_DIR.

Being based on the open SYCL standard, this implementation could also be
extended to run on other compatible non-Intel hardware in the future.

Reviewed By: sergey, brecht

Differential Revision: https://developer.blender.org/D15254

Co-authored-by: Nikita Sirgienko <nikita.sirgienko@intel.com>
Co-authored-by: Stefan Werner <stefan.werner@intel.com>
2022-06-29 12:58:04 +02:00
Sayak Biswas
abfa09752f Cycles: enable Vega GPU/APU support
Enables Vega and Vega II GPUs as well as Vega APU, using changes in HIP code
to support 64-bit waves and a new HIP SDK version.

Tested with Radeon WX9100, Radeon VII GPUs and Ryzen 7 PRO 5850U with Radeon
Graphics APU.

Ref T96740, T91571

Differential Revision: https://developer.blender.org/D15242
2022-06-28 18:35:43 +02:00
Andrii Symkin
c2a2f3553a Cycles: unify math functions names
This patch unifies the names of math functions for different data types and uses
overloading instead. The goal is to make it possible to swap out all the float3
variables containing RGB data with something else, with as few as possible
changes to the code. It's a requirement for future spectral rendering patches.

Differential Revision: https://developer.blender.org/D15276
2022-06-23 15:02:53 +02:00
Brecht Van Lommel
2c1bffa286 Cleanup: add verbose logging category names instead of numbers
And use them more consistently than before.
2022-06-17 14:08:14 +02:00
Brecht Van Lommel
33f5e8f239 Cycles: load 8 bit image textures as half float for some color spaces
For non-raw, non-sRGB color spaces, always use half float even if that uses
more memory. Otherwise the precision loss from conversion to scene linear or
sRGB (as natively understood by the texture sampling) can be too much.

This also required a change to do alpha association ourselves instead of OIIO,
because in OIIO alpha multiplication happens before conversion to half float
and that gives too much precision loss.

Ref T68926
2022-06-02 18:04:38 +02:00
Brecht Van Lommel
712b0496c1 Merge branch 'blender-v3.2-release' 2022-05-27 20:22:12 +02:00
Brecht Van Lommel
967f96ee2e Fix T98270: Cycles shows black with color values > 65K in GPU render
After recent changes to Nishita sky to clamp negative colors, the pixels ended
up a bit brighter which lead to them exceeding the half float max value. The
CUDA float to half function seems to need clamping.
2022-05-27 20:14:15 +02:00
Patrick Mours
a8c81ffa83 Cycles: Add half precision float support for volumes with NanoVDB
This patch makes it possible to change the precision with which to
store volume data in the NanoVDB data structure (as float, half, or
using variable bit quantization) via the previously unused precision
field in the volume data block.
It makes it possible to further reduce memory usage during
rendering, at a slight cost to the visual detail of a volume.

Differential Revision: https://developer.blender.org/D10023
2022-05-23 19:08:01 +02:00
Campbell Barton
eb837ba17e Cleanup: format 2022-05-05 17:33:43 +10:00
Hallam Roberts
82df48227b Nodes: Add general Combine/Separate Color nodes
Inspired by D12936 and D12929, this patch adds general purpose
"Combine Color" and "Separate Color" nodes to Geometry, Compositor,
Shader and Texture nodes.
- Within Geometry Nodes, it replaces the existing "Combine RGB" and
  "Separate RGB" nodes.
- Within Compositor Nodes, it replaces the existing
  "Combine RGBA/HSVA/YCbCrA/YUVA" and "Separate RGBA/HSVA/YCbCrA/YUVA"
  nodes.
- Within Texture Nodes, it replaces the existing "Combine RGBA" and
  "Separate RGBA" nodes.
- Within Shader Nodes, it replaces the existing "Combine RGB/HSV" and
  "Separate RGB/HSV" nodes.

Python addons have not been updated to the new nodes yet.

**New shader code**
In node_color.h, color.h and gpu_shader_material_color_util.glsl,
missing methods hsl_to_rgb and rgb_to_hsl are added by directly
converting existing C code. They always produce the same result.

**Old code**
As requested by T96219, old nodes still exist but are not displayed in
the add menu. This means Python scripts can still create them as usual.
Otherwise, versioning replaces the old nodes with the new nodes when
opening .blend files.

Differential Revision: https://developer.blender.org/D14034
2022-05-04 18:44:03 +02:00
Brecht Van Lommel
3558f565f1 Fix T97498, T97651: crash in Cycles with TBB 2021 after recent changes 2022-04-28 00:24:13 +02:00
Brecht Van Lommel
029b0df81a Fix Cycles blackbody shader not taking into account OpenColorIO config
Keep the existing Rec.709 fit and convert to other colorspace if needed, it
seems accurate enough in practice, and keeps the same performance for the
default case.
2022-04-18 19:14:34 +02:00
Brecht Van Lommel
41b3feea85 Fix Cycles build error with latest TBB after recent changes
From changes in 869a46df29, ref D14454
2022-04-18 18:49:35 +02:00
Campbell Barton
4b5195a9d7 Cleanup: clang-format 2022-04-13 13:45:42 +10:00
Michael Jones
869a46df29 Cycles fp consistency for Apple Silicon CPUs
Propagate the fp settings from the main thread to all the worker threads (the fp settings includes the FZ settings among other things) - this guarantees consistency in execution of floating point math regardless if its executed in tbb thread arena or on main thread

Add FZ mode to arm64/aarch64 in parallel to the way its been done on intel processors, currently compiling for arm target does not set this mode at all, hence potentially runs slower and with possible results mismatch with intel x86.

Reviewed By: brecht

Differential Revision: https://developer.blender.org/D14454
2022-04-12 19:43:47 +01:00
Brecht Van Lommel
0de0950ad5 Cycles: various Linux build fixes related to Hydra render delegate
* Add missing GLEW and hgiGL libraries for Hydra
* Fix wrong case sensitive include
* Fix link errors by adding external libs to static Hydra lib
* Work around weird Hydra link error with MAX_SAMPLES
* Use Embree by default for Hydra
* Sync external libs code with standalone
* Update version number to match Blender
* Remove unneeded CLEW/GLEW from test executable

None of this should affect Cycles in Blender.

Ref T96731
2022-04-07 19:52:53 +02:00
Olivier Maury
1fb0247497 Cycles: approximate shadow caustics using manifold next event estimation
This adds support for selective rendering of caustics in shadows of refractive
objects. Example uses are rendering of underwater caustics and eye caustics.

This is based on "Manifold Next Event Estimation", a method developed for
production rendering. The idea is to selectively enable shadow caustics on a
few objects in the scene where they have a big visual impact, without impacting
render performance for the rest of the scene.

The Shadow Caustic option must be manually enabled on light, caustic receiver
and caster objects. For such light paths, the Filter Glossy option will be
ignored and replaced by sharp caustics.

Currently this method has a various limitations:

* Only caustics in shadows of refractive objects work, which means no caustics
  from reflection or caustics that outside shadows. Only up to 4 refractive
  caustic bounces are supported.
* Caustic caster objects should have smooth normals.
* Not currently support for Metal GPU rendering.

In the future this method may be extended for more general caustics.

TECHNICAL DETAILS

This code adds manifold next event estimation through refractive surface(s) as a
new sampling technique for direct lighting, i.e. finding the point on the
refractive surface(s) along the path to a light sample, which satisfies Fermat's
principle for a given microfacet normal and the path's end points. This
technique involves walking on the "specular manifold" using a pseudo newton
solver. Such a manifold is defined by the specular constraint matrix from the
manifold exploration framework [2]. For each refractive interface, this
constraint is defined by enforcing that the generalized half-vector projection
onto the interface local tangent plane is null. The newton solver guides the
walk by linearizing the manifold locally before reprojecting the linear solution
onto the refractive surface. See paper [1] for more details about the technique
itself and [3] for the half-vector light transport formulation, from which it is
derived.

[1] Manifold Next Event Estimation
Johannes Hanika, Marc Droske, and Luca Fascione. 2015.
Comput. Graph. Forum 34, 4 (July 2015), 87–97.
https://jo.dreggn.org/home/2015_mnee.pdf

[2] Manifold exploration: a Markov Chain Monte Carlo technique for rendering
scenes with difficult specular transport Wenzel Jakob and Steve Marschner.
2012. ACM Trans. Graph. 31, 4, Article 58 (July 2012), 13 pages.
https://www.cs.cornell.edu/projects/manifolds-sg12/

[3] The Natural-Constraint Representation of the Path Space for Efficient
Light Transport Simulation. Anton S. Kaplanyan, Johannes Hanika, and Carsten
Dachsbacher. 2014. ACM Trans. Graph. 33, 4, Article 102 (July 2014), 13 pages.
https://cg.ivd.kit.edu/english/HSLT.php

The code for this samping technique was inserted at the light sampling stage
(direct lighting). If the walk is successful, it turns off path regularization
using a specialized flag in the path state (PATH_MNEE_SUCCESS). This flag tells
the integrator not to blur the brdf roughness further down the path (in a child
ray created from BSDF sampling). In addition, using a cascading mechanism of
flag values, we cull connections to caustic lights for this and children rays,
which should be resolved through MNEE.

This mechanism also cancels the MIS bsdf counter part at the casutic receiver
depth, in essence leaving MNEE as the only sampling technique from receivers
through refractive casters to caustic lights. This choice might not be optimal
when the light gets large wrt to the receiver, though this is usually not when
you want to use MNEE.

This connection culling strategy removes a fair amount of fireflies, at the cost
of introducing a slight bias. Because of the selective nature of the culling
mechanism, reflective caustics still benefit from the native path
regularization, which further removes fireflies on other surfaces (bouncing
light off casters).

Differential Revision: https://developer.blender.org/D13533
2022-04-01 17:45:39 +02:00