the object in volume stack should be used instead of `isect.object`.
NOTE: this solution does not work for overlapping volumes. But since
light linking of overlapping volumes did not work before, it should be
fine to implement this partial solution. We read the bottom of the stack
instead of the top to avoid looping through the entire stack.
Pull Request: https://projects.blender.org/blender/blender/pulls/124341
This PR adds a tag to prevent `kernel_data.integrator.seed` being baked into Metal pipelines as a specialisation constant when full kernel specialisation is enabled. This stops new pipelines from being continually compiled when animation is playing in live viewport mode.
Pull Request: https://projects.blender.org/blender/blender/pulls/124349
The function `direction_to_<some projection model>` computes the inverse of `<some projection model>_to_direction`.
Some of these functions had a bug where they mirror the x-axis, and some of them could be simplified.
I added round-trip tests for all of them.
This MR might change the behavior of the renderer when using equiangular_cubemap_face_to_direction:
I normalized the result vector. I looked at the usages and I think it's normalized later anyways, but someone else should probably verify that this doesn't cause issues.
Pull Request: https://projects.blender.org/blender/blender/pulls/123932
Setting this option to a value above zero replaces the lambertian Diffuse term
with the modified energy-preserving Oren-Nayar BSDF, which matches the OpenPBR
behavior.
Pull Request: https://projects.blender.org/blender/blender/pulls/123616
The function direction_to_fisheye_lens_polynomial computes the inverse of
fisheye_lens_polynomial_to_direction.
Previously the function worked almost correctly if all parameters except k_0
and k_1 were zero (in that case it was correct except for flipping the x-axis).
I replaced the fixed-point iteration (?) by Newton's method and implemented a
test to make sure it works correctly with a wider range of parameter sets.
Pull Request: https://projects.blender.org/blender/blender/pulls/123737
The function direction_to_fisheye_lens_polynomial computes the inverse of
fisheye_lens_polynomial_to_direction.
Previously the function worked almost correctly if all parameters except k_0
and k_1 were zero (in that case it was correct except for flipping the x-axis).
I replaced the fixed-point iteration (?) by Newton's method and implemented a
test to make sure it works correctly with a wider range of parameter sets.
Pull Request: https://projects.blender.org/blender/blender/pulls/123737
This is because with the addition of new features to Cycles, these GPUs
experienced significant performance regressions and bugs, all stemming
from bugs in the Metal GPU driver/compiler. The only reasonable way to
work around these issues was to disable parts of Cycles code on
these GPUs to avoid the driver/compiler bugs.
This resulted in increased development time maintaining these platforms
while being unable to deliver feature parity with other
GPU backends.
It has been decided that this development time is better spent
maintaining platforms that are still actively maintained by
hardware/software vendors, and so AMD and Intel GPU support will be
removed from the Metal backend for Cycles.
Pull Request: https://projects.blender.org/blender/blender/pulls/123551
when computing coefficients in volume, the volume density of the object
at the top of the stack is used, which leads to wrong result if
overlapping volumes have different scales.
This commit fixes the problem by pre-multiplying the volume density per
object when evaluating the shader.
Pull Request: https://projects.blender.org/blender/blender/pulls/123733
The original paper only considers the minimal distance of the cluster to
the ray, not the interval length, resulting in low weight for distant
lights that have large influence over a long distance.
This commit modifies the measure by considering `theta_b - theta_a` for
local lights and the ray length `t` for distant lights.
Pull Request: https://projects.blender.org/blender/blender/pulls/123537
Precompiled Cycles kernels make up a considerable fraction of the total size of
Blender builds nowadays. As we add more features and support for more
architectures, this will only continue to increase.
However, since these kernels tend to be quite compressible, we can save a lot
of storage by storing them in compressed form and decompressing the required
kernel(s) during loading.
By using Zstandard compression with a high level, we can get decent compression
ratios (~5x for the current kernels) while keeping decompression time low
(about 30ms in the worse case in my tests). And since we already require zstd
for Blender, this doesn't introduce a new dependency.
While the main improvement is to the size of the extracted Blender installation
(which is reduced by ~400-500MB currently), this also shrinks the download on
Windows, since .zip's deflate compression is less effective. It doesn't help on
Linux since we're already using .tar.xz there, but the smaller installed size
is still a good thing.
See #123522 for initial discussion.
Pull Request: https://projects.blender.org/blender/blender/pulls/123557
The Oren Nayer diffuse BSDF had a energy compensation term added in a
recent commit[1]. This energy compensation term used the colour input
in it's computation. The colour input was clamped in SVM, but not OSL,
resulting in differences between the two backends. This commit resolves
this issue by clamping the colour in the OSL script to match SVM.
[1] 5e40b9bb5c
Pull Request: https://projects.blender.org/blender/blender/pulls/123527
This patch implements a new Gabor noise node based on [1] but with the
improvements from [2] and the phasor formulation from [3].
We compare with the most popular existing implementation, that of OSL,
from the user's point of view:
- This implementation produces C1 continuous noise as opposed to the
non continuous OSL implementation, so it can be used for bump
mapping and is generally smother. This is achieved by windowing the
Gabor kernel using a Hann window.
- The Bandwidth input of OSL was hard-coded to 1 and was replaced with
a frequency input, which OSL hard codes to 2, since frequency is
more natural to control. This is even more true now that that Gabor
kernel is windowed as opposed to truncated, which means increasing
the bandwidth will just turn the Gaussian component of the Gabor
into a Hann window. While decreasing the bandwidth will eliminate
the harmonic from the Gabor kernel, which is the point of Gabor
noise.
- OSL had three discrete modes of operation for orienting the kernel.
Anisotropic, Isotropic, and a hybrid mode. While this implementation
provides a continuous Anisotropy parameter which users are already
familiar with from the Glossy BSDF node.
- This implementation provides not just the Gabor noise value, but
also its phase and intensity components. The Gabor noise value is
basically sin(phase) * intensity, but the phase is arguably more
useful since it does not suffer from the low contrast issues that
Gabor suffers from. While the intensity is useful to hide the
singularities in the phase.
- This implementation converges faster that OSL's relative to the
impulse count, so we fix the impulses count to 8 for simplicitly.
- This implementation does not implement anisotropic filtering.
Future improvements to the node includes implementing surface noise and
filtering. As well as extending the spectral control of the noise,
either by providing specialized kernels as was done in #110802, or by
providing some more procedural control over the frequencies of the
Gabor.
References:
[1]: Lagae, Ares, et al. "Procedural noise using sparse Gabor
convolution." ACM Transactions on Graphics (TOG) 28.3 (2009): 1-10.
[2]: Tavernier, Vincent, et al. "Making gabor noise fast and
normalized." Eurographics 2019-40th Annual Conference of the European
Association for Computer Graphics. 2019.
[3]: Tricard, Thibault, et al. "Procedural phasor noise." ACM
Transactions on Graphics (TOG) 38.4 (2019): 1-13.
Pull Request: https://projects.blender.org/blender/blender/pulls/121820
This multiscattering term comes from the OpenPBR specification and nicely
preserves energy while correctly modeling increased saturation at high
roughness.
Preparation for adding a diffuse roughness option to the Principled BSDF.
To me, the difference in output and computation seems small enough to
not need an enum for the old behavior.
Note that this also switches sampling to cosine-weighted, in my tests this
gives lower noise. I also checked doing MIS between cosine and uniform,
using the A term as a weight for how often to use cosine (since that term
is Lambertian diffuse), but always using cosine was better.
A nice consequence of that is that you don't get a huge noise jump when
going from 0.0 to 0.01 roughness.
Pull Request: https://projects.blender.org/blender/blender/pulls/123345
* Always define root directories in LIBDIR even when not needed,
to silence some warnings.
* Only show warnings about not finding libs when oneAPI is enabled.
* Prefix message for context.
Light linking was never working correctly in volume segment with light
tree, because `sd->object` was not assigned, thus
`light_link_receiver_nee(kg, sd)` always returned `OBJECT_NONE`, causing
the light tree sample to fail. This problem was revealed by fdc2962beb
since now the same light is used for volume segment and volume.
Also ensure we don't sample position on the light if sampling from
volume segment is failed, by setting `emitter_id` to `EMITTER_NONE` in
such cases.
Pull Request: https://projects.blender.org/blender/blender/pulls/122999
Currently, during baking each pixel stores a seed input that comes from the
Blender side. This is only needed for vertex color baking, however -
for regular image baking, we can just as well hash the pixel coordinates.
Therefore, we can save some memory (4 byte per pixel) by splitting the seed
info out into a separate pass and only storing it when needed.
Pull Request: https://projects.blender.org/blender/blender/pulls/122806
This patch implements blue-noise dithered sampling as described by Nathan Vegdahl (https://psychopath.io/post/2022_07_24_owen_scrambling_based_dithered_blue_noise_sampling), which in turn is based on "Screen-Space Blue-Noise Diffusion of Monte Carlo Sampling Error via Hierarchical Ordering of Pixels"(https://repository.kaust.edu.sa/items/1269ae24-2596-400b-a839-e54486033a93).
The basic idea is simple: Instead of generating independent sequences for each pixel by scrambling them, we use a single sequence for the entire image, with each pixel getting one chunk of the samples. The ordering across pixels is determined by hierarchical scrambling of the pixel's position along a space-filling curve, which ends up being pretty much the same operation as already used for the underlying sequence.
This results in a more high-frequency noise distribution, which appears smoother despite not being less noisy overall.
The main limitation at the moment is that the improvement is only clear if the full sample amount is used per pixel, so interactive preview rendering and adaptive sampling will not receive the benefit. One exception to this is that when using the new "Automatic" setting, the first sample in interactive rendering will also be blue-noise-distributed.
The sampling mode option is now exposed in the UI, with the three options being Blue Noise (the new mode), Classic (the previous Tabulated Sobol method) and the new default, Automatic (blue noise, with the additional property of ensuring the first sample is also blue-noise-distributed in interactive rendering). When debug mode is enabled, additional options appear, such as Sobol-Burley.
Note that the scrambling distance option is not compatible with the blue-noise pattern.
Pull Request: https://projects.blender.org/blender/blender/pulls/118479
On a M3 MacBook Pro, this change increases the benchmark score by 8% (with classroom seeing a path-tracing speedup of 15%).
The integrator state is currently store using struct-of-arrays, with one array per field. Such fine grained separation can result in poor GPU cache utilisation in cases where multiple fields of the same parent struct are accessed together. This PR changes the layout of the `ray`, `isect`, `subsurface`, and `shadow_ray` structs so that the data is interleaved (per parent struct) instead of separate. To try and keep this change localised, I encapsulated the layout change by extending the integrator state access macros, however maybe we want to do this more explicitly? (e.g. by updating every bit of code that accesses these parts of the state). Feedback welcome.
Pull Request: https://projects.blender.org/blender/blender/pulls/122015
This fixes#69535 and #98930.
We use a equi-solid-angle sampling algorithm for rectangular area lights,
but it is not particularly robust for small area lights (either small
in general and/or small because it's being viewed from grazing angles).
The actual sampling part is fine since it just gets clamped into the
valid area anyways, and the difference isn't notable for small lights.
However, we also need to compute the solid angle to get the sampling PDF,
and that computation is quite sensitive to numerical issues for small
values.
Therefore, this commit adds a fallback path for small values, which instead
uses the classic equi-area sampling PDF term times the area-to-solid-angle
Jacobian term. This approximation assumes that all points on the light have
the same distance and angle to the sampling point, which is of course not
strictly the case, but it's close enough for small area lights and better
than failing altogether.
Pull Request: https://projects.blender.org/blender/blender/pulls/122323
Reformulates some terms in the equi-solid-angle rectangle sampling code to
handle small area lamps better, and allows for some rounding error in the
check whether the sampled position is inside the area light.
Pull Request: https://projects.blender.org/blender/blender/pulls/122323
In the original paper, the falloff inside `bcone.theta_e` is assumed to
be `pi/2`, which is too large for spot light and resulted in an
overestimation near the cone boundary.
To address this issue, attenuate the energy of a spot light using the
minimal possible angle formed by the light axis and the shading point
when traversing the light tree.
Ref: #122362
Pull Request: https://projects.blender.org/blender/blender/pulls/122667
Since the previous fix to properly support volumes and transparent objects
it became very easy to make it so the intersection loop takes all 1024
iterations to find intersections.
This change makes it so the number of intersection is limited by the max
number of volume/transparent bounces.
This should minimize possible performance impact of the previous fix.
Pull Request: https://projects.blender.org/blender/blender/pulls/122448
We can't do the optimization to shorten the ray when we might still need
to go through transparent surfaces or volumes to reach the light.
This issue was not light tree specific, however in the test file it was
more noticable because the light tree poorly handles some areas. This in
in turn causes MIS weights for forward path tracing to become higher,
which is where the error was.
Pull Request: https://projects.blender.org/blender/blender/pulls/122404
This enables scenes with all textures not fitting in GPU
memory to finally render. For scenes that are fitting,
no functional change or performance change is expected.
Pull Request: https://projects.blender.org/blender/blender/pulls/122385