when ray exceeds `max_bounce`, we do not allocate any closure at
intersection. However, Ray Portal BSDF still added `SD_BSDF` flag,
resulting in undefined behavior in
`integrate_surface_bsdf_bssrdf_bounce()`.
This part of code was similar to Transparent BSDF, however, Transparent
closure was still allocated in this case.
To fix the undefined behavior, add `SD_BSDF` flag only when the Ray
Portal closure was allocated.
This PR contains optimisations and a general tidy-up of the MetalRT backend.
- Currently `scene_intersect` is used for both normal and (opaque) shadow rays, however the usage patterns are different enough to warrant specialisation. Shadow intersection tests (flagged with `PATH_RAY_SHADOW_OPAQUE`) only need a bool result, but need a larger "self" payload in order to exclude hits against target lights. By specialising we can minimise the payload size in each case (which is helps performance) and avoid some dynamic branching. This PR introduces a new `scene_intersect_shadow` function which is specialised in Metal, and currently redirects to `scene_intersect` in the other backends.
- Currently `scene_intersect_local` is implemented for worst-case payload requirements as demanded by `subsurface_disk` (where `max_hits` is 4). The random_walk case only demands 1 hit result which we can retrieve directly from the intersector object (rather than stashing it in the payload). By specialising, we significantly reduce the payload size for random_walk queries, which has a big impact on performance. Additionally, we only need to use a custom intersection function for the first ray test in a random walk (for self-primitive filtering), so this PR forces faster `opaque` intersection testing for all but the first random walk test.
- Currently `scene_intersect_volume` has a lot of redundant code to handle non-triangle primitives despite volumes only being enclosed by trimeshes. This PR removes this code.
Additionally, this PR tidies up the convoluted intersection function linking code, removes some redundant intersection handlers, and uses more consistent naming of intersection functions.
On a M3 MacBook Pro, these changes give 2-3% performance increase on typical scenes with opaque trimesh materials (e.g. barbershop, classroom junkshop), but can give over 15% performance increase for certain scenes using random walk SSS (e.g. monster).
Pull Request: https://projects.blender.org/blender/blender/pulls/121397
This is an implementation of thin film iridescence in the Principled BSDF based on "A Practical Extension to Microfacet Theory for the Modeling of Varying Iridescence".
There are still several open topics that are left for future work:
- Currently, the thin film only affects dielectric Fresnel, not metallic. Properly specifying thin films on metals requires a proper conductive Fresnel term with complex IOR inputs, any attempt of trying to hack it into the F82 model we currently use for the Principled BSDF is fundamentally flawed. In the future, we'll add a node for proper conductive Fresnel, including thin films.
- The F0/F90 control is not very elegantly implemented right now. It fundamentally works, but enabling thin film while using a Specular Tint causes a jump in appearance since the models integrate it differently. Then again, thin film interference is a physical effect, so of course a non-physical tweak doesn't play nicely with it.
- The white point handling is currently quite crude. In short: The code computes XYZ values of the reflectance spectrum, but we'd need the XYZ values of the product of the reflectance spectrum and the neutral illuminant of the working color space. Currently, this is addressed by just dividing by the XYZ values of the illuminant, but it would be better to do a proper chromatic adaptation transform or to use the proper reference curves for the working space instead of the XYZ curves from the paper.
Pull Request: https://projects.blender.org/blender/blender/pulls/118477
This PR fixes the (currently unused) scene-based selective feature compilation macros. These feature based macros haven't been used for a few years, and enabling them currently results in compilation errors.
The only functional change in this PR is in geom/primitive.h where undef-ing `__HAIR__` had exposed an inconsistency in how pointcloud attributes were being fetched. Using the more general `primitive_surface_attribute_float4` (instead of `curve_attribute_float4`) fixed a compilation error that occurred when rendering pointcloud unit test scenes with adaptive compilation enabled.
Pull Request: https://projects.blender.org/blender/blender/pulls/121216
Transport rays that enter to another location in the scene, with
specified ray position and normal. This may be used to render portals
for visual effects, and other production rendering tricks.
This acts much like a Transparent BSDF. Render passes are passed
through, and this is affected by light path max transparent bounces.
Pull Request: https://projects.blender.org/blender/blender/pulls/114386
Clamp some of the inputs of the Glossy BSDF, Glass BSDF, Sheen BSDF,
and Subsurface Scattering nodes to improve consistency between render
engines and to avoid unexpected results.
* Clamp roughness to 0..1
* Clamp subsurface radius to 0..inf
* Clamp colors to 0..inf
Pull Request: https://projects.blender.org/blender/blender/pulls/120390
This replaces the fixed Tangent input in BsdfNode::compile
with a custom input.
This is done because very few nodes actually use the tangent input
and it would be better to have this slot available for other inputs
on different nodes in the future.
Pull Request: https://projects.blender.org/blender/blender/pulls/119042
by ensuring `KernelLight.lightgroup` is properly assigned in
`device_update_light()`. The value is later retrieved via
`lamp_lightgroup(kg, lamp)`.
`light_manager->device_update()` is called after
`background->device_update()`, so the background light group should
already been assigned when `device_update_lights()` is called.
Pull Request: https://projects.blender.org/blender/blender/pulls/120714
use available `film_pass_pixel_render_buffer()` to access the pointer
to the render buffer.
For shadow state, a similar function `film_pass_pixel_render_buffer_shadow()`
is created, because `shadow_path` instead of `path` is needed.
it is difficult to keep in mind when MIS weight is needed, better to
handle this logic in the lower-level functions.
This reduces code duplication in many places.
This is a regression since fdc2962beb
The size of state can not be different between CPU and GPU.
This change replaces compile-time condition with a kernel feature
check, which solves the render regression on AMD Metal. It also
minimizes the state size on other GPUs when Light Tree is disabled.
Pull Request: https://projects.blender.org/blender/blender/pulls/120476
This is a regression in 4.1, caused by 36e603c430.
For unbiased MIS weight in light tree, we should use sd->N for
mis_origin_n, since sc->N is not available in NEE.
The change also makes it so we do not sample lights below sd->N even
when bump map correction is disabled. This diverges from the original
idea of giving full control to artists, but ensures the internal math
is happy.
Pull Request: https://projects.blender.org/blender/blender/pulls/120216
fdc2962beb indirectly introduced a change
in inlining (light_tree_pdf started getting inlined) that led to a 5-10%
drop in performance for most scenes.
Dropping the noinline keyword for oneAPI device recovers it.
It however brings another performance regression to MNEE and Raytrace
kernels, that we'll look into separately.
The Perlin noise algorithms suffer from precision issues when a coordinate
is greater than about 250000.
To fix this the Perlin noise texture is repeated every 100000 on each axis.
This causes discontinuities every 100000, however at such scales this
usually shouldn't be noticeable.
Pull Request: https://projects.blender.org/blender/blender/pulls/119884
it is not clear from which point the `cos_theta_u` should be computed in
volume segment, so the original implementation was mixing the closest point
and the point where the minimal angle is formed.
Use the closest point on segment as a conservative measure.
Pull Request: https://projects.blender.org/blender/blender/pulls/119965
in the test scene `all_light_types_in_volume.blend`, `theta - theta_o -
theta_u` is slightly above the threshold. Even if we do a strict check
with `acos` on the failing cases, it will go to the other branch and
deliver a result which is also 1.0f. Better to relax the threshold.
During the some of the shading for volumetrics, Cycles would try to
write to a variable that does not exist if the device has the
light tree disabled.
At the moment this only impacts AMD GPUs with the Metal backend.
Pull Request: https://projects.blender.org/blender/blender/pulls/119906
The same random number was used when sampling from the volume segment
and from the direct scattering position, causing correlation issues with
light tree.
To solve this problem, we ensure the same light is picked for
volume segment/direct scattering, equiangular/distance sampling by
sampling the light tree only once in volume segment. From the direct
scattering position in volume, we sample a position on the picked light
as usual. If sampling from the light tree fails, we continue with
indirect scattering.
For unbiased MIS weight for forward sampling, we retrieve the `P`, `D`
and `t` used in volume segment for traversing the light tree.
The main changes are:
1. `light_tree_sample()` and `light_distribution_sample()` now only pick
lights. Sampling a position on light is done separately via
`light_sample()`.
2. `light_tree_sample()` is now only called only once from volume
segment. For direct lighting we call `light_sample()`.
3. `light_tree_pdf()` now has a template `<in_volume_segment>`.
4. A new field `emitter_id` is added to struct `LightSample`, which just
stores the picked emitter index.
5. Additional field `previous_dt = ray->tmax - ray->tmin` is added to
`state->ray`, because we need this quantity for computing the pdf.
6. Distant/Background lights are also picked by light tree in volume
segment now, because we have no way to pick them afterwards. The direct
sample event for these lights will be handled by
`VOLUME_SAMPLE_DISTANCE`.
7. Original paper suggests to use the maximal importance, this results
in very poor sampling probability for distant and point lights therefore
excessive noise. We have a minimal importance for surface to balance, we
could do the same for volume but I do not want to spend much time on
this now. Just doing `min_importance = 0.0f` seems to do the job
okayish. This way we still won't sample the light with zero
`max_importance`.
The current solution might perform worse with distance sampling, because
the light tree measure is biased towards equiangular sampling. However,
it is difficult to perform MIS between equiangular and distance sampling
if different lights are picked for each method. This is something we can
look into in the future if proved to be a serious regression.
Pull Request: https://projects.blender.org/blender/blender/pulls/119389
When using light linking with the light tree, the root index of a
mesh light subtree can be 0. The current code assumed this wasn't
possible, and as such it caused rendering issues, specifically the
incorrect computation of the PDF of certain mesh lights during
forward path tracing.
So we adjust the code to allow mesh light subtree root node
indices of 0.
This was worked on by Alaska, Sergey, and Weizhen
Pull Request: https://projects.blender.org/blender/blender/pulls/119770
By restricting the sample range along the ray to the valid segment.
Supports
**Mesh Light**
- [x] restrict the ray segment to the side with MIS
**Area Light**
- [x] when the spread is zero, find the intersection of the ray and the bounding box/cylinder of the rectangle/ellipse area light beam
- [x] when the spread is non-zero, find the intersection of the ray and the minimal enclosing cone of the area light beam
*note the result is also unbiased when we just consider the cone from the sampled point in volume segment. Far away from the light source it's less noisy than the current solution, but near the light source it's much noisier. We have to restrict the sample region on the area light to the part that lits the ray then, I haven't tried yet to see if it would be less noisy.*
**Point Light**
- [x] the complete ray segment should be valid.
**Spot Light**
- [x] intersect the ray with the spot light cone
- [x] support non-zero radius
Pull Request: https://projects.blender.org/blender/blender/pulls/119438
The original bug report was that the Glossy Toon BSDF behaves incorrectly
when mixed with other closures.
The underlying issue here was that the eval function didn't check whether
the reflection angle is inside the valid cone and always returned its PDF,
which is very high compared to e.g. the diffuse closure's PDF for small
sizes (since the cone is supposed to be quite tight) and therefore breaks
MIS mixing.
However, while looking into this, I found a number of other issues, and so
this commit also contains several other changes to the Toon BSDFs:
- The angle that was used to compute the intensity wasn't the actual angle
between the vectors. From what I can see, the formula that was used goes
back all the way to the initial commit 12 years ago, so this probably was
something that happened to work with one particular cone sampling method.
Now, however, it caused weird asymmetric highlights, so replace it with
the actual angle (which we already compute anyways).
- Setting size to zero caused the BSDF to go black, so clamp to 1e-5.
- The code was overall a bit repetitive, so I've cleaned it up a bit.
The code to compute differentials mixed up the camera-space locations
of the raster coordinate and the camera itself, which caused the dP
differential to be set even when the ray origin is always the same.
This commit fixes that, reorganizes the code so that the Px/Py are
no longer used for both values to avoid future confusion, and skips
some unnecessary calculations stereo rendering isn't being used.
For spherical spot light, when the shading point is close to the light
source, we switch to sampling the light spread instead of the visible
cone from the shading point. This has the benefit of less noise when the
spread is small.
However, the light spread sampling was not considering non-uniform
object scaling, where the actual spread might be different.
This patch switches sampling method only when the smallest enclosing
spread cone is smaller than the visible cone from the shading point.
An alternative method would be to compute the actual solid angle of the
scaled cone, and sample from the scaled cone. However, that involves
ray transformation and modifying the sampling pdf and angle. Since
non-uniform scaling is rather a niche case, it's probably not worth the
computation effort.
Pull Request: https://projects.blender.org/blender/blender/pulls/119661
which resulted in bias when self intersection is excluded in forward scattering.
Below is a comparison using principled BSDF with emission. NEE and MIS were much brighter.
Pull Request: https://projects.blender.org/blender/blender/pulls/119440
Ref: #118534
turns out `in_volume_segment` does need to be checked. If the ray origin
lies on the wrong side of the mesh light, part of the ray could still be
lit by the other side, so the sample should not be considered invalid.
Pull Request: https://projects.blender.org/blender/blender/pulls/119529
Global built-ins appear to not work on AMD cards.
Also add a tweak to avoid a performance regression, similar
to what was done before. Disable adaptive subdivision kernel
code if not used.
Pull Request: https://projects.blender.org/blender/blender/pulls/119175