`world_use_portal` is not needed anymore, now that we always add world
as object (b20b4218d5).
We now check if background light is enabled only in
`test_enabled_lights()`, depending on the sample settings.
Pull Request: https://projects.blender.org/blender/blender/pulls/144710
It works with the beta we are using to build Blender 4.5, but the official
release is a bit different. This fix was tested to work with OSL 1.14.7.
Thanks to Paul Zander for finding the OSL commit that lead to this.
Pull Request: https://projects.blender.org/blender/blender/pulls/144715
It has ~1.2x speed-up on CPU and ~1.5x speed-up on GPU (tested on Metal
M2 Ultra).
Individual samples are noisier, but equal time renders are mostly
better.
Note that volume emission renders differently than before.
Pull Request: https://projects.blender.org/blender/blender/pulls/144451
Currently, it was discovered that in the case of several different
Intel dGPUs being present in the system, the experimental L0 copy
optimization does not work correctly in the Intel Driver, which is
causing crashes in the driver and Blender application. So, to avoid
this situation and restore functionality on these platforms,
a workaround was added to disable this extension from being used if
such a configuration is detected. In the future, when this problem is
fully fixed in all Intel Drivers, this workaround can be removed from
the Blender source code to restore some performance that was lost on
configurations of several dGPUs because of this workaround.
Pull Request: https://projects.blender.org/blender/blender/pulls/144262
Guide the probability to scatter in or transmit through the volume.
Only applied for primary rays.
Co-authored-by: Brecht Van Lommel <brecht@blender.org>
The distance sampling is mostly based on weighted delta tracking from
[Monte Carlo Methods for Volumetric Light Transport Simulation]
(http://iliyan.com/publications/VolumeSTAR/VolumeSTAR_EG2018.pdf).
The recursive Monte Carlo estimation of the Radiative Transfer Equation is
\[\langle L \rangle=\frac{\bar T(x\rightarrow y)}{\bar p(x\rightarrow
y)}(L_e+\sigma_s L_s + \sigma_n L).\]
where \(\bar T(x\rightarrow y) = e^{-\bar\sigma\Vert x-y\Vert}\) is the
majorant transmittance between points \(x\) and \(y\), \(p(x\rightarrow
y) = \bar\sigma e^{-\bar\sigma\Vert x-y\Vert}\) is the probability of
sampling point \(y\) from point \(x\) following exponential
distribution.
At each recursive step, we randomly pick one of the two events
proportional to their weights:
* If \(\xi < \frac{\sigma_s}{\sigma_s+\vert\sigma_n\vert}\), we sample
scatter event and evaluate \(L_s\).
* Otherwise, no real collision happens and we continue the recursive
process.
The emission \(L_e\) is evaluated at each step.
This also removes some unused volume settings from the UI:
* "Max Steps" is removed, because the step size is automatically specified
by the volume octree. There is a hard-coded threshold `VOLUME_MAX_STEPS`
to prevent numerical issues.
* "Homogeneous" is automatically detected during density evaluation
An option "Unbiased" is added to the UI. When enabled, densities above
the majorant are clamped.
Due to numerical issues this was creating many wrong self-overlapping.
It was necessary for skipping empty regions, but not any more with the
volume Octree approach
Since we sample the same light for distance sampling and equiangular
sampling, the sample is invalid anyway, so just avoid sampling direct
light for distance sampling too.
This fits better with the way normal and displacement maps are typically
combined. Previously there was a mixing of displaced normal and undisplaced
tangent, which was broken behavior.
Additionally, to undisplaced_N and undisplaced_tangent attributes must now
always be used to get undisplaced coordinates. The regular N and tangent
attributes now always include displacement.
Ref #142022
Pull Request: https://projects.blender.org/blender/blender/pulls/143109
Modify shader update so we simplify the graphs first to determine the
kernel features, then load the kernels, and only then update data on the
device. This avoids errors due to mismatched kernels and shaders.
Pull Request: https://projects.blender.org/blender/blender/pulls/144238
The Principled BSDF has a ton of inputs, and the previous SVM code just always
allocated stack space for all of them. This results in a ton of additional
NODE_VALUE_x SVM nodes, which slow down execution.
However, this is not really needed for two reasons:
- First, many inputs are only used consitionally. For example, if the
subsurface weight is zero, none of the other subsurface inputs are used.
- Many of the inputs have a "usual" value that they will have in most
materials, so if they happen to have that value we can just indicate that
by not allocating space for them.
This is a bit similar to the standard "pack the fixed value and provide
a stack offset if there's a link" pattern, except that the fixed value
is a constant in the code and we allocate a NODE_VALUE_x if a different
fixed value is used.
Therefore, this PR re-implements the parameter packing in a more efficient way:
- If we can determine that a component is disabled, all conditional inputs are
disconnected (to avoid generating upstream nodes).
- If we can determine that a component is disabled, we skip allocating all
conditional inputs on the stack.
- The inputs for which a reasonable "usual" value exists are changed to
respect that, and to only be allocated if they differ.
- param1 and param2 (which are fixed-value-packed as on all BSDF nodes) are
used to store IOR and roughness, which have a decent chance to be fixed
values.
- The parameter packing is more aggressive about using uchar4, which allows
to get rid of two SVM nodes while still storing the same inputs.
The result is a considerable speedup in scenes that make heavy use of the
Principled BSDF:
| Scene | CPU speedup | OptiX speedup |
| --- | --- | --- |
| attic | 5% | 9% |
| bistro | 5% | 8% |
| junkshop | 5% | 10% |
| monster | 3% | 4% |
| spring | 1% | 6% |
Pull Request: https://projects.blender.org/blender/blender/pulls/143910
Ever since the OptiX 8 update in Blender 4.5, the minimum GPU driver
requirements to use OptiX has increased to 535 or newer.
This commit update the minimum GPU driver requirement listed in the UI
to reflect this.
Pull Request: https://projects.blender.org/blender/blender/pulls/143917
This PR adds a new `fresnel_conductor_polarized` function, which calculates reflectance and phase shift (if requested) for both parallel and perpendicular polarized light. This is needed for applying thin film iridescence to conductors (see !141131).
For consistency, this PR also makes `fresnel_conductor` call `fresnel_conductor_polarized` instead of using a fast approximation of the Fresnel equations that is inaccurate at lower n and k values. This will change the output of some Metallic BSDF renders using Physical Conductor and prevent discrepancies when enabling thin film iridescence.
I didn't do any rigorous performance testing, but from timing the functions outside of Blender, `fresnel_conductor_polarized` is significantly slower than the approximation, between 1.5-3x depending on the compiler. This makes sense because it has three square roots and the approximation has none. In some informal tests with metallic_multiggx_physical.blend modified to have more spheres, the new renders took around 1-2% longer on both CPU and GPU.
There are some avoidable inefficiencies in this approach of just calling `fresnel_conductor_polarized`:
- one of the three square roots could be saved since `fresnel_conductor` never needs the phase shift and there are simplifications possible when only calculating the reflectance
- there are several unnecessary multiplications by 1.0 since `fresnel_conductor` uses relative IOR and `fresnel_conductor_polarized` doesn't, though those could get optimized out if inlined
Pull Request: https://projects.blender.org/blender/blender/pulls/143903
This replaces `stack_assign` with `stack_assign_if_linked`, which should save a few SVM nodes for constant parameters.
Running benchmarks (all scenes in the benchmark repo, 3 runs, median value for each) shows 1.0% improvement on CPU and 1.5% on OptiX. Not huge, but fairly (all between -0.2% and 3.0%).
Pull Request: https://projects.blender.org/blender/blender/pulls/143404