The current usage of software-based texture operations in
the oneAPI implementation puts additional register pressure on
the GPU compiler during register allocation. And it also creates
code that requires maintenance. This commit is intended to address
this situation by utilizing a recently productized SYCL bindless
texture API to enable HW-based texture operations using
Intel GPUs' hardware sampler.
This currently translates to 1-11% rendering speedups (scene-specific)
on my Arc A770 and Arc B580. At the moment, there are small
performance regressions with NanoVDB texture operations on Arc B580
and small performance regressions in shade surface MNEE and Raytrace
kernels on Arc A770, but they look recoverable and will be handled
in the future.
Pull Request: https://projects.blender.org/blender/blender/pulls/133457
There is now a non-experimental API for this_work_item functionality, so
let's use it for better code quality and also to avoid the deprecation
warning during compilation.
No functional or performance changes are expected.
Pull Request: https://projects.blender.org/blender/blender/pulls/133472
This is not a complete solution because there may be indirect changes
to the compiler other than the binary that require a rebuild, but this
should catch the simple cases at least.
To align better with the pixel and reduce the samples needed.
The paper was using gamma because the jacobian |d_gamma/d_h| approaches
infinity at the boundaries, but it seems that clamping at 0.999 is
enough for numerical stability.
In practice I did not notice a change in the noise level, but it
simplifies the range computation and renders faster due to reduced
sample amount.
Co-authored-by: Olivier Maury <omaury@meta.com>
Ref: !129616
Pull Request: https://projects.blender.org/blender/blender/pulls/134130
Enables building of a Cubin for GPUs based on Blackwell architecture
if CUDA toolkit version 12.8 or higher is installed.
Only added sm_120 to the default set, since it is the one relevant for
consumer GPUs (RTX 5090 etc.) that are generally used with Blender.
Pull Request: https://projects.blender.org/blender/blender/pulls/134170
This commit adds the `light:random` attribute to OSL, allowing the
object info node to now match between SVM and OSL when using the
random output on a light.
Pull Request: https://projects.blender.org/blender/blender/pulls/134095
The derivatives of the normal were simply not computed.
The offsetted normals are computed by perturbating the barycentric
coordinates. At triangle boundaries, the normals are extrapolated,
so discontinuities might be visible.
Currently only supported on triangles.
Pull Request: https://projects.blender.org/blender/blender/pulls/133769
Use sub-pixel differentials for bump mapping helps with reducing
artifacts when objects are moving or when textures have high frequency
details.
Currently we scale it by 0.1 because it seems to work good in practice,
we can adjust the value in the future if it turns out to be impractical.
Ref: #122892
Pull Request: https://projects.blender.org/blender/blender/pulls/133991
- Rename dx/dy -> dfdx/dfdy to match the actual computed quantity
- Add template functions to compute dfdx/dfdy on triangles for sharing
among different data types
- Add documentation to some functions
- Some code shuffling that makes it easier to scale dfdx/dfdy in the
future
- Some other trivial changes
The issue here is that originally, the step count for the geometry's
motion and the object transform's motion were tied together, so a
single variable is used to store that step count.
However, when using the velocity attribute, it's possible for the step
counts to differ, which will lead to an incorrect interpolated object
transform in the kernel.
Pull Request: https://projects.blender.org/blender/blender/pulls/133788
That version has a bunch of API changes, so by dropping support for older
versions we can remove old compatibility code.
Also, that version is required for OptiX support, so building a fully-featured
Cycles wasn't possible with older OSL anyways.
Pull Request: https://projects.blender.org/blender/blender/pulls/133746
OSL has OSL::ShaderGlobals, which contains all the state for OSL shader
execution. The renderer fills it out and hands a pointer to OSL, and any
callbacks (e.g. for querying attributes) get the pointer back.
In order to store renderer-specific data in it, there's a few opaque pointers
in the struct, but using those has led to a mess of reinterpret_cast<> and
pointer indirection in order to carry all the data around.
However, there is a way to do this nicer: Good old C-style struct inheritance.
In short: Unless you're doing pointer arithmetic, you can just add additional
data at the end of a base struct, and the rest of the code won't care.
In this case, this means that we can have our own ShaderGlobals struct and
add more Cycles-specific data at the end. Additionally, we can replace the
generic opaque void pointers with ones of the correct type, which saves us
from needing to cast them back.
Since we have a copy of ShaderGlobals for GPU OSL anyways, it's just a matter
of refactoring the code a bit to make use of that.
The advantages of this are:
- Avoids casts all over the place, just needs one cast to turn our
ShaderGlobals into the "base" type that OSL expects and one to turn the
pointer that OSL gives us on callbacks back into our "derived" type.
- Removes indirection, all the relevant stuff (sd, kg, state) is now
directly in the ShaderGlobals
- Removes some OSL-specific state from ShaderData, which helps to keep
memory usage down
Pull Request: https://projects.blender.org/blender/blender/pulls/133689
The oren_nayar_diffuse_bsdf closure in OSL had two issues:
- It broke when used with roughness of zero
- It only used the provided albedo for energy compensation, so it still
required the user to multiply with the albedo
Therefore, this handles the zero roughness corner case and includes
the albedo in the closure weight.
This makes no difference when using the closure through the Diffuse
or Principled BSDF nodes, only for custom OSL shaders.
Pull Request: https://projects.blender.org/blender/blender/pulls/133597
In a previous commit (1), adjustments to light tree traversal were made
to try and skip distant lights when deciding which light to sample
while within a world volume.
The skip wasn't implemented properly, and as a result distant lights
were still included in the light tree traversal in that sitaution.
And due to the way the skip was implemented, there were some
unintialized variables used in the processing of the distant lights
importance which caused artifacts on some platforms.
This commit fixes this issue by reverting the skip of distant lights
in that situation.
(1) blender/blender@6fbc958e89
Pull Request: https://projects.blender.org/blender/blender/pulls/132344
It is possible that the valid range computed from `theta` and the range
visible to the current pixel do not overlap. In this case the hair
section has no contribution.
Pull Request: https://projects.blender.org/blender/blender/pulls/133337
In Cycles, the subsurface scattering shader will switch to a
diffuse shader under a few different conditions to improve performance
and reduce noise.
This commit removes the "switch back to diffuse if the scattering
radius is less than a quarter of a pixel" optimization because in some
scenes, this can result in noticable lines as the shader transitions
between subsurface scattering and diffuse.
Pull Request: https://projects.blender.org/blender/blender/pulls/133245
Happens after f79cae2c59. Seems that Metal pre-processing could not
figure out that the inclusion is guarded behind `#ifdef`, so the
included file is expanded here and never again afterwards.
The pre-processing should be fixed, but for now just always include the file.
Pull Request: https://projects.blender.org/blender/blender/pulls/133336
The `ZDepth` output of the camera data node was different between SVM
and OSL.
SVM would output the `ZDepth`, with negative distances for points
behind the camera. While OSL would output the absolute of the distance,
which resulted in points behind the camera becoming positive.
Align OSL to SVM and allow outputting the negative distance as it
allows users to differentiate between what's in front or behind the
camera.
Pull Request: https://projects.blender.org/blender/blender/pulls/132837
In a recent refactor (1), the subsurface weight was set to be
constant, when it can be modified in the case that `__SUBSURFACE__`
is false, such as when using the adaptive compilation feature.
This commit fixes this issue by rearranging the code so the subsurface
weight is never overwritten.
(1) blender/blender@dd51c8660b
Pull Request: https://projects.blender.org/blender/blender/pulls/132620