Commit Graph

9373 Commits

Author SHA1 Message Date
Campbell Barton
cccc2c77c5 Cleanup: consistent for C-style comment blocks 2025-08-08 07:37:33 +10:00
Brecht Van Lommel
6dc9cd366a Fix: Cycles volume assert on Windows due to wrong comparator
Introduced in 13ab5067ce.

Pull Request: https://projects.blender.org/blender/blender/pulls/144167
2025-08-07 21:23:28 +02:00
Campbell Barton
77d6960d24 Cleanup: quiet GCC warning for pointer subtraction
Ref !144032
2025-08-06 20:31:14 +00:00
Campbell Barton
e8501d2f54 Cleanup: grammar corrections, minor improvements to wording
Also back-tick quote some code references in comments
to differentiate them from English text.
2025-08-06 00:20:39 +00:00
Michael Jones
50363918c7 Cycles: Stop Metal API validation asserts
Dynamic enqueue arguments weren't padded out to struct alignment causing API validation to assert.

Pull Request: https://projects.blender.org/blender/blender/pulls/143991
2025-08-05 14:45:14 +02:00
Damien Picard
5998795aa6 UI: Replace contractions with long-form text
Avoid using contractions for can't, aren't, doesn't, and shouldn't.
Following the writing style guide in the Human Interface Guidelines.

Pull Request: https://projects.blender.org/blender/blender/pulls/143852
2025-08-05 11:16:22 +02:00
Lukas Stockner
793040ad1c Cycles: Improve parameter packing for the Principled BSDF
The Principled BSDF has a ton of inputs, and the previous SVM code just always
allocated stack space for all of them. This results in a ton of additional
NODE_VALUE_x SVM nodes, which slow down execution.

However, this is not really needed for two reasons:
- First, many inputs are only used consitionally. For example, if the
  subsurface weight is zero, none of the other subsurface inputs are used.
- Many of the inputs have a "usual" value that they will have in most
  materials, so if they happen to have that value we can just indicate that
  by not allocating space for them.
  This is a bit similar to the standard "pack the fixed value and provide
  a stack offset if there's a link" pattern, except that the fixed value
  is a constant in the code and we allocate a NODE_VALUE_x if a different
  fixed value is used.

Therefore, this PR re-implements the parameter packing in a more efficient way:
- If we can determine that a component is disabled, all conditional inputs are
  disconnected (to avoid generating upstream nodes).
- If we can determine that a component is disabled, we skip allocating all
  conditional inputs on the stack.
- The inputs for which a reasonable "usual" value exists are changed to
  respect that, and to only be allocated if they differ.
- param1 and param2 (which are fixed-value-packed as on all BSDF nodes) are
  used to store IOR and roughness, which have a decent chance to be fixed
  values.
- The parameter packing is more aggressive about using uchar4, which allows
  to get rid of two SVM nodes while still storing the same inputs.

The result is a considerable speedup in scenes that make heavy use of the
Principled BSDF:

| Scene | CPU speedup | OptiX speedup |
| --- | --- | --- |
| attic | 5% | 9% |
| bistro | 5% | 8% |
| junkshop | 5% | 10% |
| monster | 3% | 4% |
| spring | 1% | 6% |

Pull Request: https://projects.blender.org/blender/blender/pulls/143910
2025-08-04 18:34:58 +02:00
Alaska
83472b19fe Fix: Cycles: Show correct minimum OptiX GPU driver in preferences
Ever since the OptiX 8 update in Blender 4.5, the minimum GPU driver
requirements to use OptiX has increased to 535 or newer.

This commit update the minimum GPU driver requirement listed in the UI
to reflect this.

Pull Request: https://projects.blender.org/blender/blender/pulls/143917
2025-08-04 15:48:43 +02:00
Amogh Shivaram
ff4d840cf8 Cycles: Add polarized Fresnel function for conductors
This PR adds a new `fresnel_conductor_polarized` function, which calculates reflectance and phase shift (if requested) for both parallel and perpendicular polarized light. This is needed for applying thin film iridescence to conductors (see !141131).

For consistency, this PR also makes `fresnel_conductor` call `fresnel_conductor_polarized` instead of using a fast approximation of the Fresnel equations that is inaccurate at lower n and k values. This will change the output of some Metallic BSDF renders using Physical Conductor and prevent discrepancies when enabling thin film iridescence.

I didn't do any rigorous performance testing, but from timing the functions outside of Blender, `fresnel_conductor_polarized` is significantly slower than the approximation, between 1.5-3x depending on the compiler. This makes sense because it has three square roots and the approximation has none. In some informal tests with metallic_multiggx_physical.blend modified to have more spheres, the new renders took around 1-2% longer on both CPU and GPU.

There are some avoidable inefficiencies in this approach of just calling `fresnel_conductor_polarized`:

- one of the three square roots could be saved since `fresnel_conductor` never needs the phase shift and there are simplifications possible when only calculating the reflectance
- there are several unnecessary multiplications by 1.0 since `fresnel_conductor` uses relative IOR and `fresnel_conductor_polarized` doesn't, though those could get optimized out if inlined

Pull Request: https://projects.blender.org/blender/blender/pulls/143903
2025-08-04 15:36:36 +02:00
Lukas Stockner
e266692688 Fix #143907: Cycles: Crash when custom camera shader is not found 2025-08-04 15:34:16 +02:00
Lukas Stockner
3107d1f962 Cycles: Improve parameter packing for BSDFs and emission
This replaces `stack_assign` with `stack_assign_if_linked`, which should save a few SVM nodes for constant parameters.

Running benchmarks (all scenes in the benchmark repo, 3 runs, median value for each) shows 1.0% improvement on CPU and 1.5% on OptiX. Not huge, but fairly (all between -0.2% and 3.0%).

Pull Request: https://projects.blender.org/blender/blender/pulls/143404
2025-08-04 15:19:40 +02:00
Campbell Barton
a3bf386d43 Cleanup: use full sentences in text editor code-comments
Also minor improvements, clarifications.
2025-08-02 13:33:05 +10:00
Weizhen Huang
1667d69d3b Cleanup: Cycles: use constexpr in kernel
instead of lambda and macro guard. Should be possible after ce0ae95ed3

Pull Request: https://projects.blender.org/blender/blender/pulls/143723
2025-08-01 14:06:13 +02:00
Campbell Barton
2c27d2be54 Cleanup: grammar corrections, minor improvements to wording 2025-08-01 21:41:24 +10:00
Habib Gahbiche
53380ed8b2 Cleanup: redundant check for shader type
Pull Request: https://projects.blender.org/blender/blender/pulls/143784
2025-08-01 13:38:54 +02:00
Hugh Delaney
930a942dd0 Refactor: Cycles: Move block sizes into common header
This change puts all the block size macros in the same common header, so
they can be included in host side code without needing to also include
the kernels that are defined in the device headers that contained these
values.

This change also removes a magic number used to enqueue a kernel, which
happened to agree with the GPU_PARALLEL_SORT_BLOCK_SIZE macro.

Pull Request: https://projects.blender.org/blender/blender/pulls/143646
2025-08-01 13:26:02 +02:00
Brecht Van Lommel
cedd7edd29 Cleanup: Cycles: Use gtest header instead of Blender test header
To avoid conflicts with glog logging macros, and because there is just no
need to have this dependency.

Pull Request: https://projects.blender.org/blender/blender/pulls/143719
2025-07-31 19:46:54 +02:00
Weizhen Huang
f8eae6b58a Fix: Cycles: Division by zero in Oren-Nayar shader
`Eavg` can still be 1 for very small roughness, causing division by zero
when computing `Ems`.
A roughness of 1e-5 gives an `Evg` of 0.999998, seems reasonable.

Pull Request: https://projects.blender.org/blender/blender/pulls/143637
2025-07-30 16:57:00 +02:00
Patrick Mours
6487395fa5 Cycles: Add linear curve shape
Add new "Linear 3D Curves" option in the Curves panel in the render
properties. This renders curves as linear segments rather than smooth
curves, for faster render time at the cost of accuracy.

On NVIDIA Blackwell GPUs, this can give a 6x speedup compared to smooth
curves, due to hardware acceleration. On NVIDIA Ada there is still
a 3x speedup, and CPU and other GPU backends will also render this
faster.

A difference with smooth curves is that these have end caps, as this
was simpler to implement and they are usually helpful anyway.

In the future this functionality will also be used to properly support
the CURVE_TYPE_POLY on the new curves object.

Pull Request: https://projects.blender.org/blender/blender/pulls/139735
2025-07-29 17:05:01 +02:00
Stefan Werner
c81e1d95c1 Cycles: Fixed typo in my last commit 2025-07-29 10:53:13 +02:00
Weizhen Huang
a7042ca30c Fix: warning template-id-cdtor on gcc 2025-07-29 10:41:17 +02:00
Stefan Werner
e7312b1ad5 Cycles: Explicitly setting SYCL device for Embree
This fixes issues when using Embree on mutliple GPUs.
A previous workaround used separate contexts, this one now
lets us keep a single context for all GPUs.

Pull Request: https://projects.blender.org/blender/blender/pulls/143089
2025-07-29 10:40:28 +02:00
Brecht Van Lommel
f03ac5ec4b Fix #142876: Cycles crash with OSL and interactive updates
Update use_shading, use_camera and the shading system pointers in the same
location, so that when the render is interrupted they are in a consistent state.

The added null pointer checks are not strictly needed, but just in case it
goes out of sync for another reason.

Pull Request: https://projects.blender.org/blender/blender/pulls/143467
2025-07-28 18:43:57 +02:00
Weizhen Huang
ea45c776fd Cycles: introduce dual types
to replace some uses of dfdx/dfdy/differentials.
No functional change expected.

Pull Request: https://projects.blender.org/blender/blender/pulls/143178
2025-07-28 17:34:24 +02:00
Weizhen Huang
345d23bff8 Cleanup: Cycles: add more float3 util functions
and vectorize `wrap` and `safe_fmod`.
2025-07-28 17:34:21 +02:00
Weizhen Huang
48777385c2 Cleanup: Cycles: simplify computation of dPdx and dPdy
`sd->dPdu`, `sd->dPdv`, `sd->du` and `sd->dv` are computed from
`sd->dP` by constructing a local frame, so both results are the same, subject to some numerical differences.

This avoids constructing the local frame again, so might be faster.
2025-07-28 17:34:21 +02:00
Weizhen Huang
f9a65ebbea Cleanup: Cycles: Deduplication svm bump functions 2025-07-28 17:34:21 +02:00
Habib Gahbiche
445eceb02a Nodes: Remove "Use Nodes" in Shader Editor for World
Part of https://projects.blender.org/blender/blender/pulls/141278

Blend files compatibility:
If a World exists and "Use Nodes" is disabled, we add new nodes to the
existing node tree (or create one if it doesn't) that emulates the
behavior of a world without a node tree. This ensures backward and
forward compatibility.

Python API compatibility:
- `world.use_nodes` was removed from Python API => **Breaking change**
- `world.color` is still being used by Workbench, so it stays there,
although it has no effect anymore when using Cycles or EEVEE.

Python API changes:
Creating a World using `bpy.data.worlds.new()` now creates a World with
 an empty (embedded) node tree. This was necessary to enable Python
scripts to add nodes without having to create a node tree (which is
currently not possible, because World node trees are embedded).

Pull Request: https://projects.blender.org/blender/blender/pulls/142342
2025-07-28 14:06:08 +02:00
Brecht Van Lommel
fbfa4e3805 Fix #143128: Cycles crash or artifacts with multi device rendering and denoising
Clear graphics interop when mapping the buffer directly, as this might free the
underlying buffer or handle.

Fix #141736: Artifacts on Vulkan GPU + GPU
Fix #143128: Crash on Metal CPU + GPU

Pull Request: https://projects.blender.org/blender/blender/pulls/143243
2025-07-25 21:43:01 +02:00
Sergey Sharybin
dcae48d1d3 Cycles: Add Portal Depth light pass information
It allows to implement tricks based on a knowledge whether the path
ever cam through a portal or not, and even something more advanced
based on the number of portals.

The main current objective is for strokes shading: stroke shader
uses Ray Portal BSDF to place ray to the center of the stroke and
point it in the direction of the surface it is generated for. This
gives stroke a single color which matches shading of the original
object. For this usecase to work the ray bounced from the original
surface should ignore the strokes, which is now possible by using
Portal Depth input and mixing with the Transparent BSDF. It also
helps to make shading look better when there are multiple stroke
layers.

A solution of using portal depth is chosen over a single flag due
to various factors:
- Last time we've looked into it it was a bit tricky to implement
	as a flag due to us running out of bits.
- It feels to be more flexible solution, even though it is a bit
	hard to come up with 100% compelling setup for it.
- It needs to be slightly different from the current "Is Foo"
	flags, and be more "Is Portal Descendant" or something.

An extra uint16 is added to the state to count the portal depth,
but it is only allocated for scenes that use Ray Portal BSDF.

Portal BSDF still increments Transparent bounce, as it is required
to have some "limiting" factor so that ray does not get infinitely
move to different place of the scene.

Ref #125213

Pull Request: https://projects.blender.org/blender/blender/pulls/143107
2025-07-25 18:09:38 +02:00
Michael Jones
6f1c63597d Cycles: Disable lossless MTLTexture compression & render up to 2% faster
Disallow lossless texture compression in MetalDevice. Path-tracing texture access patterns are very random, and cache reuse gains are typically too low to offset the decompression overheads. This change doesn't increase memory usage for any of the benchmark scenes (https://projects.blender.org/blender/blender-benchmarks/src/branch/main/cycles) as most textures are high entropy and don't compress well using lossless methods.

Pull Request: https://projects.blender.org/blender/blender/pulls/143074
2025-07-25 17:29:27 +02:00
Weizhen Huang
3a1fbe17b9 Fix: OSL: Attribute "generated" not available for World and Point Cloud
When "generated" is required via Attribute Node, it is not available for
World and Point Cloud.

Make OSL match the SVM behavior to use the object coordinates
(see `svm/attribute.h`),

Pull Request: https://projects.blender.org/blender/blender/pulls/143198
2025-07-25 15:39:06 +02:00
Brecht Van Lommel
47f9b7a98e Fix #142022: Cycles undisplaced normal not available
Previously with adaptive subdivision this happened to work with the N
attribute, but that was not meant to be undisplaced. This adds a new
undisplaced_N attribute specifically for this purpose.

For backwards compatibility in Blender 4.5, this also keeps N undisplaced.
But that will be changed in 5.0.

Pull Request: https://projects.blender.org/blender/blender/pulls/142090
2025-07-24 18:16:25 +02:00
Brecht Van Lommel
264669fc03 Fix #142953: Cycles renders world lightgroup wrong
Pull Request: https://projects.blender.org/blender/blender/pulls/143003
2025-07-24 15:24:03 +02:00
Michael Jones
f3485cc925 Cycles: MetalRT: Only use extended limits if needed (revisited)
Currently MetalRT renders always use extended limits, which is needed to correctly render scenes where the max primitive count can exceed 2^28 or the instance count can exceed 2^24. This patch adopts Metal best practices of only enabling this flag if it is needed.

This PR is similar to #133364, but there are some notable differences:

1) The old PR made an overly optimistic assumption that all the relevant visibility bits could be squeezed into 8 bits. This new PR adopts the same approach that Optix takes of using 8 bits as a primary HW filter, and checking the full 32 bit mask inside the SW intersection handler.

~~2) I moved the scene scanning check from Scene into MetalDevice. This avoids platform specific details leaking into platform agnostic areas.~~

~~3) In live viewport mode, we always use extended limits in case we tip over the threshold.~~

_EDIT:_
2) The limits are scanned in `Scene::update_kernel_features`, and given to the device by a new `set_bvh_limits` method which returns true if the BVH and kernels need to be reloaded.

Pull Request: https://projects.blender.org/blender/blender/pulls/142401
2025-07-24 13:27:20 +02:00
Weizhen Huang
bb689687a7 Fix #140892: Cycles: equiangular sampling numerical issue
When `delta` is significantly larger than the difference between `tmin`
and `tmax`, the precision of `theta_a` and `theta_b` reduces, resulting
in banding artefacts.

For equiangular sampling, we use uniform sampling to fix this case.

For light tree, we use the equality
atan(a) + atan(b) = atan2(a + b, 1 - a*b).

Pull Request: https://projects.blender.org/blender/blender/pulls/142845
2025-07-23 12:56:11 +02:00
Brecht Van Lommel
678ccd4a61 Cycles: Change default material to match Blender and EEVEE
Use Principled BSDF instead of Diffuse BSDF. This is a breaking change but
likely does not affect many scenes significantly. It adds some specularity
when an object does not have a a material assigned.

Fix #142538

Pull Request: https://projects.blender.org/blender/blender/pulls/142703
2025-07-22 15:54:30 +02:00
Brecht Van Lommel
474debc348 Refactor: Cycles: Make attribute map a bit smaller when not using OSL 2025-07-22 15:54:29 +02:00
Brecht Van Lommel
f861ad60e1 Fix #142603: Cycles slow updates in camera view with adaptive subdivision
Postpone update of adaptive subdivision until navigation is done, to keep
things a bit more interactive.

Pull Request: https://projects.blender.org/blender/blender/pulls/142730
2025-07-22 15:49:27 +02:00
Clément Foucault
32d64d35bb Refactor: GPU: Texture: Replace eGPUTextureFormat by TextureFormat
This offers better semantic and safety of the API.

Part of #130632

Pull Request: https://projects.blender.org/blender/blender/pulls/142818
2025-07-22 14:58:54 +02:00
Clément Foucault
f0254c2dcf Refactor: GPU: Remove unnecessary C wrappers for textures
This is the first step into merging `DRW_gpu_wrapper.hh` into
the GPU module.

This is very similar to #119825.

Pull Request: https://projects.blender.org/blender/blender/pulls/142732
2025-07-22 09:48:10 +02:00
Thomas Dinges
ce0ae95ed3 Cycles: Bump minimum supported CUDA architecture to sm_50
Pull Request: https://projects.blender.org/blender/blender/pulls/142212
2025-07-21 19:49:21 +02:00
Weizhen Huang
9404db8c7c Fix #141388: Cycles: CPU/GPU difference in pow function with 0 base
If the base is 0 and the exponent is non-zero, return 0 for both CPU and GPU.

Pull Request: https://projects.blender.org/blender/blender/pulls/142678
2025-07-21 14:45:30 +02:00
Weizhen Huang
5a27edcf79 Fix #141136: Cycles Hair black artifacts with direct coloring
fixed by clamping negative input colors

Pull Request: https://projects.blender.org/blender/blender/pulls/142667
2025-07-21 12:15:04 +02:00
Nikita Sirgienko
9875836519 Cycles: oneAPI: Compile only needed device binaries in multi-GPU case
The code of the "oneapi_load_kernels" function before this modification
was loading kernels and compiling them, if needed, for all devices in
the associated GPU context. This makes sense for one GPU execution
scenario, as well as for execution scenario of multi identical GPU,
but in cases where Blender users have several different GPUs in
render, the previous implementation would compile all kernels
for all devices for each device, unnecessarily doing the same
work multiple times. Because of this, I am changing the
implementation so that now compilation happens only for the used
device per used device, ensuring that no unnecessary work is done.

No render performance changes are expected.
2025-07-19 14:15:36 +02:00
Sebastian Herholz
20e0fed7da Cycles: Fixing wrong PDF evaluation when BSDF closures are excluded by the light source
Pull Request: https://projects.blender.org/blender/blender/pulls/142323
2025-07-18 17:13:54 +02:00
Brecht Van Lommel
87ff645f5d Fix: Cycles preferences Python error after recent changes
Assigning properties in PropertyGroup has become more strict, define it in
__slots__ since it's a runtime property.

Pull Request: https://projects.blender.org/blender/blender/pulls/142357
2025-07-18 16:07:53 +02:00
Michael Jones
8077384e3a Cycles: Improve Metal kernel specialisation
This improves the existing scene specialisation mechanism by replacing "kernel_data.kernel_features" with a function constant. It doesn't cause any additional compilation requests, but allows the backend compiler to eliminate more dead code. An additional compiler hint is provided for dead-stripping "volume_stack_enter_exit" which results in slightly faster rendering of non-volumetric scenes.

Pull Request: https://projects.blender.org/blender/blender/pulls/142235
2025-07-18 11:18:43 +02:00
Campbell Barton
9d41b04aec Cleanup: quiet warnings, typo 2025-07-18 12:03:53 +10:00
Brecht Van Lommel
df6d6c0932 Refactor: Cycles: Use logging system for GPU error print
Pull Request: https://projects.blender.org/blender/blender/pulls/142257
2025-07-17 21:14:30 +02:00