Commit Graph

3840 Commits

Author SHA1 Message Date
Brecht Van Lommel
689633d802 Refactor: Cycles: Avoid unsafe memcpy and memcmp
Pull Request: https://projects.blender.org/blender/blender/pulls/132361
2025-01-03 10:23:15 +01:00
Brecht Van Lommel
da5251f06c Cleanup: Cycles: Remove unused math_matrix.h
Pull Request: https://projects.blender.org/blender/blender/pulls/132361
2025-01-03 10:23:13 +01:00
Brecht Van Lommel
0a0696261d Cleanup: Cycles: clang-tidy warnings about missing switch default case
Pull Request: https://projects.blender.org/blender/blender/pulls/132361
2025-01-03 10:23:11 +01:00
Brecht Van Lommel
d9150484a2 Cleanup: Cycles: Remove some unnecessary #if 0 and #if 1
Pull Request: https://projects.blender.org/blender/blender/pulls/132361
2025-01-03 10:23:09 +01:00
Brecht Van Lommel
71b8ecdd84 Cleanup: Cycles: Remove workaround for slow expf in glibc < 2.16
We're on 2.28 now, and were already on 2.17 for many years before that.

Pull Request: https://projects.blender.org/blender/blender/pulls/132361
2025-01-03 10:23:03 +01:00
Brecht Van Lommel
3a57b97eba Cleanup: Cycles: Remove unneeded oneAPI double emulation for NanoVDB
Pull Request: https://projects.blender.org/blender/blender/pulls/132361
2025-01-03 10:22:59 +01:00
Brecht Van Lommel
d0c2e68e5f Refactor: Cycles: Automated clang-tidy fixups in Cycles
* Use .empty() and .data()
* Use nullptr instead of 0
* No else after return
* Simple class member initialization
* Add override for virtual methods
* Include C++ instead of C headers
* Remove some unused includes
* Use default constructors
* Always use braces
* Consistent names in definition and declaration
* Change typedef to using

Pull Request: https://projects.blender.org/blender/blender/pulls/132361
2025-01-03 10:22:55 +01:00
Brecht Van Lommel
4951356ebc Refactor: Cycles: Stop using entire OIIO namespace
Pull Request: https://projects.blender.org/blender/blender/pulls/132361
2025-01-03 10:22:52 +01:00
Brecht Van Lommel
5c46063607 Refactor: Cycles: Make kernel headers work by themselves
Shuffle around some code and add more includes so that individual
header files compile without errors.

Pull Request: https://projects.blender.org/blender/blender/pulls/132361
2025-01-03 10:22:50 +01:00
Brecht Van Lommel
7db0bc2e64 Refactor: Cycles: Make math and type headers work by themselves
Remove separate impl.h headers, shuffle around some code and add more
includes so that individual header files compile without errors.

Pull Request: https://projects.blender.org/blender/blender/pulls/132361
2025-01-03 10:22:47 +01:00
Brecht Van Lommel
f53e13411b Refactor: Cycles: Use #pragma once
Pull Request: https://projects.blender.org/blender/blender/pulls/132361
2025-01-03 10:22:45 +01:00
Brecht Van Lommel
3c2a6fbb9c Refactor: Cycles: Use nullptr instead of NULL
Pull Request: https://projects.blender.org/blender/blender/pulls/132361
2025-01-03 10:22:43 +01:00
Campbell Barton
33e38c605f Cleanup: correct indentation for CMake files, strip trailing space 2025-01-03 13:23:38 +11:00
Brecht Van Lommel
0ae67b798b Fix: OptiX kernel build does not respect CUDA_HOST_COMPILER option 2025-01-02 17:03:58 +01:00
Alaska
6d7b0c56c9 Fix: Texture Coordinate Object output referencing a object is incorrect in the world shader with OSL
The Texture Coordinate node has a "Object" output that can be
derived from the object being sampled, or a reference object.

With Cycles OSL, the "Object" output of the Texture Coordinate
would not use the reference object if one was active, and the
node was used on a world shader.

This commit fixes this issue.

Pull Request: https://projects.blender.org/blender/blender/pulls/132515
2025-01-02 03:41:38 +01:00
Alaska
0bfb6e41f2 Fix: Texture Coordinate normals are not normalized in Cycles OSL
The output normals of the Texture Coordinate node when using the OSL
backend were not normalized, leading to incorrect values in
some situations.

This commit fixes this issue by normalizing the normals in this situation.

Pull Request: https://projects.blender.org/blender/blender/pulls/132514
2025-01-02 03:40:55 +01:00
Sergey Sharybin
36aa819396 Fix: Uninitialized last intersection type in Cycles
It is unknown to cause any actual problems, but it makes it harder
to read certain debug logs (like the ones from valgrind).

Pull Request: https://projects.blender.org/blender/blender/pulls/132450
2024-12-31 10:26:21 +01:00
Sergey Sharybin
ba4c79feee Fix: Huang hair sampling does not advance LCG
The reason for this probably was the const nature of the shader data.
However, this is something counter-intuitive, as it potentially leads
to multiple BSDFs re-using the same LCG state.

Pull Request: https://projects.blender.org/blender/blender/pulls/132456
2024-12-31 10:25:19 +01:00
Brecht Van Lommel
4453ca25b4 Fix: Cycles table precompute app build failure 2024-12-31 00:50:44 +01:00
Thomas Dinges
1be75e86aa Cleanup: replace floatX_to_floatY() with make_floatY()
Now that function overloads are usable on all GPUs, replace the former explicit functions.

Pull Request: https://projects.blender.org/blender/blender/pulls/132067
2024-12-19 09:41:55 +01:00
Alaska
8e6a981487 Fix #131927: Cycles: Reduce uncertain light tree traversal in scenes with one distant light
When a scene contains distant lights and local lights, the first step
of the light tree traversal is to compute the importance of
distant lights vs local lights and pick one based on a random number.

In the specific case of when there is only one distant light,
the line of code that had been changed in this commit
effectively reduced to:
`min_importance = fast_cosf(x) < cosf(x) ? 0.0 : compute_min_importance`

And depending on the hardware, compiler, and the specific value being
tested, different configurations could take different code paths.

This commit fixes this issue by turning the comparison into
`fast_cosf(x) < fast_cosf(x)`.

---

Why does `cos_theta_plus_theta_u < cosf(bcone.theta_e - bcone.theta_o)`
reduce to `fast_cos(x) < cos(x)` in this specific case?

- `cos_theta_plus_theta_u` is computed as
`cos_theta * cos_theta_u - sin_theta * sin_theta_u`
- `cos_theta` is always 1.0 in the case of a single distant light.
- `cos_theta_u` is computed earlier as `fast_cosf(theta_e)` in
`distant_light_tree_parameters()`
- `sin_theta` is zero, and so that side of the equation doesn't matter.

This reduces `cos_theta_plus_theta_u` to `fast_cosf(theta_e)`.

`cosf(bcone.theta_e - bcone.theta_o)` reduces to `cosf(bcone.theta_e)`
because for the case of a single distant light `theta_o` is always 0.

Pull Request: https://projects.blender.org/blender/blender/pulls/131932
2024-12-17 10:51:43 +01:00
Thomas Dinges
22e16ca096 Cycles: add make_float4(float3 a, float b) type
This resolves a todo from the code. Part of the Quality Project.

Pull Request: https://projects.blender.org/blender/blender/pulls/131915
2024-12-17 09:11:08 +01:00
Aras Pranckevicius
35d7477371 Cycles: fix accuracy issues in fast_sin/fast_cos/fast_sincos
Most of these originate from OIIO of about 10 years ago. Integrate
the upstream fix from OIIO:
https://github.com/AcademySoftwareFoundation/OpenImageIO/commit/88feb65fc992

Cover them with unit tests. Before the fix, fast_sinf(1.57085085f)
was returning 0.0 instead of 1.0 as expected.

Revert previous hair workaround (a16879a5f0)

Co-authored-by: Sergey Sharybin <sergey@blender.org>
Pull Request: https://projects.blender.org/blender/blender/pulls/131957
2024-12-16 10:05:47 +01:00
Weizhen Huang
c99b7e66b2 Cycles: support Mie Scattering with particle size smaller than 5um
Previous implemenation of 5 < d < 50 was taken from the main paper,
fitting for smaller sizes are found in the supplemental. They are less
forward-scattering.

Pull Request: https://projects.blender.org/blender/blender/pulls/130234
2024-12-13 15:50:54 +01:00
Weizhen Huang
16132f8c79 Fix #117667: Remove volume density weight cutoff
`CLOSURE_WEIGHT_CUTOFF` avoids allocating a closure when its weight is
too small. It makes sense for surface closures, but for volume closures
the contribution also depends on the object size/ray length, such a
cutoff seems random and is causing problem in atmospheric scatterings.

Therefore remove the cutoff for volume, just make sure the weight is
positive.

Pull Request: https://projects.blender.org/blender/blender/pulls/131696
2024-12-13 10:28:49 +01:00
Weizhen Huang
27fc091be8 Fix #131723: Cycles volume not sampling channels with zero extinction
The original paper uses the single scattering albedo `sigma_s/sigma_t`
to pick a channel for sampling the scattering distance. However, this
only considers the situation where there is scattering inside the volume.
If some channel has an extinction coefficient of zero, the light passes
through without attenuation for that channel. We assign such channel
with a weight of 1 instead of 0 to make sure it can be sampled.

Pull Request: https://projects.blender.org/blender/blender/pulls/131741
2024-12-13 10:27:53 +01:00
Weizhen Huang
a16879a5f0 Fix #131240: Cycles: Negative integration range in Huang Hair
The cause is numerical issues with `fast_sinf()`. While fixing
`fast_sinf()` would ultimately fix the problem, it involves more
complications in other code paths, and it is safer to clamp the
integration range anyway.

Pull Request: https://projects.blender.org/blender/blender/pulls/131689
2024-12-10 21:56:04 +01:00
Weizhen Huang
bb3b8d78c2 Refactor: Cycles: split volume integration into smaller functions
Pull Request: https://projects.blender.org/blender/blender/pulls/131414
2024-12-06 16:23:04 +01:00
Weizhen Huang
59ad6d2b9c Refactor: Cycles: Extract code block to check homogeneous volume into a function 2024-12-06 16:23:00 +01:00
Weizhen Huang
13fb28581b Refactor: Cycles: Share function between volume scattering and shadowing 2024-12-06 16:23:00 +01:00
Weizhen Huang
910c2e2ba6 Refactor: Cycles: Add helper struct for stepping through the volume 2024-12-06 16:23:00 +01:00
Weizhen Huang
b16cfd2a87 Cleanup: Cycles: remove unused parameter absorption_only
`!vstate.absorption_only` is always false
2024-12-06 16:23:00 +01:00
Michael Jones
8fe2e37dd0 Fix #130641: MetalRT: Motion Blur (render errors)
This PR fixes #130641. The bug was caused by a missing self-object constraint when performing SSS on motion blur scenes. scene_intersect_local tests were erroneously hitting other objects, and out of range primitive IDs were causing spurious downstream behavior.

Pull Request: https://projects.blender.org/blender/blender/pulls/131156
2024-12-03 20:24:36 +01:00
Weizhen Huang
e2d7681fe6 Cleanup: Cycles: remove unused ccl_loop_no_unroll
Was added in 6121c28501 to ensure compiling
on OpenCL, now the definition is empty on all platforms

Pull Request: https://projects.blender.org/blender/blender/pulls/131100
2024-11-28 16:37:01 +01:00
Weizhen Huang
aa09169e0a Cleanup: Cycles: remove unused parameter skip_phase in volume
This logic is copied from surface shader, so that the sampled closure
does not need to be evaluated twice when summing all the closures, but
it is not used in volume.
2024-11-28 15:56:28 +01:00
Lukas Stockner
0de1cea5c5 Cycles: Use fused OptiX OSL programs
Based on #123377 by @brecht, but Gitea doesn't like the rebase these
so here's a new PR.

The purpose here is to switch to fused OptiX programs for OSL execution
on CUDA. On the one hand, this makes the code easier since, but there's
also another advantage - how memory allocation is managed.

OSL shaders need memory to store intermediate values, but how much is
needed depends on the complexity of the shader. With the split program
approach, Cycles had to provide that memory, so we had to allocate a
certain amount (2 KiB, to be precise) statically and show an error if
the shader would need more. If the shader used less (which is the case
for the vast majority), the memory was just wasted.

By switching to fused kernels, OSL knows the required amount during JIT
codegen, so it can allocate only what's required, which avoids this
waste. One still needs to set a maximum, and in theory, OSL would also
support spilling over into a Cycles-provided alternative memory region.
However, we currently don't implement that - instead, we default to the
same 2048 limit as before and let advanced users override it via the
CYCLES_OSL_GROUPDATA_ALLOC environment variable if really needed.

Co-authored-by: Brecht Van Lommel <brecht@blender.org>
Pull Request: https://projects.blender.org/blender/blender/pulls/130149
2024-11-26 23:58:32 +01:00
Thomas Dinges
5ddf8a6495 Merge branch 'blender-v4.3-release' 2024-11-18 19:14:10 +01:00
Lukas Stockner
1a40efbded Fix #130389: Cycles: Numerical issues in GGX_D with associative math flag
Turns out that with `-fassociative-math`, GCC turns
`(1.0f - cos_NH2) + alpha2 * cos_NH2` into
`cos_NH2 * (alpha2 - 1.0f) + 1.0f`.

Not sure why since the operation count is the same, but if alpha2 is very
small, `alpha2 - 1.0f` will be exactly -1.0f, which then causes issues.

Luckily, having one_minus_cos_NH2 as its own variable appears to be enough to
make GCC keep the original formulation.
Just to be safe, I've also used one_minus_cos_NH2 in the other branch to
hopefully reduce the chance of it being folded in again. Also turns a
division into a reciprocal, which is in theory slightly faster.

Pull Request: https://projects.blender.org/blender/blender/pulls/130469
2024-11-18 19:12:22 +01:00
Nikita Sirgienko
2aa9203f2f Cycles: Reintroduce noinline keyword for oneAPI device
In 891d71a4d4 this keyword was
dropped due to performance regression after
fdc2962beb, but currently code
does not experience this performance degradation, and in fact
there is minor performance improvement on Lunar Lake GPUs,
along with an expected improvement in compile time.
However, this change brings a minor performance regression to
shade_surface kernel on Intel Arc and Meteor Lake GPUs, which
will be solved later by disabling this keyword for
these platforms only.

Pull Request: https://projects.blender.org/blender/blender/pulls/130299
2024-11-15 12:09:37 +01:00
weizhen
9488375049 Fix: Cycles: Missing inclusion in CMakeLists 2024-11-12 15:02:59 +01:00
Weizhen Huang
93a34b1077 Refactor: Cycles: add helper struct Interval
To improve readability

Pull Request: https://projects.blender.org/blender/blender/pulls/130156
2024-11-12 12:06:09 +01:00
Weizhen Huang
675e8173fa Refactor: Cycles: separate volume stack and single entry evaluation
So that volume shader evaluation does not rely on the volume stack.
In the future this is useful for baking the density when building the
volume Octree.

Pull Request: https://projects.blender.org/blender/blender/pulls/130157
2024-11-12 12:05:37 +01:00
Weizhen Huang
90ed91dfdb Cleanup: Cycles: Add Kernel prefix to light tree bounding shapes
BoundingBox -> KernelBoundingBox
BoundingCone -> KernelBoundingCone

Pull Request: https://projects.blender.org/blender/blender/pulls/130141
2024-11-11 17:13:55 +01:00
Weizhen Huang
ec3128ee37 Cleanup: Cycles: Remove unused functions
Wasn't used even when they were added
2024-11-11 16:42:50 +01:00
Weizhen Huang
e9593a6619 Cleanup: Cycles: update light tree paper link
The original one was expired
2024-11-11 15:46:52 +01:00
Sergey Sharybin
40ba18c4cd Merge branch 'blender-v4.3-release' 2024-11-08 17:22:02 +01:00
Sergey Sharybin
e5de274faf Fix: Cycles HIP-RT compilation happens in parallel with CUDA
This was an oversight in #129945: the cycles_kernel_hiprt was not
handled in the original code, hence it was missing in the change.

Pull Request: https://projects.blender.org/blender/blender/pulls/130037
2024-11-08 17:21:20 +01:00
Sergey Sharybin
9abf9b15f4 Merge branch 'blender-v4.3-release' 2024-11-08 11:06:11 +01:00
Sergey Sharybin
f58522fc10 Cycles: Tweak scheduling of GPU kernel compilation
This change makes it so only kernels of the same vendor are compiled in
parallel. For example for the release builds it will be:

1. All CUDA kernels
2. All OptiX kernels
3. All HIP kernels
4. All OneAPI kernels

This potentially leads to a lower CPU utilization, but it makes it much
easier to manage memory usage and tweak per-vendor concurrency.

The goal of this change is to solve occasional out-of-memory during the
GPU kernels compilation step on the CI/CD farm.

This change also includes tweaks to the prallel jobs for HIP-RT and
oneAPI. The tweak is based on measuring apparent memory usage peak on
Linux when doing single-thread compilation, and giving some safe margin
from the available memory on the buildbot.

Pull Request: https://projects.blender.org/blender/blender/pulls/129945
2024-11-08 11:05:38 +01:00
Patrick Mours
d0dd587b60 Fix #108372: GPU implementation of OSL matrix intrinsic functions
All the OSL matrix functions had been implemented using the
`Transform` utility of Cycles, but that's built around a 4x3 matrix,
when the OSL matrix functions are working with 4x4 matrices.
This resulted in them not producing results consistent with the
CPU implementation.

This fixes that by making use of the `ProjectionTransform` utility
of Cycles instead, because it's built around a 4x4 matrix. Since
matrix inversion is required, I had to make a few more utility
functions available on the GPU (except Metal, due to use of
references/pointers without specification) that were previously
CPU-only.

Co-authored-by: Brecht Van Lommel <brecht@blender.org>

Pull Request: https://projects.blender.org/blender/blender/pulls/110102
2024-11-04 17:59:29 +01:00