40 Commits

Author SHA1 Message Date
Amogh Shivaram
2bd06093c7 Cycles: Thin film iridescence for metals
Applies thin film iridescence to metals in Metallic BSDF and Principled BSDF.

To get the complex IOR values for each spectral band from F82 Tint colors,
the code uses the parametrization from "Artist Friendly Metallic Fresnel",
where the g parameter is set to F82. This IOR is used to find the phase shift,
but reflectance is still calculated with the F82 Tint formula after adjusting
F0 for the film's IOR.

Co-authored-by: Lukas Stockner <lukas@lukasstockner.de>
Co-authored-by: Weizhen Huang <weizhen@blender.org>
Co-authored-by: RobertMoerland <rmoerlandrj@gmail.com>
Pull Request: https://projects.blender.org/blender/blender/pulls/141131
2025-09-29 02:58:20 +02:00
Weizhen Huang
8d6b935466 Fix #144711: Cycles: rescale throughput when density is above the majorant
The one-sample Monte Carlo estimator of the radiative transfer equation
is
       <L> = T(t) / p(t) * (L_e + σ_s * L_s + σ_n * L),
Which means we can also use another p(t) than majorant * exp(-majorant * t)
for sampling the distance. Thus, we use the baked σ_max for distance
sampling, but adjust the majorant when we encounter a density that is
larger than σ_max.

Note that this is not really unbiased because such scaling is not always
applied, but seems to work well in practice when the majorant is
reasonable.

Pull Request: https://projects.blender.org/blender/blender/pulls/146589
2025-09-23 11:17:19 +02:00
Weizhen Huang
df496eb894 Cycles: use one-tap stochastic interpolation for volume
It has ~1.2x speed-up on CPU and ~1.5x speed-up on GPU (tested on Metal
M2 Ultra).

Individual samples are noisier, but equal time renders are mostly
better.

Note that volume emission renders differently than before.

Pull Request: https://projects.blender.org/blender/blender/pulls/144451
2025-08-14 15:22:44 +02:00
Weizhen Huang
b2b2d9a4f3 Cycles: Render volume by ray marching through octrees
One octree per volume per shader based on the density. In preparation
for the null scattering
2025-08-13 10:28:50 +02:00
Amogh Shivaram
ff4d840cf8 Cycles: Add polarized Fresnel function for conductors
This PR adds a new `fresnel_conductor_polarized` function, which calculates reflectance and phase shift (if requested) for both parallel and perpendicular polarized light. This is needed for applying thin film iridescence to conductors (see !141131).

For consistency, this PR also makes `fresnel_conductor` call `fresnel_conductor_polarized` instead of using a fast approximation of the Fresnel equations that is inaccurate at lower n and k values. This will change the output of some Metallic BSDF renders using Physical Conductor and prevent discrepancies when enabling thin film iridescence.

I didn't do any rigorous performance testing, but from timing the functions outside of Blender, `fresnel_conductor_polarized` is significantly slower than the approximation, between 1.5-3x depending on the compiler. This makes sense because it has three square roots and the approximation has none. In some informal tests with metallic_multiggx_physical.blend modified to have more spheres, the new renders took around 1-2% longer on both CPU and GPU.

There are some avoidable inefficiencies in this approach of just calling `fresnel_conductor_polarized`:

- one of the three square roots could be saved since `fresnel_conductor` never needs the phase shift and there are simplifications possible when only calculating the reflectance
- there are several unnecessary multiplications by 1.0 since `fresnel_conductor` uses relative IOR and `fresnel_conductor_polarized` doesn't, though those could get optimized out if inlined

Pull Request: https://projects.blender.org/blender/blender/pulls/143903
2025-08-04 15:36:36 +02:00
Weizhen Huang
ea45c776fd Cycles: introduce dual types
to replace some uses of dfdx/dfdy/differentials.
No functional change expected.

Pull Request: https://projects.blender.org/blender/blender/pulls/143178
2025-07-28 17:34:24 +02:00
Weizhen Huang
345d23bff8 Cleanup: Cycles: add more float3 util functions
and vectorize `wrap` and `safe_fmod`.
2025-07-28 17:34:21 +02:00
Lukas Stockner
eaa5f63ba2 Cycles: Replace thin-film basis function approximation with accurate LUTs
Previously, we used precomputed Gaussian fits to the XYZ CMFs, performed
the spectral integration in that space, and then converted the result
to the RGB working space.

That worked because we're only supporting dielectric base layers for
the thin film code, so the inputs to the spectral integration
(reflectivity and phase) are both constant w.r.t. wavelength.

However, this will no longer work for conductive base layers.
We could handle reflectivity by converting to XYZ, but that won't work
for phase since its effect on the output is nonlinear.

Therefore, it's time to do this properly by performing the spectral
integration directly in the RGB primaries. To do this, we need to:
- Compute the RGB CMFs from the XYZ CMFs and XYZ-to-RGB matrix
- Resample the RGB CMFs to be parametrized by frequency instead of wavelength
- Compute the FFT of the CMFs
- Store it as a LUT to be used by the kernel code

However, there's two optimizations we can make:
- Both the resampling and the FFT are linear operations, as is the
  XYZ-to-RGB conversion. Therefore, we can resample and Fourier-transform
  the XYZ CMFs once, store the result in a precomputed table, and then just
  multiply the entries by the XYZ-to-RGB matrix at runtime.
  - I've included the Python script used to compute the table under
    `intern/cycles/doc/precompute`.
- The reference implementation by the paper authors [1] simply stores the
  real and imaginary parts in the LUT, and then computes
  `cos(shift)*real + sin(shift)*imag`. However, the real and imaginary parts
  are oscillating, so the LUT with linear interpolation is not particularly
  good at representing them. Instead, we can convert the table to
  Magnitude/Phase representation, which is much smoother, and do
  `mag * cos(phase - shift)` in the kernel.
  - Phase needs to be unwrapped to handle the interpolation decently,
    but that's easy.
  - This requires an extra trig operation in the kernel in the dielectric case,
    but for the conductive case we'll actually save three.

Rendered output is mostly the same, just slightly different because we're
no longer using the Gaussian approximation.

[1] "A Practical Extension to Microfacet Theory for the Modeling of
    Varying Iridescence" by Laurent Belcour and Pascal Barla,
    https://belcour.github.io/blog/research/publication/2017/05/01/brdf-thin-film.html

Pull Request: https://projects.blender.org/blender/blender/pulls/140944
2025-07-09 22:10:28 +02:00
Brecht Van Lommel
0e7a696819 Cleanup: Unused arguments in Cycles kernel
And add back the compiler flag that hid them.

Pull Request: https://projects.blender.org/blender/blender/pulls/139497
2025-05-27 21:30:45 +02:00
Weizhen Huang
1f01a1aee9 Cleanup: remove unnecessary defined(__KERNEL_METAL__)
The top level guard is already `#ifndef __KERNEL_METAL__`, additional
guard is not only unnecessary but also confusing.
2025-05-05 18:35:24 +02:00
Sergey Sharybin
30b962b3d8 Cycles: Optimize 3d and 4d noise
The goal is to reduce the affect of the fmod() used in the noise code,
which was initially reported in the comment:

    https://projects.blender.org/blender/blender/pulls/119884#issuecomment-1258902

Basic idea is to benefit from SIMD vectorization on CPU.

Tested on Linux i9-11900K and macOS on M2 Ultra, in both cases performance
after this change is very close to what it could be with the fmod() commented
out (the call itself, `p = p + precision_correction`).

On macOS the penalty of fmod() was about 10%, on Linux it was closer to 30%
when built with GCC-13. With Linux builds from the buildbot it is more like 18%.

The optimization is only done for 3d and 4d noise. It might be possible to
gain some performance improvement for 1d and 2d cases, but the approach would
need to be different: we'd need to optimize scalar version fmodf(). Maybe
tricks with integer cast will be faster (since we are a bit optimistic in the
kernel and do not guarantee exact behavior in extreme cases such as NaN inputs).

Pull Request: https://projects.blender.org/blender/blender/pulls/137109
2025-04-09 13:40:10 +02:00
Sergey Sharybin
5b0ed683a0 Cycles: Make select() and mask() for vectorized float work on CPU and GPU
Pull Request: https://projects.blender.org/blender/blender/pulls/137148
2025-04-08 17:04:18 +02:00
Lukas Stockner
8cb5e05c48 Cleanup: Cycles: Deduplicate kernel attribute code using templating
The attribute handling code in the kernel is currently highly duplicated since
it needs to handle five different data types and we couldn't use templates
back then.
We can now, so might as well make use of it and get rid of ~1000 lines.

There are also some small fixes for the GPU OSL code:
- Wrong derivative for .w component when converting float2/float3->float4
- Different conversion for float2->float (CPU averages, GPU used to take .x)
- Removed useless code for converting to float2, not used by OSL

Pull Request: https://projects.blender.org/blender/blender/pulls/134694
2025-02-20 19:28:45 +01:00
Brecht Van Lommel
f80f97ca0d Refactor: Cycles: Rename rcp to reciprocal
To avoid symbol conflicts with upcoming HIP changes. Also remove
unused implementations for float4 and float8.

Pull Request: https://projects.blender.org/blender/blender/pulls/134045
2025-02-04 18:59:24 +01:00
Brecht Van Lommel
612cb61199 Cleanup: Cycles: Use simpler make_float3 for single value 2025-01-13 10:07:37 +01:00
Brecht Van Lommel
57ff24cb99 Refactor: Cycles: Add const keyword to more function parameters
Pull Request: https://projects.blender.org/blender/blender/pulls/132361
2025-01-03 10:23:24 +01:00
Brecht Van Lommel
dd51c8660b Refactor: Cycles: Add const keyword where possible, using clang-tidy
Check was misc-const-correctness, combined with readability-isolate-declaration
as suggested by the docs.

Temporarily clang-format "QualifierAlignment: Left" was used to get consistency
with the prevailing order of keywords.

Pull Request: https://projects.blender.org/blender/blender/pulls/132361
2025-01-03 10:23:20 +01:00
Brecht Van Lommel
7db0bc2e64 Refactor: Cycles: Make math and type headers work by themselves
Remove separate impl.h headers, shuffle around some code and add more
includes so that individual header files compile without errors.

Pull Request: https://projects.blender.org/blender/blender/pulls/132361
2025-01-03 10:22:47 +01:00
Brecht Van Lommel
f53e13411b Refactor: Cycles: Use #pragma once
Pull Request: https://projects.blender.org/blender/blender/pulls/132361
2025-01-03 10:22:45 +01:00
Alaska
ba5d76e7e2 Fix: Shader: Align vector math node reflect mode with OSL
Align Cycles SVM and EEVEE's rendering of the vector math node
in reflect mode with OSL when the normal vector is 0,0,0.

This is done by using safe_normalize rather than normalize on the
normal vector. Which also fixes a NaN in the reflect mode in this
specific configuration.

Pull Request: https://projects.blender.org/blender/blender/pulls/125688
2024-08-02 11:20:57 +02:00
Xavier Hallade
a7171e0391 Cycles: oneAPI: Cleanup: Change packed_float3 to float3
float3 is already packed in oneAPI.
No functional change is expected.
2024-06-04 19:42:52 +02:00
Lukas Stockner
17f2cdd104 Cycles: Add thin film iridescence to Principled BSDF
This is an implementation of thin film iridescence in the Principled BSDF based on "A Practical Extension to Microfacet Theory for the Modeling of Varying Iridescence".

There are still several open topics that are left for future work:
- Currently, the thin film only affects dielectric Fresnel, not metallic. Properly specifying thin films on metals requires a proper conductive Fresnel term with complex IOR inputs, any attempt of trying to hack it into the F82 model we currently use for the Principled BSDF is fundamentally flawed. In the future, we'll add a node for proper conductive Fresnel, including thin films.
- The F0/F90 control is not very elegantly implemented right now. It fundamentally works, but enabling thin film while using a Specular Tint causes a jump in appearance since the models integrate it differently. Then again, thin film interference is a physical effect, so of course a non-physical tweak doesn't play nicely with it.
- The white point handling is currently quite crude. In short: The code computes XYZ values of the reflectance spectrum, but we'd need the XYZ values of the product of the reflectance spectrum and the neutral illuminant of the working color space. Currently, this is addressed by just dividing by the XYZ values of the illuminant, but it would be better to do a proper chromatic adaptation transform or to use the proper reference curves for the working space instead of the XYZ curves from the paper.

Pull Request: https://projects.blender.org/blender/blender/pulls/118477
2024-05-02 14:28:44 +02:00
Hoshinova
c78c6b0bdf Fix #119797: Noise Texture Precision Issues
The Perlin noise algorithms suffer from precision issues when a coordinate
is greater than about 250000.

To fix this the Perlin noise texture is repeated every 100000 on each axis.
This causes discontinuities every 100000, however at such scales this
usually shouldn't be noticeable.

Pull Request: https://projects.blender.org/blender/blender/pulls/119884
2024-03-29 16:12:23 +01:00
Anthony Roberts
3d5fa7698f Cycles: Add Windows ARM64 support
Ref #119126

Pull Request: https://projects.blender.org/blender/blender/pulls/117036
2024-03-06 16:14:34 +01:00
Thomas Dinges
30a22b92ca Cycles: Rename SSE4.1 kernel to SSE4.2
This commit updates all defines, compiler flags and cleans up some code for unused CPU capabilities.

There should be no functional change, unless it's run on a CPU that supports sse41 but not sse42. It will fallback to the SSE2 kernel in this case.

In preparation for the new SSE4.2 minimum in Blender 4.2.

Pull Request: https://projects.blender.org/blender/blender/pulls/118043
2024-02-09 17:25:58 +01:00
Alaska
9b3699db67 Fix: Cycles invalid normals in various situations
Fix issues related to NaN normals in some situations by trying
to detect when these cases might occur and just reverting back
to default normals.

As a side effect of these changes, OSL now behaves correctly
when given a non-normalized normal.

Pull Request: https://projects.blender.org/blender/blender/pulls/114960
2024-01-02 16:24:04 +01:00
Brian Savery (AMD)
e90ceefd18 Change packed_float3 to internal float3 for HIP
Remove a workaround for no built in vector type in HIP fixed in ROCM 5.1:
    https://github.com/ROCm-Developer-Tools/HIP/issues/706

Shouldn't have any functional change.  Mainly cleaning up the code.

Co-authored-by: arskell <arskell221@gmail.com>
Pull Request: https://projects.blender.org/blender/blender/pulls/111122
2023-08-15 15:40:29 +02:00
Campbell Barton
c12994612b License headers: use SPDX-FileCopyrightText in intern/cycles 2023-06-14 16:53:23 +10:00
Hoshinova
144ad4d20b Nodes: add Fractal Voronoi Noise
Fractal noise is the idea of evaluating the same noise function multiple times with
different input parameters on each layer and then mixing the results. The individual
layers are usually called octaves.
The number of layers is controlled with a "Detail" slider.
The "Lacunarity" input controls a factor by which each successive layer gets scaled.

The existing Noise node already supports fractal noise. Now the Voronoi Noise node
supports it as well. The node also has a new "Normalize" property that ensures that
the output values stay in a [0.0, 1.0] range. That is except for the F2 feature where
in rare cases the output may be outside that range even with "Normalize" turned on.

How the individual octaves are mixed depends on the feature and output socket:
- F1/Smooth F1/F2:
  - Distance/Color output:
    The individual Distance/Color octaves are first multiplied by a factor of
    `Roughness ^ (#layers - 1.0)` then added together to create the final output.
  - Position output:
    Each Position octave gets linearly interpolated with the combined output of the
    previous octaves. The Roughness input serves as an interpolation factor with
    0.0 resutling in only using the combined output of the previous octaves and
    1.0 resulting in only using the current highest octave.
- Distance to Edge:
  - Distance output:
    The Distance octaves are mixed exactly like the Position octaves for F1/Smooth F1/F2.

It should be noted that Voronoi Noise is a relatively slow noise function, especially
at higher dimensions. Increasing the "Detail" makes it even slower. Therefore, when
optimizing a scene one should consider trying to use simpler noise functions instead
of Voronoi if the final result is close enough.

Pull Request: https://projects.blender.org/blender/blender/pulls/106827
2023-06-13 09:18:12 +02:00
William Leeson
6c03339e48 Cycles: reduce mesh memory usage by unflattening
To improve mesh upload speeds and reduce the size of the scene data which allows larger scenes to be rendered.

The meshes in Cycles are currently stored as flattened meshes, where each triangle is stored as a set of 3 vertices. Unflattening writes out the vertices in a list according to the index buffer. This uses a lot of memory and for current hardware does not provide a noticeable benefit. This change unflattens the mesh by directly using the meshes vertex and index buffers directly and skips the unflattening. This change allows for larger scenes and also a reduction in the sizes of the meshes. Further it results in a decrease the amount of time it takes to upload the data to a GPU. This is especially important for when multiple GPUs are used in a single machine.

Pull Request #105173
2023-02-27 10:39:19 +01:00
Brecht Van Lommel
e1b3d91127 Refactor: replace Cycles sse/avx types by vectorized float4/int4/float8/int8
The distinction existed for legacy reasons, to easily port of Embree
intersection code without affecting the main vector types. However we are now
using SIMD for these types as well, so no good reason to keep the distinction.

Also more consistently pass these vector types by value in inline functions.
Previously it was partially changed for functions used by Metal to avoid having
to add address space qualifiers, simple to do it everywhere.

Also removes function declarations for vector math headers, serves no real
purpose.

Differential Revision: https://developer.blender.org/D16146
2022-11-08 12:28:40 +01:00
Lukas Stockner
0c50f9c4aa Fix T98672: Noise texture shows incorrect behaviour for large scales
This was a floating point precision issue - or, to be more precise,
an issue with how Cycles split floats into the integer and fractional
parts for Perlin noise.
For coordinates below -2^24, the integer could be wrong, leading to
the fractional part being outside of 0-1 range, which breaks all sorts
of other things. 2^24 sounds like a lot, but due to how the detail
octaves work, it's not that hard to reach when combined with a large
scale.

Since this code is originally based on OSL, I checked if they changed
it in the meantime, and sure enough, there's a fix for it:
https://github.com/OpenImageIO/oiio/commit/5c9dc68391e9

So, this basically just ports over that change to Cycles.

The original code mentions being faster, but as pointed out in the
linked commit, the performance impact is actually irrelevant.

I also checked in a simple scene with eight Noise textures at
detail 15 (with >90% of render time being spent on the noise), and
the render time went from 13.06sec to 13.05sec. So, yeah, no issue.
2022-10-16 02:34:10 +02:00
Brecht Van Lommel
023eb2ea7c Cycles: more closely match some math and intersection operations in Embree
This helps with debugging, and gives a slightly closer match between CPU
and CUDA/HIP/Metal renders when it comes to ray tracing precision.
2022-07-25 13:27:40 +02:00
Andrii Symkin
c2a2f3553a Cycles: unify math functions names
This patch unifies the names of math functions for different data types and uses
overloading instead. The goal is to make it possible to swap out all the float3
variables containing RGB data with something else, with as few as possible
changes to the code. It's a requirement for future spectral rendering patches.

Differential Revision: https://developer.blender.org/D15276
2022-06-23 15:02:53 +02:00
Brecht Van Lommel
9cfc7967dd Cycles: use SPDX license headers
* Replace license text in headers with SPDX identifiers.
* Remove specific license info from outdated readme.txt, instead leave details
  to the source files.
* Add list of SPDX license identifiers used, and corresponding license texts.
* Update copyright dates while we're at it.

Ref D14069, T95597
2022-02-11 17:47:34 +01:00
Michael Jones
f613c4c095 Cycles: MetalRT support (kernel side)
This patch adds MetalRT support to Cycles kernel code. It is mostly additive in nature or confined to Metal-specific code, however there are a few areas where this interacts with other code:

- MetalRT closely follows the Optix implementation, and in some cases (notably handling of transforms) it makes sense to extend Optix special-casing to MetalRT. For these generalisations we now have `__KERNEL_GPU_RAYTRACING__` instead of `__KERNEL_OPTIX__`.
- MetalRT doesn't support primitive offsetting (as with `primitiveIndexOffset` in Optix), so we define and populate a new kernel texture, `__object_prim_offset`, containing per-object primitive / curve-segment offsets. This is referenced and applied in MetalRT intersection handlers.
- Two new BVH layout enum values have been added: `BVH_LAYOUT_METAL` and `BVH_LAYOUT_MULTI_METAL_EMBREE` for XPU mode). Some host-side enum case handling has been updated where it is trivial to do so.

Ref T92212

Reviewed By: brecht

Maniphest Tasks: T92212

Differential Revision: https://developer.blender.org/D13353
2021-11-29 15:20:26 +00:00
Michael Jones
d19e35873f Cycles: several small fixes and additions for MSL
This patch contains many small leftover fixes and additions that are
required for Metal-enablement:

- Address space fixes and a few other small compile fixes
- Addition of missing functionality to the Metal adapter headers
- Addition of various scattered `__KERNEL_METAL__` blocks (e.g. for
  atomic support & maths functions)

Ref T92212

Differential Revision: https://developer.blender.org/D13263
2021-11-18 14:38:02 +01:00
Brecht Van Lommel
9937d5379c Cycles: add packed_float3 type for storage
Introduce a packed_float3 type for smaller storage that is exactly 3
floats, instead of 4. For computation float3 is still used since it can
use SIMD instructions.

Ref T92212

Differential Revision: https://developer.blender.org/D13243
2021-11-17 17:29:41 +01:00
William Leeson
7b1c5712f8 Cycles: Replace saturate with saturatef
saturate is depricated in favour of __saturatef this replaces saturate
with __saturatef on CUDA by createing a saturatef function which replaces
all instances of saturate and are hooked up to the correct function on all
platforms.

Reviewed By: brecht

Differential Revision: https://developer.blender.org/D13010
2021-10-27 14:05:46 +02:00
Brecht Van Lommel
fd25e883e2 Cycles: remove prefix from source code file names
Remove prefix of filenames that is the same as the folder name. This used
to help when #includes were using individual files, but now they are always
relative to the cycles root directory and so the prefixes are redundant.

For patches and branches, git merge and rebase should be able to detect the
renames and move over code to the right file.
2021-10-26 15:37:04 +02:00