In volume segment, the minimal angle formed by the emitter bounding cone
axis and the vector pointing from the cluster centroid to any point on
the ray is computed via `dot(bcone.axis, point_to_centroid)`, see Fig.8.
in paper.
For distant light this angle is 0, but due to numerical issues this is
not always true. Therefore explicitly assign `-bcone.axis` to
`point_to_centroid` in this case.
Pull Request: https://projects.blender.org/blender/blender/pulls/129489
Looks like some recent changes in the driver broke an assumption
the OptiX denoiser code in Cycles made about being able to set it up
with a different input size than later used to invoke it, which caused
broken output on older GPU architectures. This commit fixes that by
ensuring the input image size passed to `optixDenoiserSetup` matches
that passed to `optixDenoiserInvoke`, even when no tiling is used
(which is the common case).
Pull Request: https://projects.blender.org/blender/blender/pulls/129398
This PR fixes a latent issue arising from invalid use of `accept_any_intersection(true)` when performing SSS ray-stepping with MetalRT. The comment incorrectly states that "we can optimize and accept the first hit", but to guarantee correct behaviour in future we need to request the closest hit.
Draine phase function sampling internally use Henyey-Greenstein and
Rayleigh sampling for degenerated cases, but the sampling pattern was
different between Draine and Rayleigh. The commit effectively replace
`rand` with `1 - rand` in Rayleigh sampling.
Pull Request: https://projects.blender.org/blender/blender/pulls/129261
When `g == 0`, the Draine phase function from
https://doi.org/10.1051/0004-6361/202142437 simplifies to
\[\Phi(\theta)=\frac{3}{4\pi(3+\alpha)}(1+\alpha\cos^2\theta).\]
Similar as Rayleigh sampling in https://doi.org/10.1364/JOSAA.28.002436,
The solution to the CDF of the marginal density function is
\[\cos^3\theta+a\cos\theta+b=0,\]
with
\[a=\frac{3}{\alpha},\quad b=\frac{3+\alpha}{\alpha}(2\xi_1-1),\]
which has only one real root since \(\alpha > 0\),
resulting in the sample technique
\[\cos\theta=u-\frac{1}{\alpha u}.\]
Pull Request: https://projects.blender.org/blender/blender/pulls/129259
Running in a headless weston session was asserting, then crashing,
causing WITH_UI_TESTS to fail.
Resolve by accounting for a wayland sessions without a seat.
Fix the failing rendering of volumes on Windows with HIP SDK 6.1
by reducing the optimization level.
There should be no functional or performance difference for the average
user as the Blender foundation currently does not use HIP SDK 6.1
on Windows. This change is primarily to fix issues for community members
building Blender locally.
Pull Request: https://projects.blender.org/blender/blender/pulls/128836
Previously, in case of a failure during BVH transfer, when running out
of memory for example, we could get an error such as "BVH failed to
migrate to the GPU due to Embree library error (no error)", because
embree error status was actually reset before being queried.
This commit fixes its propagation.
Pull Request: https://projects.blender.org/blender/blender/pulls/129022
Multiple threads can access the same device queue from different
threads. This could happen when doing a cycles preview render, baking
eevee volume probes or generating material previews.
This PR adds a mutex around access to the device queues.
Detected when researching #128608
Pull Request: https://projects.blender.org/blender/blender/pulls/128974
This probably should always have been the value used, really.
Now, instead of reporting `Qualcomm Technologies Inc`, it reports the more informative `Snapdragon(R) X Elite - X1E78100 - Qualcomm(R) Oryon(TM) CPU` on a Thinkpad T14s Gen6 device.
Pull Request: https://projects.blender.org/blender/blender/pulls/128808
Dropping files could crash ~10% of the time on some systems,
although I wasn't able to reproduce the error.
The ownership of GWL_Seat::data_offer_dnd wasn't handled correctly,
where the value could be handled by both wl_data_device_listener::leave
& drop callbacks.
Resolve by ensuring the data-offer is handled by the drop callback.
getpwuid for accessing home wasn't used when looking up the path
for older Blender versions. There is no reason for the code-paths
to differ. Use a shared utility function to access home.
This was already done in GHOST, but not BKE_appdir_folder_home.
Also null check the return value from getpwuid() as it's not
guaranteed to be non-null.
Fixes a issue where the Principled BSDF would render incorrectly if
`__SUBSURFACE__` is off. Which is common when using adaptive kernel
compilation (a unsupported Cycles feature).
Pull Request: https://projects.blender.org/blender/blender/pulls/128003
Turns out it is possible to have code to pick up wrong class
when defining a friend:
```
intern\cycles\device/memory.h(255): warning C4099: 'GPUDevice': type name first seen using 'struct' now seen using 'class'
source\blender\gpu\GPU_platform.hh(69): note: see declaration of 'GPUDevice'
```
Now made it so the classes have forward declaration in the CCL
namespace, avoiding possible conflict with the classes with the
same name in the global namespace.
Pull Request: https://projects.blender.org/blender/blender/pulls/128485
This adds feature parity with Cycles regarding light and shadow liking.
Technically, this extends the GBuffer header to 32 bits, and uses
the top bits to store the object's light set membership index.
The same index is also added to `ObjectInfo` in place of padding bytes.
For shadow linking, the shadow blocker sets bitmask is stored per
tilemap. It is then used during the GPU culling phase to cull objects
that do not belong to the shadow's sets.
Co-authored-by: Clément Foucault <foucault.clem@gmail.com>
Pull Request: https://projects.blender.org/blender/blender/pulls/127514
Previously, Cycles only supported the Henyey-Greenstein phase function for volume scattering.
While HG is flexible and works for a wide range of effects, sometimes a more physically accurate
phase function may be needed for realism.
Therefore, this adds three new phase functions to the code:
Rayleigh: For particles with a size below the wavelength of light, mostly athmospheric scattering.
Fournier-Forand: For realistic underwater scattering.
Draine: Fairly specific on its own (mostly for interstellar dust), but useful for the next entry.
Mie: Approximates Mie scattering in water droplets using a mix of Draine and HG phase functions.
These phase functions can be combined using Mix nodes as usual.
Co-authored-by: Lukas Stockner <lukas@lukasstockner.de>
Pull Request: https://projects.blender.org/blender/blender/pulls/123532
This PR introduces support for the extension `VK_KHR_fragment_shader_barycentric`,
and includes a few miscellaneous improvements related to it.
1. Add support for `VK_KHR_fragment_shader_barycentric`, if the physical device
supports it. Otherwise, gpu_BaryCoord is generated through an injected geom
shader, like it was previously.
2. Simplify the logic of checking has_geometry_stage in vert shader.
3. Fix a potential issue of location mismatch in an injected geom shader.
Related to #127687Resolves#126228
Pull Request: https://projects.blender.org/blender/blender/pulls/127995
This enables most of the GPU compiler's optimizations while -ffast-math
isn't set at DPC++ level.
It brings an overall 1% speedup and currently doesn't change the unit
tests pass rate.
This enables three additional math optimizations:
-ffp-contract=fast (enables FMA generation)
-freciprocal-math (enables x/y -> x*(1/y))
-fassociative-math (enables e.g. a*b + c*b -> (a+c)*b)
These are used on Windows and HIP anyways, so our code can't expect exact IEEE
semantics in any case.
The only difference between the new set and -ffast-math is that we don't use
-ffinite-math-only since this causes issues with the BVH (see ce1f2e271d) and
breaks e.g. isnan.
This causes a ~1.5% speedup in my very quick test, but might be higher for some
more math-intensive cases.
Pull Request: https://projects.blender.org/blender/blender/pulls/128342
The only difference between Windows+Clang and the others is a prefix, so use
some CMake logic to just prepend that to all flags instead of duplicating them.
Pull Request: https://projects.blender.org/blender/blender/pulls/128342
Since the introduction of vulkan loader in vulkan (not Blender) the
molten vk extension always leads to a failed registration. This
extension is only available when using the loader. Blender doesn't use
the vulkan loader so we should remove it.
Pull Request: https://projects.blender.org/blender/blender/pulls/128354
Maintenance 4 is an optional extension, however it was added to required
extension when the extension is availble. This would also be checked
when adding it as an optional extension.
Detected when reviewing !127995
Pull Request: https://projects.blender.org/blender/blender/pulls/128353
Remove the indirection previously used for the topology refiner
to separate C and C++ code. Instead retrieve the base level in
calling code and call opensubdiv API functions directly. This
avoids copying arrays of mesh indices and should reduce
function call overhead since index retrieval can now be inlined.
It also lets us remove a lot of boilerplate shim code.
The downside is increased need for WITH_OPENSUBDIV defines
in various parts of blenkernel, but I think that is required to avoid
the previous indirection and have the kernel deal with OpenSubdiv
more directly.
Pull Request: https://projects.blender.org/blender/blender/pulls/120825
Happens for renders from command line, when kernel specialization
thread is still working after the allocators on the Blender side
have been deinitialized.
Add an explicit deinitializaiton, which ensures all Cycles worker
and cache threads are finished before the allocators are deinitialized.
This should solve occasional crashes when running regression tests
for Metal or Metal-RT.
Pull Request: https://projects.blender.org/blender/blender/pulls/128239
This improve the API in multiple aspects:
* No need for an additional `lookup` call to get the current attribute. This
would internally iterate over all attributes again. This leads to O(n^2)
behavior. Note that there are still other reasons for O(n^2) behavior when
processing attributes (where n is the number of attributes).
* Remove the need to return a value from the iteration code to indicate that the
iteration should continue. This is now the default behavior. The iteration can
still be stopped by calling `iter.stop()`.
* Easier access to `is_builtin` property.
* Iterator callback only has a single parameter instead of two (of which one is
sometimes unused).
Pull Request: https://projects.blender.org/blender/blender/pulls/128128
The issue was that cycles did not take ownership of the entire `VolumeGridData`.
The `VolumeTreeAccessToken` that was used already only makes sure that the
grid is not unloaded when it was read from disk for example. It does not extend
the lifetime of the `VolumeGridData` as a whole.
Previous fix (b17734598d) tried to fix this, but it seems that
depending on CPU and video width, ffmpeg might need at least 64
byte alignment, even if CPU we're running on is only AVX2 (32 byte
alignment). That is because some other parts of ffmpeg code
statically pick 64 or 32 byte alignment internally, depending on whether
AVX512 support is even compiled in (even if CPU might not have it).
I have checked whether this does not negatively affect a platform where
SIMD alignment is always 16 (Mac), and it does not, everything works as
expected.
Pull Request: https://projects.blender.org/blender/blender/pulls/128107