sycl/L0 runtime reports compute-runtime version since Intel graphics
driver 101.3268 on Windows, when querying driver version from sycl.
Prior to this driver, it was 0. Now we can bump minimum requirement to
this one and filter-out devices returning 0.
Maniphest Tasks: T100648
This patch is a response to T92588 and is implemented
as a Function/Shader node.
This node has support for Float, Vector and Color data types.
For Vector it supports uniform and non-uniform mixing.
For Color it now has the option to remove factor clamping.
It replaces the Mix RGB for Shader and Geometry node trees.
As discussed in T96219, this patch converts existing nodes
in .blend files. The old node is still available in the
Python API but hidden from the menus.
Reviewed By: HooglyBoogly, JacquesLucke, simonthommes, brecht
Maniphest Tasks: T92588
Differential Revision: https://developer.blender.org/D13749
The calculation was revised to address two issues:
* Discontinuities occurring when detail was a non-integer greater than 2.
* Levels of detail in the interval [0,1) repeating the levels of detail in
the interval [1,2).
This fixes Cycles, Eevee and geometry nodes.
Differential Revision: https://developer.blender.org/D15785
Fix typo in blender_release.cmake, and ensure that "make release" still works
when ocloc is not available. While a fatal error is useful for debugging, the
current convention is to disable features, especially in cases like this where
there is no simple way to make the feature work.
Differential Revision: https://developer.blender.org/D15774
Based on the paper "Practical Hash-based Owen Scrambling" by Brent Burley,
2020, Journal of Computer Graphics Techniques.
It is distinct from the existing Sobol sampler in two important ways:
* It is Owen scrambled, which gives it a much better convergence rate in many
situations.
* It uses padding for higher dimensions, rather than using higher Sobol
dimensions directly. In practice this is advantagous because high-dimensional
Sobol sequences have holes in their sampling patterns that don't resolve
until an unreasonable number of samples are taken. (See Burley's paper for
details.)
The pattern reduces noise in some benchmark scenes, however it is also slower,
particularly on the CPU. So for now Progressive Multi-Jittered sampling remains
the default.
Differential Revision: https://developer.blender.org/D15679
This gave a 1.1x speedup, however also leads to very long compile times
that make it seems like Blender has stopped working.
This can be brought back in the future behind an option that users can
explicitly enabled.
Fix T100102
Ref D14923, D14763, T92212
* Store compact ray differentials in ShaderData and compute full differentials
on demand. This reduces register pressure on the GPU.
* Remove BSDF differential code that was effectively doing nothing as the
differential orientation was discarded when making it compact.
This gives a 1-5% speedup with RTX A6000 + OptiX in our benchmarks, with the
bigger speedups in simpler scenes.
Renders appear to be identical except for the Both displacement option that
does both displacement and bump.
Differential Revision: https://developer.blender.org/D15677
Detect cases where a ray-intersection would miss the current triangle, which if
the intersection is strictly watertight, implies that a neighboring triangle would
incorrectly be hit instead.
When that is detected, apply a ray-offset. The idea being that we only want to
introduce potential error from ray offsets if we really need to.
This work for BVH2 and Embree, as we are able to match the ray-interesction
bit-for-bit, though doing so for Embree requires ugly hacks. Tiny differences
like fused-multiply-add or dot product intrinstics in matrix inversion and ray
intersection needed to be matched exactly, so this is fragile.
Unfortunately we're not able to do the same for OptiX or MetalRT, since those
implementations are unknown (and possibly impossible to match as hardware
instructions). Still artifacts are much reduced, though not eliminated.
Ref T97259
Differential Revision: https://developer.blender.org/D15559
These replace float3 and packed_float3 in various places in the kernel where a
spectral color representation will be used in the future. That representation
will require more than 3 channels and conversion to from/RGB. The kernel code
was refactored to remove the assumption that Spectrum and RGB colors are the
same thing.
There are no functional changes, Spectrum is still a float3 and the conversion
functions are no-ops.
Differential Revision: https://developer.blender.org/D15535
Now all the same ones are available on CPU and GPU, which was previously not
possible due to lack of operator overloadng in OpenCL. Print functions are
no-ops on some GPUs.
Ref D15535
zebin format is critical for the compatibility of AoT graphics binaries
across driver versions. It was previously disabled on Linux due to
runtime issues that are now fixed in
https://github.com/intel/compute-runtime/releases/tag/22.31.23852.
The minimum supported driver version isn't bumped to this one yet as
current codebase with current IGC compiler does actually run fine on
earlier drivers and is not running into these issues anymore.
It should consistently use the Cycles pirmitive ID for self intersection detection,
not the one from the OptiX or Embree acceleration structure.
Differential Revision: https://developer.blender.org/D15632
Assume that all faces using the smae material form a closed mesh, so that
joining meshes gives the same result as separate meshes.
It does mean that using different materials on different sides of one
closed mesh do not work, but the meaning of that is poorly defined anyway
if there is a volume interior.
Recently, performance with oneAPI have regressed due some recent
changes in Blender itself. This commit's changes is resolving this
and also improve compilation time for oneAPI backend first
execution (or Blender compilation time in case of AoT).
Regression have appeared after 5152c7c152 and not related to the
changes itself, but increase of kernels complexity introduced with
it. Changes in this commit is marking some Blender functions as
noinlined for oneAPI backend, which helps GPU compiler to deal with
this complexity without any negative side-effects on performance.
* OneAPI: remove separate float3 definition
* OneAPI: disable operator[] to match other GPUs
* OneAPI: make int3 compact to match other GPUs
* Use #pragma once
* Add __KERNEL_NATIVE_VECTOR_TYPES__ to simplify checks
* Remove unused vector3
Simplifies intersection code a little and slightly improves precision regarding
self intersection.
The parametric texture coordinate in shader nodes is still the same as before
for compatibility.
The number of Execution Units and resident "threads" (simd width * threads
per EUs) are now exposed and used to select the number of states using
a simplified heuristic.
float8 is a reserved type in Metal, but is not implemented. So rename to
float8_t for now.
Also move back intersection handlers to kernel.metal, they can't be in the
class that encapsulates the other Metal kernel functions.
This was tested in some places to check if code was being compiled for the
CPU, however this is only defined in the kernel. Checking __KERNEL_GPU__
always works.