Test kernel will now test functionalities related to kernel execution
with USM memory allocations instead of with SYCL buffers and accessors
as these aren't currently used in the backend.
This change removes CMake code for automatic calculation of the number
of offline device compiler instances, to hand over control to developers
instead as it incurs a rather large memory usage with around 8GB per
instance at peak.
Use SYCL_OFFLINE_COMPILER_PARALLEL_JOBS CMake variable to configure it.
This patch enables MNEE on macOS >= 13. There was an inefficiency in the calculation of spill requirements, fixed as of macOS 13. This patch also adds a temporary inlining workaround for a Metal compiler bug which causes `mnee_compute_constraint_derivatives` to behave incorrectly.
Reviewed By: brecht
Differential Revision: https://developer.blender.org/D16235
JIT compilation of oneAPI kernels now happens during load stage
and proper message gets shown in the GUI during compilation.
Also, this implementation skips kernels that aren't needed for
the used scene, reducing overall (re)compilation time.
It fixes SYCL runtime issues in Debug builds that were due to mixing
Release and Debug MSVC runtimes.
This commit also removes specific handling of dpcpp compiler executable
to simplify the CMake implementation. Using it like clang++ works and
clang++ executable is also available from Intel oneAPI DPC++ compiler in
case it doesn't.
This is a minimal set of changes, allowing a lot of cleanup that can
happen afterward as it allows sycl method and objects to be used outside
of kernel.cpp.
Reviewed By: brecht, sergey
Differential Revision: https://developer.blender.org/D15397
Changing volume parameters during rendering could cause a crash
when guiding was enabled. It was due to an unintialized state paramter
at the beginning of the path tracing process.
In addition guiding is disabled when dealing with almost delta volumes
(i.e., g close to 1.0 or -1.0).
This change speeds up the compilation at the cost of higher memory usage.
CMake implementation checks the amount of available memory to spawn a
reasonable number of parallel compiler jobs.
Previously it would bake viewed from above the surface. The new option can be
useful when the baked result is meant to be viewed from a fixed viewpoint or
with limited camera motion.
Some effort is made to give a continuous reflection on parts of the surface
invisible to the camera, but this is necessarily only a rough approximation.
Differential Revision: https://developer.blender.org/D15921
This patch optimises the Metal inlining policy. It gives a small speedup (2-3% on M1 Max) with no notable compilation slowdown vs what is already in master. Previously noted compilation slowdowns (as reported in T100102) were caused by forcing inlining for `ccl_device`, but we get better rendering perf by relying on compiler heuristics in these cases.
Reviewed By: brecht
Differential Revision: https://developer.blender.org/D16081
This adds path guiding features into Cycles by integrating Intel's Open Path
Guiding Library. It can be enabled in the Sampling > Path Guiding panel in the
render properties.
This feature helps reduce noise in scenes where finding a path to light is
difficult for regular path tracing.
The current implementation supports guiding directional sampling decisions on
surfaces, when the material contains a least one diffuse component, and in
volumes with isotropic and anisotropic Henyey-Greenstein phase functions.
On surfaces, the guided sampling decision is proportional to the product of
the incident radiance and the normal-oriented cosine lobe and in volumes it
is proportional to the product of the incident radiance and the phase function.
The incident radiance field of a scene is learned and updated during rendering
after each per-frame rendering iteration/progression.
At the moment, path guiding is only supported by the CPU backend. Support for
GPU backends will be added in future versions of OpenPGL.
Ref T92571
Differential Revision: https://developer.blender.org/D15286
Simplifies code overall to do it inside the eval function, most of the BSDFs
already compute the dot product.
The refactoring in bsdf_principled_hair_eval() was needed to avoid a HIP
compiler bug. Cause is unclear, just changing the implementation enough
is meant to sidestep it.
Ref T92571, D15286
* Return roughness and IOR for BSDF sampling
* Add functions to query IOR and label for given BSDF
* Default IOR to 1.0 instead of 0.0 for BSDFs that don't use it
* Ensure pdf >= 0.0 in case of numerical precision issues
Ref T92571, D15286
Windows drivers 101.3430 fix an important GUI-related crash and it's
best to prevent users from running into it.
Linux drivers weren't affected but still had relevant gpu binary
compatibility fixes, so it makes sense to keep the min-supported version
aligned across OSes.
This is already the case for most CMake usage.
Although some find modules are an exception to this, as they were
originally maintained externally they use some different conventions.
Also corrected bad indentation in: intern/cycles/CMakeLists.txt
Now explicitly including math.h first before #defining funcitons.
This avoids undefined behavior and improves compatibility with
different SYCL compilers and backends.
Cleans up the file structure to be more similar to that of the SVM
and also makes it possible to build kernels with OSL support, but
without having to include SVM support.
This patch was split from D15902.
Differential Revision: https://developer.blender.org/D15949
This has the advantage of being able to use information about the
existing OSL closures in various places without code duplication. In
addition, the setup code for all closures was moved to standalone
functions to avoid usage of virtual function calls in preparation for GPU
support.
This patch was split from D15902.
Differential Revision: https://developer.blender.org/D15917
The SVM attribute map is always generated and uses a simple
linear search to lookup by an opaque ID, so can reuse that for OSL
as well and simply use the attribute name hash as ID instead of
generating a unique value separately. This works for both object
and geometry attributes since the SVM attribute map already
stores both. Simplifies code somewhat and reduces memory
usage slightly.
This patch was split from D15902.
Differential Revision: https://developer.blender.org/D15918
The recent revert of Apple silicon inlining changes to avoid long compile times
worked on macOS 12, but in macOS 13 Beta it results in render errors. This may
be a compiler bug and perhaps get fixed in time, but try to be on the safe side
and ensure Blender 3.3.0 works regardless.
This brings part of the inlining back, which brings improved performance but
also longer compiler times again. Compile time is around 2min now, where the
previous full inlining was about 5-7min.
Patch by Michael Jones.
Differential Revision: https://developer.blender.org/D15897
This uses the same sample classification approach as used for PMJ,
because it turns out to also work equally well with Sobol-Burley.
This also implements a fallback (random classification) that should
work "okay" for other samplers, though there are no other samplers
at the moment.
Differential Revision: https://developer.blender.org/D15845
The multi-dimensional Sobol pattern required us to carefully use as low
dimensions as possible, as quality goes down in higher dimensions. Now that we
have two sampling patterns that are at least as good, there is no need to keep
it around and the implementation can be simplified.
Differential Revision: https://developer.blender.org/D15788
Fix two issues in the previous implementation:
* Only power-of-two prefixes were progressively stratified, not suffixes.
This resulted in unnecessarily increased noise when using non-power-of-two
sample counts.
* In order to try to get away with just a single sample pattern, the code
used a combination of sample index shuffling and Cranley-Patterson rotation.
Index shuffling is normally fine, but due to the sample patterns themselves
not being quite right (as described above) this actually resulted in
additional increased noise. Cranley-Patterson, on the other hand, always
increases noise with randomized (t,s) nets like PMJ02, and should be avoided
with these kinds of sequences.
Addressed with the following changes:
* Replace the sample pattern generation code with a much simpler algorithm
recently published in the paper "Stochastic Generation of (t, s) Sample
Sequences". This new implementation is easier to verify, produces fully
progressively stratified PMJ02, and is *far* faster than the previous code,
being O(N) in the number of samples generated.
* It keeps the sample index shuffling, which works correctly now due to the
improved sample patterns. But it now uses a newer high-quality hash instead
of the original Laine-Karras hash.
* The scrambling distance feature cannot (to my knowledge) be implemented with
any decorrelation strategy other than Cranley-Patterson, so Cranley-Patterson
is still used when that feature is enabled. But it is now disabled otherwise,
since it increases noise.
* In place of Cranley-Patterson, multiple independent patterns are generated
and randomly chosen for different pixels and dimensions as described in the
original PMJ paper. In this patch, the pattern selection is done via
hash-based shuffling to ensure there are no repeats within a single pixel
until all patterns have been used.
The combination of these fixes brings the quality of Cycles' PMJ sampler in
line with the previously submitted Sobol-Burley sampler in D15679. They are
essentially indistinguishable in terms of quality/noise, which is expected
since they are both randomized (0,2) sequences.
Differential Revision: https://developer.blender.org/D15746
sycl/L0 runtime reports compute-runtime version since Intel graphics
driver 101.3268 on Windows, when querying driver version from sycl.
Prior to this driver, it was 0. Now we can bump minimum requirement to
this one and filter-out devices returning 0.
Maniphest Tasks: T100648
This patch is a response to T92588 and is implemented
as a Function/Shader node.
This node has support for Float, Vector and Color data types.
For Vector it supports uniform and non-uniform mixing.
For Color it now has the option to remove factor clamping.
It replaces the Mix RGB for Shader and Geometry node trees.
As discussed in T96219, this patch converts existing nodes
in .blend files. The old node is still available in the
Python API but hidden from the menus.
Reviewed By: HooglyBoogly, JacquesLucke, simonthommes, brecht
Maniphest Tasks: T92588
Differential Revision: https://developer.blender.org/D13749