This is preparation for #129495.
Currently, all of OSL is managed by the OSLShaderManager. This makes it so that the general OSL setup is handled by a general OSLManager, and both the OSLShaderManager and (in the future) the Camera can use it to manage their scripts.
Pull Request: https://projects.blender.org/blender/blender/pulls/135050
Check was misc-const-correctness, combined with readability-isolate-declaration
as suggested by the docs.
Temporarily clang-format "QualifierAlignment: Left" was used to get consistency
with the prevailing order of keywords.
Pull Request: https://projects.blender.org/blender/blender/pulls/132361
* Use .empty() and .data()
* Use nullptr instead of 0
* No else after return
* Simple class member initialization
* Add override for virtual methods
* Include C++ instead of C headers
* Remove some unused includes
* Use default constructors
* Always use braces
* Consistent names in definition and declaration
* Change typedef to using
Pull Request: https://projects.blender.org/blender/blender/pulls/132361
Cycles supports a feature known as "Adaptive Compile" which will
compile the GPU kernel at runtime with only the features neccesary
for the current scene.
This is primarily used for debugging purposes and is not advised for
general use, because it's not well tested/maintained and leads to
frequent kernel recompilation which can take a long time and interupt
your workflow.
This commits exposes the option to turn this feature on
for the HIP and Metal backends in the Cycles debug UI panel.
Pull Request: https://projects.blender.org/blender/blender/pulls/132459
Happens for renders from command line, when kernel specialization
thread is still working after the allocators on the Blender side
have been deinitialized.
Add an explicit deinitializaiton, which ensures all Cycles worker
and cache threads are finished before the allocators are deinitialized.
This should solve occasional crashes when running regression tests
for Metal or Metal-RT.
Pull Request: https://projects.blender.org/blender/blender/pulls/128239
Previously, GPU denoisers were ignoring settings about render
configuration and were using any available GPU. With these changes,
GPU denoisers will use the device selected in Blender Cycles
settings.
This allows any GPU denoiser to be used with CPU rendering.
Pull Request: https://projects.blender.org/blender/blender/pulls/118841
Currently the multi-input sockets are not exposed to the custom nodes
Python API. This makes some features cumbersome to implement if one
wants a node to process an arbitrary number of inputs.
One workaround is to make inputs duplicate themselves when a link is
created, but a proper multi-input would be easier to use for both
add-on developers and users.
This commit exposes a new `use_multi_input` boolean parameter when
creating a new node socket. This makes it possible to declare a
multi-input, while still leaving the existing `is_multi_input`
property read-only so that existing nodes cannot be made unstable.
The parameter is optional so existing scripts stay compatible. It also
raises an error when used on output sockets, since it makes no sense
for those to be multi-input.
The Custom Node Tree Python template was updated to reflect this
change by making one of the inputs of the custom node multi-input.
Pull Request: https://projects.blender.org/blender/blender/pulls/114474
This commit updates all defines, compiler flags and cleans up some code for unused CPU capabilities.
There should be no functional change, unless it's run on a CPU that supports sse41 but not sse42. It will fallback to the SSE2 kernel in this case.
In preparation for the new SSE4.2 minimum in Blender 4.2.
Pull Request: https://projects.blender.org/blender/blender/pulls/118043
Along with the 4.1 libraries upgrade, we are bumping the clang-format
version from 8-12 to 17. This affects quite a few files.
If not already the case, you may consider pointing your IDE to the
clang-format binary bundled with the Blender precompiled libraries.
This patch updates the experimental MetalRT code path to use new [curve primitives](https://developer.apple.com/videos/play/wwdc2023/10128/) which were recently added in macOS 14. This replaces the previous custom box intersection implementation, allowing the driver to better optimise curve acceleration structures for the GPU. On existing hardware, this can speed up MetalRT renders by up to 40% for scenes that use hair / curve primitives extensively.
The MetalRT option will only be available on macOS >= 14, and requires Xcode >= 15 to build (otherwise the option will be compiled out).
Authored by Marco Giordano, Michael Jones, and Jason Fielder
---
Before / after render times (M1 Max MacBook Pro, macOS 14 beta, MetalRT enabled):
```
Custom box intersection MetalRT curve primitives Speedup
fishy_cat 111.5 80.5 1.39
koro 114.4 86.7 1.32
sinosauropteryx 291.8 279.2 1.05
spring 142.3 142.2 1.00
victor 442.7 347.7 1.27
```
---
Pull Request: https://projects.blender.org/blender/blender/pulls/111795
There are a couple of functions that create rna pointers. For example
`RNA_main_pointer_create` and `RNA_pointer_create`. Currently, those
take an output parameter `r_ptr` as last argument. This patch changes
it so that the functions actually return a` PointerRNA` instead of using
the output parameters.
This has a few benefits:
* Output parameters should only be used when there is an actual benefit.
Otherwise, one should default to returning the value.
* It's simpler to use the API in the large majority of cases (note that this
patch reduces the number of lines of code).
* It allows the `PointerRNA` to be const on the call-site, if that is desired.
No performance regression has been measured in production files.
If one of these functions happened to be called in a hot loop where
there is a regression, the solution should be to use an inline function
there which allows the compiler to optimize it even better.
Pull Request: https://projects.blender.org/blender/blender/pulls/111976
HIP RT enables AMD hardware ray tracing on RDNA2 and above, and falls back to a
to shader implementation for older graphics cards. It offers an average 25%
sample rendering rate improvement in Cycles benchmarks, on a W6800 card.
The ray tracing feature functions are accessed through HIP RT SDK, available on
GPUOpen. HIP RT traversal functionality is pre-compiled in bitcode format and
shipped with the SDK.
This is not yet enabled as there are issues to be resolved, but landing the
code now makes testing and further changes easier.
Known limitations:
* Not working yet with current public AMD drivers.
* Visual artifact in motion blur.
* One of the buffers allocated for traversal has a static size. Allocating it
dynamically would reduce memory usage.
* This is for Windows only currently, no Linux support.
Co-authored-by: Brecht Van Lommel <brecht@blender.org>
Ref #105538
Updated Embree 4 library with GPU support is required for it to be
compiled - compatiblity with Embree 3 and Embree 4 without GPU support
is maintained.
Enabling hardware raytracing is an opt-in user setting for now.
Pull Request: https://projects.blender.org/blender/blender/pulls/106266
While keeping SSE2, SSE4.1 and AVX2. This does not affect hardware support, it
only slightly reduces performance for some older CPUs.
To reduce maintenance cost and improve compile times.
Differential Revision: https://developer.blender.org/D16978
This patch adds a new "Kernel Optimization Level" dropdown menu to control Metal kernel specialisation. Currently this defaults to "full" optimisation, on the assumption that the changes proposed in D16371 will address usability concerns around app responsiveness and shader cache housekeeping.
Reviewed By: brecht
Differential Revision: https://developer.blender.org/D16514
It is not really used from any of the sources, including the
standalone app. Since we are moving to a more backend-independent
drawing it makes sense to remove header which was specific to
how Blender integrates Cycles into viewport.
There is probably some cleanup in CMake files is possible, but
there is some inter-dependency with USD.
Differential Revision: https://developer.blender.org/D16681
This change fixes issues with viewport rendering when Metal
GPU backend is used for drawing. This is not a default build
configuration and requires the following tweaks:
- Enable WITH_METAL_BACKEND CMake option (set it to on)
- Use `--gpu-backend metal` command line arguments
It also helps using the `--factory-startup` command line
argument to ensure Eevee is not used (it is not ported and
will crash).
The root of the problem was in the use of glViewport().
It is replaced with the GPU_viewport_size_get_i() which
is supposed to be portable equivalent form the GPU module.
Without this change the viewport size is detected to be 0
which backfired in few places.
The rest of the changes were to make the code more robust
in the extreme conditions instead of asserting or crashing.
Simplified and streamlined GPU resources creation in the
display driver. It was a bit convoluted mix of creation of
the GPU resources and resizing them to the proper size. It
even seemed to be done in the reverse order. Now it is as
simple as "just ensure GPU resources are there for the
given texture or buffer size".
Also avoid division by zero in the tile manager.
Differential Revision: https://developer.blender.org/D16679
When either initializing with a non-constant value, or using the standard
[[ string widget = "null" ]] metadata. This can be used for inputs like
normals and texture coordinates, where you don't want to default to a
constant value.
In previous OSL versions the input value was automatically ignore when it
was left unchanged for such inputs. However that's no longer the case in
the latest version, breaking existing nodes. There is no good entirely
backwards compatible fix, but I believe the new behavior is better and will
keep most existing cases working.
Fix T102450: OSL node with normal input not working
This adds path guiding features into Cycles by integrating Intel's Open Path
Guiding Library. It can be enabled in the Sampling > Path Guiding panel in the
render properties.
This feature helps reduce noise in scenes where finding a path to light is
difficult for regular path tracing.
The current implementation supports guiding directional sampling decisions on
surfaces, when the material contains a least one diffuse component, and in
volumes with isotropic and anisotropic Henyey-Greenstein phase functions.
On surfaces, the guided sampling decision is proportional to the product of
the incident radiance and the normal-oriented cosine lobe and in volumes it
is proportional to the product of the incident radiance and the phase function.
The incident radiance field of a scene is learned and updated during rendering
after each per-frame rendering iteration/progression.
At the moment, path guiding is only supported by the CPU backend. Support for
GPU backends will be added in future versions of OpenPGL.
Ref T92571
Differential Revision: https://developer.blender.org/D15286
This patch adds a new Cycles device with similar functionality to the
existing GPU devices. Kernel compilation and runtime interaction happen
via oneAPI DPC++ compiler and SYCL API.
This implementation is primarly focusing on Intel® Arc™ GPUs and other
future Intel GPUs. The first supported drivers are 101.1660 on Windows
and 22.10.22597 on Linux.
The necessary tools for compilation are:
- A SYCL compiler such as oneAPI DPC++ compiler or
https://github.com/intel/llvm
- Intel® oneAPI Level Zero which is used for low level device queries:
https://github.com/oneapi-src/level-zero
- To optionally generate prebuilt graphics binaries: Intel® Graphics
Compiler All are included in Linux precompiled libraries on svn:
https://svn.blender.org/svnroot/bf-blender/trunk/lib The same goes for
Windows precompiled binaries but for the graphics compiler, available
as "Intel® Graphics Offline Compiler for OpenCL™ Code" from
https://www.intel.com/content/www/us/en/developer/articles/tool/oneapi-standalone-components.html,
for which path can be set as OCLOC_INSTALL_DIR.
Being based on the open SYCL standard, this implementation could also be
extended to run on other compatible non-Intel hardware in the future.
Reviewed By: sergey, brecht
Differential Revision: https://developer.blender.org/D15254
Co-authored-by: Nikita Sirgienko <nikita.sirgienko@intel.com>
Co-authored-by: Stefan Werner <stefan.werner@intel.com>
* Replace license text in headers with SPDX identifiers.
* Remove specific license info from outdated readme.txt, instead leave details
to the source files.
* Add list of SPDX license identifiers used, and corresponding license texts.
* Update copyright dates while we're at it.
Ref D14069, T95597
Consider temporary directory to be variant part of session configuration
which gets communicated to the tile manager on render reset.
This allows to be able to render with one temp directory, change the
directory, render again and have proper render result even with enabled
persistent data.
For the ease of access to the temp directory expose it via the render
engine API (engine.temp_directory).
Differential Revision: https://developer.blender.org/D13790
Enables the `bpy.ops.cycles.denoise_animation()` operator again and modifies it to support
temporal denoising with OptiX. This requires renders that were done with both the "Vector"
and "Denoising Data" passes.
Differential Revision: https://developer.blender.org/D11442
This patch adds the Metal host-side code:
- Add all core host-side Metal backend files (device_impl, queue, etc)
- Add MetalRT BVH setup files
- Integrate with Cycles device enumeration code
- Revive `path_source_replace_includes` in util/path (required for MSL compilation)
This patch also includes a couple of small kernel-side fixes:
- Add an implementation of `lgammaf` for Metal [Nemes, Gergő (2010), "New asymptotic expansion for the Gamma function", Archiv der Mathematik](https://users.renyi.hu/~gergonemes/)
- include "work_stealing.h" inside the Metal context class because it accesses state now
Ref T92212
Reviewed By: brecht
Maniphest Tasks: T92212
Differential Revision: https://developer.blender.org/D13423
Instead of printing debug flags listing various CPU and GPU settings that
may or may not be used, print when we are using them. This include CPU
kernel types, OptiX debugging and CUDA and HIP adaptive compilation. BVH
type was already printed.
Remove prefix of filenames that is the same as the folder name. This used
to help when #includes were using individual files, but now they are always
relative to the cycles root directory and so the prefixes are redundant.
For patches and branches, git merge and rebase should be able to detect the
renames and move over code to the right file.