These have bugs in with the latest HIP-RT and HIP SDK, so just disable them
as we do not expect a fix in time, and rolling back would re-introduce other
bugs. As RDNA1 does not have hardware raytracing, it is also less important
to use HIP-RT.
Note that only RDNA2+ is officially supported by HIP, so these GPUs working
at all is somewhat lucky.
Fix#134979Fix#134978Fix#134975
Pull Request: https://projects.blender.org/blender/blender/pulls/135179
After the recent HIP SDK 6.3 update on Windows, the minimum GPU driver
required to use HIP in Cycles has increased.
This commit increases the required driver version listed in the UI and
adds a check to avoid showing HIP devices if they're below a certain
driver version number as they don't work properly.
Pull Request: https://projects.blender.org/blender/blender/pulls/134965
The current usage of software-based texture operations in
the oneAPI implementation puts additional register pressure on
the GPU compiler during register allocation. And it also creates
code that requires maintenance. This commit is intended to address
this situation by utilizing a recently productized SYCL bindless
texture API to enable HW-based texture operations using
Intel GPUs' hardware sampler.
This currently translates to 1-11% rendering speedups (scene-specific)
on my Arc A770 and Arc B580. At the moment, there are small
performance regressions with NanoVDB texture operations on Arc B580
and small performance regressions in shade surface MNEE and Raytrace
kernels on Arc A770, but they look recoverable and will be handled
in the future.
Pull Request: https://projects.blender.org/blender/blender/pulls/133457
Cycles has a sample offset feature allowing users to render X samples
in a single frame on one device, then the remaining Y samples later or
on a different device and combine them back together at the end.
However in most situations the result from using this method was
different, and usually lower quality than rendering all the samples in
one go.
This was because Cycles tunes it's random number sequence for the
number of samples being rendered. And the random number sequence was
being tuned for the wrong number of samples in the case that a user
was using the sample offset.
This commit fixes this issue by adding a "sample subset" feature.
The user specifies the total sample count being rendered across all
devices in the existing `Max Samples` parameter, then specifies per
device which subset of samples will be rendered (E.g. Render samples
0-1024 out of a 0-2048 range).
This commit also contains some additional clean up work
inside Cycles related to the area being changed.
Co-authored-by: Brecht Van Lommel <brecht@blender.org>
Pull Request: https://projects.blender.org/blender/blender/pulls/132961
This is leading to "'super' object has no attribute '__del__'" errors
in some situations. As explained in #132476 this is only for future
proofing, so don't do it yet.
This reverts commit f301952b6a.
According to the Python API release notes, this is required now along
with super().__init__() which was already done.
Also fixes mistake in example in API docs.
Pull Request: https://projects.blender.org/blender/blender/pulls/132476
Cycles supports a feature known as "Adaptive Compile" which will
compile the GPU kernel at runtime with only the features neccesary
for the current scene.
This is primarily used for debugging purposes and is not advised for
general use, because it's not well tested/maintained and leads to
frequent kernel recompilation which can take a long time and interupt
your workflow.
This commits exposes the option to turn this feature on
for the HIP and Metal backends in the Cycles debug UI panel.
Pull Request: https://projects.blender.org/blender/blender/pulls/132459
This commit exposes the "Quality" option of the Open Image Denoiser
to the user for the denoise node in the compositor.
There are a few quality modes:
- High - Highest quality, but takes the longest to process.
- Balanced - Slightly lower quality, but usually halves
the processing time compared to High.
- Fast - Further reduce the quality, for a small increase in
speed over Balanced.
Along with that there is a `Follow Scene` option which will use the
quality set in the scene settings.
This allows users that have multiple denoise nodes
(E.g. For multi-pass denoising), to quickly switch all nodes between
different quality modes.
Performance (denoising time):
High: 13 seconds
Balanced: 6 seconds
Fast: 5 seconds
Test setup:
CPU: AMD Ryzen 9 5950X
Denoising a 3840x2160 render
---
Follow ups:
Ideally the "Denoise Nodes" UI panel in the render properties panel
would be hidden if the compositor setup does not contain any
denoise nodes.
However implementing this efficiently can be difficult and so it was
decided this task was outside the scope of this commit.
Pull Request: https://projects.blender.org/blender/blender/pulls/130252
This commit introduces proper handling of ROCm 5 and ROCm 6 runtimes on
Linux, based on the version of the ROCm compiler used at build time.
Previously, HIPEW (the HIP equivalent of Cuda Wrangler) defaulted to
loading the ROCm 5 runtime. If ROCm 5 was unavailable, it would attempt
to load ROCm 6. However, ROCm 6 introduces changes in certain
structures and functions that are not backward compatible, leading to
potential issues when kernels compiled with the ROCm 6 compiler are
executed on the ROCm 5 runtime.
### Summary of Changes:
**Separation of Structures and Functions:**
Structures and functions are now separated into hipew5 and hipew6 to
accommodate the differences between ROCm versions.
**Build-Time Version Detection:**
The ROCm version is determined during build time, and the corresponding
hipew5 or hipew6 is included accordingly.
**Runtime Default to ROCm 6:**
By default, HIPEW now loads the ROCm 6 runtime and
includes hipew6 (Linux only).
**JIT Compilation Behavior:**
Since ROCm 6 is the default version, JIT compilation is supported only
when the ROCm 6 compiler is detected at runtime.
**HIP-RT Update:**
HIP-RT has been updated to load the ROCm 6 runtime by default.
These changes ensure compatibility and stability when switching
between ROCm versions, avoiding issues caused by runtime
and compiler mismatches.
Co-authored-by: Alaska <alaskayou01@gmail.com>
Co-authored-by: Sergey Sharybin <sergey@blender.org>
Pull Request: https://projects.blender.org/blender/blender/pulls/130153
The `poll()` function for `CYCLES_PT_context_material` was using the
legacy `GPENCIL` identifier as opposed to `GREASEPENCIL`. This caused
duplicated material templates to show in the material tab.
Pull Request: https://projects.blender.org/blender/blender/pulls/130962
This commit removes support for Vega GPUs from the AMD HIP backend of
Cycles. This is being done as:
- AMD no longer provides official support for Vega GPUs in their
ROCm software.
- Vega GPUs have rendering artifacts on all supported platforms,
and as a result of the reduction of support from AMD, are unlikely
to be fixed. Rendering artifacts include.
- The incorrect shading of volumes (Windows and Linux)
- Missing intersections on many meshes with HIPRT
- Crashing rendering subsurface scattering materials (Linux)
- And more.
Pull Request: https://projects.blender.org/blender/blender/pulls/129523
For now, PointerRNA is made non-trivial by giving explicit default
values to its members.
Besides of BPY python binding code, the change is relatively trivial.
The main change (besides the creation/deletion part) is the replacement
of `memset` by zero-initialized assignment (using `{}`).
makesrna required changes are quite small too.
The big piece of this PR is the refactor of the BPY RNA code.
It essentially brings back allocation and deletion of the BPy_StructRNA,
BPy_Pointer etc. python objects into 'cannonical process', using `__new__`,
and `__init__` callbacks (and there matching CAPI functions).
Existing code was doing very low-level manipulations to create these
data, which is not really easy to understand, and AFAICT incompatible
with handling C++ data that needs to be constructed and destructed.
Unfortunately, similar change in destruction code (using `__del__` and
matching `tp_finalize` CAPI callback) is not possible, because of technical
low-level implementation details in CPython (see [1] for details).
`std::optional` pointer management is used to encapsulate PointerRNA
data. This allows to keep control on _when_ actual RNA creation is done,
and to have a safe destruction in `tp_dealloc` callbacks.
Note that a critical change in Blender's Python API will be that classes
inherinting from `bpy_struct` etc. will now have to properly call the
base class `__new__` and/or `__init__`if they define them.
Implements #122431.
[1] https://discuss.python.org/t/cpython-usage-of-tp-finalize-in-c-defined-static-types-with-no-custom-tp-dealloc/64100
This adds feature parity with Cycles regarding light and shadow liking.
Technically, this extends the GBuffer header to 32 bits, and uses
the top bits to store the object's light set membership index.
The same index is also added to `ObjectInfo` in place of padding bytes.
For shadow linking, the shadow blocker sets bitmask is stored per
tilemap. It is then used during the GPU culling phase to cull objects
that do not belong to the shadow's sets.
Co-authored-by: Clément Foucault <foucault.clem@gmail.com>
Pull Request: https://projects.blender.org/blender/blender/pulls/127514
This change switches Cycles to an opensource HIP-RT library which
implements hardware ray-tracing. This library is now used on
both Windows and Linux. While there should be no noticeable changes
on Windows, on Linux this adds support for hardware ray-tracing on
AMD GPUs.
The majority of the change is typical platform code to add new
library to the dependency builder, and a change in the way how
ahead-of-time (AoT) kernels are compiled. There are changes in
Cycles itself, but they are rather straightforward: some APIs
changed in the opensource version of the library.
There are a couple of extra files which are needed for this to
work: hiprt02003_6.1_amd.hipfb and oro_compiled_kernels.hipfb.
There are some assumptions in the HIP-RT library about how they
are available. Currently they follow the same rule as AoT
kernels for oneAPI:
- On Windows they are next to blender.exe
- On Linux they are in the lib/ folder
Performance comparison on Ubuntu 22.04.5:
```
GPU: AMD Radeon PRO W7800
Driver: amdgpu-install_6.1.60103-1_all.deb
main hip-rt
attic 0.1414s 0.0932s
barbershop_interior 0.1563s 0.1258s
bistro 0.2134s 0.1597s
bmw27 0.0119s 0.0099s
classroom 0.1006s 0.0803s
fishy_cat 0.0248s 0.0178s
junkshop 0.0916s 0.0713s
koro 0.0589s 0.0720s
monster 0.0435s 0.0385s
pabellon 0.0543s 0.0391s
sponza 0.0223s 0.0180s
spring 0.1026s 1.5145s
victor 0.1901s 0.1239s
wdas_cloud 0.1153s 0.1125s
```
Co-authored-by: Brecht Van Lommel <brecht@blender.org>
Co-authored-by: Ray Molenkamp <github@lazydodo.com>
Co-authored-by: Sergey Sharybin <sergey@blender.org>
Pull Request: https://projects.blender.org/blender/blender/pulls/121050
IGC 1.0.17384, ocloc 24.31.30508, which:
- add support for Battlemage and Lunar Lake GPUs
- recover from recent performance regression on Linux
- allow to drop older work-around
(9d5164d472) and need for a patched
version on Windows
- ocloc now needs "dg2,mtl" naming for fat binaries.
opencl-clang patches don't get applied anymore by igc build scripts
when llvm is not a git repository, hence I could also drop we can drop
current patch disabling patching.
I've only slightly pushed min-driver-version updates after carefull
testing, instead of jumping to the same version as ocloc as we use to.
Pull Request: https://projects.blender.org/blender/blender/pulls/127251
This new version of the graphics compiler solves a performance
regression on Arc, adds support for Battlemage and Lunar Lake GPUs, and
allows to drop older patch to build fat binaries with broad
compatibility.
This latter change requires using -device dg2,mtl naming instead of
passing architecture ids.
Pull Request: https://projects.blender.org/blender/blender/pulls/127371
Add a `use_gpu()` function to the UI code for Cycles.
This is done to clean up some of the other code (`use_{backend}()`)
and to help isolate `use_multi_device` and `show_device_active` from
their context making them more robust to changes made in other parts
of the UI code.
Pull Request: https://projects.blender.org/blender/blender/pulls/124134
Refactor the call to `_cycles.available_devices` into it's own function
and update `self.device` at the same time to avoid mis-matches between
`_cycles.available_devices` and `self.device`.
Pull Request: https://projects.blender.org/blender/blender/pulls/124079
This resolves two issues:
1. On macOS the GPU Compute device would be disabled by default unless
the user opens user preferences. This is unexpected behaviour ever
since 09ba1486f8
2. Fixes incorrect automatic denoiser display settings and errors in
terminal related to the denoising UI on macOS if the user hasn't opened
user preferences.
Pull Request: https://projects.blender.org/blender/blender/pulls/123911
Fixes an issue where Blender would crash if the OptiX denoiser was
selected, but an unsupported GPU device (E.g. Intel GPU) was
selected in preferences.
This crash would occur because Cycles uses the device in preferences
to setup the denoiser, and there was no check stopping an unsupported
GPU from being used to try and setup and run the denoiser.
Pull Request: https://projects.blender.org/blender/blender/pulls/124001
This is because with the addition of new features to Cycles, these GPUs
experienced significant performance regressions and bugs, all stemming
from bugs in the Metal GPU driver/compiler. The only reasonable way to
work around these issues was to disable parts of Cycles code on
these GPUs to avoid the driver/compiler bugs.
This resulted in increased development time maintaining these platforms
while being unable to deliver feature parity with other
GPU backends.
It has been decided that this development time is better spent
maintaining platforms that are still actively maintained by
hardware/software vendors, and so AMD and Intel GPU support will be
removed from the Metal backend for Cycles.
Pull Request: https://projects.blender.org/blender/blender/pulls/123551
Extract
- Cycles denoiser enum.
- Extensions user preferences UI.
- Node operator poll message from new node function.
Improve
- Split "(Enabled|Disabled) on startup, overriding the preference."
into two messages.
Disambiguate
- "Add" when describing the action of adding something should use the
Operator context.
- "Dimensions", in noise textures.
- "Transform" as a noun, the matrix transform type of Geometry Nodes,
as opposed to the verb to move things in space.
- "Parent" as a noun or verb (the parent of an object, to parent an
object to another).
Some issues reported by Satoshi Yamasaki, deathblood, and Gabriel Gazzán.
Pull Request: https://projects.blender.org/blender/blender/pulls/122969
This patch implements blue-noise dithered sampling as described by Nathan Vegdahl (https://psychopath.io/post/2022_07_24_owen_scrambling_based_dithered_blue_noise_sampling), which in turn is based on "Screen-Space Blue-Noise Diffusion of Monte Carlo Sampling Error via Hierarchical Ordering of Pixels"(https://repository.kaust.edu.sa/items/1269ae24-2596-400b-a839-e54486033a93).
The basic idea is simple: Instead of generating independent sequences for each pixel by scrambling them, we use a single sequence for the entire image, with each pixel getting one chunk of the samples. The ordering across pixels is determined by hierarchical scrambling of the pixel's position along a space-filling curve, which ends up being pretty much the same operation as already used for the underlying sequence.
This results in a more high-frequency noise distribution, which appears smoother despite not being less noisy overall.
The main limitation at the moment is that the improvement is only clear if the full sample amount is used per pixel, so interactive preview rendering and adaptive sampling will not receive the benefit. One exception to this is that when using the new "Automatic" setting, the first sample in interactive rendering will also be blue-noise-distributed.
The sampling mode option is now exposed in the UI, with the three options being Blue Noise (the new mode), Classic (the previous Tabulated Sobol method) and the new default, Automatic (blue noise, with the additional property of ensuring the first sample is also blue-noise-distributed in interactive rendering). When debug mode is enabled, additional options appear, such as Sobol-Burley.
Note that the scrambling distance option is not compatible with the blue-noise pattern.
Pull Request: https://projects.blender.org/blender/blender/pulls/118479
Previously, GPU denoisers were ignoring settings about render
configuration and were using any available GPU. With these changes,
GPU denoisers will use the device selected in Blender Cycles
settings.
This allows any GPU denoiser to be used with CPU rendering.
Pull Request: https://projects.blender.org/blender/blender/pulls/118841
Happens when opening a file saved file with preview paused.
This fix covers the typical use-case when the property is modified
from the space it comes from.
Pull Request: https://projects.blender.org/blender/blender/pulls/122502
This enables scenes with all textures not fitting in GPU
memory to finally render. For scenes that are fitting,
no functional change or performance change is expected.
Pull Request: https://projects.blender.org/blender/blender/pulls/122385
This unify Cycles and EEVEE setting.
We always copy the Cycles setting in versionning
except if the first scene is using EEVEE as renderer.
Note that this currently breaks importers
addons who will try to `cycles.cast_shadow`property
on the light.
Pull Request: https://projects.blender.org/blender/blender/pulls/121804