This new version of the graphics compiler improves performance
for the majority of supported Intel devices and adds support
for upcoming Intel hardware. Such an upgrade also requires
an increase in the minimal supported driver version on Windows,
which is why these changes are combined together with
the ocloc upgrade.
Previously set minimal version 101.6557 was increased to 101.8132.
Pull Request: https://projects.blender.org/blender/blender/pulls/147460
These changes are already in place de facto, as the check for
the minimal driver version in the Blender code was already
updated. This commit simply adds the UI update, which I
initially forgot to perform.
Pull Request: https://projects.blender.org/blender/blender/pulls/147459
In very old OpenEXR version there was a limit on the channel names, which meant
the pass names needed to be short like "DiffDir". Change them to be longer like
"Diffuse Direct".
* This breaks forward compatibility. Old Blender version will lose links when
reading compositing node setups with such passes, but #146571 will fix it
for 4.5 LTS.
* Add-ons, scripts and compositing setups in other applications that rely on these
names will also break.
* The find_by_type function for render passes has also been removed, as this was
already deprecated and replaced by find_by_name.
* We assume spaces in the name are ok, since we have passes with them already
and have not seen reports about compatibility issues.
Pull Request: https://projects.blender.org/blender/blender/pulls/142731
* Add adaptive subdivision properties natively on the subdivision surface
modifier, so that other engines may reuse them in the future. This also
resolve issues where they would not get copied properly.
* Remove "Feature Set" option in the render properties, this was the last
experimental one.
* Add space choice between "Pixel" and "Object". The latter is new and can
be used for object space dicing that works with instances. Instead of
a pixel size an object space edge length is specified.
* Add object space subdivision test.
Ref #53901
Pull Request: https://projects.blender.org/blender/blender/pulls/146723
This implements a basic render time pass,
using HW-based counters to minimize render time impact.
x86-64 uses the TSC instruction for timing, while ARM64 uses the cntvct_el0
register. In theory TSC is not always super reliable (e.g. old CPUs had it tied
to their current clock rate), but for somewhat recent CPU models it should
be fine. If neither is available, it falls back to `std::chrono::steady_clock`,
which should still be very fast.
The output is in milliseconds of CPU-time per pixel.
Pull Request: https://projects.blender.org/blender/blender/pulls/125933
Always enforce tiling of some size, up to the 8k tile size.
Rendering very big images without tiles have a lot of challenges.
While solving those challenges is not impossible, it does not seem to
be a practical time investment.
The internals of the way how Cycles work, including Cycles Standalone
is not affected by this change.
A possible downside is that path guiding might not work exactly how one
would expect it to due to lack of information sharing across multiple
tiles. This is something that never worked nicely, and camera animation
and border render has the same issues, so it is not considered a stopper
for this change.
Fixes#145900
Co-authored-by: Brecht Van Lommel <brecht@blender.org>
Pull Request: https://projects.blender.org/blender/blender/pulls/146031
Guide the probability to scatter in or transmit through the volume.
Only applied for primary rays.
Co-authored-by: Brecht Van Lommel <brecht@blender.org>
The distance sampling is mostly based on weighted delta tracking from
[Monte Carlo Methods for Volumetric Light Transport Simulation]
(http://iliyan.com/publications/VolumeSTAR/VolumeSTAR_EG2018.pdf).
The recursive Monte Carlo estimation of the Radiative Transfer Equation is
\[\langle L \rangle=\frac{\bar T(x\rightarrow y)}{\bar p(x\rightarrow
y)}(L_e+\sigma_s L_s + \sigma_n L).\]
where \(\bar T(x\rightarrow y) = e^{-\bar\sigma\Vert x-y\Vert}\) is the
majorant transmittance between points \(x\) and \(y\), \(p(x\rightarrow
y) = \bar\sigma e^{-\bar\sigma\Vert x-y\Vert}\) is the probability of
sampling point \(y\) from point \(x\) following exponential
distribution.
At each recursive step, we randomly pick one of the two events
proportional to their weights:
* If \(\xi < \frac{\sigma_s}{\sigma_s+\vert\sigma_n\vert}\), we sample
scatter event and evaluate \(L_s\).
* Otherwise, no real collision happens and we continue the recursive
process.
The emission \(L_e\) is evaluated at each step.
This also removes some unused volume settings from the UI:
* "Max Steps" is removed, because the step size is automatically specified
by the volume octree. There is a hard-coded threshold `VOLUME_MAX_STEPS`
to prevent numerical issues.
* "Homogeneous" is automatically detected during density evaluation
An option "Unbiased" is added to the UI. When enabled, densities above
the majorant are clamped.
Ever since the OptiX 8 update in Blender 4.5, the minimum GPU driver
requirements to use OptiX has increased to 535 or newer.
This commit update the minimum GPU driver requirement listed in the UI
to reflect this.
Pull Request: https://projects.blender.org/blender/blender/pulls/143917
Add new "Linear 3D Curves" option in the Curves panel in the render
properties. This renders curves as linear segments rather than smooth
curves, for faster render time at the cost of accuracy.
On NVIDIA Blackwell GPUs, this can give a 6x speedup compared to smooth
curves, due to hardware acceleration. On NVIDIA Ada there is still
a 3x speedup, and CPU and other GPU backends will also render this
faster.
A difference with smooth curves is that these have end caps, as this
was simpler to implement and they are usually helpful anyway.
In the future this functionality will also be used to properly support
the CURVE_TYPE_POLY on the new curves object.
Pull Request: https://projects.blender.org/blender/blender/pulls/139735
GPU devices can only be selected in the user preferences if a suitable
device is available. This uses a dynamic enum and the items are not
always defined in RNA, so they need to be extracted manually using
`n_()`.
Also rephrase one message slightly to respect the style guide
("Don't" -> "Do not").
In addition, fix my mistake where an import was mixed up
(`pgettext_tip` was imported as `n_`).
Pull Request: https://projects.blender.org/blender/blender/pulls/141244
At the moment there are two main usability issues that make it hard to
recommend to enable HIP RT by default:
- Dramatically increased memory usage during BVH construction on
high poly meshes compared to BVH2 (#136174)
- This issue can be fixed by using the "balanced" HIP RT BVH, but
it requires a HIP RT update that won't make it into 4.5 (!136622)
- Many Blender and GPU driver crashes when modifying objects in the
viewport. #140763, #140738, #139013, #138043
Pull Request: https://projects.blender.org/blender/blender/pulls/140794
In blender-v4.5 some problematic commits were reverted, but for 5.0 we will
keep the changes and wait for the HIP SDK to be upgraded and hopefully fix
these issues.
Ref blender/blender#139836
With these changes, we can now mark devices which are expected to work as
performant as possible, and devices which were not optimized for some reason.
For example, because the device was released after the Blender release,
making it impossible for developers to optimize for devices in already
released unchangeable code. This is primarily relevant for the LTS versions,
which are supported for two years and require proper communication about
optimization status for the new devices released during this time.
This is implemented for oneAPI devices. Other device types currently are
marked as optimized for compatibility with old behavior, but may implement
the same in the future.
Pull Request: https://projects.blender.org/blender/blender/pulls/139751
- "Parameters for custom (OSL-based) Cameras" -> "cameras": lower case
in tooltips.
- "Connect two nodes ... (automatically determined": missing
parenthesis.
- "Join curve... control points are detected(if disabled...": add
missing space.
- "Add Selected to Active Objects Collection" -> "Active Object's":
typo.
- "Duplicate the acive shape key" -> "active": typo.
- "Copy selected points ": remove trailing space.
- "Move cursor" -> "Cursor": title case for operator.
- "Paste text to clipboard" -> "from clipboard": typo.
- "An empty Action considered as both a 'layered' and a 'layered'
Action." -> "is considered as both a 'legacy' and a 'layered'
Action": likely copy-paste error.
- "Target's Z axis will constraint..." -> "will constrain": typo.
- "The layer groups is expanded in the UI" -> "layer group": typo.
- Deprecation warnings: add missing parentheses.
- "... on low poly geometry.Offset rays...": add missing space after
period.
- "... relative to the files directory" -> "... to the file's
directory": typo.
- "The unit multiplier for pixels per meter" -> "The base unit": this
property description was copy and pasted.
- "... beyond the faces UVs..." -> "the faces' UVs: typo.
- "Is tracking data contains ..." -> "Whether the tracking data
contains": grammar.
- "Selected text" -> "Text": title case for prop.
- "The user has been shown the "Online Access" prompt and make a
choice" -> "made a choice": grammar.
- "Glare ": remove trailing space.
- "Don't collapse a curves" -> "Do not collapse curves": grammar.
Some issues reported by Tamar Mebonia.
Pull Request: https://projects.blender.org/blender/blender/pulls/139118
The previous value was incorrectly assuming this is 2^exposure, but it
is a simple multiplier.
Specifying large film exposure values is useful as it affects the raw data and
thus results in the same color values available in EXR, Viewer, and
Compositor, whereas using Color Management exposure is not reflected in EXR
exports, resulting in unusably low values. Using film exposure remedies that.
Pull Request: https://projects.blender.org/blender/blender/pulls/138850
(when done from python)
When changing those setting via the **UI**, we somehow get the
dependency graph tagged for changes.
If this happens, the update ripples through the following (and we are
all good):
- `DEG_editors_update` (here the depsgraph update is detected via
`DEG_id_type_any_updated`)
- `deg_editors_scene_update`
- `ED_render_scene_update`
- `ED_render_view3d_update`
- `engine_view_update` which finally calls the renderengine `sync_func`
When doing this from python though, the DEG tag is missing, now added
(similar to 4f15c24705)
We could as well just do a broader update though and use
`update_render_engine`
Since it seems `CyclesVisibilitySettings` are only used for the World
nowadays, this PR also corrects the property descriptions accordingly
(might make sense to split this into a separate commit).
Pull Request: https://projects.blender.org/blender/blender/pulls/138093
This allows users to implement arbitrary camera models using OSL by writing
shaders that take an image position as input and compute ray origin and
direction.
The obvious applications for this are e.g. panorama modes, lens distortion
models and realistic lens simulation, but the possibilities are endless.
Currently, this is only supported on devices with OSL support, so CPU and
OptiX. However, it is independent from the shading model used, so custom
cameras can be used without getting the performance hit of OSL shading.
A few samples are provided as Text Editor templates.
One notable current limitation (in addition to the limited device support)
is that inverse mapping is not supported, so Window texture coordinates and
the Vector pass will not work with custom cameras.
Pull Request: https://projects.blender.org/blender/blender/pulls/129495
These have bugs in with the latest HIP-RT and HIP SDK, so just disable them
as we do not expect a fix in time, and rolling back would re-introduce other
bugs. As RDNA1 does not have hardware raytracing, it is also less important
to use HIP-RT.
Note that only RDNA2+ is officially supported by HIP, so these GPUs working
at all is somewhat lucky.
Fix#134979Fix#134978Fix#134975
Pull Request: https://projects.blender.org/blender/blender/pulls/135179
After the recent HIP SDK 6.3 update on Windows, the minimum GPU driver
required to use HIP in Cycles has increased.
This commit increases the required driver version listed in the UI and
adds a check to avoid showing HIP devices if they're below a certain
driver version number as they don't work properly.
Pull Request: https://projects.blender.org/blender/blender/pulls/134965
The current usage of software-based texture operations in
the oneAPI implementation puts additional register pressure on
the GPU compiler during register allocation. And it also creates
code that requires maintenance. This commit is intended to address
this situation by utilizing a recently productized SYCL bindless
texture API to enable HW-based texture operations using
Intel GPUs' hardware sampler.
This currently translates to 1-11% rendering speedups (scene-specific)
on my Arc A770 and Arc B580. At the moment, there are small
performance regressions with NanoVDB texture operations on Arc B580
and small performance regressions in shade surface MNEE and Raytrace
kernels on Arc A770, but they look recoverable and will be handled
in the future.
Pull Request: https://projects.blender.org/blender/blender/pulls/133457
Cycles has a sample offset feature allowing users to render X samples
in a single frame on one device, then the remaining Y samples later or
on a different device and combine them back together at the end.
However in most situations the result from using this method was
different, and usually lower quality than rendering all the samples in
one go.
This was because Cycles tunes it's random number sequence for the
number of samples being rendered. And the random number sequence was
being tuned for the wrong number of samples in the case that a user
was using the sample offset.
This commit fixes this issue by adding a "sample subset" feature.
The user specifies the total sample count being rendered across all
devices in the existing `Max Samples` parameter, then specifies per
device which subset of samples will be rendered (E.g. Render samples
0-1024 out of a 0-2048 range).
This commit also contains some additional clean up work
inside Cycles related to the area being changed.
Co-authored-by: Brecht Van Lommel <brecht@blender.org>
Pull Request: https://projects.blender.org/blender/blender/pulls/132961
Cycles supports a feature known as "Adaptive Compile" which will
compile the GPU kernel at runtime with only the features neccesary
for the current scene.
This is primarily used for debugging purposes and is not advised for
general use, because it's not well tested/maintained and leads to
frequent kernel recompilation which can take a long time and interupt
your workflow.
This commits exposes the option to turn this feature on
for the HIP and Metal backends in the Cycles debug UI panel.
Pull Request: https://projects.blender.org/blender/blender/pulls/132459
This commit introduces proper handling of ROCm 5 and ROCm 6 runtimes on
Linux, based on the version of the ROCm compiler used at build time.
Previously, HIPEW (the HIP equivalent of Cuda Wrangler) defaulted to
loading the ROCm 5 runtime. If ROCm 5 was unavailable, it would attempt
to load ROCm 6. However, ROCm 6 introduces changes in certain
structures and functions that are not backward compatible, leading to
potential issues when kernels compiled with the ROCm 6 compiler are
executed on the ROCm 5 runtime.
### Summary of Changes:
**Separation of Structures and Functions:**
Structures and functions are now separated into hipew5 and hipew6 to
accommodate the differences between ROCm versions.
**Build-Time Version Detection:**
The ROCm version is determined during build time, and the corresponding
hipew5 or hipew6 is included accordingly.
**Runtime Default to ROCm 6:**
By default, HIPEW now loads the ROCm 6 runtime and
includes hipew6 (Linux only).
**JIT Compilation Behavior:**
Since ROCm 6 is the default version, JIT compilation is supported only
when the ROCm 6 compiler is detected at runtime.
**HIP-RT Update:**
HIP-RT has been updated to load the ROCm 6 runtime by default.
These changes ensure compatibility and stability when switching
between ROCm versions, avoiding issues caused by runtime
and compiler mismatches.
Co-authored-by: Alaska <alaskayou01@gmail.com>
Co-authored-by: Sergey Sharybin <sergey@blender.org>
Pull Request: https://projects.blender.org/blender/blender/pulls/130153
This commit removes support for Vega GPUs from the AMD HIP backend of
Cycles. This is being done as:
- AMD no longer provides official support for Vega GPUs in their
ROCm software.
- Vega GPUs have rendering artifacts on all supported platforms,
and as a result of the reduction of support from AMD, are unlikely
to be fixed. Rendering artifacts include.
- The incorrect shading of volumes (Windows and Linux)
- Missing intersections on many meshes with HIPRT
- Crashing rendering subsurface scattering materials (Linux)
- And more.
Pull Request: https://projects.blender.org/blender/blender/pulls/129523
This change switches Cycles to an opensource HIP-RT library which
implements hardware ray-tracing. This library is now used on
both Windows and Linux. While there should be no noticeable changes
on Windows, on Linux this adds support for hardware ray-tracing on
AMD GPUs.
The majority of the change is typical platform code to add new
library to the dependency builder, and a change in the way how
ahead-of-time (AoT) kernels are compiled. There are changes in
Cycles itself, but they are rather straightforward: some APIs
changed in the opensource version of the library.
There are a couple of extra files which are needed for this to
work: hiprt02003_6.1_amd.hipfb and oro_compiled_kernels.hipfb.
There are some assumptions in the HIP-RT library about how they
are available. Currently they follow the same rule as AoT
kernels for oneAPI:
- On Windows they are next to blender.exe
- On Linux they are in the lib/ folder
Performance comparison on Ubuntu 22.04.5:
```
GPU: AMD Radeon PRO W7800
Driver: amdgpu-install_6.1.60103-1_all.deb
main hip-rt
attic 0.1414s 0.0932s
barbershop_interior 0.1563s 0.1258s
bistro 0.2134s 0.1597s
bmw27 0.0119s 0.0099s
classroom 0.1006s 0.0803s
fishy_cat 0.0248s 0.0178s
junkshop 0.0916s 0.0713s
koro 0.0589s 0.0720s
monster 0.0435s 0.0385s
pabellon 0.0543s 0.0391s
sponza 0.0223s 0.0180s
spring 0.1026s 1.5145s
victor 0.1901s 0.1239s
wdas_cloud 0.1153s 0.1125s
```
Co-authored-by: Brecht Van Lommel <brecht@blender.org>
Co-authored-by: Ray Molenkamp <github@lazydodo.com>
Co-authored-by: Sergey Sharybin <sergey@blender.org>
Pull Request: https://projects.blender.org/blender/blender/pulls/121050
IGC 1.0.17384, ocloc 24.31.30508, which:
- add support for Battlemage and Lunar Lake GPUs
- recover from recent performance regression on Linux
- allow to drop older work-around
(9d5164d472) and need for a patched
version on Windows
- ocloc now needs "dg2,mtl" naming for fat binaries.
opencl-clang patches don't get applied anymore by igc build scripts
when llvm is not a git repository, hence I could also drop we can drop
current patch disabling patching.
I've only slightly pushed min-driver-version updates after carefull
testing, instead of jumping to the same version as ocloc as we use to.
Pull Request: https://projects.blender.org/blender/blender/pulls/127251
This new version of the graphics compiler solves a performance
regression on Arc, adds support for Battlemage and Lunar Lake GPUs, and
allows to drop older patch to build fat binaries with broad
compatibility.
This latter change requires using -device dg2,mtl naming instead of
passing architecture ids.
Pull Request: https://projects.blender.org/blender/blender/pulls/127371
Refactor the call to `_cycles.available_devices` into it's own function
and update `self.device` at the same time to avoid mis-matches between
`_cycles.available_devices` and `self.device`.
Pull Request: https://projects.blender.org/blender/blender/pulls/124079
This resolves two issues:
1. On macOS the GPU Compute device would be disabled by default unless
the user opens user preferences. This is unexpected behaviour ever
since 09ba1486f8
2. Fixes incorrect automatic denoiser display settings and errors in
terminal related to the denoising UI on macOS if the user hasn't opened
user preferences.
Pull Request: https://projects.blender.org/blender/blender/pulls/123911