This pull request removes ROCm 5 code path and adds ROCm 7 runtime to
library search list.
ROCm 5 runtime is no longer shipped with AMD drivers, and ROCm 5 compiler
is no longer compatible with newer driver versions.
It also adds ROCm 7 runtime to the list of runtime libraries to look for.
Starting later this year, ROCm 7 runtime will be bundled with the driver
installer, and all future runtime fixes and improvements will target ROCm 7.
Once ROCm 7 runtime is rolled out, ROCm 6 compiler will continue to work
with it for about a year as a transitional measure. Beyond that, compatibility
is not guaranteed.
Pull Request: https://projects.blender.org/blender/blender/pulls/145279
All GPU backends now support NanoVDB, using our own kernel side code
that is easily portable. This simplifies kernel and device code.
Volume bounds are now built from the NanoVDB grid instead of OpenVDB,
to avoid having to keep around the OpenVDB grid after loading.
While this reduces memory usage, it does have a performance impact,
particularly for the Cubic filter. That will be addressed by
another commit.
Pull Request: https://projects.blender.org/blender/blender/pulls/132908
* Add GraphicsInteropDevice to check if interop is possible with device
* Rename GraphcisInterop to GraphicsInteropBuffer
* Include display device type and memory size in GraphicsInteropBuffer
* Unnest graphics interop class to make forward declarations possible
Pull Request: https://projects.blender.org/blender/blender/pulls/137363
There is a known precision bug in the current HIP compiler version (RDNA2 family/Windows) that has already been fixed and will be available in
a future HIP SDK release. Enabling more precise math prevents the artifacts.
This may cause a 5-10% performance drop in some scenes.
Fix#136138: Microfacet BSDF
Fix#136449: Hair BSDF
Pull Request: https://projects.blender.org/blender/blender/pulls/136341
This may be used for device to do host memory allocation in a way that
is more efficient for copy the host memory to the device.
Also rename and group device memory allocation functions for clarity.
Pull Request: https://projects.blender.org/blender/blender/pulls/134412
If one of the devices already used host happed memory but another not,
it would previously realloc both.
Thanks to Jorn Visser for investigating and finding this problem.
Pull Request: https://projects.blender.org/blender/blender/pulls/132912
Should be a bit more efficient, and it fixes host memory fallback bugs,
where host memory was incorrectly freed during re-copy. For the case
where memory should get reallocated on the host, a new mem_move_to_host
was added.
Thanks to Jorn Visser for investigating and finding this problem.
Pull Request: https://projects.blender.org/blender/blender/pulls/132912
* Use .empty() and .data()
* Use nullptr instead of 0
* No else after return
* Simple class member initialization
* Add override for virtual methods
* Include C++ instead of C headers
* Remove some unused includes
* Use default constructors
* Always use braces
* Consistent names in definition and declaration
* Change typedef to using
Pull Request: https://projects.blender.org/blender/blender/pulls/132361
This commit introduces proper handling of ROCm 5 and ROCm 6 runtimes on
Linux, based on the version of the ROCm compiler used at build time.
Previously, HIPEW (the HIP equivalent of Cuda Wrangler) defaulted to
loading the ROCm 5 runtime. If ROCm 5 was unavailable, it would attempt
to load ROCm 6. However, ROCm 6 introduces changes in certain
structures and functions that are not backward compatible, leading to
potential issues when kernels compiled with the ROCm 6 compiler are
executed on the ROCm 5 runtime.
### Summary of Changes:
**Separation of Structures and Functions:**
Structures and functions are now separated into hipew5 and hipew6 to
accommodate the differences between ROCm versions.
**Build-Time Version Detection:**
The ROCm version is determined during build time, and the corresponding
hipew5 or hipew6 is included accordingly.
**Runtime Default to ROCm 6:**
By default, HIPEW now loads the ROCm 6 runtime and
includes hipew6 (Linux only).
**JIT Compilation Behavior:**
Since ROCm 6 is the default version, JIT compilation is supported only
when the ROCm 6 compiler is detected at runtime.
**HIP-RT Update:**
HIP-RT has been updated to load the ROCm 6 runtime by default.
These changes ensure compatibility and stability when switching
between ROCm versions, avoiding issues caused by runtime
and compiler mismatches.
Co-authored-by: Alaska <alaskayou01@gmail.com>
Co-authored-by: Sergey Sharybin <sergey@blender.org>
Pull Request: https://projects.blender.org/blender/blender/pulls/130153
On Linux, Cycles HIP has a JIT compilation feature.
This feature is used when Cycles can not find a precompiled kernel
for your GPU. Which is most common when using hardware that wasn't
out at the time that a version of Blender was released.
There were various issues with this JIT compilation system, this commit
aims to solve them. The changes include:
- Enable `WITH_NANOVDB` when Blender is built with NanoVDB.
- This fixes a issue where VDB objects would not render.
- Enable some extra debug options for developers when desired
(This is so we match the CUDA implementation of the same feature).
- Reduce the optimizaiton level from -O3 to the default.
- This is to avoid any extra issues that may occur as a result
of an increase optimization level that isn't tested with
precompiled kernels.
- Reduce the optimization level even further to -O1 for Vega.
- This was done on precompiled kernels to work around some issues,
so I decided to apply it to JIT kernels as well.
- Note: Although Vega is not officially supported, this may help
people that unofficially use Vega.
- Added some previously missing compiler arguments and fixed errors that
were introduced when enabling these compiler arguments.
- Fixed a issue where JIT compilation would fail if Blener was
installed in a path that had a space in it.
Pull Request: https://projects.blender.org/blender/blender/pulls/131853
Precompiled Cycles kernels make up a considerable fraction of the total size of
Blender builds nowadays. As we add more features and support for more
architectures, this will only continue to increase.
However, since these kernels tend to be quite compressible, we can save a lot
of storage by storing them in compressed form and decompressing the required
kernel(s) during loading.
By using Zstandard compression with a high level, we can get decent compression
ratios (~5x for the current kernels) while keeping decompression time low
(about 30ms in the worse case in my tests). And since we already require zstd
for Blender, this doesn't introduce a new dependency.
While the main improvement is to the size of the extracted Blender installation
(which is reduced by ~400-500MB currently), this also shrinks the download on
Windows, since .zip's deflate compression is less effective. It doesn't help on
Linux since we're already using .tar.xz there, but the smaller installed size
is still a good thing.
See #123522 for initial discussion.
Pull Request: https://projects.blender.org/blender/blender/pulls/123557
Since #118841 there are more cases where Cycles would check for the
graphics interop support. This could lead to a crash when graphics
interop functions are called without having active graphics context.
This change makes it so there is no graphics interop calls when doing
headless render. In order to achieve this the device creation is now
aware of the headless mode.
Pull Request: https://projects.blender.org/blender/blender/pulls/122844
Previously the CUDA context was always destroyed and the module along
with it. Now that this no longer happens, the missing module free became
a memory leak.
Also fix the same issue for HIP, though this is destroying the context
so it's not a problem yet.
Fix part of #119035
Co-authored-by: Brecht Van Lommel <brecht@blender.org>
HIP fails to allocate textures, typically when they are too large.
This commit lets the user know what might be causing the issue
rather than providing a confusing internal error message.
Pull Request: https://projects.blender.org/blender/blender/pulls/118239
ROCm 6 brings some changes to the HIP API. This pull request is meant to be
backward and forward compatible.
That is Blender could be compiled with either ROCM 6 or 5 and run on either.
The main change is the hipMemoryType enum, which we check based on the
runtime version to use the correct enum values.
Without this, HIP will not work on Windows with upcoming 23.40 driver.
Pull Request: https://projects.blender.org/blender/blender/pulls/116713
NDEBUG is part of the C standard and disables asserts. Only this will
now be used to decide if asserts are enabled.
DEBUG was a Blender specific define, that has now been removed.
_DEBUG is a Visual Studio define for builds in Debug configuration.
Blender defines this for all platforms. This is still used in a few
places in the draw code, and in external libraries Bullet and Mantaflow.
Pull Request: https://projects.blender.org/blender/blender/pulls/115774
Host memory fallback in CUDA and HIP devices is almost identical.
We remove duplicated code and create a shared generic version that
other devices (oneAPI) will be able to use.
Reviewed By: brecht
Differential Revision: https://developer.blender.org/D17173
This adds a new mirror image extension type for shaders and
geometry nodes (next to the existing repeat, extend and clip
options).
See D16432 for a more detailed explanation of `wrap_mirror`.
This also adds a new sampler flag `GPU_SAMPLER_MIRROR_REPEAT`.
It acts as a modifier to `GPU_SAMPLER_REPEAT`, so any `REPEAT`
flag must be set for the `MIRROR` flag to have an effect.
Differential Revision: https://developer.blender.org/D16432
With libepoxy we can choose between EGL and GLX at runtime, as well as
dynamically open EGL and GLX libraries without linking to them.
This will make it possible to build with Wayland, EGL, GLVND support while
still running on systems that only have X11, GLX and libGL. It also paves
the way for headless rendering through EGL.
libepoxy is a new library dependency, and is included in the precompiled
libraries. GLEW is no longer a dependency, and WITH_SYSTEM_GLEW was removed.
Includes contributions by Brecht Van Lommel, Ray Molenkamp, Campbell Barton
and Sergey Sharybin.
Ref T76428
Differential Revision: https://developer.blender.org/D15291
* Rename "texture" to "data array". This has not used textures for a long time,
there are just global memory arrays now. (On old CUDA GPUs there was a cache
for textures but not global memory, so we used to put all data in textures.)
* For CUDA and HIP, put globals in KernelParams struct like other devices.
* Drop __ prefix for data array names, no possibility for naming conflict now that
these are in a struct.
Move MNEE to own kernel, separate from shader ray-tracing. This does introduce
the limitation that a shader can't use both MNEE and AO/bevel, but that seems
like the better trade-off for now.
We can experiment with bigger kernel organization changes later.
Differential Revision: https://developer.blender.org/D15070
This patch makes it possible to change the precision with which to
store volume data in the NanoVDB data structure (as float, half, or
using variable bit quantization) via the previously unused precision
field in the volume data block.
It makes it possible to further reduce memory usage during
rendering, at a slight cost to the visual detail of a volume.
Differential Revision: https://developer.blender.org/D10023
* Replace license text in headers with SPDX identifiers.
* Remove specific license info from outdated readme.txt, instead leave details
to the source files.
* Add list of SPDX license identifiers used, and corresponding license texts.
* Update copyright dates while we're at it.
Ref D14069, T95597
This patch adds new arg-type parameters to `DeviceQueue::enqueue` and its overrides. This is in preparation for the Metal backend which needs this information for correct argument encoding.
Ref T92212
Reviewed By: brecht
Maniphest Tasks: T92212
Differential Revision: https://developer.blender.org/D13357
Use the correct device function (hipDeviceGet) for multi GPU setups, instead
of hipGetDevice which just returns the default device.
Differential Revision: https://developer.blender.org/D13323