This change brings the following improvements on the user level
- Support of GPUs with gfx12 architecture
- New HIP-RT library which in addition to the gfx12 support brings
various bug-fixes.
The known limitation of gfx12 is that OpenImageDenoiser does not yet
support this GPU architecture. This means that while Cycles will use the
full advantage of the gfx12 (including hardware accelerated ray-tracing),
denoising will only be possible on CPU, or secondary gfx11 or below GPU.
This is something that requires a change in OIDN and it is to late to do
it for Blender 4.4, but it is something to look forward for Blender 4.5.
The gfx12 changes for the pre-compiled kernels is rather trivial,
so it comes together (in the same PR) as the bigger HIP-RT change.
On the development side this change brings the following improvements:
- One step compile and link (much simpler CMake rules)
- Embedding BVH binaries in hiprt dll (which makes it easier to package
and load, without relying on special path configuration)
Co-authored-by: Sahar Kashi <sahar.kashi@amd.com>
Co-authored-by: Sergey Sharybin <sergey@blender.org>
Co-authored-by: Brecht Van Lommel <brecht@blender.org>
Pull Request: https://projects.blender.org/blender/blender/pulls/133129
In contrast to libamdhip64, for which blender searches in multiple directories,
libhiprt64 only had limited support for non-standard installation directories.
With this commit, the library lookup for HIPRT is handled in a similar way in
the original HIP/ROCm module. This also fixes issues for downstream Linux
distributions where HIPRT support was enabled during build time but was not
available at runtime.
Co-authored-by: Torsten Keßler <t.kessler@posteo.de>
Pull Request: https://projects.blender.org/blender/blender/pulls/131046
This commit introduces proper handling of ROCm 5 and ROCm 6 runtimes on
Linux, based on the version of the ROCm compiler used at build time.
Previously, HIPEW (the HIP equivalent of Cuda Wrangler) defaulted to
loading the ROCm 5 runtime. If ROCm 5 was unavailable, it would attempt
to load ROCm 6. However, ROCm 6 introduces changes in certain
structures and functions that are not backward compatible, leading to
potential issues when kernels compiled with the ROCm 6 compiler are
executed on the ROCm 5 runtime.
### Summary of Changes:
**Separation of Structures and Functions:**
Structures and functions are now separated into hipew5 and hipew6 to
accommodate the differences between ROCm versions.
**Build-Time Version Detection:**
The ROCm version is determined during build time, and the corresponding
hipew5 or hipew6 is included accordingly.
**Runtime Default to ROCm 6:**
By default, HIPEW now loads the ROCm 6 runtime and
includes hipew6 (Linux only).
**JIT Compilation Behavior:**
Since ROCm 6 is the default version, JIT compilation is supported only
when the ROCm 6 compiler is detected at runtime.
**HIP-RT Update:**
HIP-RT has been updated to load the ROCm 6 runtime by default.
These changes ensure compatibility and stability when switching
between ROCm versions, avoiding issues caused by runtime
and compiler mismatches.
Co-authored-by: Alaska <alaskayou01@gmail.com>
Co-authored-by: Sergey Sharybin <sergey@blender.org>
Pull Request: https://projects.blender.org/blender/blender/pulls/130153
This is being added straight to 4.2, prior to the `make license` command
which will use this to generate a more complete license file.
Licenses information and ambiguities worked with Dalai Felinto.
Part of !129018.
This change switches Cycles to an opensource HIP-RT library which
implements hardware ray-tracing. This library is now used on
both Windows and Linux. While there should be no noticeable changes
on Windows, on Linux this adds support for hardware ray-tracing on
AMD GPUs.
The majority of the change is typical platform code to add new
library to the dependency builder, and a change in the way how
ahead-of-time (AoT) kernels are compiled. There are changes in
Cycles itself, but they are rather straightforward: some APIs
changed in the opensource version of the library.
There are a couple of extra files which are needed for this to
work: hiprt02003_6.1_amd.hipfb and oro_compiled_kernels.hipfb.
There are some assumptions in the HIP-RT library about how they
are available. Currently they follow the same rule as AoT
kernels for oneAPI:
- On Windows they are next to blender.exe
- On Linux they are in the lib/ folder
Performance comparison on Ubuntu 22.04.5:
```
GPU: AMD Radeon PRO W7800
Driver: amdgpu-install_6.1.60103-1_all.deb
main hip-rt
attic 0.1414s 0.0932s
barbershop_interior 0.1563s 0.1258s
bistro 0.2134s 0.1597s
bmw27 0.0119s 0.0099s
classroom 0.1006s 0.0803s
fishy_cat 0.0248s 0.0178s
junkshop 0.0916s 0.0713s
koro 0.0589s 0.0720s
monster 0.0435s 0.0385s
pabellon 0.0543s 0.0391s
sponza 0.0223s 0.0180s
spring 0.1026s 1.5145s
victor 0.1901s 0.1239s
wdas_cloud 0.1153s 0.1125s
```
Co-authored-by: Brecht Van Lommel <brecht@blender.org>
Co-authored-by: Ray Molenkamp <github@lazydodo.com>
Co-authored-by: Sergey Sharybin <sergey@blender.org>
Pull Request: https://projects.blender.org/blender/blender/pulls/121050
* Only works on machines with a Qualcomm Snapdragon 8cx Gen3 or above.
Older generation devices are not and will not be supported due to
some driver issues
* Requires VS2022 for building.
* Uses new MSVC preprocessor for sse2neon compatibility.
* SIMD is not enabled, waiting on conversion of blenlib to C++.
Ref #119126
Pull Request: https://projects.blender.org/blender/blender/pulls/117036
ROCm 6 brings some changes to the HIP API. This pull request is meant to be
backward and forward compatible.
That is Blender could be compiled with either ROCM 6 or 5 and run on either.
The main change is the hipMemoryType enum, which we check based on the
runtime version to use the correct enum values.
Without this, HIP will not work on Windows with upcoming 23.40 driver.
Pull Request: https://projects.blender.org/blender/blender/pulls/116713
Try loading ROCm 5.x libraries specifically, as the .so without version
is only part of the development package.
Thanks to Lee Ringham investigating and proposing this solution.
* Add HIP-RT API functions and library loading
* Add more HIP API types and functions
* Find HIP linker executable in CMake module
* New CMake module to find HIP-RT SDK
Co-authored-by: Brecht Van Lommel <brecht@blender.org>
Ref #105538
The goal is to solve confusion of the "All rights reserved" for licensing
code under an open-source license.
The phrase "All rights reserved" comes from a historical convention that
required this phrase for the copyright protection to apply. This convention
is no longer relevant.
However, even though the phrase has no meaning in establishing the copyright
it has not lost meaning in terms of licensing.
This change makes it so code under the Blender Foundation copyright does
not use "all rights reserved". This is also how the GPL license itself
states how to apply it to the source code:
<one line to give the program's name and a brief idea of what it does.>
Copyright (C) <year> <name of author>
This program is free software ...
This change does not change copyright notice in cases when the copyright
is dual (BF and an author), or just an author of the code. It also does
mot change copyright which is inherited from NaN Holding BV as it needs
some further investigation about what is the proper way to handle it.
Previously it would use a hardcoded location where the AMD driver installs it,
but Linux distributions may use other locations. Now look for both cases.
Use a shorter/simpler license convention, stops the header taking so
much space.
Follow the SPDX license specification: https://spdx.org/licenses
- C/C++/objc/objc++
- Python
- Shell Scripts
- CMake, GNUmakefile
While most of the source tree has been included
- `./extern/` was left out.
- `./intern/cycles` & `./intern/atomic` are also excluded because they
use different header conventions.
doc/license/SPDX-license-identifiers.txt has been added to list SPDX all
used identifiers.
See P2788 for the script that automated these edits.
Reviewed By: brecht, mont29, sergey
Ref D14069
This patch implements the vector types (i.e:`float2`) by making heavy
usage of templating. All vector functions are now outside of the vector
classes (inside the `blender::math` namespace) and are not vector size
dependent for the most part.
In the ongoing effort to make shaders less GL centric, we are aiming
to share more code between GLSL and C++ to avoid code duplication.
####Motivations:
- We are aiming to share UBO and SSBO structures between GLSL and C++.
This means we will use many of the existing vector types and others
we currently don't have (uintX, intX). All these variations were
asking for many more code duplication.
- Deduplicate existing code which is duplicated for each vector size.
- We also want to share small functions. Which means that vector
functions should be static and not in the class namespace.
- Reduce friction to use these types in new projects due to their
incompleteness.
- The current state of the `BLI_(float|double|mpq)(2|3|4).hh` is a
bit of a let down. Most clases are incomplete, out of sync with each
others with different codestyles, and some functions that should be
static are not (i.e: `float3::reflect()`).
####Upsides:
- Still support `.x, .y, .z, .w` for readability.
- Compact, readable and easilly extendable.
- All of the vector functions are available for all the vectors types
and can be restricted to certain types. Also template specialization
let us define exception for special class (like mpq).
- With optimization ON, the compiler unroll the loops and performance
is the same.
####Downsides:
- Might impact debugability. Though I would arge that the bugs are
rarelly caused by the vector class itself (since the operations are
quite trivial) but by the type conversions.
- Might impact compile time. I did not saw a significant impact since
the usage is not really widespread.
- Functions needs to be rewritten to support arbitrary vector length.
For instance, one can't call `len_squared_v3v3` in
`math::length_squared()` and call it a day.
- Type cast does not work with the template version of the `math::`
vector functions. Meaning you need to manually cast `float *` and
`(float *)[3]` to `float3` for the function calls.
i.e: `math::distance_squared(float3(nearest.co), positions[i]);`
- Some parts might loose in readability:
`float3::dot(v1.normalized(), v2.normalized())`
becoming
`math::dot(math::normalize(v1), math::normalize(v2))`
But I propose, when appropriate, to use
`using namespace blender::math;` on function local or file scope to
increase readability.
`dot(normalize(v1), normalize(v2))`
####Consideration:
- Include back `.length()` method. It is quite handy and is more C++
oriented.
- I considered the GLM library as a candidate for replacement. It felt
like too much for what we need and would be difficult to extend / modify
to our needs.
- I used Macros to reduce code in operators declaration and potential
copy paste bugs. This could reduce debugability and could be reverted.
- This touches `delaunay_2d.cc` and the intersection code. I would like
to know @howardt opinion on the matter.
- The `noexcept` on the copy constructor of `mpq(2|3)` is being removed.
But according to @JacquesLucke it is not a real problem for now.
I would like to give a huge thanks to @JacquesLucke who helped during this
and pushed me to reduce the duplication further.
Reviewed By: brecht, sergey, JacquesLucke
Differential Revision: https://developer.blender.org/D13791
This patch implements the vector types (i.e:float2) by making heavy
usage of templating. All vector functions are now outside of the vector
classes (inside the blender::math namespace) and are not vector size
dependent for the most part.
In the ongoing effort to make shaders less GL centric, we are aiming
to share more code between GLSL and C++ to avoid code duplication.
Motivations:
- We are aiming to share UBO and SSBO structures between GLSL and C++.
This means we will use many of the existing vector types and others we
currently don't have (uintX, intX). All these variations were asking
for many more code duplication.
- Deduplicate existing code which is duplicated for each vector size.
- We also want to share small functions. Which means that vector functions
should be static and not in the class namespace.
- Reduce friction to use these types in new projects due to their
incompleteness.
- The current state of the BLI_(float|double|mpq)(2|3|4).hh is a bit of a
let down. Most clases are incomplete, out of sync with each others with
different codestyles, and some functions that should be static are not
(i.e: float3::reflect()).
Upsides:
- Still support .x, .y, .z, .w for readability.
- Compact, readable and easilly extendable.
- All of the vector functions are available for all the vectors types and
can be restricted to certain types. Also template specialization let us
define exception for special class (like mpq).
- With optimization ON, the compiler unroll the loops and performance is
the same.
Downsides:
- Might impact debugability. Though I would arge that the bugs are rarelly
caused by the vector class itself (since the operations are quite trivial)
but by the type conversions.
- Might impact compile time. I did not saw a significant impact since the
usage is not really widespread.
- Functions needs to be rewritten to support arbitrary vector length. For
instance, one can't call len_squared_v3v3 in math::length_squared() and
call it a day.
- Type cast does not work with the template version of the math:: vector
functions. Meaning you need to manually cast float * and (float *)[3] to
float3 for the function calls.
i.e: math::distance_squared(float3(nearest.co), positions[i]);
- Some parts might loose in readability:
float3::dot(v1.normalized(), v2.normalized())
becoming
math::dot(math::normalize(v1), math::normalize(v2))
But I propose, when appropriate, to use
using namespace blender::math; on function local or file scope to
increase readability. dot(normalize(v1), normalize(v2))
Consideration:
- Include back .length() method. It is quite handy and is more C++
oriented.
- I considered the GLM library as a candidate for replacement.
It felt like too much for what we need and would be difficult to
extend / modify to our needs.
- I used Macros to reduce code in operators declaration and potential
copy paste bugs. This could reduce debugability and could be reverted.
- This touches delaunay_2d.cc and the intersection code. I would like to
know @Howard Trickey (howardt) opinion on the matter.
- The noexcept on the copy constructor of mpq(2|3) is being removed.
But according to @Jacques Lucke (JacquesLucke) it is not a real problem
for now.
I would like to give a huge thanks to @Jacques Lucke (JacquesLucke) who
helped during this and pushed me to reduce the duplication further.
Reviewed By: brecht, sergey, JacquesLucke
Differential Revision: http://developer.blender.org/D13791
21.Q4 is required, older version should not show devices in the preferences.
This adds a check for the file version of amdhip64.dll file during hipew
initialization.
Differential Revision: https://developer.blender.org/D13324
Use the correct device function (hipDeviceGet) for multi GPU setups, instead
of hipGetDevice which just returns the default device.
Differential Revision: https://developer.blender.org/D13323
This fixes the the app crash happening when trying to render smoke as a dense
3D texture. The changes are related to matching up hipew with the actual HIP
headers.
Differential Revision: https://developer.blender.org/D13296
* Additional structs added to the hipew loader for device props
* Adds hipRTC functions to the loader for future usage
* Enables CPU+GPU usage for HIP
* Cleanup to the adaptive kernel compilation process
* Fix for kernel compilation failures with HIP with latest master
Ref T92393, D12958
This patch cleans up code for HIP device and makes it more consistent with the CUDA code.
It also fixes the issue with high VRAM usage on AMD cards using HIP allowing better performance and usage on cards like 6600XT.
Added a check in intern/cycles/kernel/bvh/bvh_util.h to prevent compiler error with hipcc
Reviewed By: brecht, leesonw
Maniphest Tasks: T92124
Differential Revision: https://developer.blender.org/D12834
NOTE: this feature is not ready for user testing, and not yet enabled in daily
builds. It is being merged now for easier collaboration on development.
HIP is a heterogenous compute interface allowing C++ code to be executed on
GPUs similar to CUDA. It is intended to bring back AMD GPU rendering support
on Windows and Linux.
https://github.com/ROCm-Developer-Tools/HIP.
As of the time of writing, it should compile and run on Linux with existing
HIP compilers and driver runtimes. Publicly available compilers and drivers
for Windows will come later.
See task T91571 for more details on the current status and work remaining
to be done.
Credits:
Sayak Biswas (AMD)
Arya Rafii (AMD)
Brian Savery (AMD)
Differential Revision: https://developer.blender.org/D12578