On an iMac (Retina 5K, 27-inch, Late 2015) it crashed when rendering using Cycles. This was due to the fact that it incorrectly detected that the machine supported ray tracing. This uses the device.supportsRaytracing flag to fill in the use_hardware_raytracing flag for the device.
This patch removes a workaround for an issue that is now understood to be undefined behaviour (and fixed by #108176). It also adds two useful debug flags that we would like to be available in Blender 3.6.
Pull Request: https://projects.blender.org/blender/blender/pulls/108322
This patch fixes an undefined behaviour where we were trying to use linked functions with binary archives. This isn't supported yet. At best this will fail silently, but this is not guaranteed in future. To fix this we simply disable binary archives if any linked functions are involved. The impact of this is that the `SHADE_SURFACE_RAYTRACE` and `SHADE_SURFACE_MNEE` kernels will fall back to the file system cache when MetalRT is enabled. The file system cache will occasionally be purged due to factors beyond Blender's control.
Pull Request: https://projects.blender.org/blender/blender/pulls/108176
Only Embree CPU BVH was built in the multi-device case. However, one
Embree GPU BVH is needed per GPU, so we now reuse the same logic as in
the other backends.
Pull Request: https://projects.blender.org/blender/blender/pulls/107992
In some cases comments at the end of control statements were wrapped
onto new lines which made it read as if they applied to the next line
instead of the (now) previous line.
Relocate comments to the previous line or in some cases the end of the
line (before the brace) to avoid confusion.
Note that in quite a few cases these blocks didn't read well
even before MultiLine was used as comments after the brace caused
wrapping across multiple lines in a way that didn't follow
formatting used everywhere else.
* User simpler API names that accept both PTX and OptiX-IR
* New argument for optixProgramGroupGetStackSize, leave to default
* Remove OptixPipelineLinkOptions::debugLevel that does nothing
Pull Request: https://projects.blender.org/blender/blender/pulls/107450
HIP RT enables AMD hardware ray tracing on RDNA2 and above, and falls back to a
to shader implementation for older graphics cards. It offers an average 25%
sample rendering rate improvement in Cycles benchmarks, on a W6800 card.
The ray tracing feature functions are accessed through HIP RT SDK, available on
GPUOpen. HIP RT traversal functionality is pre-compiled in bitcode format and
shipped with the SDK.
This is not yet enabled as there are issues to be resolved, but landing the
code now makes testing and further changes easier.
Known limitations:
* Not working yet with current public AMD drivers.
* Visual artifact in motion blur.
* One of the buffers allocated for traversal has a static size. Allocating it
dynamically would reduce memory usage.
* This is for Windows only currently, no Linux support.
Co-authored-by: Brecht Van Lommel <brecht@blender.org>
Ref #105538
Using the new HIP SDK 5.5 that includes a fix for the compiler bug.
This also enables the light tree.
For Linux the binaries are still disabled. ROCm 5.5 is planned to
include the same fix but not released yet. When that happens we
should be able to enable Linux as well.
Ref #104786Fix#104085
Pull Request: https://projects.blender.org/blender/blender/pulls/107098
Updated Embree 4 library with GPU support is required for it to be
compiled - compatiblity with Embree 3 and Embree 4 without GPU support
is maintained.
Enabling hardware raytracing is an opt-in user setting for now.
Pull Request: https://projects.blender.org/blender/blender/pulls/106266
This patch replaces `dispatchThreadgroups` with `dispatchThreads` which takes care of non-uniform threadgroup bounds. This allows us to remove the bounds guards in the integrator kernel entry points.
Pull Request: https://projects.blender.org/blender/blender/pulls/106217
For example
```
OIIOOutputDriver::~OIIOOutputDriver()
{
}
```
becomes
```
OIIOOutputDriver::~OIIOOutputDriver() {}
```
Saves quite some vertical space, which is especially handy for
constructors.
Pull Request: https://projects.blender.org/blender/blender/pulls/105594
Workaround for a crash when `addComputePipelineFunctionsWithDescriptor` is called *after* `newComputePipelineStateWithDescriptor` with linked functions (i.e. with MetalRT enabled). Ideally we would like to call `newComputePipelineStateWithDescriptor` (async) first so we can bail out if needed, but we can stop the crash by flipping the order when there are linked functions. However when addComputePipelineFunctionsWithDescriptor is called first it will block while it builds the pipeline, offering no way of bailing out.
Note that this only has an impact when the "MetalRT (Experimental)" option is checked.
Pull Request: https://projects.blender.org/blender/blender/pulls/105629
The traversable handle of a BLAS may be zero when the relevant geometry
is empty (no triangles/curves/points/...), as no BLAS is built in such cases.
It is not correct to attach a zero handle to a TLAS, so filter out such instances.
When running unit tests or other fast completing renders, forced crashes can occur if there are any slow, outstanding PSO compilation requests (due to the `std::terminate` fall-back case in `~ShaderCache`).
This patch eliminates the need for this shutdown hack by using of the async version of `newComputePipelineStateWithDescriptor` when creating a PSO for the first time. In doing so, we are able to explicitly respond to app shutdown instead of waiting for the pipeline to finish compiling (..and then timing out and force-crashing). We still use the blocking version of `newComputePipelineStateWithDescriptor` when loading from an archive, as this can handle loading from a corrupted archive gracefully. Finally, we move `addComputePipelineFunctionsWithDescriptor` to *after* the PSO is built (as this will trigger a full blocking compile if the PSO has not yet been built, which would bring back the original issue).
Pull Request: https://projects.blender.org/blender/blender/pulls/105506
This patch fixes hanging unit tests when MetalRT is enabled. It simplifies and fixes the kernel selection logic by baking the MetalRT-specific options into `kernels_md5` rather than expanding out and testing MetalRT bit flags explicitly.
Pull Request #105270
To improve mesh upload speeds and reduce the size of the scene data which allows larger scenes to be rendered.
The meshes in Cycles are currently stored as flattened meshes, where each triangle is stored as a set of 3 vertices. Unflattening writes out the vertices in a list according to the index buffer. This uses a lot of memory and for current hardware does not provide a noticeable benefit. This change unflattens the mesh by directly using the meshes vertex and index buffers directly and skips the unflattening. This change allows for larger scenes and also a reduction in the sizes of the meshes. Further it results in a decrease the amount of time it takes to upload the data to a GPU. This is especially important for when multiple GPUs are used in a single machine.
Pull Request #105173