This is not currently working, with an internal compiler error. However
we are currently using BVH2 instead of Metal RT. So this has no effect for
users, it's being committed to avoid the code getting outdated.
Ref T92573, T92212
Differential Revision: https://developer.blender.org/D13632
This patch fixes a correctness issue discovered in the `int4 select(...)` function on Apple Silicon machines, which causes bad bvh2 builds. Although the generated bvh2s give correct renders, the resulting runtime performance is terrible. This fix allows us to switch over to bvh2 on Apple Silicon giving a significant performance uplift for many of the standard benchmarking assets. It also fixes some unit test failures stemming from the use of MetalRT, and trivially enables the new pointcloud primitive.
Ref T92212
Reviewed By: brecht
Maniphest Tasks: T92212
Differential Revision: https://developer.blender.org/D13877
This patch fixes crash T94736 on Metal in which the launch_params were not being updated to reflect destruction of MetalMem objects.
Reviewed By: brecht
Differential Revision: https://developer.blender.org/D13875
The root of the issue is caused by Cycles ignoring OpenGL limitation on
the maximum resolution of textures: Cycles was allocating texture of the
final render resolution. It was exceeding limitation on certain GPUs and
driver.
The idea is simple: use multiple textures for the display, each of which
will fit into OpenGL limitations.
There is some code which allows the display driver to know when to start
the new tile. Also added some code to allow force graphics interop to be
re-created. The latter one ended up not used in the final version of the
patch, but it might be helpful for other drivers implementation.
The tile size is limited to 8K now as it is the safest size for textures
on many GPUs and OpenGL drivers.
This is an updated fix with a workaround for freezing with the NVIDIA
driver on Linux.
Differential Revision: https://developer.blender.org/D13385
Query TBB for the maximum allowed concurrency, which is free from a bug
in own concurrency detection code. One thing to keep in mind is that now
Cycles is limited by the number of threads in the TBB areana from which
Session is created. This isn't a problem for Blender since we do not limit
arena on Blender side. Could be something to watch out for in other Cycles
integrations.
Differential Revision: https://developer.blender.org/D13658
Enables the `bpy.ops.cycles.denoise_animation()` operator again and modifies it to support
temporal denoising with OptiX. This requires renders that were done with both the "Vector"
and "Denoising Data" passes.
Differential Revision: https://developer.blender.org/D11442
This add support for rendering of the point cloud object in Blender, as a native
geometry type in Cycles that is more memory and time efficient than instancing
sphere meshes. This can be useful for rendering sand, water splashes, particles,
motion graphics, etc.
Points are currently always rendered as spheres, with backface culling. More
shapes are likely to be added later, but this is the most important one and can
be customized with shaders.
For CPU rendering the Embree primitive is used, for GPU there is our own
intersection code. Motion blur is suppored. Volumes inside points are not
currently supported.
Implemented with help from:
* Kévin Dietrich: Alembic procedural integration
* Patrick Mourse: OptiX integration
* Josh Whelchel: update for cycles-x changes
Ref T92573
Differential Revision: https://developer.blender.org/D9887
This fixes crash T94022 when selecting live viewport render with both GPU & CPU devices selected. It is caused by incorrect `KernelBVHLayout` assignment. Similar to `BVH_LAYOUT_MULTI_OPTIX` for Optix, this patch adds a `BVH_LAYOUT_MULTI_METAL` to correctly redirect to the correct Metal BVH layout type.
Reviewed By: brecht
Differential Revision: https://developer.blender.org/D13561
This patch suppresses OS version warnings and hides currently unsupported Metal GPUs when enumerating devices.
Reviewed By: brecht
Differential Revision: https://developer.blender.org/D13506
This reverts commit 5e37f70307.
It is leading to freezing of the entire desktop for a few seconds when stopping
3D viewport rendering on my Linux / NVIDIA system.
The root of the issue is caused by Cycles ignoring OpenGL limitation on
the maximum resolution of textures: Cycles was allocating texture of the
final render resolution. It was exceeding limitation on certain GPUs and
driver.
The idea is simple: use multiple textures for the display, each of which
will fit into OpenGL limitations.
There is some code which allows the display driver to know when to start
the new tile. Also added some code to allow force graphics interop to be
re-created. The latter one ended up not used in the final version of the
patch, but it might be helpful for other drivers implementation.
The tile size is limited to 8K now as it is the safest size for textures
on many GPUs and OpenGL drivers.
Differential Revision: https://developer.blender.org/D13385
Somehow only a part of rBf4f8b6dde32b0438e0b97a6d8ebeb89802987127 ended up in
Cycles X, causing the issue that commit fixed, "OPTIX_ERROR_INVALID_VALUE" when the
system is out of memory, to show up again.
This adds the missing changes to fix that problem.
Maniphest Tasks: T93620
Differential Revision: https://developer.blender.org/D13488
This patch adds the Metal host-side code:
- Add all core host-side Metal backend files (device_impl, queue, etc)
- Add MetalRT BVH setup files
- Integrate with Cycles device enumeration code
- Revive `path_source_replace_includes` in util/path (required for MSL compilation)
This patch also includes a couple of small kernel-side fixes:
- Add an implementation of `lgammaf` for Metal [Nemes, Gergő (2010), "New asymptotic expansion for the Gamma function", Archiv der Mathematik](https://users.renyi.hu/~gergonemes/)
- include "work_stealing.h" inside the Metal context class because it accesses state now
Ref T92212
Reviewed By: brecht
Maniphest Tasks: T92212
Differential Revision: https://developer.blender.org/D13423
The OptiX denoiser does have an upper limit as to how many pixels it can denoise at once, so
this changes the OptiX denoising process to use tiles for high resolution images.
The OptiX SDK does have an utility function for this purpose, so changes are minor, adjusting
the configured tile size and including enough overlap.
Maniphest Tasks: T92308
Differential Revision: https://developer.blender.org/D13436
This patch adds MetalRT support to Cycles kernel code. It is mostly additive in nature or confined to Metal-specific code, however there are a few areas where this interacts with other code:
- MetalRT closely follows the Optix implementation, and in some cases (notably handling of transforms) it makes sense to extend Optix special-casing to MetalRT. For these generalisations we now have `__KERNEL_GPU_RAYTRACING__` instead of `__KERNEL_OPTIX__`.
- MetalRT doesn't support primitive offsetting (as with `primitiveIndexOffset` in Optix), so we define and populate a new kernel texture, `__object_prim_offset`, containing per-object primitive / curve-segment offsets. This is referenced and applied in MetalRT intersection handlers.
- Two new BVH layout enum values have been added: `BVH_LAYOUT_METAL` and `BVH_LAYOUT_MULTI_METAL_EMBREE` for XPU mode). Some host-side enum case handling has been updated where it is trivial to do so.
Ref T92212
Reviewed By: brecht
Maniphest Tasks: T92212
Differential Revision: https://developer.blender.org/D13353
This patch adds new arg-type parameters to `DeviceQueue::enqueue` and its overrides. This is in preparation for the Metal backend which needs this information for correct argument encoding.
Ref T92212
Reviewed By: brecht
Maniphest Tasks: T92212
Differential Revision: https://developer.blender.org/D13357
Some enum names were changed/removed in OptiX 7.4, so some changes are necessary to
make things compile still.
In addition, OptiX 7.4 also adds built-in support for catmull-rom curves, so it is no longer
necessary to convert the catmull-rom data to cubic bsplines first, and has endcaps disabled
by default now, so can remove the special handling via any-hit programs that filtered them
out before.
Differential Revision: https://developer.blender.org/D13351
21.Q4 is required, older version should not show devices in the preferences.
This adds a check for the file version of amdhip64.dll file during hipew
initialization.
Differential Revision: https://developer.blender.org/D13324
Use the correct device function (hipDeviceGet) for multi GPU setups, instead
of hipGetDevice which just returns the default device.
Differential Revision: https://developer.blender.org/D13323
Happens when device runs out of memory and Cycles is moving some
textures to the host memory.
The delayed memory free for OptiX BVH was moving data from one
device_memory to another, leaving the original device memory in
an invalid state. This was ruining the allocation map in the CUDA
device which is using pointer to the device_memory.
This change makes it so the memory pointer is stolen from BVH
into the delayed memory free list.
Additionally, forbid copying and moving instances of device_memory
and added sanity checks in the device implementation.
Differential Revision: https://developer.blender.org/D13316
This fixes the the app crash happening when trying to render smoke as a dense
3D texture. The changes are related to matching up hipew with the actual HIP
headers.
Differential Revision: https://developer.blender.org/D13296
This patch adds a CMake option "WITH_CYCLES_DEBUG" which builds cycles with
a feature that allows debugging/selecting the direct-light sampling strategy.
The same option may later be used to add other debugging features that could
affect performance in release builds.
The three options are:
* Forward path tracing (e.g., via BSDF or phase function)
* Next-event estimation
* Multiple importance sampling combination of the previous two methods
Such a feature is useful for debugging light different sampling, evaluation,
and pdf methods (e.g., for light sources and BSDFs).
Differential Revision: https://developer.blender.org/D13152
Introduce a packed_float3 type for smaller storage that is exactly 3
floats, instead of 4. For computation float3 is still used since it can
use SIMD instructions.
Ref T92212
Differential Revision: https://developer.blender.org/D13243
Partially reverts commit rB440a3475b8f5410e5c41bfbed5ce82771b41356f because
"optixDenoiserComputeIntensity" does not currently support input images that are not packed (the
"pixelStrideInBytes" field is not zero). As a result the intensity calculation would take into account
data from other passes in the image, some of which was scaled by the number of samples still and
therefore produce widely incorrect results that then caused artifacts in the denoised image.
Maniphest Tasks: T93029
Remove outdated CUDA comments for bindless textures and cleanup some HIP comments that still mentioned CUDA.
Differential Revision: https://developer.blender.org/D13189
The issue was caused by splitting happening twice.
Fixed by checking for split flag which is assigned to the both states
during split.
The tricky part was to write catcher data at the moment of split: the
transparency and shadow catcher sample count is to be accumulated at
that point. Now it is happening in the `intersect_closest` kernel.
The downside is that render buffer is to be passed to the kernel, but
the benefit is that extra split bounce check is not needed now.
Had to move the passes write to shadow catcher header, since include
of `film/passes.h` causes all the fun of requirement to have BSDF
data structures available.
Differential Revision: https://developer.blender.org/D13177
This is due to a driver bug, so disable it for now until it gets resolved
in a future driver release.
Ref T92972
Differential Revision: https://developer.blender.org/D13167