Commit Graph

202 Commits

Author SHA1 Message Date
Brecht Van Lommel
ecd54ba4e4 Cycles: Metal graphics interop
This is trivial with unified memory, and avoids one memory copy.

Pull Request: https://projects.blender.org/blender/blender/pulls/137363
2025-04-28 11:38:56 +02:00
Brecht Van Lommel
4d7bd22beb Refactor: Cycles: Graphics interop changes
* Add GraphicsInteropDevice to check if interop is possible with device
* Rename GraphcisInterop to GraphicsInteropBuffer
* Include display device type and memory size in GraphicsInteropBuffer
* Unnest graphics interop class to make forward declarations possible

Pull Request: https://projects.blender.org/blender/blender/pulls/137363
2025-04-28 11:38:56 +02:00
Hans Goudey
d4b23d38c9 Cleanup: Formatting 2025-04-03 11:44:25 -04:00
Michael Jones
326d5bca03 Cycles: Support Decomposed MetalRT motion interpolation
Currently MetalRT interpolates transformation matrix on per-element basis
which leads to issues like #135659.

This change adds implementation of for decomposed (Scale/Rotate/Translate)
motion interpolation, matching behavior of BVH2 and other HW-RT.

This requires macOS 15 and Xcode 16 in order to use this interpolation.
On older platforms and compilers old interpolation is used.

Currently there is no changes on the user (by default) and it is only
available via CYCLES_METALRT_PCMI environment variable. This is because
there are some issues with complex motion paths that need to be looked
into. Having code available makes it easier to do further debugging.

Ref #135659

Authored by Emma Liu

Pull Request: https://projects.blender.org/blender/blender/pulls/136253
2025-04-03 16:24:04 +02:00
Sergey Sharybin
42cbc52b07 Fix: Warning in Cycles motion blur kernel features expression
This fixes the following warning with MSVC:
device_impl.cpp(287): warning C4805: '|=': unsafe mix of type 'bool' and type 'ccl::uint' in operation

The similar fix is applied to Metal code as well.

There is no short-circuiting boolean operator ||=, so expand the expression.

Pull Request: https://projects.blender.org/blender/blender/pulls/136561
2025-03-26 17:20:33 +01:00
Brecht Van Lommel
f506564a47 Cleanup: Unused argument compiler warning 2025-03-23 21:01:25 +01:00
Michael Jones
c23c4ae6ba Cycles: Fix issue affecting Metal kernel profiling (normally disabled)
This issue only affects profiling mode (`CYCLES_METAL_PROFILING=1`). There's a modest limit to the number of concurrent counter sampling buffers per device, so instead of creating one per device queue, we create one per device that can be reused by successive device queues.

Authored by Emma Liu.

Pull Request: https://projects.blender.org/blender/blender/pulls/136248
2025-03-21 12:47:15 +01:00
Michael Jones
9dca0ba856 Cycles: Maximise MTLCompiler concurrency when GUI isn't active
This PR will result in much faster Metal kernel (re)compilation for command line rendering.

Pull Request: https://projects.blender.org/blender/blender/pulls/136247
2025-03-20 14:07:14 +01:00
Michael Jones
584f19a5af Cycles: Apple Silicon tidy: Remove non-UMA codepaths (v2)
This PR removes a bunch of dead code following #123551 (removal of AMD and Intel GPU support). It is safe to assume that UMA will be available, so a lot of codepaths that dealt with copying between CPU and GPU are now just clutter.

Pull Request: https://projects.blender.org/blender/blender/pulls/136146
2025-03-19 12:53:01 +01:00
Brecht Van Lommel
ab3204e251 Revert "Cycles: Apple Silicon tidy: Remove non-UMA codepaths"
This reverts commit 1a93dfe4fc.

This is hitting asserts in the tests, revert until it's fixed.

Ref #136117
2025-03-18 20:37:23 +01:00
Michael Jones
1a93dfe4fc Cycles: Apple Silicon tidy: Remove non-UMA codepaths
This PR removes a bunch of dead code following #123551 (removal of AMD and Intel GPU support). It is safe to assume that UMA will be available, so a lot of codepaths that dealt with copying between CPU and GPU are now just clutter.

Pull Request: https://projects.blender.org/blender/blender/pulls/136117
2025-03-18 19:09:25 +01:00
Brecht Van Lommel
b487bcd2bd Merge branch 'blender-v4.4-release' 2025-02-13 19:59:25 +01:00
Brecht Van Lommel
f99f958c47 Refactor: Cycles: Add host_alloc/free to device API
This may be used for device to do host memory allocation in a way that
is more efficient for copy the host memory to the device.

Also rename and group device memory allocation functions for clarity.

Pull Request: https://projects.blender.org/blender/blender/pulls/134412
2025-02-13 19:58:56 +01:00
Sean Kim
a2b75c87c8 Merge branch 'blender-v4.4-release' 2025-02-11 14:10:10 -08:00
Sergey Sharybin
a535a1a027 Fix #132782: MetalRT: Missing Geometry in Cycles preview on MacOS 15.2
The issue also happens on macOS 15.3.

This is a Metal driver bug, a fix is coming in macOS 15.4. Until then
disable refitting the viewport. There is no perceptible benefit from
refitting, so while it might be less that ideal it allows to side step
the problem and still benefit from the HWRT.

Pull Request: https://projects.blender.org/blender/blender/pulls/134399
2025-02-11 21:42:38 +01:00
Brecht Van Lommel
2c34786474 Merge branch 'blender-v4.4-release' 2025-02-11 20:43:17 +01:00
Brecht Van Lommel
9ad19396f5 Fix: Cycles Metal invalid storage mode check
Pull Request: https://projects.blender.org/blender/blender/pulls/134337
2025-02-11 20:42:01 +01:00
Brecht Van Lommel
21a90f26b6 Cleanup: Fix C++20 deprecation warnings in Cycles
Pull Request: https://projects.blender.org/blender/blender/pulls/134338
2025-02-11 16:42:03 +01:00
Brecht Van Lommel
e8ebcb3ee3 Fix: Cycles: Check if memory is host mapped without access to device_mem_map
This avoids concurrency issues.

Pull Request: https://projects.blender.org/blender/blender/pulls/132912
2025-01-29 14:12:23 +01:00
Brecht Van Lommel
8b7fce492e Refactor: Cycles: Change API so host and device memory are freed together
With host mapped memory these can be shared, and we can't get back the
original host pointer unless we make a copy which is inefficient.

Also add asserts to verify this doesn't happen.

Pull Request: https://projects.blender.org/blender/blender/pulls/132912
2025-01-29 14:12:19 +01:00
Brecht Van Lommel
2cfe2e0bfe Fix: Cycles: Re-copy memory from host to device without realloc
Should be a bit more efficient, and it fixes host memory fallback bugs,
where host memory was incorrectly freed during re-copy. For the case
where memory should get reallocated on the host, a new mem_move_to_host
was added.

Thanks to Jorn Visser for investigating and finding this problem.

Pull Request: https://projects.blender.org/blender/blender/pulls/132912
2025-01-29 14:11:50 +01:00
Brecht Van Lommel
bd0cca5d6d Fix: Cycles Metal RT assert with persistent data render
Pull Request: https://projects.blender.org/blender/blender/pulls/133490
2025-01-23 15:29:41 +01:00
Michael Jones
fd06944d15 Fix #131458: Cycles Metal workaround for binary archives crash
There is a macOS bug that causes `[binaryArchive serializeToURL]` to crash sometimes. The fix is coming in macOS 15.4.

Pull Request: https://projects.blender.org/blender/blender/pulls/132688
2025-01-06 14:12:22 +01:00
Brecht Van Lommel
9971648783 Refactor: Cycles: Replace new/delete by unique_ptr, in simple cases
Pull Request: https://projects.blender.org/blender/blender/pulls/132361
2025-01-03 10:23:30 +01:00
Brecht Van Lommel
57ff24cb99 Refactor: Cycles: Add const keyword to more function parameters
Pull Request: https://projects.blender.org/blender/blender/pulls/132361
2025-01-03 10:23:24 +01:00
Brecht Van Lommel
dd51c8660b Refactor: Cycles: Add const keyword where possible, using clang-tidy
Check was misc-const-correctness, combined with readability-isolate-declaration
as suggested by the docs.

Temporarily clang-format "QualifierAlignment: Left" was used to get consistency
with the prevailing order of keywords.

Pull Request: https://projects.blender.org/blender/blender/pulls/132361
2025-01-03 10:23:20 +01:00
Brecht Van Lommel
689633d802 Refactor: Cycles: Avoid unsafe memcpy and memcmp
Pull Request: https://projects.blender.org/blender/blender/pulls/132361
2025-01-03 10:23:15 +01:00
Brecht Van Lommel
d9150484a2 Cleanup: Cycles: Remove some unnecessary #if 0 and #if 1
Pull Request: https://projects.blender.org/blender/blender/pulls/132361
2025-01-03 10:23:09 +01:00
Brecht Van Lommel
d0c2e68e5f Refactor: Cycles: Automated clang-tidy fixups in Cycles
* Use .empty() and .data()
* Use nullptr instead of 0
* No else after return
* Simple class member initialization
* Add override for virtual methods
* Include C++ instead of C headers
* Remove some unused includes
* Use default constructors
* Always use braces
* Consistent names in definition and declaration
* Change typedef to using

Pull Request: https://projects.blender.org/blender/blender/pulls/132361
2025-01-03 10:22:55 +01:00
Brecht Van Lommel
3c2a6fbb9c Refactor: Cycles: Use nullptr instead of NULL
Pull Request: https://projects.blender.org/blender/blender/pulls/132361
2025-01-03 10:22:43 +01:00
Sergey Sharybin
aec4ba39b9 Merge branch 'blender-v4.3-release' 2024-11-04 17:54:52 +01:00
Michael Jones
d1368883ed Cycles: MetalRT: Fix logic bug when deciding if HW RT should be used
Don't try to use MetalRT by default unless the device explicitly reports that RT is supported. We shouldn't just rely on an assumption that it's supported for M3 and beyond, ad infinitum.

Pull Request: https://projects.blender.org/blender/blender/pulls/129688
2024-11-04 17:54:12 +01:00
Clément Foucault
47f7aaa2cc Merge branch 'blender-v4.3-release' 2024-11-01 12:16:38 +01:00
Jason Fielder
7fbc9e9428 Fix: Metal: Memory leaks identified by Instruments and Xcode memory graph.
Running Xcode memory graphs and the Instruments tools revealed
memory leaks caused, in the main, by over-retained objects.
This removes the unnecessary 'retains' and adds some asserts
to guard against over-retaining in the future.

There are a few memory leaks remaining involving PyUnicode_DecodeUTF8
but I am unable to identify the cause of these at this time.

Authored by Apple: James McCarthy

Pull Request: https://projects.blender.org/blender/blender/pulls/129117
2024-11-01 11:56:51 +01:00
Weizhen Huang
34b95fe3f6 Cleanup: Cycles: use existing utility functions for geometry types
Pull Request: https://projects.blender.org/blender/blender/pulls/129552
2024-10-30 16:45:56 +01:00
Sergey Sharybin
95f361ac31 Fix: Cycles occasional crash after Metal render
Happens for renders from command line, when kernel specialization
thread is still working after the allocators on the Blender side
have been deinitialized.

Add an explicit deinitializaiton, which ensures all Cycles worker
and cache threads are finished before the allocators are deinitialized.

This should solve occasional crashes when running regression tests
for Metal or Metal-RT.

Pull Request: https://projects.blender.org/blender/blender/pulls/128239
2024-09-27 14:39:49 +02:00
Sergey Sharybin
b96a7b7204 Fix #127622: 4.1 splash screen won't render with MetalRT
The commit which made the issue to be more easily discoverable is 4651f8a08f.

The fix is similar to #127114.

Pull Request: https://projects.blender.org/blender/blender/pulls/128173
2024-09-26 13:39:22 +02:00
Sergey Sharybin
ce6454d02f Fix #126005: x64 Blender on Apple Silicon doesn't render properly in Cycles GPU
The GPU packed state is a static check from the Cycles core perspective,
and it is disabled for non-Apple Silicon GPUs. However, the Metal kernel
always used packed integrator.

This change makes it so the Host and Device side checks for the Host CPU
are aligned, and that Device-side packed state check does not differ from
the Host side.

Pull Request: https://projects.blender.org/blender/blender/pulls/126082
2024-08-08 16:01:23 +02:00
Weizhen Huang
4a4270d73c Merge branch 'blender-v4.2-release' 2024-07-08 16:19:41 +02:00
Michael Jones
5a29be3c75 Cycles: Fix #116243, #122022 - MetalRT live viewport stability issues
This PR fixes live viewport stability issues on Mac when MetalRT is enabled.

There were two sources of instability:

1) `MTLAccelerationStructure` instances were not being correctly retained meaning that use-after-free crashes could occur following a geometry sync.
2) `MTLIntersectionFunctionTable` objects could be unsafely shared between multiple `MetalDeviceQueue` instances (in this case, `setBuffer` being the unsafe mutation)

The solution to 2 involves creating a new `MetalDispatchPipeline` type which is strictly used by only 1 `MetalDeviceQueue` instance.

Pull Request: https://projects.blender.org/blender/blender/pulls/124055
2024-07-08 16:18:34 +02:00
Michael Jones
3c38bff667 Cycles: Fix MetalRT motion blur setup buffer overrun
This PR fixes a buffer overrun crash in the MetalRT backend. When non-traceable objects are in the scene, 'num_motion_transforms' is undercounting and the downstream buffer writes (i.e.`motion_transforms[motion_transform_index++]`) are overrunning.

Pull Request: https://projects.blender.org/blender/blender/pulls/124351
2024-07-08 16:17:19 +02:00
Alaska
c8340cf754 Cycles: Remove AMD and Intel GPU support from Metal backend
This is because with the addition of new features to Cycles, these GPUs
experienced significant performance regressions and bugs, all stemming
from bugs in the Metal GPU driver/compiler. The only reasonable way to
work around these issues was to disable parts of Cycles code on
these GPUs to avoid the driver/compiler bugs.

This resulted in increased development time maintaining these platforms
while being unable to deliver feature parity with other
GPU backends.

It has been decided that this development time is better spent
maintaining platforms that are still actively maintained by
hardware/software vendors, and so AMD and Intel GPU support will be
removed from the Metal backend for Cycles.

Pull Request: https://projects.blender.org/blender/blender/pulls/123551
2024-06-26 17:16:20 +02:00
Alaska
3232458152 Fix #123763: Cycles Metal renders with MNEE stuck on some Macs
On some Macs, MNEE would be disabled in Cycles to work around a bug.
However this just led to these devices skipping over MNEE related
parts of the rendering pipeline and not properly progressing through
the render.

This commit fixes this issue by properly disabling MNEE on these devices.

Pull Request: https://projects.blender.org/blender/blender/pulls/123765
2024-06-26 14:15:01 +02:00
Sergey Sharybin
b803d7fabb Fix: Command line Cycles render crash on multi-CUDA device
Since #118841 there are more cases where Cycles would check for the
graphics interop support. This could lead to a crash when graphics
interop functions are called without having active graphics context.

This change makes it so there is no graphics interop calls when doing
headless render. In order to achieve this the device creation is now
aware of the headless mode.

Pull Request: https://projects.blender.org/blender/blender/pulls/122844
2024-06-07 17:53:44 +02:00
Michael Jones
5508b41a40 Cycles: MetalRT optimisations (scene_intersect_shadow + random_walk)
This PR contains optimisations and a general tidy-up of the MetalRT backend.

- Currently `scene_intersect` is used for both normal and (opaque) shadow rays, however the usage patterns are different enough to warrant specialisation. Shadow intersection tests (flagged with `PATH_RAY_SHADOW_OPAQUE`) only need a bool result, but need a larger "self" payload in order to exclude hits against target lights. By specialising we can minimise the payload size in each case (which is helps performance) and avoid some dynamic branching. This PR introduces a new `scene_intersect_shadow` function which is specialised in Metal, and currently redirects to `scene_intersect` in the other backends.

- Currently `scene_intersect_local` is implemented for worst-case payload requirements as demanded by `subsurface_disk` (where `max_hits` is 4). The random_walk case only demands 1 hit result which we can retrieve directly from the intersector object (rather than stashing it in the payload). By specialising, we significantly reduce the payload size for random_walk queries, which has a big impact on performance. Additionally, we only need to use a custom intersection function for the first ray test in a random walk (for self-primitive filtering), so this PR forces faster `opaque` intersection testing for all but the first random walk test.

- Currently `scene_intersect_volume` has a lot of redundant code to handle non-triangle primitives despite volumes only being enclosed by trimeshes. This PR removes this code.

Additionally, this PR tidies up the convoluted intersection function linking code, removes some redundant intersection handlers, and uses more consistent naming of intersection functions.

On a M3 MacBook Pro, these changes give 2-3% performance increase on typical scenes with opaque trimesh materials (e.g. barbershop, classroom junkshop), but can give over 15% performance increase for certain scenes using random walk SSS (e.g. monster).

Pull Request: https://projects.blender.org/blender/blender/pulls/121397
2024-05-10 16:38:02 +02:00
Attila Áfra
26c93c8359 Cycles: Enable OIDN 2.3 lazy device module loading
This enables the new lazy module loading behavior introduced in OIDN 2.3,
without breaking compatibility with older versions of OIDN (using separate
code paths).

Also, the detection of OIDN support for devices is now much cleaner, and
devices do not need to be matched by PCI address or device name anymore.

Pull Request: https://projects.blender.org/blender/blender/pulls/121362
2024-05-07 14:07:39 +02:00
Michael Jones
9b833fdeba Cycles: Use more accurate GPU counter timestamps for profiling in Metal
This PR replaces the existing CPU wall-clock based profiling mechanism with more precise GPU counter based timestamps. As before, it is enabled by setting the env var `CYCLES_METAL_PROFILING=1`. Original implementation by Morteza Mostajabodaveh.

Pull Request: https://projects.blender.org/blender/blender/pulls/121208
2024-04-29 15:25:32 +02:00
Alaska
1dede89eee Refactor: Allow get_apple_gpu_architecture to report non Apple GPUs
get_apple_gpu_architecture will now report if the GPU being checked
is not an Apple GPU.

At the moment this has no functional changes. But it reduces the
chances of mistakes in the future where a developer tries to enable
a feature on newer Apple GPUs using get_apple_gpu_architecture,
and accidentally enables it on unsupported AMD and Intel GPUs.

Pull Request: https://projects.blender.org/blender/blender/pulls/120448
2024-04-15 15:04:23 +02:00
Alaska
eff4fe24cf Cycles: Properly default to Metal-RT off unless GPU is a M3 or newer
Ever since commit [1], `use_metalrt_by_default` will be True
if the GPU being used is not a M1 or M2 based system.
The intention of this was to enable MetalRT by default for
M3 and newer devices that have hardware for ray traversal.

However the side effect of this change was that all AMD GPUs would
have `use_metalrt_by_default` set to True. Which appears to be the
main culprit causing crashes on older AMD GPUs in #120126.
Since these GPUs don't support MetalRT.

This commit fixes this issue by only setting
`use_metalrt_by_default` to True if the GPU is not M1 or M2 based,
and the GPU is Apple Silicon based. Which equates to M3 or newer.
Which is the original intent of this code.

This resolves the issue where AMD GPUs were being told to use MetalRT
by default, when they shouldn't be.

[1] 322a2f7b12

Pull Request: https://projects.blender.org/blender/blender/pulls/120299
2024-04-09 16:19:24 +02:00
Sergey Sharybin
bffcb000e8 Fix: Cycles crash on Metal GPU with ASAN builds
Running a very simple files when Blender is built with the
WITH_COMPILER_ASAN=ON and WITH_CYCLES_KERNEL_ASAN=ON CMake options
leads to ASAN reporting an unknown-crash at line where the worker
pool is being filled in.

It is not entirely clear if it is a real issue in the code, since
placing debug prints with `this` address report proper addresses,
however there is no harm on capturing `this` pointer by value and
it does solve the ASAN reporting issues.

It is possible to reproduce the ASAN crash with the following steps:
- Start with --factory-startup
- Enable Metal device in User Preferences
- Switch render device to GPU Compute
- Switch viewport more to Rendered

Pull Request: https://projects.blender.org/blender/blender/pulls/119867
2024-03-25 11:36:15 +01:00