Commit Graph

68 Commits

Author SHA1 Message Date
Michael Jones
fd06944d15 Fix #131458: Cycles Metal workaround for binary archives crash
There is a macOS bug that causes `[binaryArchive serializeToURL]` to crash sometimes. The fix is coming in macOS 15.4.

Pull Request: https://projects.blender.org/blender/blender/pulls/132688
2025-01-06 14:12:22 +01:00
Brecht Van Lommel
9971648783 Refactor: Cycles: Replace new/delete by unique_ptr, in simple cases
Pull Request: https://projects.blender.org/blender/blender/pulls/132361
2025-01-03 10:23:30 +01:00
Brecht Van Lommel
dd51c8660b Refactor: Cycles: Add const keyword where possible, using clang-tidy
Check was misc-const-correctness, combined with readability-isolate-declaration
as suggested by the docs.

Temporarily clang-format "QualifierAlignment: Left" was used to get consistency
with the prevailing order of keywords.

Pull Request: https://projects.blender.org/blender/blender/pulls/132361
2025-01-03 10:23:20 +01:00
Brecht Van Lommel
689633d802 Refactor: Cycles: Avoid unsafe memcpy and memcmp
Pull Request: https://projects.blender.org/blender/blender/pulls/132361
2025-01-03 10:23:15 +01:00
Brecht Van Lommel
d0c2e68e5f Refactor: Cycles: Automated clang-tidy fixups in Cycles
* Use .empty() and .data()
* Use nullptr instead of 0
* No else after return
* Simple class member initialization
* Add override for virtual methods
* Include C++ instead of C headers
* Remove some unused includes
* Use default constructors
* Always use braces
* Consistent names in definition and declaration
* Change typedef to using

Pull Request: https://projects.blender.org/blender/blender/pulls/132361
2025-01-03 10:22:55 +01:00
Brecht Van Lommel
3c2a6fbb9c Refactor: Cycles: Use nullptr instead of NULL
Pull Request: https://projects.blender.org/blender/blender/pulls/132361
2025-01-03 10:22:43 +01:00
Sergey Sharybin
95f361ac31 Fix: Cycles occasional crash after Metal render
Happens for renders from command line, when kernel specialization
thread is still working after the allocators on the Blender side
have been deinitialized.

Add an explicit deinitializaiton, which ensures all Cycles worker
and cache threads are finished before the allocators are deinitialized.

This should solve occasional crashes when running regression tests
for Metal or Metal-RT.

Pull Request: https://projects.blender.org/blender/blender/pulls/128239
2024-09-27 14:39:49 +02:00
Weizhen Huang
4a4270d73c Merge branch 'blender-v4.2-release' 2024-07-08 16:19:41 +02:00
Michael Jones
5a29be3c75 Cycles: Fix #116243, #122022 - MetalRT live viewport stability issues
This PR fixes live viewport stability issues on Mac when MetalRT is enabled.

There were two sources of instability:

1) `MTLAccelerationStructure` instances were not being correctly retained meaning that use-after-free crashes could occur following a geometry sync.
2) `MTLIntersectionFunctionTable` objects could be unsafely shared between multiple `MetalDeviceQueue` instances (in this case, `setBuffer` being the unsafe mutation)

The solution to 2 involves creating a new `MetalDispatchPipeline` type which is strictly used by only 1 `MetalDeviceQueue` instance.

Pull Request: https://projects.blender.org/blender/blender/pulls/124055
2024-07-08 16:18:34 +02:00
Alaska
c8340cf754 Cycles: Remove AMD and Intel GPU support from Metal backend
This is because with the addition of new features to Cycles, these GPUs
experienced significant performance regressions and bugs, all stemming
from bugs in the Metal GPU driver/compiler. The only reasonable way to
work around these issues was to disable parts of Cycles code on
these GPUs to avoid the driver/compiler bugs.

This resulted in increased development time maintaining these platforms
while being unable to deliver feature parity with other
GPU backends.

It has been decided that this development time is better spent
maintaining platforms that are still actively maintained by
hardware/software vendors, and so AMD and Intel GPU support will be
removed from the Metal backend for Cycles.

Pull Request: https://projects.blender.org/blender/blender/pulls/123551
2024-06-26 17:16:20 +02:00
Michael Jones
5508b41a40 Cycles: MetalRT optimisations (scene_intersect_shadow + random_walk)
This PR contains optimisations and a general tidy-up of the MetalRT backend.

- Currently `scene_intersect` is used for both normal and (opaque) shadow rays, however the usage patterns are different enough to warrant specialisation. Shadow intersection tests (flagged with `PATH_RAY_SHADOW_OPAQUE`) only need a bool result, but need a larger "self" payload in order to exclude hits against target lights. By specialising we can minimise the payload size in each case (which is helps performance) and avoid some dynamic branching. This PR introduces a new `scene_intersect_shadow` function which is specialised in Metal, and currently redirects to `scene_intersect` in the other backends.

- Currently `scene_intersect_local` is implemented for worst-case payload requirements as demanded by `subsurface_disk` (where `max_hits` is 4). The random_walk case only demands 1 hit result which we can retrieve directly from the intersector object (rather than stashing it in the payload). By specialising, we significantly reduce the payload size for random_walk queries, which has a big impact on performance. Additionally, we only need to use a custom intersection function for the first ray test in a random walk (for self-primitive filtering), so this PR forces faster `opaque` intersection testing for all but the first random walk test.

- Currently `scene_intersect_volume` has a lot of redundant code to handle non-triangle primitives despite volumes only being enclosed by trimeshes. This PR removes this code.

Additionally, this PR tidies up the convoluted intersection function linking code, removes some redundant intersection handlers, and uses more consistent naming of intersection functions.

On a M3 MacBook Pro, these changes give 2-3% performance increase on typical scenes with opaque trimesh materials (e.g. barbershop, classroom junkshop), but can give over 15% performance increase for certain scenes using random walk SSS (e.g. monster).

Pull Request: https://projects.blender.org/blender/blender/pulls/121397
2024-05-10 16:38:02 +02:00
Sergey Sharybin
bffcb000e8 Fix: Cycles crash on Metal GPU with ASAN builds
Running a very simple files when Blender is built with the
WITH_COMPILER_ASAN=ON and WITH_CYCLES_KERNEL_ASAN=ON CMake options
leads to ASAN reporting an unknown-crash at line where the worker
pool is being filled in.

It is not entirely clear if it is a real issue in the code, since
placing debug prints with `this` address report proper addresses,
however there is no harm on capturing `this` pointer by value and
it does solve the ASAN reporting issues.

It is possible to reproduce the ASAN crash with the following steps:
- Start with --factory-startup
- Enable Metal device in User Preferences
- Switch render device to GPU Compute
- Switch viewport more to Rendered

Pull Request: https://projects.blender.org/blender/blender/pulls/119867
2024-03-25 11:36:15 +01:00
Campbell Barton
e33f5e36ac Cleanup: spacing around C-style comment blocks 2024-03-09 23:40:57 +11:00
Raul Fernandez
324ff4ddef macOS: Remove unnecessary checks now that minimum version is macOS 11.2
MacOS minimum version is now 11.2 we no longer need to check for lower API versions.

Pull Request: https://projects.blender.org/blender/blender/pulls/118388
2024-02-16 19:03:23 +01:00
Brecht Van Lommel
d377ef2543 Clang Format: bump to version 17
Along with the 4.1 libraries upgrade, we are bumping the clang-format
version from 8-12 to 17. This affects quite a few files.

If not already the case, you may consider pointing your IDE to the
clang-format binary bundled with the Blender precompiled libraries.
2024-01-03 13:38:14 +01:00
Michael Jones
f2bb4c617f Cycles: Apple M3 tuning including hardware raytracing
This PR adds tunings for the [newly announced](https://www.youtube.com/watch?v=ctkW3V0Mh-k) M3 family of chips. In particular, MetalRT will be enabled as the automatic default for intersection testing on M3 and beyond to take advantage of hardware raytracing. This will result in significant path-tracing speedups, as well as faster BVH builds.

Pull Request: https://projects.blender.org/blender/blender/pulls/114296
2023-10-31 11:14:16 +01:00
Michael Jones
1c1c6ac457 Cycles: Fix last failing unit test (T39823) on MetalRT
This PR fixes T39823, the sole failing unit test when running with MetalRT.  It does so by implementing and binding a missing intersection handler (`__anyhit__cycles_metalrt_volume_test_tri`) which is required for `scene_intersect_volume` (as used by `integrator_volume_stack_update_for_subsurface`) to work as intended. This scene exposed the error as it uses subsurface scattering on a sphere which is intersected by volume.

Pull Request: https://projects.blender.org/blender/blender/pulls/112876
2023-09-25 22:41:27 +02:00
Michael Jones
6c98cb73ac Cycles: Use new MetalRT curve primitives for 3D curves and ribbons
This patch updates the experimental MetalRT code path to use new [curve primitives](https://developer.apple.com/videos/play/wwdc2023/10128/) which were recently added in macOS 14. This replaces the previous custom box intersection implementation, allowing the driver to better optimise curve acceleration structures for the GPU. On existing hardware, this can speed up MetalRT renders by up to 40% for scenes that use hair / curve primitives extensively.

The MetalRT option will only be available on macOS >= 14, and requires Xcode >= 15 to build (otherwise the option will be compiled out).

Authored by Marco Giordano, Michael Jones, and Jason Fielder

---
Before / after render times (M1 Max MacBook Pro, macOS 14 beta, MetalRT enabled):
```
                  Custom box intersection      MetalRT curve primitives       Speedup
fishy_cat           111.5                         80.5                         1.39
koro                114.4                         86.7                         1.32
sinosauropteryx     291.8                        279.2                         1.05
spring              142.3                        142.2                         1.00
victor              442.7                        347.7                         1.27
```

---

Pull Request: https://projects.blender.org/blender/blender/pulls/111795
2023-09-13 16:02:49 +02:00
Sergey Sharybin
bad41885db Cleanup: Mark unused function arguments as such
A lot of such cases got discovered since recent change to CLang's
compiler flags for C++.

Pull Request: https://projects.blender.org/blender/blender/pulls/109732
2023-07-05 12:02:06 +02:00
Ray Molenkamp
fac69131ab Cleanup: make format 2023-07-04 08:26:24 -06:00
Michael Jones
24ebf489d6 Cycles: Make use of maximum concurrent compilations on Metal
This patch queries the MTLDevice `maximumConcurrentCompilationTaskCount` property (macOS >= 13.3) to spawn more compilation threads if available.

Pull Request: https://projects.blender.org/blender/blender/pulls/109689
2023-07-04 15:01:48 +02:00
Brecht Van Lommel
f4da74ed29 Revert "Cycles: Make use of maximum concurrent compilations on Metal"
This reverts commit 63d3fc2dcb, because it
causes a build error on the buildbot.

Ref #109655
2023-07-03 20:30:22 +02:00
Michael Jones
63d3fc2dcb Cycles: Make use of maximum concurrent compilations on Metal
This patch queries the MTLDevice `maximumConcurrentCompilationTaskCount` property (macOS >= 13.3) to spawn more compilation threads if available.

Pull Request: https://projects.blender.org/blender/blender/pulls/109655
2023-07-03 18:09:13 +02:00
Campbell Barton
c12994612b License headers: use SPDX-FileCopyrightText in intern/cycles 2023-06-14 16:53:23 +10:00
Sergey Sharybin
5a44922937 Merge branch 'blender-v3.6-release' 2023-05-30 11:47:35 +02:00
Michael Jones
bdf5649f36 Cycles: Remove redundant MetalRT workaround & add two useful DebugFlags
This patch removes a workaround for an issue that is now understood to be undefined behaviour (and fixed by #108176). It also adds two useful debug flags that we would like to be available in Blender 3.6.

Pull Request: https://projects.blender.org/blender/blender/pulls/108322
2023-05-30 11:12:05 +02:00
Sergey Sharybin
ba3f26fac5 Cycles: light and shadow linking
With light linking, lights can be set to affect only specific objects in the
scene. Shadow linking additionally gives control over which objects acts a
shadow blockers for a light.

Usage:
https://wiki.blender.org/wiki/Reference/Release_Notes/4.0/Cycles

Implementation:
https://wiki.blender.org/wiki/Source/Render/Cycles/LightLinking

Ref #104972
Co-authored-by: Brecht Van Lommel <brecht@blender.org>
2023-05-24 14:11:47 +02:00
Michael Jones
4a26f51a55 Cycles: MetalRT: Metal binary archives don't support linked functions
This patch fixes an undefined behaviour where we were trying to use linked functions with binary archives. This isn't supported yet. At best this will fail silently, but this is not guaranteed in future. To fix this we simply disable binary archives if any linked functions are involved. The impact of this is that the `SHADE_SURFACE_RAYTRACE` and `SHADE_SURFACE_MNEE` kernels will fall back to the file system cache when MetalRT is enabled. The file system cache will occasionally be purged due to factors beyond Blender's control.

Pull Request: https://projects.blender.org/blender/blender/pulls/108176
2023-05-23 13:42:25 +02:00
Michael Jones
25fe9d74c9 Cycles: Fix hang when MSL->AIR compilation fails
Follow up to [#100066](https://projects.blender.org/blender/blender/pulls/105122/files). Scenes that do pre-render work on the GPU (e.g. the bake unit tests) will still hang if there are MSL compile failures. This patch adds a loop break out in case of an error.

Pull Request: https://projects.blender.org/blender/blender/pulls/107657
2023-05-05 18:52:54 +02:00
Campbell Barton
6859bb6e67 Cleanup: format (with BraceWrapping::AfterControlStatement "MultiLine") 2023-05-02 09:37:49 +10:00
Michael Jones
70edef1311 Cycles: Fix Metal use-after-free bug
`entryPoint` was being used unsafely following its release.

Pull Request: https://projects.blender.org/blender/blender/pulls/106572
2023-04-05 21:50:14 +02:00
Michael Jones
089e8a1887 Cycles: Fix Metal API validation error (use uint instead of ushort)
This PR fixes an error that is given when Metal API validation is enabled. The compute grid can exceed 65536 threads so `ushort` is not sufficient for `metal_grid_id [[threadgroup_position_in_grid]]`.

This PR also fixes OS version warnings ([Cycles Metal: Unguarded access to newer macOS features #105630](https://projects.blender.org/blender/blender/issues/105630))

Pull Request: https://projects.blender.org/blender/blender/pulls/105763
2023-03-14 22:05:55 +01:00
Michael Jones
a60626ab0b Cycles: Workaround for MetalRT crash when building pipelines
Workaround for a crash when `addComputePipelineFunctionsWithDescriptor` is called *after* `newComputePipelineStateWithDescriptor` with linked functions (i.e. with MetalRT enabled). Ideally we would like to call `newComputePipelineStateWithDescriptor` (async) first so we can bail out if needed, but we can stop the crash by flipping the order when there are linked functions. However when addComputePipelineFunctionsWithDescriptor is called first it will block while it builds the pipeline, offering no way of bailing out.

Note that this only has an impact when the "MetalRT (Experimental)" option is checked.

Pull Request: https://projects.blender.org/blender/blender/pulls/105629
2023-03-10 12:36:58 +01:00
Michael Jones
8f1136e018 Cycles: Use async Metal PSO compilation to avoid std::terminate on exit
When running unit tests or other fast completing renders, forced crashes can occur if there are any slow, outstanding PSO compilation requests (due to the `std::terminate` fall-back case in `~ShaderCache`).

This patch eliminates the need for this shutdown hack by using of the async version of `newComputePipelineStateWithDescriptor` when creating a PSO for the first time. In doing so, we are able to explicitly respond to app shutdown instead of waiting for the pipeline to finish compiling (..and then timing out and force-crashing). We still use the blocking version of `newComputePipelineStateWithDescriptor` when loading from an archive, as this can handle loading from a corrupted archive gracefully. Finally, we move `addComputePipelineFunctionsWithDescriptor` to *after* the PSO is built (as this will trigger a full blocking compile if the PSO has not yet been built, which would bring back the original issue).

Pull Request: https://projects.blender.org/blender/blender/pulls/105506
2023-03-07 17:08:30 +01:00
Michael Jones
7842347ec8 Cycles: Fix hanging unit tests when MetalRT is enabled
This patch fixes hanging unit tests when MetalRT is enabled. It simplifies and fixes the kernel selection logic by baking the MetalRT-specific options into `kernels_md5` rather than expanding out and testing MetalRT bit flags explicitly.

Pull Request #105270
2023-02-28 11:42:08 +01:00
Campbell Barton
9cee0eb7fa Cleanup: format 2023-02-28 15:44:49 +11:00
Michael Jones
626c233dd2 Fix #104087: Cycles crashes (Metal / AMD)
This is a workaround for [issue #104087](https://projects.blender.org/blender/blender/issues/104087). We encounter crashes when using shader binary archives on AMD, so this disables them while we investigate a proper fix. Kernels will still be cached automatically by the OS file system cache. This cache may occasionally be purged due to external factors, in which case kernels will get compiled again.

Pull Request #105186
2023-02-24 17:52:35 +01:00
Michael Jones
482fb791ce Fix #105100: Metal using wrong kernels in multi-pass renders
This fixes issue [#105100](https://projects.blender.org/blender/blender/issues/105100) where multi-pass renders can be incorrect due to kernels using stale specialisation constants (e.g. when rendering Pokedstudio).

This patch adds a new group of md5 hashes (`global_defines_md5`) to track whether the injected block of #defines is stale and regenerate the source string as appropriate. It also renames the existing group of md5 hashes from `source_md5` to `kernels_md5` to clarify that these refer to a specific kernel set rather than just the source (which might build an arbitrarily large number of kernel sets).

Pull Request #105103
2023-02-23 11:07:28 +01:00
Michael Jones
2d994de77c Cycles: MetalRT optimisation for subsurface intersection queries
This patch optimises subsurface intersection queries on MetalRT. Currently intersect_local traverses from the scene root, retrospectively discarding all non-local hits. Using a lookup of bottom level acceleration structures, we can explicitly query only the relevant instance. On M1 Max, with MetalRT selected, this can give a render speedup of 15-20% for scenes like Monster which make heavy use of subsurface scattering.

Patch authored by Marco Giordano.

Reviewed By: brecht

Differential Revision: https://developer.blender.org/D17153
2023-02-06 19:12:29 +00:00
Michael Jones
654e1e901b Cycles: Use local atomics for faster shader sorting (enabled on Metal)
This patch adds two new kernels: SORT_BUCKET_PASS and SORT_WRITE_PASS. These replace PREFIX_SUM and SORTED_PATHS_ARRAY on supported devices (currently implemented on Metal, but will be trivial to enable on the other backends). The new kernels exploit sort partitioning (see D15331) by sorting each partition separately using local atomics. This can give an overall render speedup of 2-3% depending on architecture. As before, we fall back to the original non-partitioned sorting when the shader count is "too high".

Reviewed By: brecht

Differential Revision: https://developer.blender.org/D16909
2023-02-06 11:18:26 +00:00
Brecht Van Lommel
fe552bf236 Cleanup: make format 2023-01-19 22:48:05 +01:00
Michael Jones
e270a198a5 Cycles: Markup to disable specialisation of kernel data fields (Metal)
This patch adds markup to specify that certain kernel data constants should not be specialised. Currently it is used for `tabulated_sobol_sequence_size` and `sobol_index_mask` which change frequently based on the aa sample count, trash the shader cache, and have little bearing on performance.

Reviewed By: brecht

Differential Revision: https://developer.blender.org/D16968
2023-01-19 17:57:42 +00:00
Michael Jones
08b3426df9 Cycles: Occupancy tuning for new higher end M2 machines
This patch adds occupancy tuning for the newly announced high-end M2 machines, giving 10-15% render speedup over a pre-tuned build.

Reviewed By: brecht

Differential Revision: https://developer.blender.org/D17037
2023-01-19 17:56:40 +00:00
Brecht Van Lommel
87f7b630b5 Cleanup: make format 2023-01-05 19:43:19 +01:00
Michael Jones
a7cc6e015c Cycles: Additional Metal kernel specialisation exposed through UI
This patch adds a new "Kernel Optimization Level" dropdown menu to control Metal kernel specialisation. Currently this defaults to "full" optimisation, on the assumption that the changes proposed in D16371 will address usability concerns around app responsiveness and shader cache housekeeping.

Reviewed By: brecht

Differential Revision: https://developer.blender.org/D16514
2023-01-04 23:36:52 +00:00
Chris Blackbourn
496d736adc Cleanup: format 2023-01-05 11:21:51 +13:00
Michael Jones
77c3e67d3d Cycles: Improved render start/stop responsiveness on Metal
All kernel specialisation is now performed in the background regardless of kernel type, meaning that the first render will be visible a few seconds sooner. The only exception is during benchmark warm up, in which case we wait for all kernels to be cached. When stopping a render, we call a new `cancel()` method on the device which causes any outstanding compilation work to be cancelled, and we destroy the device in a detached thread so that any stale queued compilations can be safely purged without blocking the UI for longer than necessary.

Reviewed By: brecht

Differential Revision: https://developer.blender.org/D16371
2023-01-04 16:00:53 +00:00
Michael Jones
b0e2e45496 Cycles: Enable MetalRT pointclouds & other fixes
Code authored by Marco Giordano.

This fixes pointcloud rendering on MetalRT and some other subtle MetalRT bugs:
- Incorrect kernel hashing
- Missing specialisation constants
- Incorrect visibility filtering
- Missing null pointer check

Reviewed By: brecht

Differential Revision: https://developer.blender.org/D16499
2022-11-14 16:39:18 +00:00
Michael Jones
2c596319a4 Cycles: Cache only up to 5 kernels of each type on Metal
This patch adapts D14754 for the Metal backend. Kernels of the same type are already organised into subdirectories which simplifies type matching.

Reviewed By: brecht

Differential Revision: https://developer.blender.org/D16469
2022-11-11 18:10:29 +00:00
Michael Jones
74140d41b1 Cycles: Apple GPU threadgroup tuning
This patch tunes maximum threads-per-threadgroup and threads-per-block for faster renders on Apple GPUs. Appropriate tuning is selected based on the GPU architecture (M1 or M2). We see a benchmark uplift of around 5-10% on M1 family chips. Similar uplift is expected on M2 with upcoming OS changes. (Ref T101931)

Reviewed By: brecht

Maniphest Tasks: T101931

Differential Revision: https://developer.blender.org/D16299
2022-11-07 10:00:46 +00:00