Commit Graph

1153 Commits

Author SHA1 Message Date
Campbell Barton
eb2867de90 Cleanup: spelling in comments 2023-04-19 08:02:41 +10:00
Xavier Hallade
4382a0b350 Cleanup: avoid warnings from gcc in oneAPI device compilation
When building using GCC and with Embree without GPU support, there were
a few unused variables and a non-defined macro.
2023-04-18 22:40:40 +02:00
Xavier Hallade
70892e82ac Cycles: oneAPI: use specialization constant to compile with/without Embree on GPU 2023-04-18 22:09:42 +02:00
Xavier Hallade
9821a2d397 Cycles: pass kernel features to get_bvh_layout_mask
This allows to selectively disable Hardware Raytracing in oneAPI
backend, depending on features used.
2023-04-18 22:09:42 +02:00
Nikita Sirgienko
3f8c995109 Cycles: add hardware raytracing support to oneAPI device
Updated Embree 4 library with GPU support is required for it to be
compiled - compatiblity with Embree 3 and Embree 4 without GPU support
is maintained.
Enabling hardware raytracing is an opt-in user setting for now.

Pull Request: https://projects.blender.org/blender/blender/pulls/106266
2023-04-18 22:09:42 +02:00
Xavier Hallade
887022257d Cycles: update DPCPP to 2022-12 release
We also backport a patch to program_manager to it as
61e51015a5
helps avoid unnecessary recompilation when enumerating available
kernels.
2023-04-18 22:09:41 +02:00
Campbell Barton
ccea39b538 Cleanup: spelling in comments 2023-04-12 11:24:10 +10:00
Michael Jones
70edef1311 Cycles: Fix Metal use-after-free bug
`entryPoint` was being used unsafely following its release.

Pull Request: https://projects.blender.org/blender/blender/pulls/106572
2023-04-05 21:50:14 +02:00
Xavier Hallade
9e9baa9085 Cycles: Upgrade to new Embree 4 while staying compatible with Embree 3
For more information about Embree 3->4 API changes:
https://github.com/embree/embree/blob/master/doc/src/api.md#upgrading-from-embree-3-to-embree-4

This is not yet enabling HW RT on Arc GPUs using Embree, which is worked on in https://projects.blender.org/blender/blender/pulls/106266

Co-authored-by: Nikita Sirgienko <nikita.sirgienko@intel.com>
Co-authored-by: Stefan Werner <stefan.werner@intel.com>
Pull Request: https://projects.blender.org/blender/blender/pulls/105974
2023-04-05 11:03:06 +02:00
Brecht Van Lommel
4cb670e68f Fix #105148: Cycles Metal memory leak on AMD GPU
After NanoVDB support from 02c2970982, this line should not have been
removed.
2023-04-03 18:18:01 +02:00
Michael Jones
5f61eca7af Cycles: Exploit non-uniform threadgroup sizes on Metal
This patch replaces `dispatchThreadgroups` with `dispatchThreads` which takes care of non-uniform threadgroup bounds. This allows us to remove the bounds guards in the integrator kernel entry points.

Pull Request: https://projects.blender.org/blender/blender/pulls/106217
2023-03-29 21:46:11 +02:00
Sergey Sharybin
d32d787f5f Clang-Format: Allow empty functions to be single-line
For example

```
OIIOOutputDriver::~OIIOOutputDriver()
{
}
```

becomes

```
OIIOOutputDriver::~OIIOOutputDriver() {}
```

Saves quite some vertical space, which is especially handy for
constructors.

Pull Request: https://projects.blender.org/blender/blender/pulls/105594
2023-03-29 16:50:54 +02:00
Campbell Barton
bb2dc141f2 Cleanup: spelling in comments 2023-03-27 12:08:14 +11:00
Brecht Van Lommel
74de2e23a5 Merge branch 'blender-v3.5-release' 2023-03-17 21:53:51 +01:00
Brecht Van Lommel
cc6d8cd573 Fix #105442: Cycles CUDA and HIP host memory fallback not working
Transforming the host pointer should not be done in an assert, it only works
in debug builds then. Caused by 6dcfb6d.
2023-03-17 21:52:29 +01:00
Chris Blackbourn
59a083e948 Cleanup: format 2023-03-16 09:34:38 +13:00
Julian Eisel
30e517c3ca Merge branch 'blender-v3.5-release' 2023-03-15 13:07:26 +01:00
Michael Jones
089e8a1887 Cycles: Fix Metal API validation error (use uint instead of ushort)
This PR fixes an error that is given when Metal API validation is enabled. The compute grid can exceed 65536 threads so `ushort` is not sufficient for `metal_grid_id [[threadgroup_position_in_grid]]`.

This PR also fixes OS version warnings ([Cycles Metal: Unguarded access to newer macOS features #105630](https://projects.blender.org/blender/blender/issues/105630))

Pull Request: https://projects.blender.org/blender/blender/pulls/105763
2023-03-14 22:05:55 +01:00
Pratik Borhade
577fd9add5 Merge branch 'blender-v3.5-release' 2023-03-10 21:11:09 +05:30
Michael Jones
a60626ab0b Cycles: Workaround for MetalRT crash when building pipelines
Workaround for a crash when `addComputePipelineFunctionsWithDescriptor` is called *after* `newComputePipelineStateWithDescriptor` with linked functions (i.e. with MetalRT enabled). Ideally we would like to call `newComputePipelineStateWithDescriptor` (async) first so we can bail out if needed, but we can stop the crash by flipping the order when there are linked functions. However when addComputePipelineFunctionsWithDescriptor is called first it will block while it builds the pipeline, offering no way of bailing out.

Note that this only has an impact when the "MetalRT (Experimental)" option is checked.

Pull Request: https://projects.blender.org/blender/blender/pulls/105629
2023-03-10 12:36:58 +01:00
Patrick Mours
7edb3ab5e0 Merge branch 'blender-v3.5-release' 2023-03-09 13:16:15 +01:00
Patrick Mours
dcfc9629c2 Fix OptiX TLAS being built with invalid traversables when a geometry is empty
The traversable handle of a BLAS may be zero when the relevant geometry
is empty (no triangles/curves/points/...), as no BLAS is built in such cases.
It is not correct to attach a zero handle to a TLAS, so filter out such instances.
2023-03-09 13:15:08 +01:00
Campbell Barton
b3625e6bfd Cleanup: comment blocks 2023-03-09 10:39:49 +11:00
Sebastian Parborg
023524765a Merge branch 'blender-v3.5-release' 2023-03-07 17:35:05 +01:00
Michael Jones
8f1136e018 Cycles: Use async Metal PSO compilation to avoid std::terminate on exit
When running unit tests or other fast completing renders, forced crashes can occur if there are any slow, outstanding PSO compilation requests (due to the `std::terminate` fall-back case in `~ShaderCache`).

This patch eliminates the need for this shutdown hack by using of the async version of `newComputePipelineStateWithDescriptor` when creating a PSO for the first time. In doing so, we are able to explicitly respond to app shutdown instead of waiting for the pipeline to finish compiling (..and then timing out and force-crashing). We still use the blocking version of `newComputePipelineStateWithDescriptor` when loading from an archive, as this can handle loading from a corrupted archive gracefully. Finally, we move `addComputePipelineFunctionsWithDescriptor` to *after* the PSO is built (as this will trigger a full blocking compile if the PSO has not yet been built, which would bring back the original issue).

Pull Request: https://projects.blender.org/blender/blender/pulls/105506
2023-03-07 17:08:30 +01:00
Hans Goudey
8fbc80be8f Merge branch 'blender-v3.5-release' 2023-02-28 11:36:20 -05:00
Michael Jones
7842347ec8 Cycles: Fix hanging unit tests when MetalRT is enabled
This patch fixes hanging unit tests when MetalRT is enabled. It simplifies and fixes the kernel selection logic by baking the MetalRT-specific options into `kernels_md5` rather than expanding out and testing MetalRT bit flags explicitly.

Pull Request #105270
2023-02-28 11:42:08 +01:00
Campbell Barton
9cee0eb7fa Cleanup: format 2023-02-28 15:44:49 +11:00
William Leeson
6c03339e48 Cycles: reduce mesh memory usage by unflattening
To improve mesh upload speeds and reduce the size of the scene data which allows larger scenes to be rendered.

The meshes in Cycles are currently stored as flattened meshes, where each triangle is stored as a set of 3 vertices. Unflattening writes out the vertices in a list according to the index buffer. This uses a lot of memory and for current hardware does not provide a noticeable benefit. This change unflattens the mesh by directly using the meshes vertex and index buffers directly and skips the unflattening. This change allows for larger scenes and also a reduction in the sizes of the meshes. Further it results in a decrease the amount of time it takes to upload the data to a GPU. This is especially important for when multiple GPUs are used in a single machine.

Pull Request #105173
2023-02-27 10:39:19 +01:00
Chris Blackbourn
86ceb6722f Cleanup: format 2023-02-26 11:55:22 +13:00
Pratik Borhade
e3538546f2 Merge branch 'blender-v3.5-release' 2023-02-24 22:36:18 +05:30
Michael Jones
82ff277528 Fix #100066: Cycles hangs when MSL->AIR compilation fails
This fixes [#100066](https://projects.blender.org/blender/blender/issues/100066) by failing hard when front-end MSL->AIR compilation errors are encountered.

Pull Request #105122
2023-02-24 17:55:27 +01:00
Michael Jones
626c233dd2 Fix #104087: Cycles crashes (Metal / AMD)
This is a workaround for [issue #104087](https://projects.blender.org/blender/blender/issues/104087). We encounter crashes when using shader binary archives on AMD, so this disables them while we investigate a proper fix. Kernels will still be cached automatically by the OS file system cache. This cache may occasionally be purged due to external factors, in which case kernels will get compiled again.

Pull Request #105186
2023-02-24 17:52:35 +01:00
Sybren A. Stüvel
c8ed48d5ca Merge remote-tracking branch 'origin/blender-v3.5-release' 2023-02-23 11:28:02 +01:00
Michael Jones
482fb791ce Fix #105100: Metal using wrong kernels in multi-pass renders
This fixes issue [#105100](https://projects.blender.org/blender/blender/issues/105100) where multi-pass renders can be incorrect due to kernels using stale specialisation constants (e.g. when rendering Pokedstudio).

This patch adds a new group of md5 hashes (`global_defines_md5`) to track whether the injected block of #defines is stale and regenerate the source string as appropriate. It also renames the existing group of md5 hashes from `source_md5` to `kernels_md5` to clarify that these refer to a specific kernel set rather than just the source (which might build an arbitrarily large number of kernel sets).

Pull Request #105103
2023-02-23 11:07:28 +01:00
Brecht Van Lommel
02c2970983 Cycles: add NanoVDB support for Metal on Apple Silicon
Contributed by Yulia Kuznetcova at Apple.

NanoVDB is patched to give add address spaces required by Metal. We hope that
in the future Metal will support the generic address space.

For AMD and Intel this is currently not available since it causes a performance
regression also on scenes without volumes.

Pull Request #104837
2023-02-21 15:03:52 +01:00
Brecht Van Lommel
6583acb880 Fix Cycles MetalRT access of macOS 11 features when unavailable
After recent changes in 2d994de.

Pull Request #104976
2023-02-21 12:03:21 +01:00
Brecht Van Lommel
6a0b1eae8c Fix #104097: re-enable Cycles AMD Vega support
The internal compiler error appears to be gone. Unclear why it appeared in the
first place and why it's gone now. Just random kernel code changes causing it.

Pull Request #104719
2023-02-13 22:53:08 +01:00
Campbell Barton
91346755ce Cleanup: use '#' prefix for issues instead of 'T'
Match the convention from Gitea instead of Phabricator's T for tasks.
2023-02-12 14:56:05 +11:00
Michael Jones (Apple)
01480229b1 Cycles: Fix MetalRT checkbox not hooked up to device on AMD
(Follow on from D17043)
On AMD Navi2 devices the MetalRT checkbox was not hooked up properly and had no effect. This patch fixes it.

Co-authored-by: Michael Jones <michael_p_jones@apple.com>
Pull Request #104520
2023-02-10 10:55:39 +01:00
Lucas Tadeu
a1282ab015 Fix Cycles debug build error after host falback changes
Introduced in dcfb6df9ce6.

Co-authored-by: Lucas Tadeu Teixeira <lucas@lucastadeu.com>

Pull Request #104454
2023-02-08 19:27:40 +01:00
Campbell Barton
a99022e22d Cleanup: spelling in comments 2023-02-07 14:17:01 +11:00
Nikita Sirgienko
6dcfb6df9c Cycles: Abstract host memory fallback for GPU devices
Host memory fallback in CUDA and HIP devices is almost identical.
We remove duplicated code and create a shared generic version that
other devices (oneAPI) will be able to use.

Reviewed By: brecht

Differential Revision: https://developer.blender.org/D17173
2023-02-06 22:19:32 +01:00
Michael Jones
2d994de77c Cycles: MetalRT optimisation for subsurface intersection queries
This patch optimises subsurface intersection queries on MetalRT. Currently intersect_local traverses from the scene root, retrospectively discarding all non-local hits. Using a lookup of bottom level acceleration structures, we can explicitly query only the relevant instance. On M1 Max, with MetalRT selected, this can give a render speedup of 15-20% for scenes like Monster which make heavy use of subsurface scattering.

Patch authored by Marco Giordano.

Reviewed By: brecht

Differential Revision: https://developer.blender.org/D17153
2023-02-06 19:12:29 +00:00
Patrick Mours
f2538c7173 Fix T104335: MNEE + OptiX OSL results in illegal address error
The OptiX pipeline created for OSL was missing sufficient continuation
stack to handle the MNEE ray generation program.
2023-02-06 15:06:52 +01:00
Michael Jones
654e1e901b Cycles: Use local atomics for faster shader sorting (enabled on Metal)
This patch adds two new kernels: SORT_BUCKET_PASS and SORT_WRITE_PASS. These replace PREFIX_SUM and SORTED_PATHS_ARRAY on supported devices (currently implemented on Metal, but will be trivial to enable on the other backends). The new kernels exploit sort partitioning (see D15331) by sorting each partition separately using local atomics. This can give an overall render speedup of 2-3% depending on architecture. As before, we fall back to the original non-partitioned sorting when the shader count is "too high".

Reviewed By: brecht

Differential Revision: https://developer.blender.org/D16909
2023-02-06 11:18:26 +00:00
Michael Jones
be0912a402 Cycles: Prevent use of both AMD and Intel Metal devices at same time
This patch removes the option to select both AMD and Intel GPUs on system that have both. Currently both devices will be selected by default which results in crashes and other poorly understood behaviour. This patch adds precedence for using any discrete AMD GPU over an integrated Intel one. This can be overridden with CYCLES_METAL_FORCE_INTEL.

Reviewed By: brecht

Differential Revision: https://developer.blender.org/D17166
2023-02-06 11:13:33 +00:00
Michael Jones
0a3df611e7 Fix T103393: Cycles: Undefine __LIGHT_TREE__ on Metal/AMD to fix perf
This patch fixes T103393 by undefining `__LIGHT_TREE__` on Metal/AMD as it has an unexpected & major impact on performance even when light trees are not in use.

Patch authored by Prakash Kamliya.

Reviewed By: brecht

Maniphest Tasks: T103393

Differential Revision: https://developer.blender.org/D17167
2023-02-06 11:12:34 +00:00
Campbell Barton
266d8de687 Cleanup: spelling in comments 2023-02-03 12:41:01 +11:00
Xavier Hallade
8afcecdf1f Cycles: update Intel Graphics compiler to 101.4032 on Windows
A noticeable (>5%) performance regression in oneAPI backend came with
a501a2dbff. Updating to latest graphics
compiler from driver 101.4032 fixes it.

I've tested it with current min-supported drivers and it runs well but
since compatibility of graphics compiler with older drivers isn't
guaranteed, I'm also bumping the min-supported driver versions.

If end-users consider latest drivers too fresh to switch to (version
isn't released as stable on Linux as of today but should be before
Blender 3.5 release), CYCLES_ONEAPI_ALL_DEVICES=1 env variable can be
used.

Intel Graphics Compiler on Linux will be updated in a later commit
so we can then close D16984.

Reviewed By: sergey, LazyDodo
2023-01-23 19:36:34 +01:00