Commit Graph

138 Commits

Author SHA1 Message Date
Jason Fielder
3d9d67594c GPU: Add GPU frame capture support.
Adds two modes of GPU frame capture support for
enhanced debugging. GPU frame capture begin/end
allow instantaneous frame capture of all GPU commands
within the capture boundary.

GPU frame capture scopes allow several user-defined capture
regions which can wrap key parts of code. These scopes are
exposed to connected GPU tools allowing the user to manually
trigger a capture of a known scope at the desired time.

This is currently integrated with the Metal backend for
support with Xcode.

Related to #105591

Pull Request: https://projects.blender.org/blender/blender/pulls/105717
2023-03-16 08:54:05 +01:00
Jeroen Bakker
397f6250a5 Merge branch 'blender-v3.5-release' 2023-03-16 08:28:21 +01:00
Jason Fielder
7bdd82eca0 Metal: Resolve Race Condition in Memory Manager
Fix race condition if several competing threads are inserting Metal
buffers into the MTLSafeFreeList simultaneously while a new list
chunk is being created.

Also raise the limit for an MTLSafeFreeListChunk size to optimize
for interactivity when releasing lots of memory simultaneously.

Authored by Apple: Michael Parkin-White

Pull Request: https://projects.blender.org/blender/blender/pulls/105254
2023-03-16 08:25:15 +01:00
Jason Fielder
94855119da Fix: Metal validation error when shader has no uniforms
Metal buffer binding validation would trigger an error
when a given shader had an empty PushConstantBlock.
This patch removes the default uniform code gen if
no uniforms are present, to avoid any possible issues
with buffers being bound to a shader where the destination
data block is size zero.

Authored by Apple: Michael Parkin-White

Pull Request: https://projects.blender.org/blender/blender/pulls/105796
2023-03-16 08:10:38 +01:00
Jason Fielder
ac6a70e5f8 Fix #104012: Selection crash with AMD on Metal
Crash when selecting objects on AMD platforms running
Metal. This was caused by shader compilation warnings
being treated as errors in macOS 10.15. Wrapping
compilation failure with success check resolves error.

Authored by Apple: Michael Parkin-White

Pull Request: https://projects.blender.org/blender/blender/pulls/105739
2023-03-16 08:03:15 +01:00
Jason Fielder
b96f8ac9fe Fix #105606: Metal texture upload regression
immDrawPixels performs significantly slower in Metal
than OpenGL. This was caused by two main factors. Firstly,
the additional overhead of tiled texture update, where all
memory needed to be kept in flight for each update, but
caused update to take a slow path. Avoiding tile update
with Metal is more efficient for both memory pressure
and GPU pipelining.

Secondly, on AMD platforms, the staging buffer used
for temporary texture data was page-faulting when
several texture updates would occur within one frame.
This is due to limitations of allocating one large contiguous
memory chunk. Using the Metal buffer pool for staging
data is more efficient.

Authored by Apple: Michael Parkin-White

Pull Request: https://projects.blender.org/blender/blender/pulls/105794
2023-03-16 07:59:22 +01:00
Brecht Van Lommel
7b66168bcb Merge branch 'blender-v3.5-release' into main 2023-03-14 18:19:53 +01:00
Jason Fielder
96c6349cbf Fix #103605: Metal barycentric coordinate compilation failure
Fix support for Wireframe and parametric nodes by resolving
compilation failures surrounding barycentric coordinates.
A final missing part of the Metal implementation for barycentric
coordinates was missing.

Feedback also addressed to move barycentric calculation out
of code-gen and into surface_lib.

Authored by Apple: Michael Parkin-White

This also resolves #103606.
Ref #96261

Pull Request: https://projects.blender.org/blender/blender/pulls/105740
2023-03-14 08:23:02 +01:00
Campbell Barton
b3625e6bfd Cleanup: comment blocks 2023-03-09 10:39:49 +11:00
Sebastian Parborg
023524765a Merge branch 'blender-v3.5-release' 2023-03-07 17:35:05 +01:00
Jason Fielder
d31083583c Fix 105449: Resolve selection in Metal backend
MTLFramebuffer's viewport was not correctly updated when
updating attachments. Behaviour modified to be consistent
with OpenGL.

Authored by Apple: Michael Parkin-White

Ref #96261

Pull Request: https://projects.blender.org/blender/blender/pulls/105529
2023-03-07 16:00:23 +01:00
Jeroen Bakker
4e32864786 Cleanup: Remove compilation warning.
In MTLTexture it was checked that this was valid. What in that
case should always be true.
2023-03-06 08:40:30 +01:00
Clément Foucault
fcedc97d11 GPU: Add GPU_BARRIER_BUFFER_UPDATE barrier type
This barrier types is needed for correct readback of buffers GPU memory
to CPU memory.
2023-03-04 07:44:34 +01:00
Clément Foucault
5bb3dc84c1 Merge branch 'blender-v3.5-release' 2023-03-03 12:31:25 +01:00
Jason Fielder
d3cbfc96e0 Metal: Ensure explicit UBO bind indices
Previously, UBO bind locations were linearly incremented and
relied on  the correct uniform location being queried. This fix
is a future requirement for EEVEE next, however, pulling forward
due to Issue #105280 highlighting a possible flaw with expected
uniform locations.

Authored by Apple: Michael Parkin-White

Ref #96261
Pull Request #105311
2023-03-03 12:03:43 +01:00
Jason Fielder
7e5cb36e0c Metal: Fix erroneous outvar replacements.
Resolves issue with node_holdout perfomring an outvar
replacement on a function, causing shader compilation
failure.

Ref #96261
Pull Request #105396
2023-03-03 11:01:36 +01:00
Jacques Lucke
c79c35b4d3 Merge branch 'blender-v3.5-release' 2023-02-27 14:02:56 +01:00
Jason Fielder
f738843362 Metal: Fix possible uniform lookup issue.
Similar to recent issues with gl_shader_interface,
ShaderInput lists need to be sorted to ensure correct
and efficient uniform lookup by name.

Authored by Apple: Michael Parkin-White

Ref #96261
Pull Request #105239
2023-02-27 12:33:28 +01:00
Campbell Barton
efb86b75ee Cleanup: comment block formatting 2023-02-27 21:51:57 +11:00
Clément Foucault
8b543bfa3a Merge branch 'blender-v3.5-release'
# Conflicts:
#	source/blender/gpu/metal/mtl_immediate.mm
2023-02-26 17:44:31 +01:00
Jason Fielder
9f42888552 Fix #104016: Resolve Metal LineLoop emulation.
Metal LineLoop emulation path does not correctly apply
when using SSBO vertex fetch mode alongside 3D line
rendering.

Patch moves line emulation above SSBO
vertex fetch setup to ensure the correct emulation
parameters are passed to the shader.

Authored by Apple: Michael Parkin-White

Ref #96261
Pull Request #105142
2023-02-26 15:21:54 +01:00
Jason Fielder
fb63e484b9 Fix #103398: Fix Icon sampler initialization in Metal backend.
Resolves issue with nearest filtering on UI Icons. Note that as
Metal does not support LOD bias as a parameter on a sampler
object, the original code has been modified to perform LOD
biasing at the shader level.

As GPU_SAMPLER_ICON is not  widely used, it is more
efficient to apply directly to the  affected shaders, rather
than workaround passing in the sampler LOD bias as a
separate value e.g. uniform or push constant.

Original PR feedback addressed to also refactor ICON
shaders to use consistent style for single and multi
Icon rendering.

Authored by Apple: Michael Parkin-White

Ref #96261
Pull Request #105145
2023-02-26 13:23:40 +01:00
Clément Foucault
52ded6ab56 GPUTexture: Make sure all available texture format are supported
This remove default casses from the `switch` statements to catch where
the missing cases are.

Uncomment unimplemented cases for the sake of completeness. Improving the
overall API.

This make the format conversion lists exhaustive and documented.

This replace `validate_data_format_mtl` by the common version as they
don't differ at all now.
2023-02-25 21:47:00 +01:00
Clément Foucault
9fb1f32f06 Cleanup: GPUTexture: Remove _ex suffix from texture creation
It isn't relevant anymore now that usage flags are mandatory.

Pull Request #105197
2023-02-25 11:39:54 +01:00
Clément Foucault
3d6578b33e GPUTexture: Add texture usage flag to all texture creation
All usages are currently placeholder GPU_TEXTURE_USAGE_GENERAL.
2023-02-25 11:39:53 +01:00
Clément Foucault
e01b140fb2 GPUTexture: Remove data_format from 3D texture creation function
For every other texture types this is expected to be implicitly
`GPU_DATA_FLOAT`. There is only one case where this is not the case.

I believe this was previously needed because the data type was
conditionning the texture creation. This is not the case anymore.
2023-02-25 11:39:53 +01:00
Sergey Sharybin
8abef86217 Merge branch 'blender-v3.5-release' 2023-02-20 15:37:02 +01:00
Sergey Sharybin
c7d7175270 Cleanup: Remove check for this pointer not being nullptr
The check was triggering the 'this' pointer cannot be null in
well-defined C++ code

We do not check for this pointer in any other areas. If it is
needed due to possible opaque pointer cast to the check prior
to the cast.

Pull Request #104974
2023-02-20 15:35:24 +01:00
Julian Eisel
c437a8aea8 Revert release branch only commit after merge
This is a revert of a revert, because the initial revert is only
supposed to be in the release branch.

This reverts commit 3eed00dc54.
2023-02-20 11:51:16 +01:00
Julian Eisel
3eed00dc54 Revert "GPencil: Include UV information in simplify->sample modifier."
This reverts commit 19222627c6.

Something went wrong here, seems like this commit merged the main branch
into the release branch, which should never be done.
2023-02-20 11:20:07 +01:00
YimingWu
19222627c6 GPencil: Include UV information in simplify->sample modifier.
Simplify modifier sample mode didn't transfer UV parameters, now fixed.

Pull Request #104942
2023-02-19 11:45:22 +01:00
Campbell Barton
8d35b28f2a Cleanup: spelling in comments 2023-02-15 13:11:14 +11:00
Jason Fielder
7b9d1cb51f Eevee: GPU Material node graph optimization.
Certain material node graphs can be very expensive to run. This feature aims to produce secondary GPUPass shaders within a GPUMaterial which provide optimal runtime performance. Such optimizations include baking constant data into the shader source directly, allowing the compiler to propogate constants and perform aggressive optimization upfront.

As optimizations can result in reduction of shader editor and animation interactivity, optimized pass generation and compilation is deferred until all outstanding compilations have completed. Optimization is also delayed util a material has remained unmodified for a set period of time, to reduce excessive compilation. The original variant of the material shader is kept to maintain interactivity.

Also adding a new concept to gpu::Shader allowing assignment of a parent shader from which a shader can pull PSO descriptors and any required metadata for asynchronous shader cache warming. This enables fully asynchronous shader optimization, without runtime hitching, while also reducing runtime hitching for standard materials, by using PSO descriptors from default materials, ahead of rendering.

Further shader graph optimizations are likely also possible with this architecture. Certain scenes, such as Wanderer benefit significantly. Viewport performance for this scene is 2-3x faster on Apple-silicon based GPUs.

Authored by Apple: Michael Parkin-White

Ref T96261
Pull Request #104536
2023-02-14 21:51:03 +01:00
Clément Foucault
dd171f7743 Cleanup: GPUShader: Rename GPU_shader_uniform_vector
Rename to `GPU_shader_uniform_float/int_ex` to make more sense as a
general purpose function.
2023-02-13 11:22:38 +01:00
Clément Foucault
158f87203e Cleanup: GPUShader: Reorganize GPU_shader.h to separate depecated API
This avoid confusion to what to use nowadays.
Also improves documentation.
2023-02-13 11:22:38 +01:00
Jeroen Bakker
f828ecf4ba GPU: Use same read back API as SSBOs
The GPU module has 2 different styles when reading back data from
GPU buffers. The SSBOs used a memcpy to copy the data to a
pre-allocated buffer. IndexBuf/VertBuf gave back a driver/platform
controlled pointer to the memory.

Readback is done for test cases returning mapped pointers is not safe.
For this reason we settled on using the same approach as the SSBO.
Copy the data to a caller pre-allocated buffer.

Reason why this API is currently changed is that the Vulkan API is more
strict on mapping/unmapping buffers that can lead to potential issues
down the road.

Pull Request #104571
2023-02-13 08:34:19 +01:00
Campbell Barton
91346755ce Cleanup: use '#' prefix for issues instead of 'T'
Match the convention from Gitea instead of Phabricator's T for tasks.
2023-02-12 14:56:05 +11:00
Jason Fielder
0e6da74e98 Fix #104282: Resolve Depth read for D24_S8 types in Metal.
Fixes incorrect spotlight gizmo orientation when moving.

Authored by Apple: Michael Parkin-White

Related to #96261

Pull Request #104537
2023-02-10 20:40:07 +01:00
Jason Fielder
f152159101 Metal: Guard advanced command buffer debugging behind OS version flag.
Authored by Apple: Michael Parkin-White

Ref T96261

Reviewed By: fclem

Maniphest Tasks: T96261

Differential Revision: https://developer.blender.org/D17181
2023-02-07 00:51:16 +01:00
Campbell Barton
96ea0dd458 Cleanup: spelling in comments 2023-02-02 14:00:32 +11:00
Jason Fielder
a504058dee Metal: Optimize GLSL to MSL translation. Improve cached compilation
Reduces the GLSL to MSL translation stage of shader compilation from 120 ms to 5 ms for complex EEVEE materials. This manifests in faster overall compilations, and faster cache hits for secondary compilations, as the MSL variant is needed as a key.

Startup time is also improved for both first-run and second-run. Note that this change does not affect shader compilation times within the Metal API.

Also disables shader output to disk

Authored by Apple: Michael Parkin-White

Ref T96261
Depends on D16990

Reviewed By: fclem

Maniphest Tasks: T96261

Differential Revision: https://developer.blender.org/D17033
2023-01-31 11:06:31 +01:00
Jason Fielder
f3bd5458a3 Metal: Optimise shader texture cache usage and branch reduction via point sampling.
Replace texelFetch calls with a texture point-sample rather than a textureRead call. This increases texture cache utilisation when mixing between sampled calls and reads. Bounds checking can also be removed from these functions, reducing instruction count and branch divergence, as the sampler routine handles range clamping.

Authored by Apple: Michael Parkin-White
Ref T96261

Depends on D16923

Reviewed By: fclem

Maniphest Tasks: T96261

Differential Revision: https://developer.blender.org/D17021
2023-01-31 10:56:25 +01:00
Campbell Barton
79c82fc1c5 Cleanup: trailing space 2023-01-31 15:49:04 +11:00
Campbell Barton
27b4916b1a Cleanup: spelling in comments
Also minor changes in comments:
- Reference BLENDER_HISTORY_FILE instead of the literal file-name
  (simplifies looking up usage).
- Use usernames in tags, as noted in code-style.
2023-01-31 14:22:23 +11:00
Jason Fielder
aca9c131fc Metal: Fix issue with premature safe buffer list flush and optimize memory manager overhead.
Resolve an issue where released buffers were returned to the reusable memory pool before GPU work associated with these buffers had been encoded. Usually release of memory pools is dependent on successful completion of GPU work via command buffer callbacks. However, if the pool refresh operation occurs between encoding of work and submission, buffer ref-count is prematurely decremented.

Patch also ensures safe buffer free lists are only flushed once a set number of buffers have been used. This reduces overhead of small and frequent flushes, without raising the memory ceiling significantly.

Authored by Apple: Michael Parkin-White

Ref T96261

Reviewed By: fclem

Maniphest Tasks: T96261

Differential Revision: https://developer.blender.org/D17118
2023-01-30 14:54:39 +01:00
Jeroen Bakker
411345757c Cleanup: Remove unused variable.
Introduced in recent commit.
2023-01-30 12:04:44 +01:00
Jason Fielder
6dde185dc4 Metal: Fix edge-case with point primitive restart index removal where all indices are restarts.
Metal backend does not support primtiive restart for point primtiives. Hence strip_restart_indices removes restart indices by swapping them to the end of the index buffer and reducing the length.
An edge-case existed where all indices within the index buffer were restarts and no valid swap-index would be found, resulting in a buffer underflow.

Authored by Apple: Michael Parkin-White

Ref T96261

Reviewed By: fclem

Maniphest Tasks: T96261

Differential Revision: https://developer.blender.org/D17088
2023-01-30 11:26:38 +01:00
Jason Fielder
57552f52b2 Metal: Realtime compositor enablement with addition of GPU Compute.
This patch adds support for compilation and execution of GLSL compute shaders. This, along with a few systematic changes and fixes, enable realtime compositor functionality with the Metal backend on macOS. A number of GLSL source modifications have been made to add the required level of type explicitness, allowing all compilations to succeed.

GLSL Compute shader compilation follows a similar path to Vertex/Fragment translation, with added support for shader atomics, shared memory blocks and barriers.

Texture flags have also been updated to ensure correct read/write specification for textures used within the compositor pipeline. GPU command submission changes have also been made in the high level path, when Metal is used, to address command buffer time-outs caused by certain expensive compute shaders.

Authored by Apple: Michael Parkin-White

Ref T96261
Ref T99210

Reviewed By: fclem

Maniphest Tasks: T99210, T96261

Differential Revision: https://developer.blender.org/D16990
2023-01-30 11:06:56 +01:00
Jason Fielder
84c25fdcaa Metal: Improve command buffer handling and workload scheduling.
Improve handling for cases where maximum in-flight command buffer count is exceeded. This can occur during light-baking operations. Ensures the application handles this gracefully and also improves workload pipelining by situationally stalling until GPU work has completed, if too much work is queued up.

This may have a tangible benefit for T103742 by ensuring Blender does not queue up too much GPU work.

Authored by Apple: Michael Parkin-White

Ref T96261
Ref T103742
Depends on D17018

Reviewed By: fclem

Maniphest Tasks: T103742, T96261

Differential Revision: https://developer.blender.org/D17019
2023-01-23 17:47:21 +01:00
Jason Fielder
1c672f3d1d Fix T103433: Ensure Metal memory allocator is safe for multi-threaded allocation. Resolves crash when baking indirect lighting.
Also applies correct texture usage flag to light bake texture.

Authored by Apple: Michael Parkin-White

Ref T96261

Reviewed By: fclem, jbakker

Maniphest Tasks: T96261, T103433

Differential Revision: https://developer.blender.org/D17018
2023-01-23 17:45:39 +01:00