Commit Graph

56 Commits

Author SHA1 Message Date
Clément Foucault
783472671e Cleanup: GPU: Add macro for default constructor compatibility on MSL 2025-03-03 12:50:45 +01:00
Clément Foucault
2c20c200bf Cleanup: GPU: Remove warning about is_zero redundant declaration 2025-03-03 12:50:45 +01:00
Clément Foucault
86b70143d5 Cleanup: GPU: Remove unused Transform Feedback implementation
Most of the cleanup is inside the metal backend.

Pull Request: https://projects.blender.org/blender/blender/pulls/134349
2025-02-10 17:30:42 +01:00
Campbell Barton
4cd827870d Cleanup: quiet check_spelling_* targets
Also correct outdated references to `ghash`.
2025-02-02 13:58:34 +11:00
Clément Foucault
651ae0e47c Metal: Add OOB coordinate rejection to image atomic functions
These should have been guarded but are not, creating
buffer out of bound access error on Apple devices.
2025-01-31 16:17:58 +01:00
Clément Foucault
067f6767d4 Fix #129571: Metal: Broken texture atomic workaround
The refactor 9c0321ae9b
had the wrong mental model of the backing texture
layout for the atomic workaround.

For 3D textures, the layout is breaking the 3D texture
and reinterpreting the linear location as its 2D
linear location. This breaks the 3D texture Z slices
into non contiguous regions in 2D.

Comments have been added to avoid future confusion.

Pull Request: https://projects.blender.org/blender/blender/pulls/133830
2025-01-31 16:10:59 +01:00
Clément Foucault
994c43413a Metal: Remove SSBO Vertex Fetch
This API was used as a workaround to the lack of
geometry shader. It has been rendered redundant
since the introduction of #125782.
2024-12-05 22:58:52 +01:00
Clément Foucault
62826931b0 GPU: Move more linting and processing of GLSL to compile time
The goal is to reduce the startup time cost of
all of these parsing and string replacement.

All comments are now stripped at compile time.
This comment check added noticeable slowdown at
startup in debug builds and during preprocessing.

Put all metadatas between start and end token.
Use very simple parsing using `StringRef` and
hash all identifiers.

Move all the complexity to the preprocessor that
massagess the metadata into a well expected input
to the runtime parser.

All identifiers are compile time hashed so that no string
comparison is made at runtime.

Speed up the source loading:
- from 10ms to 1.6ms (6.25x speedup) in release
- from 194ms to 6ms (32.3x speedup) in debug

Follow up #129009

Pull Request: https://projects.blender.org/blender/blender/pulls/128927
2024-10-15 19:47:30 +02:00
Clément Foucault
9c0321ae9b Metal: Simplify MSL translation
Move most of the string preprocessing used for MSL
compatibility to `glsl_preprocess`.

Enforce some changes like matrix constructor and
array constructor to the GLSL codebase. This is
for C++ compatibility.

Additionally reduce the amount of code duplication
inside the compatibility code.

Pull Request: https://projects.blender.org/blender/blender/pulls/128634
2024-10-07 12:54:10 +02:00
Clément Foucault
dcd80dbe15 GPU: GLSL C++ stubs
Allows to compile GLSL code using a C++ compiler. The end result is that
IDE features such as autocompletion and error detection can work with
the GLSL codebase.

Rel #127983

Pull Request: https://projects.blender.org/blender/blender/pulls/128598
2024-10-04 17:44:24 +02:00
Jason Fielder
57f7d6380c Fix #126542 Fix UV Edge overlays in Metal
Takes into account any offset that must be added to the vertex index
(usually supplied as baseVertex or startVertex in the Metal draw call)
in the code that emulates the SSBO vertex fetch.

Authored by Apple: James McCarthy

Pull Request: https://projects.blender.org/blender/blender/pulls/127864
2024-09-25 15:22:30 +02:00
Aras Pranckevicius
7fdfa47f23 VSE: Rounded corners for timeline strips
VSE timeline strips now have rounded corners. Strip corner rounding radius is
4, 6 or 8px depending on strip height (if strip is too narrow to fit
rounding, then rounding is turned off).

This is achieved with a dedicated GPU shader for drawing most of VSE
strip widget, that it could do proper rounded corner masking.

More details and images in the PR.

Pull Request: https://projects.blender.org/blender/blender/pulls/122576
2024-06-04 20:05:35 +02:00
Jason Fielder
f20ad70c08 EEVEE Next: Add imageStore/LoadFast ops to Renderpasses
Add fast image writing and reading variants for render passes.
These variants do not perform range checking on values
and should only be used in cases where the written texel is
guaranteed to be in range. This eliminates additional
branching and simplifies shader logic.

Authored by Apple: Michael Parkin-White

Pull Request: https://projects.blender.org/blender/blender/pulls/121116
2024-05-24 12:41:47 +02:00
Clément Foucault
bc1c35e201 EEVEE-Next: Change lights object matrix for more readability and compactness
This adds a new `Transform` type similar to cycles that reduces
the amount of data passed for a typical affine 3D transform.

This then applies this type to the light data and cleanup
all usage of the former `object_mat`. This also changes the axes
macros into utility accessor functions.

Pull Request: https://projects.blender.org/blender/blender/pulls/121089
2024-04-26 12:54:08 +02:00
Clément Foucault
2a600b4a83 EEVEE-Next: Shadow: Limit view per shadow map projection
This limits the number of tilemaps per LOD that can be fed to avoid the
easy to hit "Too many shadow updates" (#119757).

This allows for a max 64 tilemaps to be updated at once at their lowest
requested LOD (so ~10.6667 point lights if every faces of the punctual
shadow map is needed, but likely more in practice).

Unfortunately this is still quite low and will surely be hit quite soon
with directional shadow added to it. One idea to workaround this would
be to time slice the update of some lights, but this opens a whole can
of worms that I'm not ready to open for now so I created #119890 for
future reference.

Some notes, most lights seems to request around 3 LODs. It might help
to allow requesting at least 2 LODs if we are rendering since volumes
might want lower LOD available for volumes.

I added a very simplistic heuristic that also lowers the max tilemaps
when transforming, animation playback or navigating the 3D view to
improve the responsiveness of the engine. Note that this doesn't
only lowers the resolution to the minimum requested one. So it should
be good enough in most cases.

Pull Request: https://projects.blender.org/blender/blender/pulls/119889
2024-03-26 20:33:31 +01:00
Hans Goudey
8b514bccd1 Cleanup: Move remaining GPU headers to C++
Pull Request: https://projects.blender.org/blender/blender/pulls/119807
2024-03-23 01:24:18 +01:00
Jason Fielder
6b56ed3cd3 Metal: Resolve artifact in EEVEE Next Film Cryptomatte
Cryptomatte passes would generate a feathered outline
in Metal due to missing texture fence in chained
read->modify->write->read->... patterns.

Added imageFence function to explicitly state that
imageStore's should be visible to future imageLoad's.

Authored by Apple: Michael Parkin-White

Pull Request: https://projects.blender.org/blender/blender/pulls/119163
2024-03-14 17:48:30 +01:00
Clément Foucault
4205718dce GPU: Cleanup type aliases
This define all aliases for supported types,
document which one to use in C++ shared code,
move relevant defines to their backend file.

Rename `bool1` to `bool32_t` and cleanup
its usage as mentioned in #118961.

Rel. #118961

Pull Request: https://projects.blender.org/blender/blender/pulls/119098
2024-03-08 19:09:10 +01:00
Clément Foucault
749a3880de GL: Remove cube map array workaround 2024-01-31 18:12:59 +01:00
Jason Fielder
190567f941 EEVEE Next: Optimize HiZ with fast image load store routines
Authored by Apple: Michael Parkin-White

Pull Request: https://projects.blender.org/blender/blender/pulls/116953
2024-01-24 09:36:25 +01:00
Jason Fielder
d721dcd767 Metal: Resolve texture atomic compilation issue
Resolves small issue with native texture
atomic support after addition of fallback path.

Authored by Apple: Michael Parkin-White

Pull Request: https://projects.blender.org/blender/blender/pulls/116657
2023-12-31 01:07:47 +01:00
Campbell Barton
77204bed17 Cleanup: spelling in comments 2023-12-12 12:58:56 +11:00
Jason Fielder
9313750f0a Metal: Add fallback path for texture atomics V2
This patch adds an alternative path for devices/OSs
which do not support native texture atomics in Metal.
Support is encapsulated within the backend, ensuring
any allocated texture with the USAGE_ATOMIC flag is
allocated with a backing buffer, upon which atomic
operations happen.

The shader generation is also changed for the atomic
case, which instructs the backend to insert additional
buffer bind-points for the buffer resource. As Metal
also only supports buffer-backed textures for
textureBuffers or 2D textures, TextureArrays and
3D textures are emulated within a 2D texture, with
sample locations being indirected.

All usage of atomic textures MUST now utilise the
correct atomic texture types in the high level shader
and GPUShaderCreateInfo declarations.

Authored by Apple: Michael Parkin-White

Pull Request: https://projects.blender.org/blender/blender/pulls/115956
2023-12-11 23:00:20 +01:00
Clément Foucault
0684b68eb4 EEVEE-Next: Make Ambient Occlusion Pass use Horizon Scan
This adds a new way of computing occlusion using visibility bitmask. To
make it more algorithm agnostic, we name it horizon scan.
This cleans-up / simplify the code compared to the Horizon based solution.
There is no more trickery for fading influence of distant samples which
makes the result match cycles closer.

This introduces a new thickness option. Maintaining it relatively low
makes it possible to avoid over occlusion because of in front geometry.
Making it too low will cause under occlusion.

Related #112979

Pull Request: https://projects.blender.org/blender/blender/pulls/114150
2023-11-02 19:22:01 +01:00
Campbell Barton
49218f531a Cleanup: format 2023-10-20 14:20:45 +11:00
Clément Foucault
f79b86553a EEVEE-Next: Add mesh volume bounds estimation
This adds correct object bounds estimation.

This works by creating an occupancy texture where one
bit represents one froxel. A geometry pre-pass fill this
occupancy texture and doesn't do any shading. Each bit
set to 0 will not be considered occupied by the object
volume and will discard the material compute shader for
this froxel.

There is 2 method of computing the occupancy map:
- Atomic XOR: For each fragment we compute the amount of
  froxels **center** in-front of it. We then convert that
  into occupancy bitmask that we apply to the occupancy
  texture using `imageAtomicXor`. This is straight forward
  and works well for any manifold geometry.
- Hit List: For each fragment we write the fragment depth
  in a list (contained in one array texture). This list
  is then processed by a fullscreen pass (see
  `eevee_occupancy_convert_frag.glsl`) that sorts and
  converts all the hits to the occupancy bits. This
  emulate Cycles behavior by considering only back-face
  hits as exit events and front-face hits as entry events.
  The result stores it to the occupancy texture using
  bit-wise `OR` operation to compose it with other non-hit
  list objects. This also decouple the hit-list evaluation
  complexity from the material evaluation shader.

## Limitations
### Fast
- Non-manifolds geometry objects are rendered incorrectly.
- Non-manifolds geometry objects will affect other objects
  in front of them.
### Accurate
- Limited to 16 hits per layer for now.
- Non-manifolds geometry objects will affect other objects
  in front of them.

Pull Request: https://projects.blender.org/blender/blender/pulls/113731
2023-10-19 19:22:14 +02:00
Harley Acheson
f30280a1d1 Cleanup: Make format
Formatting changes resulting from Make Format.
2023-10-01 08:50:28 -07:00
Clément Foucault
ad50ded7b5 Metal: Fix texture atomic wrapper
GLSL imageAtomic operations operate on single components.
2023-09-30 21:37:44 +02:00
Campbell Barton
077832e063 Cleanup: spelling in comments 2023-09-26 19:50:48 +10:00
Jason Fielder
ee03bb38cb Metal: Add support for atomic image operations
Texture Atomics have been added in Metal 3.1
and enable the original implementations of
shadow update and irradiance cache baking.

However, a fallback solution will be
required for versions under macOS 14.0 utilising
buffer-backed textures instead.

This patch also includes a stub implementation if
building/running on older macOS versions which
provides locally-synchronized texture access in
place of atomics. This enables some effects to be
partially tested, and ensures non-guarded use
of imageAtomic functions does not result
in compilation failure.

Authored by Apple: Michael Parkin-White

Pull Request: https://projects.blender.org/blender/blender/pulls/112866
2023-09-25 21:56:46 +02:00
Jason Fielder
109bc2d416 GPU: Add imageStoreFast for increased write performance
imageStoreFast provides a variant of imageStore which does
not perform any bounds checking, reducing shader divergence,
register pressure and increasing performance through fewer
instructions.

However, this should only be used for cases where the writing
coordinate is guaranteed to fall within the texture.

Authored by Apple: Michael Parkin-White

Pull Request: https://projects.blender.org/blender/blender/pulls/111750
2023-09-04 15:15:55 +02:00
Campbell Barton
3082037743 Cleanup: spelling in comments 2023-09-03 16:15:01 +10:00
Clément Foucault
05816e6f7a Fix GPU: MSL sources not considered by make format 2023-09-01 10:40:21 +02:00
Campbell Barton
3d607be572 Cleanup: spelling in comments 2023-08-30 10:57:12 +10:00
Campbell Barton
eec449ffe8 Cleanup: correct spelling, comments
Hyphenate words in GLSL code-comments.
2023-08-29 15:55:09 +10:00
Campbell Barton
3de8900ed6 Cleanup: spelling in comments 2023-08-25 09:40:42 +10:00
Campbell Barton
04bf0f3eb6 License headers: add SPDX copyright entries for '*.msl' files 2023-08-19 17:41:14 +10:00
Clément Foucault
c7dce76619 Metal: Various fixes
Authored by Apple: Michael Parkin-White
2023-08-13 23:42:06 +02:00
Jason Fielder
72987941e7 Metal: EEVEE-Next: Fix light and shadow OOB issues
Addressing a number of small issues and OOB reads/writes occuring in
EEVEE Next shadows + lighting passes. Improving correctness for unit
tests. Shadows are not yet working overall, but this unblocks progress
towards unit tests.

Authored by Apple: Michael Parkin-White

Pull Request: https://projects.blender.org/blender/blender/pulls/108768
2023-06-08 16:47:45 +02:00
Clément Foucault
0e7b81dd32 Metal: Fix MSL compilation warning 2023-05-25 09:24:53 +02:00
Clément Foucault
c796cbebef Metal: Add atomicExchange and mat3x4 support 2023-05-05 16:18:24 +02:00
Jason Fielder
88ace032a6 Metal: Storage buffer and explicit bind location support
Adds support for Storage buffers, including changes to the resource
binding model to ensure explicit resource bind locations are supported
for all resource types.

Storage buffer support also includes required changes for shader
source generation and SSBO wrapper support for other resource
types such as GPUVertBuf, GPUIndexBuf and GPUUniformBuf.

Authored by Apple: Michael Parkin-White

Pull Request: https://projects.blender.org/blender/blender/pulls/107175
2023-05-03 11:46:30 +02:00
Jason Fielder
fdf920bf5d Metal: Add textureGrad support
Fixes compilation errors in viewport compositor.

Authored by Apple: Michael Parkin-White

Pull Request: https://projects.blender.org/blender/blender/pulls/106805
2023-04-21 07:45:30 +02:00
Jason Fielder
dda4c0721c EEVEE-Next: Resolve compilation errors in Metal
Shader source requires explicit conversions and shader address
space qualifers in certain places in order to compile for Metal.

We also require constructors for a number of default struct types.

Authored by Apple: Michael Parkin-White

Pull Request: https://projects.blender.org/blender/blender/pulls/106219
2023-04-20 08:03:31 +02:00
Jason Fielder
d3409f2159 Fix: Uncached Metal Materials not Being Released
Optimized node graphs do not get cached and were
not correctly freed once their reference count reached
zero, due to being excluded from the GPUPass garbage
collection.

Also suppress Metal shader warnings, which are prevalent
during material optimization.

Authored by Apple: Michael Parkin-White

Pull Request: https://projects.blender.org/blender/blender/pulls/105795
2023-03-16 08:19:32 +01:00
Jason Fielder
f3bd5458a3 Metal: Optimise shader texture cache usage and branch reduction via point sampling.
Replace texelFetch calls with a texture point-sample rather than a textureRead call. This increases texture cache utilisation when mixing between sampled calls and reads. Bounds checking can also be removed from these functions, reducing instruction count and branch divergence, as the sampler routine handles range clamping.

Authored by Apple: Michael Parkin-White
Ref T96261

Depends on D16923

Reviewed By: fclem

Maniphest Tasks: T96261

Differential Revision: https://developer.blender.org/D17021
2023-01-31 10:56:25 +01:00
Jeroen Bakker
6b8fa899ca Metal: Fix compilation of GLSL used in test cases.
Added imageStore for 1d textures.
2023-01-31 08:42:33 +01:00
Jason Fielder
57552f52b2 Metal: Realtime compositor enablement with addition of GPU Compute.
This patch adds support for compilation and execution of GLSL compute shaders. This, along with a few systematic changes and fixes, enable realtime compositor functionality with the Metal backend on macOS. A number of GLSL source modifications have been made to add the required level of type explicitness, allowing all compilations to succeed.

GLSL Compute shader compilation follows a similar path to Vertex/Fragment translation, with added support for shader atomics, shared memory blocks and barriers.

Texture flags have also been updated to ensure correct read/write specification for textures used within the compositor pipeline. GPU command submission changes have also been made in the high level path, when Metal is used, to address command buffer time-outs caused by certain expensive compute shaders.

Authored by Apple: Michael Parkin-White

Ref T96261
Ref T99210

Reviewed By: fclem

Maniphest Tasks: T99210, T96261

Differential Revision: https://developer.blender.org/D16990
2023-01-30 11:06:56 +01:00
Jason Fielder
0ba5954bb2 Fix T103635: Fix failing EEVEE and OCIO shader compilations in Metal.
Affecting render output preview when tone mapping is used, and EEVEE scenes such as Mr Elephant rendering in pink due to missing shaders.

Authored by Apple: Michael Parkin-White

Ref T103635
Ref T96261

Reviewed By: fclem

Maniphest Tasks: T103635, T96261

Differential Revision: https://developer.blender.org/D16923
2023-01-23 17:40:10 +01:00
Jeroen Bakker
d7598c8081 Metal: Fix crash when compiling compositor shaders.
Although viewport compositor isn't supported yet on Apple deviced the
shaderlibs are compiled. The compositor shaders uses mat3 constructor
with a single vec3 and 6 floats. This constructor wasn't defined in
metal so it failed the compilation.

This patch adds the override to mat3.
2023-01-05 08:30:41 +01:00