This was caused by the removal of some compatibility code
in the metal backend.
This patch reintroduce this compatibility code in a cleaner
way than before.
This should be followed up by a documentation commit that
explains what is supported and what is not.
Pull Request: https://projects.blender.org/blender/blender/pulls/139185
This commit finishes removing the uses of the integer to float
vertex buffer fetch mode. Previous commits noted below already started
that process. The last usage was geometry attributes. Now integers are
converted to floats as part of the existing upload process.
The change makes the Vulkan vertex buffer type conversion unused, so
it's removed. That's nice because Vulkan vertex buffers go from 1040 to
568 bytes in size and have significantly less overhead on creation.
Related:
- 153abc372e
- 1e1ac2bb9b
- 617858e453
Pull Request: https://projects.blender.org/blender/blender/pulls/138873
Allows basic usage of templated functions.
There is no support for templated struct.
Benefit:
- More readable than macros in shader sources.
- Compatible with C++ tools.
- More sharing possible with host C++ code.
Requirements/Limitations:
- No default arguments to template parameters.
- Must use explicit instantiation for all variant needed.
- Explicit instantiation needs to **not** use argument deduction.
- Calls to template needs to have all template argument explicit
or all implicit.
- Template overload is not supported (redefining the same template
with different template argument or function argument types).
Currently implemented as Macros inside the build-time pre-pocessor,
but that could change to copy-paste to allow better error reporting.
However, the Macros keep the shader code reduced in the final binary
and allow different file to declare different instantiation.
The implementation is done by declaring overloads for each explicit
instantiation.
If a template has arguments not present in function
arguments, then all arguments **values** are appended to the
function name. The explicit template callsite is then modified to use
`TEMPLATE_GLUE` which will call the correct function. This is
why template argument deduction is not supported in this case.
Rel #137446
Pull Request: https://projects.blender.org/blender/blender/pulls/137441
This avoid having to guards functions that are
only available in fragment shader stage.
Calling the function inside another stage is still
invalid and will yield a compile error on Metal.
The vulkan and opengl glsl patch need to be modified
per stage to allow the fragment specific function
to be defined.
This is not yet widely used, but a good example is
the change in `film_display_depth_amend`.
Rel #137261
Pull Request: https://projects.blender.org/blender/blender/pulls/138280
They are actually already some literals with the `f` suffix
that are in our shader codebase and we never had problem in
the past 5 years (or even 8 years).
So I think it is safe to do and improves convergence of codestyles.
Pull Request: https://projects.blender.org/blender/blender/pulls/137352
The refactor 9c0321ae9b
had the wrong mental model of the backing texture
layout for the atomic workaround.
For 3D textures, the layout is breaking the 3D texture
and reinterpreting the linear location as its 2D
linear location. This breaks the 3D texture Z slices
into non contiguous regions in 2D.
Comments have been added to avoid future confusion.
Pull Request: https://projects.blender.org/blender/blender/pulls/133830
The goal is to reduce the startup time cost of
all of these parsing and string replacement.
All comments are now stripped at compile time.
This comment check added noticeable slowdown at
startup in debug builds and during preprocessing.
Put all metadatas between start and end token.
Use very simple parsing using `StringRef` and
hash all identifiers.
Move all the complexity to the preprocessor that
massagess the metadata into a well expected input
to the runtime parser.
All identifiers are compile time hashed so that no string
comparison is made at runtime.
Speed up the source loading:
- from 10ms to 1.6ms (6.25x speedup) in release
- from 194ms to 6ms (32.3x speedup) in debug
Follow up #129009
Pull Request: https://projects.blender.org/blender/blender/pulls/128927
Move most of the string preprocessing used for MSL
compatibility to `glsl_preprocess`.
Enforce some changes like matrix constructor and
array constructor to the GLSL codebase. This is
for C++ compatibility.
Additionally reduce the amount of code duplication
inside the compatibility code.
Pull Request: https://projects.blender.org/blender/blender/pulls/128634
Takes into account any offset that must be added to the vertex index
(usually supplied as baseVertex or startVertex in the Metal draw call)
in the code that emulates the SSBO vertex fetch.
Authored by Apple: James McCarthy
Pull Request: https://projects.blender.org/blender/blender/pulls/127864
VSE timeline strips now have rounded corners. Strip corner rounding radius is
4, 6 or 8px depending on strip height (if strip is too narrow to fit
rounding, then rounding is turned off).
This is achieved with a dedicated GPU shader for drawing most of VSE
strip widget, that it could do proper rounded corner masking.
More details and images in the PR.
Pull Request: https://projects.blender.org/blender/blender/pulls/122576
Add fast image writing and reading variants for render passes.
These variants do not perform range checking on values
and should only be used in cases where the written texel is
guaranteed to be in range. This eliminates additional
branching and simplifies shader logic.
Authored by Apple: Michael Parkin-White
Pull Request: https://projects.blender.org/blender/blender/pulls/121116
This adds a new `Transform` type similar to cycles that reduces
the amount of data passed for a typical affine 3D transform.
This then applies this type to the light data and cleanup
all usage of the former `object_mat`. This also changes the axes
macros into utility accessor functions.
Pull Request: https://projects.blender.org/blender/blender/pulls/121089
This limits the number of tilemaps per LOD that can be fed to avoid the
easy to hit "Too many shadow updates" (#119757).
This allows for a max 64 tilemaps to be updated at once at their lowest
requested LOD (so ~10.6667 point lights if every faces of the punctual
shadow map is needed, but likely more in practice).
Unfortunately this is still quite low and will surely be hit quite soon
with directional shadow added to it. One idea to workaround this would
be to time slice the update of some lights, but this opens a whole can
of worms that I'm not ready to open for now so I created #119890 for
future reference.
Some notes, most lights seems to request around 3 LODs. It might help
to allow requesting at least 2 LODs if we are rendering since volumes
might want lower LOD available for volumes.
I added a very simplistic heuristic that also lowers the max tilemaps
when transforming, animation playback or navigating the 3D view to
improve the responsiveness of the engine. Note that this doesn't
only lowers the resolution to the minimum requested one. So it should
be good enough in most cases.
Pull Request: https://projects.blender.org/blender/blender/pulls/119889
Cryptomatte passes would generate a feathered outline
in Metal due to missing texture fence in chained
read->modify->write->read->... patterns.
Added imageFence function to explicitly state that
imageStore's should be visible to future imageLoad's.
Authored by Apple: Michael Parkin-White
Pull Request: https://projects.blender.org/blender/blender/pulls/119163
This define all aliases for supported types,
document which one to use in C++ shared code,
move relevant defines to their backend file.
Rename `bool1` to `bool32_t` and cleanup
its usage as mentioned in #118961.
Rel. #118961
Pull Request: https://projects.blender.org/blender/blender/pulls/119098
This patch adds an alternative path for devices/OSs
which do not support native texture atomics in Metal.
Support is encapsulated within the backend, ensuring
any allocated texture with the USAGE_ATOMIC flag is
allocated with a backing buffer, upon which atomic
operations happen.
The shader generation is also changed for the atomic
case, which instructs the backend to insert additional
buffer bind-points for the buffer resource. As Metal
also only supports buffer-backed textures for
textureBuffers or 2D textures, TextureArrays and
3D textures are emulated within a 2D texture, with
sample locations being indirected.
All usage of atomic textures MUST now utilise the
correct atomic texture types in the high level shader
and GPUShaderCreateInfo declarations.
Authored by Apple: Michael Parkin-White
Pull Request: https://projects.blender.org/blender/blender/pulls/115956
This adds a new way of computing occlusion using visibility bitmask. To
make it more algorithm agnostic, we name it horizon scan.
This cleans-up / simplify the code compared to the Horizon based solution.
There is no more trickery for fading influence of distant samples which
makes the result match cycles closer.
This introduces a new thickness option. Maintaining it relatively low
makes it possible to avoid over occlusion because of in front geometry.
Making it too low will cause under occlusion.
Related #112979
Pull Request: https://projects.blender.org/blender/blender/pulls/114150
This adds correct object bounds estimation.
This works by creating an occupancy texture where one
bit represents one froxel. A geometry pre-pass fill this
occupancy texture and doesn't do any shading. Each bit
set to 0 will not be considered occupied by the object
volume and will discard the material compute shader for
this froxel.
There is 2 method of computing the occupancy map:
- Atomic XOR: For each fragment we compute the amount of
froxels **center** in-front of it. We then convert that
into occupancy bitmask that we apply to the occupancy
texture using `imageAtomicXor`. This is straight forward
and works well for any manifold geometry.
- Hit List: For each fragment we write the fragment depth
in a list (contained in one array texture). This list
is then processed by a fullscreen pass (see
`eevee_occupancy_convert_frag.glsl`) that sorts and
converts all the hits to the occupancy bits. This
emulate Cycles behavior by considering only back-face
hits as exit events and front-face hits as entry events.
The result stores it to the occupancy texture using
bit-wise `OR` operation to compose it with other non-hit
list objects. This also decouple the hit-list evaluation
complexity from the material evaluation shader.
## Limitations
### Fast
- Non-manifolds geometry objects are rendered incorrectly.
- Non-manifolds geometry objects will affect other objects
in front of them.
### Accurate
- Limited to 16 hits per layer for now.
- Non-manifolds geometry objects will affect other objects
in front of them.
Pull Request: https://projects.blender.org/blender/blender/pulls/113731
Texture Atomics have been added in Metal 3.1
and enable the original implementations of
shadow update and irradiance cache baking.
However, a fallback solution will be
required for versions under macOS 14.0 utilising
buffer-backed textures instead.
This patch also includes a stub implementation if
building/running on older macOS versions which
provides locally-synchronized texture access in
place of atomics. This enables some effects to be
partially tested, and ensures non-guarded use
of imageAtomic functions does not result
in compilation failure.
Authored by Apple: Michael Parkin-White
Pull Request: https://projects.blender.org/blender/blender/pulls/112866
imageStoreFast provides a variant of imageStore which does
not perform any bounds checking, reducing shader divergence,
register pressure and increasing performance through fewer
instructions.
However, this should only be used for cases where the writing
coordinate is guaranteed to fall within the texture.
Authored by Apple: Michael Parkin-White
Pull Request: https://projects.blender.org/blender/blender/pulls/111750
Addressing a number of small issues and OOB reads/writes occuring in
EEVEE Next shadows + lighting passes. Improving correctness for unit
tests. Shadows are not yet working overall, but this unblocks progress
towards unit tests.
Authored by Apple: Michael Parkin-White
Pull Request: https://projects.blender.org/blender/blender/pulls/108768
Adds support for Storage buffers, including changes to the resource
binding model to ensure explicit resource bind locations are supported
for all resource types.
Storage buffer support also includes required changes for shader
source generation and SSBO wrapper support for other resource
types such as GPUVertBuf, GPUIndexBuf and GPUUniformBuf.
Authored by Apple: Michael Parkin-White
Pull Request: https://projects.blender.org/blender/blender/pulls/107175