With the shift to GPU-driven rendering pipeline,
the SSBO vertex fetch paradigm used to
implement workbench shadows on Metal
instead of utilising the geometry shader
path no longer worked correctly.
This is because the draw submission
required vertex amplification up-front,
based on the expected output geometry
amount for a given input geometry.
This patch aims to resolve this
issue through addition of API to
enable the features within the
GPU driven pipeline.
Co-authored-by: Michael Parkin-White <mparkinwhite@apple.com>
Pull Request: https://projects.blender.org/blender/blender/pulls/113498
Changes to ensure all supported texture tests are passing with the
Metal backend and add additional tests to cover texture_3d and
texture 1d test cases.
Authored by Apple: Michael Parkin-White
Co-authored-by: Michael Parkin-White <mparkinwhite@apple.com>
Pull Request: https://projects.blender.org/blender/blender/pulls/113889
With the shift to GPU-driven rendering pipeline,
the SSBO vertex fetch paradigm used to
implement workbench shadows on Metal
instead of utilising the geometry shader
path no longer worked correctly.
This is because the draw submission
required vertex amplification up-front,
based on the expected output geometry
amount for a given input geometry.
This WIP patch aims to resolve this
issue through addition of API to
enable the features within the
GPU driven pipeline.
Co-authored-by: Michael Parkin-White <mparkinwhite@apple.com>
Pull Request: https://projects.blender.org/blender/blender/pulls/113498
This patch improves the heuristic used to
determine maximum frames-in-flight at
different input latency levels.
Previous values adjusted to better encapsulate
workloads in the 10-20 FPS range which can
manifest excessive latency. Changes improve
total performance throughput and improve
the interactive user experience.
Authored by Apple: Michael Parkin-White
Pull Request: https://projects.blender.org/blender/blender/pulls/113020
Optimization of EEVEE Next's Virtual Shadow Maps for TBDRs.
The core of these optimizations lie in eliminating use of
atomic shadow atlas writes and instead utilise tile memory to
perform depth accumulation as a secondary pass once all
geometry updates for a given shadow view have been updated.
This also allows use of fast on-tile depth testing/sorting, reducing
overdraw and redundant fragment operations, while also allowing
for tile indirection calculations to be offloaded into the vertex
shader to increase fragment storage efficiency and throughput.
Authored by Apple: Michael Parkin-White
Co-authored-by: Michael Parkin-White <mparkinwhite@apple.com>
Co-authored-by: Clément Foucault <foucault.clem@gmail.com>
Pull Request: https://projects.blender.org/blender/blender/pulls/111283
This adds basic emulation of the subpass input feature
of vulkan and to a lower extend Raster Order Group on Metal.
This help test paths that might use this feature in the future
(like shadow rendering) on all platform and or simplify higher
level code for supporting older hardware.
This add clear description to the load/store ops and to the
new `GPUAttachementState`.
The OpenGL backend will correctly mask un-writable
attachments and will bind as texture readable attachments.
Even if possible by the vulkan standard, the GPU API prohibit
the read and write to the same attachment inside the same
subpass.
In the GL backend, this is implemented using `glTextureBarrier`
and `texelFetch` as it is described in the ARB_texture_barrier
extension.
https://registry.khronos.org/OpenGL/extensions/ARB/ARB_texture_barrier.txt
Pull Request: https://projects.blender.org/blender/blender/pulls/112051
Flickering caused by in-flight SSBO data
overwrites has been resolved by ensuring
data updates go into a new buffer while
existing data is in flight.
GPU_finish has also been removed from
SSBO read due to its frequent mid-frame use
limiting performance.
Authored by Apple: Michael Parkin-White
Pull Request: https://projects.blender.org/blender/blender/pulls/113019
Texture Atomics have been added in Metal 3.1
and enable the original implementations of
shadow update and irradiance cache baking.
However, a fallback solution will be
required for versions under macOS 14.0 utilising
buffer-backed textures instead.
This patch also includes a stub implementation if
building/running on older macOS versions which
provides locally-synchronized texture access in
place of atomics. This enables some effects to be
partially tested, and ensures non-guarded use
of imageAtomic functions does not result
in compilation failure.
Authored by Apple: Michael Parkin-White
Pull Request: https://projects.blender.org/blender/blender/pulls/112866
Blender 4.0 requires OpenGL 4.3 which always support SSBO's.
Platforms that don't support enough SSBO bind points will be marked
as unsupported.
Users who start Blender on those platforms will be informed via a
dialog. This PR also updates the `--debug-gpu-force-workarounds`
to match our minimum requirements. Note that some bugs are still
there that should be solved in other PRs:
* Workbench only renders the object using a unit matrix this is because
there is a bug in the workaround for shader_draw_parameters
* Navigating with middle mouse button is not working. Unsure what the
cause is, but might be a missing feature check in the OpenGL backend.
Related to #112224
Pull Request: https://projects.blender.org/blender/blender/pulls/112572
The NDEBUG is a toggle define, and in debug builds it is not defined.
This change solves the warning
mtl_context.mm:2282:5: warning: 'NDEBUG' is not defined, evaluates to 0 [-Wundef]
Pull Request: https://projects.blender.org/blender/blender/pulls/112504
Enables performance optimizations and new rendering
features through allowing colour attachments to be
tagged with a rasterization order group. This means that
all accesses to the same pixel location are guaranteed
to happen in submission order.
This lays the ground work for tile-based architecture
optimizations, enabling additional features for
deferred rendering which allow subsequent passes
which operate on pixel data to occur in order, while
memory remains on-tile.
This patch allows fragment outputs to be tagged
with a raster order group and also adds a new
FragmentTileIn parameter to a shader, which allows
speciication of incoming parameters from a previous
draw.
Authored by Apple: Michael Parkin-White
Pull Request: https://projects.blender.org/blender/blender/pulls/111748
Enhance custom framebuffer binding state to allow
specification of clear color as part of the loadstore
state at framebuffer bind time.
This ensures all parameters controlling attachment
loading and storage behaviour can be explicitly
specified when binding a framebuffer.
This change enables optimizations which leverage
explicit framebuffer load store state to also specify
a clear color without prematurely triggering a
clear which may occur independently to
render work when using GPU_framebuffer_clear(..).
Authored by Apple: Michael Parkin-White.
Pull Request: https://projects.blender.org/blender/blender/pulls/111810
Removes old unused bind state parameters from
prior to the addition of explicit resource location support.
Patch also ensures compute state is reset between each
encoder and adds a debug option to dispatch each
compute command within its own encoder for debugging
synchronization issues.
Authored by Apple: Michael Parkin-White
Pull Request: https://projects.blender.org/blender/blender/pulls/111753
Memoryless textures are only used as intermediate attachments
during rasterization, but do not have any backing storage. This is
particularly useful if a virutal framebuffer is needed, or, there is
a situation where a depth buffer is only needed within the pass
itself and the results are discarded once the pass completes.
Authored by Apple: Michael Parkin-White
Pull Request: https://projects.blender.org/blender/blender/pulls/111749
The hash tables and vector blenlib headers were pulling many more
headers than they actually need, including the C base math header,
our C string API header, and the StringRef header. All of this
potentially slows down compilation and polutes autocomplete
with unrelated information.
Also remove the `ListBase` constructor for `Vector`. It wasn't used
much, and making it easy to use `ListBase` isn't worth it for the
same reasons mentioned above.
It turns out a lot of files depended on indirect includes of
`BLI_string.h` and `BLI_listbase.h`, so those are fixed here.
Pull Request: https://projects.blender.org/blender/blender/pulls/111801
When GLSL sources were first included in Blender they were treated as
data (like blend files) and had no license header.
Since then GLSL has been used for more sophisticated features
(EEVEE & real-time compositing)
where it makes sense to include licensing information.
Add SPDX copyright headers to *.glsl files, matching headers used for
C/C++, also include GLSL files in the license checking script.
As leading C-comments are now stripped,
added binary size of comments is no longer a concern.
Ref !111247
Listing the "Blender Foundation" as copyright holder implied the Blender
Foundation holds copyright to files which may include work from many
developers.
While keeping copyright on headers makes sense for isolated libraries,
Blender's own code may be refactored or moved between files in a way
that makes the per file copyright holders less meaningful.
Copyright references to the "Blender Foundation" have been replaced with
"Blender Authors", with the exception of `./extern/` since these this
contains libraries which are more isolated, any changed to license
headers there can be handled on a case-by-case basis.
Some directories in `./intern/` have also been excluded:
- `./intern/cycles/` it's own `AUTHORS` file is planned.
- `./intern/opensubdiv/`.
An "AUTHORS" file has been added, using the chromium projects authors
file as a template.
Design task: #110784
Ref !110783.
Previously the shader associated with a batch was
assigned separately. This meant on occasion, the
incorrect shader was used with the Metal backend.
Updating the shader fetch to use the context bound
shader instead of the batch shader member as per
OpenGL/Vulkan backends.
Also include state change check to be consistent
with OpenGL.
Authored by Apple: Michael Parkin-White
Co-authored-by: Michael Parkin-White <mparkinwhite@apple.com>
Pull Request: https://projects.blender.org/blender/blender/pulls/111112
Add a High Dynamic Range option in the Color Management > Display panel.
This enables display of extended color ranges above 1.0 for the 3D
viewport, image editor and render previews.
This requires a monitor that can display HDR colors, and a view
transform designed for HDR output. The Standard view transform works,
but Filmic does not as it was designed to bring values into the 0..1
range for SDR displays.
This patch is limited to allowing the display to visualize extended
colors, but does not include future looking work to better integrate HDR
into the full workflow.
It is implemented by rendering to high bit-depth texture formats for
the user interface, and uncapping the color range in color management.
Authored by Apple: Michael Parkin-White
Pull Request: https://projects.blender.org/blender/blender/pulls/105662
This add the possibility to define different
viewports inside a single framebuffer and
let the vertex shader decide which viewport
to render to.
This only contain the GL and VK implementation.
The Vulkan implementation works but still
has a validation error related to shader features
and extension. The test passes nonetheless.
Pull Request: https://projects.blender.org/blender/blender/pulls/110923
Adds support for generating curve primtiives avoiding the
use of primtiive restarts. This maixmises geometry performance
when using Metal.
Also ensure that the existing index buffer optimization path is
skipped for indirect draw calls where counts are not known at
submission time.
Authored by Apple: Michael Parkin-White
Pull Request: https://projects.blender.org/blender/blender/pulls/109972
PixelBuffer required addition of buffer re-allocation during mapping
to avoid host updates over-writing previous in-flight data as required
by the GPU. Changes allow multiple copies of existing pixel data to
be in-flight simultaneously.
Authored by Apple: Michael Parkin-White
Pull Request: https://projects.blender.org/blender/blender/pulls/110305
Certain feature requirements unsupported by older OS builds
caused failures when running Intel GPUs on older OS's.
This patch increases the minimum required OS version
to one which covers devices supporting all required features.
Authored by Apple: Michael Parkin-White
Pull Request: https://projects.blender.org/blender/blender/pulls/109921