While investigating Blender compilation time for windows-arm64, we
identified two compilation units that were taking a long time to compile
(~1h each). This affects windows-x64 builds as well.
Pull Request: https://projects.blender.org/blender/blender/pulls/117534
Allows specification of per-shader threadgroup memory tuning
to optimise performance through increase of GPU occupancy.
Authored by Apple: Michael Parkin-White
Pull Request: https://projects.blender.org/blender/blender/pulls/115238
Adds API to allow usage of specialization constants in shaders.
Specialization constants are dynamic runtime constants which can
be compiled into a shader pipeline state object (PSO) to improve
runtime performance by reducing shader complexity through
shader compiler constant-folding.
This API allows specialization constant values to be specified
along with a default value if no constant value has been declared.
Each GPU backend is then responsible for caching PSO permutations
against the current specialization configuration.
This patch adds support for specialization constants in the
Metal backend and provides a generalised high-level solution
which can be adopted by other graphics APIs supporting
this feature.
Authored by Apple: Michael Parkin-White
Authored by Blender: Clément Foucault (files in gpu/test folder)
Pull Request: https://projects.blender.org/blender/blender/pulls/115193
This patch adds an alternative path for devices/OSs
which do not support native texture atomics in Metal.
Support is encapsulated within the backend, ensuring
any allocated texture with the USAGE_ATOMIC flag is
allocated with a backing buffer, upon which atomic
operations happen.
The shader generation is also changed for the atomic
case, which instructs the backend to insert additional
buffer bind-points for the buffer resource. As Metal
also only supports buffer-backed textures for
textureBuffers or 2D textures, TextureArrays and
3D textures are emulated within a 2D texture, with
sample locations being indirected.
All usage of atomic textures MUST now utilise the
correct atomic texture types in the high level shader
and GPUShaderCreateInfo declarations.
Authored by Apple: Michael Parkin-White
Pull Request: https://projects.blender.org/blender/blender/pulls/115956
This PR introduced some filters to improve the workflow when using
shader_builder. Shader builder is used to validate shader compilation
during buildtime and can be enabled using `WITH_GPU_BUILDTIME_SHADER_BUILDER`.
During backend development shader builder is also handy as you can
pin-point it to the shader/backend you're focusing on. Without filters
you would insert temporary code to break on a specific shader.
* `--gpu-backend` can be used to only check a specific backend.
possible values are `vulkan`, `metal` or `opengl`. When argument
isn't passed, all backends will be validated.
* `--gpu-shader-filter` can be used to only check a subset or indivisual
shader. The filter is a name starts with filter. Use
`--gpu-shader-filter eevee` to validate all eevee shaders
Pull Request: https://projects.blender.org/blender/blender/pulls/115888
NDEBUG is part of the C standard and disables asserts. Only this will
now be used to decide if asserts are enabled.
DEBUG was a Blender specific define, that has now been removed.
_DEBUG is a Visual Studio define for builds in Debug configuration.
Blender defines this for all platforms. This is still used in a few
places in the draw code, and in external libraries Bullet and Mantaflow.
Pull Request: https://projects.blender.org/blender/blender/pulls/115774
Shaders that require transform feedback should not be validated on
backends that don't support transform feedback.
In Vulkan transform feedback is implemented as an extension and
supported by half of the platforms. It isn't decided yet if we want to
support transform feedback as it is currently used as a fallback for
hair compute shader. In vulkan compute is available on all platforms.
During validation the shader printed a not implemented message.
This change hides that message.
Pull Request: https://projects.blender.org/blender/blender/pulls/113655
`eevee_shadow_page_tile_store` shader uses `VIEWPORT_INDEX` and `LAYER`.
Both use an optional extension in OpenGL and Vulkan. When the extension
isn't available a geometry shader is injected to emulate the
extension. The generated geometry shader requires `instance_name` to be
set. This wasn't the case for `eevee_shadow_page_tile_store` shader.
This PR also adds a detection for incompatible shader infos.
- Shaders that use `VIEWPORT_INDEX` or `LAYER` cannot have a geometry stage.
This check is done in debug and release builds.
- Shaders that use a fallback shader should have instance names set in
the stage interfaces. This check is only done in debug builds.
Pull Request: https://projects.blender.org/blender/blender/pulls/113649
Blender 4.0 requires OpenGL 4.3 which always support SSBO's.
Platforms that don't support enough SSBO bind points will be marked
as unsupported.
Users who start Blender on those platforms will be informed via a
dialog. This PR also updates the `--debug-gpu-force-workarounds`
to match our minimum requirements. Note that some bugs are still
there that should be solved in other PRs:
* Workbench only renders the object using a unit matrix this is because
there is a bug in the workaround for shader_draw_parameters
* Navigating with middle mouse button is not working. Unsure what the
cause is, but might be a missing feature check in the OpenGL backend.
Related to #112224
Pull Request: https://projects.blender.org/blender/blender/pulls/112572
5cf7089e43 added the `BuiltinBits::LAYER` to shaders with a geometry
stage. This causes compilation errors when
`GLContext::layered_rendering_support` is false (otherwise the flag
does nothing).
This PR moves the `LAYER` flags to the `no_geom` shader versions and
adds a check to `ShaderCreateInfo::finalize()` to ensure the `LAYER` flag
is not used in shaders with a geometry stage.
Pull Request: https://projects.blender.org/blender/blender/pulls/112245
Enables performance optimizations and new rendering
features through allowing colour attachments to be
tagged with a rasterization order group. This means that
all accesses to the same pixel location are guaranteed
to happen in submission order.
This lays the ground work for tile-based architecture
optimizations, enabling additional features for
deferred rendering which allow subsequent passes
which operate on pixel data to occur in order, while
memory remains on-tile.
This patch allows fragment outputs to be tagged
with a raster order group and also adds a new
FragmentTileIn parameter to a shader, which allows
speciication of incoming parameters from a previous
draw.
Authored by Apple: Michael Parkin-White
Pull Request: https://projects.blender.org/blender/blender/pulls/111748
Some shaders using gl_Layer/gpu_Layer were missing
the correct usage bit in the shader create info. Shaders
also need to inherit feature usage bits from included
additional_info.
Authored by Apple: Michael Parkin-White
Pull Request: https://projects.blender.org/blender/blender/pulls/111751
Now that all shaders have been converted to be Vulkan Compatible it is
safe to test for incompatibility when running in OpenGL or MTL.
This detection is only done when blender is compiled in debug mode.
shaderc generates an error when a struct containing an int type
isn't qualified as flat. We work around this issue by changing the
interpolation mode to flat during code generation.
Pull Request: https://projects.blender.org/blender/blender/pulls/111211
A difference was detected between stage interfaces between OpenGL and Vulkan
that are not compatible with our current API.
**OpenGL**
In OpenGL an stage interface struct can have different interpolation qualifiers
per attribute.
```glsl
struct MyStageInterface {
smooth vec4 color;
flat int face_flag;
};
layout(..) MyStageInterface interp;
```
**Vulkan**
In vulkan the interpolation qualifier isn't supported on attribute
level and needs to be added to the struct.
```glsl
struct MyStageInterface {
vec4 color;
};
struct MyStageInterface_flat {
int face_flag;
};
layout(..) smooth MyStageInterface interp;
layout(..) flat MyStageInterface_flat interp_flat;
```
This patch reports shaders that are incompatible with Vulkan so they can be
patched. Report is only done in debug mode and when using the vulkan backend.
After all shaders are patched an error will be raised so developers will
known immediately when incompatibility are created.
Making the shaders compatible and adding the error will be done in future
patches.
**Python**
Via Python gpu module (gpu.types.GPUShaderCreateInfo) it isn't possible
to construct an incompatible shader as instance names cannot be set
via the API. So this isn't a breaking change.
Pull Request: https://projects.blender.org/blender/blender/pulls/111138
Listing the "Blender Foundation" as copyright holder implied the Blender
Foundation holds copyright to files which may include work from many
developers.
While keeping copyright on headers makes sense for isolated libraries,
Blender's own code may be refactored or moved between files in a way
that makes the per file copyright holders less meaningful.
Copyright references to the "Blender Foundation" have been replaced with
"Blender Authors", with the exception of `./extern/` since these this
contains libraries which are more isolated, any changed to license
headers there can be handled on a case-by-case basis.
Some directories in `./intern/` have also been excluded:
- `./intern/cycles/` it's own `AUTHORS` file is planned.
- `./intern/opensubdiv/`.
An "AUTHORS" file has been added, using the chromium projects authors
file as a template.
Design task: #110784
Ref !110783.
A lot of files were missing copyright field in the header and
the Blender Foundation contributed to them in a sense of bug
fixing and general maintenance.
This change makes it explicit that those files are at least
partially copyrighted by the Blender Foundation.
Note that this does not make it so the Blender Foundation is
the only holder of the copyright in those files, and developers
who do not have a signed contract with the foundation still
hold the copyright as well.
Another aspect of this change is using SPDX format for the
header. We already used it for the license specification,
and now we state it for the copyright as well, following the
FAQ:
https://reuse.software/faq/
The goal is to solve confusion of the "All rights reserved" for licensing
code under an open-source license.
The phrase "All rights reserved" comes from a historical convention that
required this phrase for the copyright protection to apply. This convention
is no longer relevant.
However, even though the phrase has no meaning in establishing the copyright
it has not lost meaning in terms of licensing.
This change makes it so code under the Blender Foundation copyright does
not use "all rights reserved". This is also how the GPL license itself
states how to apply it to the source code:
<one line to give the program's name and a brief idea of what it does.>
Copyright (C) <year> <name of author>
This program is free software ...
This change does not change copyright notice in cases when the copyright
is dual (BF and an author), or just an author of the code. It also does
mot change copyright which is inherited from NaN Holding BV as it needs
some further investigation about what is the proper way to handle it.
This pull request adds a new tipe of resource handles (thin handles).
These are intended for cases where a resource buffer with more than one
entry for each object is needed (for example, one entry per material
slot).
While it's already possible to have multiple regular handles for the
same object, they have a non-trivial overhead in terms of uploaded
data (matrix, bounds, object info) and computation (visibility
culling).
Thin handles store an indirection buffer pointing to their "parent"
regular handle, therefore multiple thin handles can share the same
per-object data and visibility culling computation.
Thin handles can only be used in their own Pass type (PassMainThin),
so passes that don't need them don't have to pay the overhead.
This pull request also includes the update of the Workbench Next
pre-pass to use PassMainThin, which is the main reason for the
implementation of this feature.
The main change from the previous PR is that the thin handles are now
stored directly in the main resource_id_buf, to avoid wasting an extra
bind slot.
Pull Request #105261
GPencil 3D stroke rendering uses a geometry shader.
This is unsupported by the Metal backend, so implement
fix for this failing compilation by shifting geometry shader
logic into the Vertex shader for Metal backend.
Authored by Apple: Michael Parkin-White
Ref #96261
Pull Request #105143
This reverts commit 19222627c6.
Something went wrong here, seems like this commit merged the main branch
into the release branch, which should never be done.
This patch adds support for compilation and execution of GLSL compute shaders. This, along with a few systematic changes and fixes, enable realtime compositor functionality with the Metal backend on macOS. A number of GLSL source modifications have been made to add the required level of type explicitness, allowing all compilations to succeed.
GLSL Compute shader compilation follows a similar path to Vertex/Fragment translation, with added support for shader atomics, shared memory blocks and barriers.
Texture flags have also been updated to ensure correct read/write specification for textures used within the compositor pipeline. GPU command submission changes have also been made in the high level path, when Metal is used, to address command buffer time-outs caused by certain expensive compute shaders.
Authored by Apple: Michael Parkin-White
Ref T96261
Ref T99210
Reviewed By: fclem
Maniphest Tasks: T99210, T96261
Differential Revision: https://developer.blender.org/D16990
- Support for non-contiguous shader resource bindings for all cases required by create-info
- Implement missing geometry shader alternative path for edit curve handle.
- Add support for non-float dummy textures to address all cases where default bindings may be required.
Authored by Apple: Michael Parkin-White
Ref T96261
Depends on D16721
Reviewed By: fclem
Differential Revision: https://developer.blender.org/D16777
Implementing non-geometry-shader path for rendering stencil shadows,
used by the workbench engine.
Patch also contains a few small modifications to Create-info to ensure
usage of gl_FragDepth is explicitly specified.
This is required for testing of the patch.
Authored by Apple: Michael Parkin-White
Ref T96261
Reviewed By: fclem
Differential Revision: https://developer.blender.org/D16436
Implemented geometry shader alternative for rendering of UV edges in Metal, as geometry shaders are unsupported.
Authored by Apple: Michael Parkin-White
Ref T96261
Reviewed By: fclem
Differential Revision: https://developer.blender.org/D16452
Porting conservative depth rendering to use non-geometry shader path for
Metal.
Authored by Apple: Michael Parkin-White
Ref T96261
Reviewed By: fclem
Differential Revision: https://developer.blender.org/D16424
Additional mat3 constructors added, global variable namespace collisions
for uniform and object color avoided via re-name.
Metal vertex format compatibility added for shaders wherein vertex data
goes through a double-conversion and cannot be implicitly converted during
Metal vertex assembly e.g. bitmasks passed directly as unsigned type in
shader interface for certain shader interfaces.
Authored by Apple: Michael Parkin-White
Ref T96261
Reviewed By: fclem
Differential Revision: https://developer.blender.org/D16433
Required by Metal backend for efficient shader compilation. EEVEE material
resource binding permutations now controlled via CreateInfo and selected
based on material options. Other existing CreateInfo's also modified to
ensure explicitness for depth-writing mode. Other missing bindings also
addressed to ensure full compliance with the Metal backend.
Authored by Apple: Michael Parkin-White
Ref T96261
Reviewed By: fclem
Differential Revision: https://developer.blender.org/D16243
This new inheritance behavior is more beneficial for the metal Backend.
Also change the default depth write behavior of shaders to be unchanged.
This makes fragment shader depth amendment more explicit.
This also add the missing depth_write for metal kernels.
This patch moves the GLSL shaders and their infos to the compositor
module as decided by the EEVEE & Viewport module. This is a non
functional change.
Differential Revision: https://developer.blender.org/D16360
Reviewed By: Clement Foucault
These implementations remove dependency on the Geometry pass by instead invoking one vertex shader instance for each expected output vertex, matching what a geometry shader would emit. Each vertex shader instance is then responsible for calculating the same output position based on its vertex_id as the logic would in the geometry shader version.
SSBO Vertex fetch enables full random-access into a vertex buffer by binding it as a read-only SSBO. This enables each instance to read neighbouring vertex data to perform contextual calculations as a geometry shader would, for cases where attribute Multiload is not supported.
Authored by Apple: Michael Parkin-White
Ref T96261
Reviewed By: fclem
Differential Revision: https://developer.blender.org/D15901
This is a new implementation of the draw manager using modern
rendering practices and GPU driven culling.
This only ports features that are not considered deprecated or to be
removed.
The old DRW API is kept working along side this new one, and does not
interfeer with it. However this needed some more hacking inside the
draw_view_lib.glsl. At least the create info are well separated.
The reviewer might start by looking at `draw_pass_test.cc` to see the
API in usage.
Important files are `draw_pass.hh`, `draw_command.hh`,
`draw_command_shared.hh`.
In a nutshell (for a developper used to old DRW API):
- `DRWShadingGroups` are replaced by `Pass<T>::Sub`.
- Contrary to DRWShadingGroups, all commands recorded inside a pass or
sub-pass (even binds / push_constant / uniforms) will be executed in order.
- All memory is managed per object (except for Sub-Pass which are managed
by their parent pass) and not from draw manager pools. So passes "can"
potentially be recorded once and submitted multiple time (but this is
not really encouraged for now). The only implicit link is between resource
lifetime and `ResourceHandles`
- Sub passes can be any level deep.
- IMPORTANT: All state propagate from sub pass to subpass. There is no
state stack concept anymore. Ensure the correct render state is set before
drawing anything using `Pass::state_set()`.
- The drawcalls now needs a `ResourceHandle` instead of an `Object *`.
This is to remove any implicit dependency between `Pass` and `Manager`.
This was a huge problem in old implementation since the manager did not
know what to pull from the object. Now it is explicitly requested by the
engine.
- The pases need to be submitted to a `draw::Manager` instance which can
be retrieved using `DRW_manager_get()` (for now).
Internally:
- All object data are stored in contiguous storage buffers. Removing a lot
of complexity in the pass submission.
- Draw calls are sorted and visibility tested on GPU. Making more modern
culling and better instancing usage possible in the future.
- Unit Tests have been added for regression testing and avoid most API
breakage.
- `draw::View` now contains culling data for all objects in the scene
allowing caching for multiple views.
- Bounding box and sphere final setup is moved to GPU.
- Some global resources locations have been hardcoded to reduce complexity.
What is missing:
- ~~Workaround for lack of gl_BaseInstanceARB.~~ Done
- ~~Object Uniform Attributes.~~ Done (Not in this patch)
- Workaround for hardware supporting a maximum of 8 SSBO.
Reviewed By: jbakker
Differential Revision: https://developer.blender.org/D15817
This is a complete rewrite of the draw debug drawing module in C++.
It uses `GPUStorageBuf` to store the data to be drawn and use indirect
drawing. This makes it easier to do a mirror API for GPU shaders.
The C++ API class is exposed through `draw_debug.hh` and should be used
when possible in new code.
However, the debug drawing will not work for platform not yet supporting
`GPUStorageBuf`. Also keep in mind that this module must only be used
in debug build for performance and compatibility reasons.