This patch adds the maximum number of supported image units to the GPU
capabilities module. Currently, the GPU module assume a maximum of 8
units, so the patch is not currently particularly useful, but we can
consider committing it for the future anyways.
Pull Request: https://projects.blender.org/blender/blender/pulls/119057
Adds an option to set the capture title when using renderdoc
`GPU_debug_capture_begin` has an optional `title` parameter to set
the title of the renderdoc capture.
Pull Request: https://projects.blender.org/blender/blender/pulls/118649
Blender uses some vertex attributes that are not (and sometimes
never) supported by a GPU. OpenGL silently converted these changes
but for Metal/Vulkan we need to convert then when uploading the
data.
This PR will write to console invalid usages which we should remove
from Blender code-base. Note it is still possible to create attributes
that still need conversions by using the PyGPU API.
Span is preferrable since it's agnostic of the source container,
makes it clearer that there is no ownership, is 8 bytes smaller,
and can be passed by value.
`GLBatch::draw_indirect` has additional overhead compared to
`GLBatch::draw`, and can become a bottleneck in scenes that require
many draw calls (ie. with too many unique meshes).
The performance difference is almost exclusively caused by the
`GL_COMMAND_BARRIER_BIT` barrier that happens on every call.
This PR adds a `GPU_storagebuf_sync_as_indirect_buffer` function that
can be used to place the barrier only once after filling the indirect
buffer content.
This function is a no-op in Vulkan and Metal since they don't need the
barrier.
Pull Request: https://projects.blender.org/blender/blender/pulls/117561
Previously a storage buffer was used to store draw list commands as it
matches already existing APIs. Unfortunately StorageBuffers prefers to
be stored on the GPU device and would reduce the benefit of a dynamic
draw list.
This PR replaces the storage buffer with a regular buffer, which keeps
more control where to store the buffer.
Pull Request: https://projects.blender.org/blender/blender/pulls/117712
A draw list bundles multiple draw commands for the same geometry
and sends the draw commands in a single command. This reduces
the overhead of pipeline checking, resource validation and can
keep the load higher on the gpu as more work needs to be done.
Previously the draw list didn't bundle any commands and would still
send each call separately to the GPU. This PR implements the bundling
of the commands.
Pull Request: https://projects.blender.org/blender/blender/pulls/117548
Ensure attachment states and load/store configs don't get out of sync
with the framebuffer layout.
In theory, a Framebuffer could have empty attachments interleaved with
valid ones so checking just the attachments "length" is not enough.
What this does instead is to ensure that valid attachments have a valid
config and that null attachments either don't have a matching config or
have an IGNORE/DONT_CARE one.
Pull Request: https://projects.blender.org/blender/blender/pulls/117073
This PR adds support for specialization constants for the OpenGL
backend. The minimum OpenGL version we are targetting doesn't
have native support for specialization constants. We simulate this
by keeping track of shader programs for each set of specialization
constants that are being used.
Specialization constants can be used to reduce shader complexity
and improve performance as less registry and/or spilling is done.
This requires the ability to recompile GLShaders. In order to do this
we need to keep track of the sources that are used when the shader
was compiled. For static sources we only store references
(`GLSource::source_ref`), for dynamically generated sources we keep
a copy of the source (`GLSource::source`).
When recompiling the shader GLSL source-code is generated for the
constants stored in `Shader::constants`. When compiling the previous
GLSource that contains specialization constants is then replaced
by the new version.
Pull Request: https://projects.blender.org/blender/blender/pulls/116926
When anisotropic filtering is enabled on a sampler its value must be
between 1 and 16. In blender it is possible to set a value lower than 1.
0 actually means that anisotropic filtering is disabled in Blender.
This would trigger a validation error in Vulkan.
Pull Request: https://projects.blender.org/blender/blender/pulls/117018
The Specialization Shader workaround generated code that wasn't Vulkan
GLSL compliant. The uintBitsToFloat cannot be used during global const
initialization.
This is a temporary work around until we implement the Specialization
Constants for Vulkan.
Pull Request: https://projects.blender.org/blender/blender/pulls/116942
Along with the 4.1 libraries upgrade, we are bumping the clang-format
version from 8-12 to 17. This affects quite a few files.
If not already the case, you may consider pointing your IDE to the
clang-format binary bundled with the Blender precompiled libraries.
This adds some `#line` directive between the
source file injection so that the log parser knowns
which file the errors originated from.
This is then followed by a scan over the combined
source to find out the real row number.
This needed some changes in the `Shader::plint_log`
to skip lines to avoid outputing redundant information.
Adds API to allow usage of specialization constants in shaders.
Specialization constants are dynamic runtime constants which can
be compiled into a shader pipeline state object (PSO) to improve
runtime performance by reducing shader complexity through
shader compiler constant-folding.
This API allows specialization constant values to be specified
along with a default value if no constant value has been declared.
Each GPU backend is then responsible for caching PSO permutations
against the current specialization configuration.
This patch adds support for specialization constants in the
Metal backend and provides a generalised high-level solution
which can be adopted by other graphics APIs supporting
this feature.
Authored by Apple: Michael Parkin-White
Authored by Blender: Clément Foucault (files in gpu/test folder)
Pull Request: https://projects.blender.org/blender/blender/pulls/115193
GPU module assumes that image and textures uses different bind
namespaces. In Vulkan this isn't the case, leading that some shaders
generate incorrect bind types, when the state has bindings that are not
used by the shader, but would conflict due to namespace differences.
This PR will only return the binding when after validating it is from
the expected namespace. This removes several validation warnings.
This was done in order to debug EEVEE using modern toolsets. These
toolsets don't support OpenGL and we use Vulkan as a workaround if
possible.
Pull Request: https://projects.blender.org/blender/blender/pulls/116465
This patch adds an alternative path for devices/OSs
which do not support native texture atomics in Metal.
Support is encapsulated within the backend, ensuring
any allocated texture with the USAGE_ATOMIC flag is
allocated with a backing buffer, upon which atomic
operations happen.
The shader generation is also changed for the atomic
case, which instructs the backend to insert additional
buffer bind-points for the buffer resource. As Metal
also only supports buffer-backed textures for
textureBuffers or 2D textures, TextureArrays and
3D textures are emulated within a 2D texture, with
sample locations being indirected.
All usage of atomic textures MUST now utilise the
correct atomic texture types in the high level shader
and GPUShaderCreateInfo declarations.
Authored by Apple: Michael Parkin-White
Pull Request: https://projects.blender.org/blender/blender/pulls/115956
During normal execution it isn't possible to switch workarounds.
However when running test cases we want to check if shaders and
other tests work when work arounds are enabled. Currently shader
patches are cached globally.
This PR moves the cached shader patch to the device level which
is rebuild every time a device needs to be reinitialized.
For OpenGL this is also an issue, but harder to solve as the concept
device doesn't exist there.
Pull Request: https://projects.blender.org/blender/blender/pulls/116042
When debugging the descriptor sets are unnamed. This PR sets the
active shader name. This helps when debugging so we don't need
to track down the shader it is complaining about.
```
the descriptor (VkDescriptorSet 0x66da6f0000001c58[workbench_prepass_mesh_opaque_studio_texture_no_clip_1022]
binding 7, index 0) is being used in draw but has never been updated via vkUpdateDescriptorSets() or a similar call.
```
This message direct directly to the shader including what part is
needed to be checked. No need to add break points and that sort
of things.
Pull Request: https://projects.blender.org/blender/blender/pulls/115944
This change adds timeline semaphores to track submissions. The previous
implementation used a fence.
Timeline semaphores can be tracked in more detail as it is an counter.
For each submission the counter can be stored locally and when waiting
for completion the counter can be retrieved again and checked if is
known to be succeeded by a higher value.
The timeline semaphore is stored next to the queue and can also be used
to synchronize between multiple contexts.
Pull Request: https://projects.blender.org/blender/blender/pulls/115357
Currently all buffer types were stored in host memory, which is visible to the GPU as well.
This is typically slow as the data would be transferred over the PCI bus when used.
Most of the time Index and Vertex buffers are written once and read many times so it makes
more sense to locate them on the GPU. Storage buffers typically require quick access as they
are created for shading/compute purposes.
This PR will try to store vertex buffers, index buffers and storage buffers on device memory
to improve the performance.
Uniform buffers are still located on host memory as they can be uploaded during binding process.
This can (will) reset the graphics pipeline triggering draw calls using unattached resources.
In future this could be optimized further as in:
* using different pools for allocating specific buffers, with a fallback when buffers cannot be
stored on the GPU anymore.
* store uniform buffers in device memory
Pull Request: https://projects.blender.org/blender/blender/pulls/115343
This PR adds the default vulkan pipeline cache to when creating
pipelines.
Pipeline caching reuses pipelines that have already been created.
Pipeline creation can be somewhat costly - it has to compile the
shaders at creation time for example.
The cache is recreated at each run of Blender.
Pull Request: https://projects.blender.org/blender/blender/pulls/115346
Some test cases inside the vulkan backend don't rely on an initialized
Vulkan Context and can always be run.
This PR enables those test cases when `WITH_GTEST` and
`WITH_VULKAN_BACKEND` is on. Allowing some tests to run on the
buildbot.
Pull Request: https://projects.blender.org/blender/blender/pulls/114574
Multi indirect drawing would bind an offset index buffer, but
indirect drawing parameters also offset the index buffer so
incorrect geometry was drawn.
Fixes drawing of meshes with multiple materials.
Pull Request: https://projects.blender.org/blender/blender/pulls/115190
This PR shows the memory footprint in the statusbar when activated.
Only memory allocated on the VRAM is counted. Memory allocated on host
memory is not counted.

Pull Request: https://projects.blender.org/blender/blender/pulls/115184
Currently we keep track of 3 command buffers. Data transfer, compute and
graphics. The reason is that data transfer and compute commands cannot
be recorded in a command buffer that has an active render pass.
This PR simplifies the implementation by combining data transfer and
compute commands as there is no need to separate those in individual
command buffers.
This is in preparation of improving submission performance.
Pull Request: https://projects.blender.org/blender/blender/pulls/115033
In Blender a context should not be shared between threads. In Vulkan a
command pool must not be shared between threads. In the current
implementation the command pool are stored on device level and could
therefore be shared between multiple context which made the implementation
not matching these rules.
This PR moves the command pool from device to command buffers where it
would not conflict between other contexts. This PR doesn't make the Vulkan
backend fully multithreaded. The access to the queue is still missing.
Pull Request: https://projects.blender.org/blender/blender/pulls/114977