Add dispatch indirect node. Also refactored the dispatch (direct) node
so more logic could be reused. The context only stores a `VKResourceAccessInfo`
struct which is reused by both the dispatch and dispatch indirect node.
Pull Request: https://projects.blender.org/blender/blender/pulls/120993
This PR adds support for compute shaders to render graph. Only direct dispatch
is supported. indirect dispatch will be added in a future PR.
This change enables the next test cases to be supported when using render graphs
- `GPUVulkanTest.push_constants*`
- `GPUVulkanTest.shader_compute_*`
- `GPUVulkanTest.buffer_texture`
- `GPUVulkanTest.specialization_constants_compute`
- `GPUVulkanTest.compute_direct`
```
[==========] 95 tests from 2 test suites ran. (24059 ms total)
[ PASSED ] 95 tests.
```
Specialization constants are supported when using the render graph. This should conclude
the conversion the prototype of the render graph.
Pull Request: https://projects.blender.org/blender/blender/pulls/120963
VKPipeline class is deprecated and will be phased out in the near future.
This PR moves the push constants to VKShader as it was wrongly placed in the
pipeline.
Pull Request: https://projects.blender.org/blender/blender/pulls/120980
In Vulkan, a Blender shader is organized in multiple
objects. A VkPipeline is the highest level concept and represents
somewhat we call a shader. A pipeline is an device/platform optimized
version of the shader that is uploaded and executed in the GPU device.
A key difference with shaders is that its usage is also compiled
in. When using the same shader with a different blending, a new pipeline
needs to be created.
In the current implementation of the Vulkan backend the pipeline is
re-created when any pipeline parameter changes. This triggers many
pipeline compilations. Especially when common shaders are used in
different parts of the drawing code.
A requirement of our render graph implementation is that changes
of the pipeline can be detected based on the VkPipeline handle.
We only want to rebind the pipeline handle when the handle actually
changes. This improves performance (especially on NVIDIA) devices
where pipeline binds are known to be costly.
The solution of this PR is to add a pipeline pool. This holds all
pipelines and can find an already created pipeline based on pipeline
infos. Only compute pipelines support has been added.
# Future enhancements
- Recent drivers replace `VkShaderModule` with pipeline libraries.
It improves sharing pipeline stages and reduce pipeline creation times.
- GPUMaterials should be removed from the pipeline pool when they are
destroyed. Details on this will be more clear when EEVEE support is
added.
Pull Request: https://projects.blender.org/blender/blender/pulls/120899
Resource access info contains lists in a future setup the resource access info
will be kept in the VKContext and reused. This requires a reset function to
cleanup the instance for reuse.
Pull Request: https://projects.blender.org/blender/blender/pulls/120962
When building the resource access used when adding dispatch/draw commands
to the render graph, the access mask is required. This PR stores the
access mask in the shader interface. When binding the resources referenced
by the state manager, the resource access info struct is populated with
the access flags.
In the near future the resource access info will be passed when adding
a dispatch/draw node to the render graph to generate the links.
Pull Request: https://projects.blender.org/blender/blender/pulls/120908
Compute shaders are required since 4.0. There was one occasion where
an older AMD driver failed and support was turned off. This driver
is now marked unsupported.
This PR includes:
- removing the check in viewport compositing
- remove properties from system info
- always construct draw manager.
- remove unused pass logic in draw hair/curves
- add deprecation warning when accessed from python
Pull Request: https://projects.blender.org/blender/blender/pulls/120909
There was a memory leak in the render graph where nodes where freed,
but not the data it could keep. Detected during adding support for
compute shaders and running the draw tests.
Pull Request: https://projects.blender.org/blender/blender/pulls/120906
This PR implements render graph for VKTexture. During the
implementation some tweaks to the render graph was done
to support depth and stencil textures.
The render graph will record the image aspect being used
for each node. This will then be used to generate barriers
for the correct aspect.
Also fixes an issue that uploading of array textures didn't
allocate a large enough staging buffer.
Pull Request: https://projects.blender.org/blender/blender/pulls/120821
A developer can switch `vk_common.hh#use_render_graph` to enable render graph.
When enabled the buffers and images are tracked by the device resource state
tracker. The storage buffer commands are recorded to the context render graph.
The next unit tests will pass:
- GPUVulkanTest.storage_buffer_create_update_read
- GPUVulkanTest.storage_buffer_clear_zero
- GPUVulkanTest.storage_buffer_clear
- GPUVulkanTest.storage_buffer_copy_from_vertex_buffer
The pattern to migrate to render graph is:
- always construct CreateInfo for class.
- based on `use_render_graph` call `context.command_buffers.something`
or `context.render_graph.add_node`.
- Hide calls to `context.flush` when `use_render_graph` is true.
Pull Request: https://projects.blender.org/blender/blender/pulls/120812
f2ae04db10 introduces missing binding
tracking for SSBO and UBOs. Vulkan relies on validation layers to
report on missing bindings, but the binding information should still
be cleared in the context state manager.
Pull Request: https://projects.blender.org/blender/blender/pulls/120814
**Design Task**: blender/blender#118330
This PR adds the core of the render graph. The render graph isn't used.
Current implementation of the Vulkan Backend is slow by design. We
focused on stability, before performance. With the new introduced render
graph the focus will shift to performance and keep the stability at where
it is.
Some highlights:
- Every context will get its own render graph. (`VKRenderGraph`).
- Resources (and resource state tracking) is device specific (`VKResourceStateTracker`).
- No node reordering / sub graph execution has been implemented. Currently
All nodes in the graph is executed in the order they were added. (`VKScheduler`).
- The links inside the graph describe the resources the nodes read from (input links)
or writes to (output links)
- When resources are written to a resource stamp is incremented allowing keeping
track of which nodes needs which stamp of a resource.
- At each link the access information (how does the node accesses the resource)
and image layout (for image resources) are stored. This allows the render graph
to find out how a resource was used in the past and will be used in the future.
That is important to construct pipeline barriers that don't stall the whole GPU.
# Defined nodes
This implementation has nodes for:
- Blit image
- Clear color image
- Copy buffers to buffers
- Copy buffers to images
- Copy images to images
- Copy images to buffers
- Dispatch compute shader
- Fill buffers
- Synchronization
Each node has a node info, create info and data struct. The create info
contains all data to construct the node, including the links of the graph.
The data struct only contains the data stored inside the node. The node info
contains the node specific implementation.
> NOTE: Other nodes will be added after this PR lands to main.
# Resources
Before a render graph can be used, the resources should be registered
to `VKResourceStateTracker`. In the final implementation this will be owned by
the `VKDevice`. Registration of resources can be done by calling
`VKResources.add_buffer` or `VKResources.add_image`.
# Render graph
Nodes can be added to the render graph. When adding a node its read/
write dependencies are extracted and converted into links (`VKNodeInfo.
build_links`).
When the caller wants to have a resource up to date the functions
`VKRenderGraph.submit_for_read` or `VKRenderGraph.submit_for_present`
can be called.
These functions will select and order the nodes that are needed
and convert them to `vkCmd*` commands. These commands include pipeline
barrier and image layout transitions.
The `vkCmd` are recorded into a command buffer which is sent to the
device queue.
## Walking the graph
Walking the render graph isn't implemented yet. The idea is to have a
`Map<ResourceWithStamp, Vector<NodeHandle>> consumers` and
`Map<ResourceWithStamp, NodeHandle> producers`. These attributes can
be stored in the render graph and created when building the links, or
can be created inside the VKScheduler as a variable. The exact detail
which one would be better is unclear as there aren't any users yet. At
the moment the scheduler would need them we need to figure out the best
way to store and retrieve the consumers/producers.
# Unit tests
The render graph can be tested by enabling `WITH_GTEST` and use
`vk_render_graph` as a filter.
```
bin/tests/blender_test --gtest_filter="vk_render_graph*"
```
Pull Request: https://projects.blender.org/blender/blender/pulls/120427
Trying to narrow down why some tests are failing on windows, but not on
linux/mac. These tests use a string compare. This PR adds two modifications
to the current vk_to_string.
- use `std::endl` although not required, it is important to portability
- print vulkan handles in a uniform way.
Pull Request: https://projects.blender.org/blender/blender/pulls/120780
When reviewing the render graph core PR, we discussed where specific
backend tests should be located. The code style is clear that it
needs to be located in a tests folder next to the code it tests.
This PR moved the tests folder from next to the files they test to
a tests folder.
Pull Request: https://projects.blender.org/blender/blender/pulls/120777
- Removed ignored message ids as this is also part of vkconfig which should be used
This is also adviced by the Vulkan Tools WG.
- Also initialize logging when platform doesn't have debugging extensions.
Pull Request: https://projects.blender.org/blender/blender/pulls/120776
This PR adds a context function to consider all
buffer bindings obsolete. This is in order to
track missing binds and invalid lingering states
accross `draw::Pass`es.
The functions `GPU_storagebuf_debug_unbind_all`
and `GPU_uniformbuf_debug_unbind_all` do nothing
more than resetting the internal debug slot bits
to zero. This is what OpenGL backend does as it
doesn't track the bindings themselves.
Other backends might have other way to detect
missing bindings. If not they should be
implemented separately anyway.
I renamed the function to `debug_unbind_all` to
denote that it actually does something related to
debugging.
This also add SSBO binding check for OpenGL as it
was also missing.
#### Future
This error checking logic is pretty much backend
agnostic. While it would be nice to move it at
`gpu::Context` level, we don't have the resources
for that now.
Pull Request: https://projects.blender.org/blender/blender/pulls/120716
Previously all shaders had its own descriptor set layout handle. This makes it
difficult to see if currently bound resources can be reused when switching shaders.
To work around this limitation, the vulkan backend rebound the resources over
and over again.
This PR is part of the render graph where changing shaders can reuse previous
bound resources. This PR only makes sure that the layout handles are the same
so we can identify 'compatible' pipelines. The behavior to limit rebinding of
resources will be added as part of the render graph in a later commit.
Pull Request: https://projects.blender.org/blender/blender/pulls/120562
MoltenVK original intent was to let developers work on a mac system developing
for the vulkan eco-system. MoltenVK doesn't support all the features that we
require and would require additional workarounds to be actually supported.
It is not expected that we will release Blender with MoltenVK for this reason.
But it still has value for shader developers to validate shaders on metal and
vulkan on a single platform.

Pull Request: https://projects.blender.org/blender/blender/pulls/117940
Textures that are GPU-compressed already (in practice: from DDS files
that are DXT1/DXT3/DXT5 compressed) now can stay GPU compressed
in Vulkan, similar to how that works on OpenGL.
Additionally, fixed lack of mipmaps in Vulkan textures. The textures
were created with mipmaps (good), the sampler too (good), but
the vulkan image view was always saying "yo, this is mip 0 only"
because mip range variables were never set to anything than zero.
Pull Request: https://projects.blender.org/blender/blender/pulls/119866
Every vulkan installation has a vk.xml file containing the vulkan specification
in a machine readable fasion.
This PR uses the vk.xml to generate to_string functions for data types blender uses.
When updating to a new specification or when changing features/extensions we
should re-generate the to_string functions.
The generator is implemented in `vk_to_string.py`.
Pull Request: https://projects.blender.org/blender/blender/pulls/119880
Now that all relevant code is C++, the indirection from the C struct
`GPUVertBuf` to the C++ `blender::gpu::VertBuf` class just adds
complexity and necessitates a wrapper API, making more cleanups like
use of RAII or other C++ types more difficult.
This commit replaces the C wrapper structs with direct use of the
vertex and index buffer base classes. In C++ we can choose which parts
of a class are private, so we don't risk exposing too many
implementation details here.
Pull Request: https://projects.blender.org/blender/blender/pulls/119825
This patch adds the maximum number of supported image units to the GPU
capabilities module. Currently, the GPU module assume a maximum of 8
units, so the patch is not currently particularly useful, but we can
consider committing it for the future anyways.
Pull Request: https://projects.blender.org/blender/blender/pulls/119057
Adds an option to set the capture title when using renderdoc
`GPU_debug_capture_begin` has an optional `title` parameter to set
the title of the renderdoc capture.
Pull Request: https://projects.blender.org/blender/blender/pulls/118649
Blender uses some vertex attributes that are not (and sometimes
never) supported by a GPU. OpenGL silently converted these changes
but for Metal/Vulkan we need to convert then when uploading the
data.
This PR will write to console invalid usages which we should remove
from Blender code-base. Note it is still possible to create attributes
that still need conversions by using the PyGPU API.
Span is preferrable since it's agnostic of the source container,
makes it clearer that there is no ownership, is 8 bytes smaller,
and can be passed by value.
`GLBatch::draw_indirect` has additional overhead compared to
`GLBatch::draw`, and can become a bottleneck in scenes that require
many draw calls (ie. with too many unique meshes).
The performance difference is almost exclusively caused by the
`GL_COMMAND_BARRIER_BIT` barrier that happens on every call.
This PR adds a `GPU_storagebuf_sync_as_indirect_buffer` function that
can be used to place the barrier only once after filling the indirect
buffer content.
This function is a no-op in Vulkan and Metal since they don't need the
barrier.
Pull Request: https://projects.blender.org/blender/blender/pulls/117561
Previously a storage buffer was used to store draw list commands as it
matches already existing APIs. Unfortunately StorageBuffers prefers to
be stored on the GPU device and would reduce the benefit of a dynamic
draw list.
This PR replaces the storage buffer with a regular buffer, which keeps
more control where to store the buffer.
Pull Request: https://projects.blender.org/blender/blender/pulls/117712
A draw list bundles multiple draw commands for the same geometry
and sends the draw commands in a single command. This reduces
the overhead of pipeline checking, resource validation and can
keep the load higher on the gpu as more work needs to be done.
Previously the draw list didn't bundle any commands and would still
send each call separately to the GPU. This PR implements the bundling
of the commands.
Pull Request: https://projects.blender.org/blender/blender/pulls/117548
Ensure attachment states and load/store configs don't get out of sync
with the framebuffer layout.
In theory, a Framebuffer could have empty attachments interleaved with
valid ones so checking just the attachments "length" is not enough.
What this does instead is to ensure that valid attachments have a valid
config and that null attachments either don't have a matching config or
have an IGNORE/DONT_CARE one.
Pull Request: https://projects.blender.org/blender/blender/pulls/117073