Files
test2/source/blender/gpu/vulkan/render_graph/vk_command_builder.cc

Ignoring revisions in .git-blame-ignore-revs. Click here to bypass and see the normal blame view.

975 lines
41 KiB
C++
Raw Normal View History

Vulkan: Render graph core **Design Task**: blender/blender#118330 This PR adds the core of the render graph. The render graph isn't used. Current implementation of the Vulkan Backend is slow by design. We focused on stability, before performance. With the new introduced render graph the focus will shift to performance and keep the stability at where it is. Some highlights: - Every context will get its own render graph. (`VKRenderGraph`). - Resources (and resource state tracking) is device specific (`VKResourceStateTracker`). - No node reordering / sub graph execution has been implemented. Currently All nodes in the graph is executed in the order they were added. (`VKScheduler`). - The links inside the graph describe the resources the nodes read from (input links) or writes to (output links) - When resources are written to a resource stamp is incremented allowing keeping track of which nodes needs which stamp of a resource. - At each link the access information (how does the node accesses the resource) and image layout (for image resources) are stored. This allows the render graph to find out how a resource was used in the past and will be used in the future. That is important to construct pipeline barriers that don't stall the whole GPU. # Defined nodes This implementation has nodes for: - Blit image - Clear color image - Copy buffers to buffers - Copy buffers to images - Copy images to images - Copy images to buffers - Dispatch compute shader - Fill buffers - Synchronization Each node has a node info, create info and data struct. The create info contains all data to construct the node, including the links of the graph. The data struct only contains the data stored inside the node. The node info contains the node specific implementation. > NOTE: Other nodes will be added after this PR lands to main. # Resources Before a render graph can be used, the resources should be registered to `VKResourceStateTracker`. In the final implementation this will be owned by the `VKDevice`. Registration of resources can be done by calling `VKResources.add_buffer` or `VKResources.add_image`. # Render graph Nodes can be added to the render graph. When adding a node its read/ write dependencies are extracted and converted into links (`VKNodeInfo. build_links`). When the caller wants to have a resource up to date the functions `VKRenderGraph.submit_for_read` or `VKRenderGraph.submit_for_present` can be called. These functions will select and order the nodes that are needed and convert them to `vkCmd*` commands. These commands include pipeline barrier and image layout transitions. The `vkCmd` are recorded into a command buffer which is sent to the device queue. ## Walking the graph Walking the render graph isn't implemented yet. The idea is to have a `Map<ResourceWithStamp, Vector<NodeHandle>> consumers` and `Map<ResourceWithStamp, NodeHandle> producers`. These attributes can be stored in the render graph and created when building the links, or can be created inside the VKScheduler as a variable. The exact detail which one would be better is unclear as there aren't any users yet. At the moment the scheduler would need them we need to figure out the best way to store and retrieve the consumers/producers. # Unit tests The render graph can be tested by enabling `WITH_GTEST` and use `vk_render_graph` as a filter. ``` bin/tests/blender_test --gtest_filter="vk_render_graph*" ``` Pull Request: https://projects.blender.org/blender/blender/pulls/120427
2024-04-19 10:46:50 +02:00
/* SPDX-FileCopyrightText: 2024 Blender Authors
*
* SPDX-License-Identifier: GPL-2.0-or-later */
/** \file
* \ingroup gpu
*/
#include "vk_command_builder.hh"
#include "vk_backend.hh"
Vulkan: Render graph core **Design Task**: blender/blender#118330 This PR adds the core of the render graph. The render graph isn't used. Current implementation of the Vulkan Backend is slow by design. We focused on stability, before performance. With the new introduced render graph the focus will shift to performance and keep the stability at where it is. Some highlights: - Every context will get its own render graph. (`VKRenderGraph`). - Resources (and resource state tracking) is device specific (`VKResourceStateTracker`). - No node reordering / sub graph execution has been implemented. Currently All nodes in the graph is executed in the order they were added. (`VKScheduler`). - The links inside the graph describe the resources the nodes read from (input links) or writes to (output links) - When resources are written to a resource stamp is incremented allowing keeping track of which nodes needs which stamp of a resource. - At each link the access information (how does the node accesses the resource) and image layout (for image resources) are stored. This allows the render graph to find out how a resource was used in the past and will be used in the future. That is important to construct pipeline barriers that don't stall the whole GPU. # Defined nodes This implementation has nodes for: - Blit image - Clear color image - Copy buffers to buffers - Copy buffers to images - Copy images to images - Copy images to buffers - Dispatch compute shader - Fill buffers - Synchronization Each node has a node info, create info and data struct. The create info contains all data to construct the node, including the links of the graph. The data struct only contains the data stored inside the node. The node info contains the node specific implementation. > NOTE: Other nodes will be added after this PR lands to main. # Resources Before a render graph can be used, the resources should be registered to `VKResourceStateTracker`. In the final implementation this will be owned by the `VKDevice`. Registration of resources can be done by calling `VKResources.add_buffer` or `VKResources.add_image`. # Render graph Nodes can be added to the render graph. When adding a node its read/ write dependencies are extracted and converted into links (`VKNodeInfo. build_links`). When the caller wants to have a resource up to date the functions `VKRenderGraph.submit_for_read` or `VKRenderGraph.submit_for_present` can be called. These functions will select and order the nodes that are needed and convert them to `vkCmd*` commands. These commands include pipeline barrier and image layout transitions. The `vkCmd` are recorded into a command buffer which is sent to the device queue. ## Walking the graph Walking the render graph isn't implemented yet. The idea is to have a `Map<ResourceWithStamp, Vector<NodeHandle>> consumers` and `Map<ResourceWithStamp, NodeHandle> producers`. These attributes can be stored in the render graph and created when building the links, or can be created inside the VKScheduler as a variable. The exact detail which one would be better is unclear as there aren't any users yet. At the moment the scheduler would need them we need to figure out the best way to store and retrieve the consumers/producers. # Unit tests The render graph can be tested by enabling `WITH_GTEST` and use `vk_render_graph` as a filter. ``` bin/tests/blender_test --gtest_filter="vk_render_graph*" ``` Pull Request: https://projects.blender.org/blender/blender/pulls/120427
2024-04-19 10:46:50 +02:00
#include "vk_render_graph.hh"
#include "vk_to_string.hh"
Vulkan: Render graph core **Design Task**: blender/blender#118330 This PR adds the core of the render graph. The render graph isn't used. Current implementation of the Vulkan Backend is slow by design. We focused on stability, before performance. With the new introduced render graph the focus will shift to performance and keep the stability at where it is. Some highlights: - Every context will get its own render graph. (`VKRenderGraph`). - Resources (and resource state tracking) is device specific (`VKResourceStateTracker`). - No node reordering / sub graph execution has been implemented. Currently All nodes in the graph is executed in the order they were added. (`VKScheduler`). - The links inside the graph describe the resources the nodes read from (input links) or writes to (output links) - When resources are written to a resource stamp is incremented allowing keeping track of which nodes needs which stamp of a resource. - At each link the access information (how does the node accesses the resource) and image layout (for image resources) are stored. This allows the render graph to find out how a resource was used in the past and will be used in the future. That is important to construct pipeline barriers that don't stall the whole GPU. # Defined nodes This implementation has nodes for: - Blit image - Clear color image - Copy buffers to buffers - Copy buffers to images - Copy images to images - Copy images to buffers - Dispatch compute shader - Fill buffers - Synchronization Each node has a node info, create info and data struct. The create info contains all data to construct the node, including the links of the graph. The data struct only contains the data stored inside the node. The node info contains the node specific implementation. > NOTE: Other nodes will be added after this PR lands to main. # Resources Before a render graph can be used, the resources should be registered to `VKResourceStateTracker`. In the final implementation this will be owned by the `VKDevice`. Registration of resources can be done by calling `VKResources.add_buffer` or `VKResources.add_image`. # Render graph Nodes can be added to the render graph. When adding a node its read/ write dependencies are extracted and converted into links (`VKNodeInfo. build_links`). When the caller wants to have a resource up to date the functions `VKRenderGraph.submit_for_read` or `VKRenderGraph.submit_for_present` can be called. These functions will select and order the nodes that are needed and convert them to `vkCmd*` commands. These commands include pipeline barrier and image layout transitions. The `vkCmd` are recorded into a command buffer which is sent to the device queue. ## Walking the graph Walking the render graph isn't implemented yet. The idea is to have a `Map<ResourceWithStamp, Vector<NodeHandle>> consumers` and `Map<ResourceWithStamp, NodeHandle> producers`. These attributes can be stored in the render graph and created when building the links, or can be created inside the VKScheduler as a variable. The exact detail which one would be better is unclear as there aren't any users yet. At the moment the scheduler would need them we need to figure out the best way to store and retrieve the consumers/producers. # Unit tests The render graph can be tested by enabling `WITH_GTEST` and use `vk_render_graph` as a filter. ``` bin/tests/blender_test --gtest_filter="vk_render_graph*" ``` Pull Request: https://projects.blender.org/blender/blender/pulls/120427
2024-04-19 10:46:50 +02:00
#include <sstream>
Vulkan: Render graph core **Design Task**: blender/blender#118330 This PR adds the core of the render graph. The render graph isn't used. Current implementation of the Vulkan Backend is slow by design. We focused on stability, before performance. With the new introduced render graph the focus will shift to performance and keep the stability at where it is. Some highlights: - Every context will get its own render graph. (`VKRenderGraph`). - Resources (and resource state tracking) is device specific (`VKResourceStateTracker`). - No node reordering / sub graph execution has been implemented. Currently All nodes in the graph is executed in the order they were added. (`VKScheduler`). - The links inside the graph describe the resources the nodes read from (input links) or writes to (output links) - When resources are written to a resource stamp is incremented allowing keeping track of which nodes needs which stamp of a resource. - At each link the access information (how does the node accesses the resource) and image layout (for image resources) are stored. This allows the render graph to find out how a resource was used in the past and will be used in the future. That is important to construct pipeline barriers that don't stall the whole GPU. # Defined nodes This implementation has nodes for: - Blit image - Clear color image - Copy buffers to buffers - Copy buffers to images - Copy images to images - Copy images to buffers - Dispatch compute shader - Fill buffers - Synchronization Each node has a node info, create info and data struct. The create info contains all data to construct the node, including the links of the graph. The data struct only contains the data stored inside the node. The node info contains the node specific implementation. > NOTE: Other nodes will be added after this PR lands to main. # Resources Before a render graph can be used, the resources should be registered to `VKResourceStateTracker`. In the final implementation this will be owned by the `VKDevice`. Registration of resources can be done by calling `VKResources.add_buffer` or `VKResources.add_image`. # Render graph Nodes can be added to the render graph. When adding a node its read/ write dependencies are extracted and converted into links (`VKNodeInfo. build_links`). When the caller wants to have a resource up to date the functions `VKRenderGraph.submit_for_read` or `VKRenderGraph.submit_for_present` can be called. These functions will select and order the nodes that are needed and convert them to `vkCmd*` commands. These commands include pipeline barrier and image layout transitions. The `vkCmd` are recorded into a command buffer which is sent to the device queue. ## Walking the graph Walking the render graph isn't implemented yet. The idea is to have a `Map<ResourceWithStamp, Vector<NodeHandle>> consumers` and `Map<ResourceWithStamp, NodeHandle> producers`. These attributes can be stored in the render graph and created when building the links, or can be created inside the VKScheduler as a variable. The exact detail which one would be better is unclear as there aren't any users yet. At the moment the scheduler would need them we need to figure out the best way to store and retrieve the consumers/producers. # Unit tests The render graph can be tested by enabling `WITH_GTEST` and use `vk_render_graph` as a filter. ``` bin/tests/blender_test --gtest_filter="vk_render_graph*" ``` Pull Request: https://projects.blender.org/blender/blender/pulls/120427
2024-04-19 10:46:50 +02:00
namespace blender::gpu::render_graph {
Vulkan: Render graph core **Design Task**: blender/blender#118330 This PR adds the core of the render graph. The render graph isn't used. Current implementation of the Vulkan Backend is slow by design. We focused on stability, before performance. With the new introduced render graph the focus will shift to performance and keep the stability at where it is. Some highlights: - Every context will get its own render graph. (`VKRenderGraph`). - Resources (and resource state tracking) is device specific (`VKResourceStateTracker`). - No node reordering / sub graph execution has been implemented. Currently All nodes in the graph is executed in the order they were added. (`VKScheduler`). - The links inside the graph describe the resources the nodes read from (input links) or writes to (output links) - When resources are written to a resource stamp is incremented allowing keeping track of which nodes needs which stamp of a resource. - At each link the access information (how does the node accesses the resource) and image layout (for image resources) are stored. This allows the render graph to find out how a resource was used in the past and will be used in the future. That is important to construct pipeline barriers that don't stall the whole GPU. # Defined nodes This implementation has nodes for: - Blit image - Clear color image - Copy buffers to buffers - Copy buffers to images - Copy images to images - Copy images to buffers - Dispatch compute shader - Fill buffers - Synchronization Each node has a node info, create info and data struct. The create info contains all data to construct the node, including the links of the graph. The data struct only contains the data stored inside the node. The node info contains the node specific implementation. > NOTE: Other nodes will be added after this PR lands to main. # Resources Before a render graph can be used, the resources should be registered to `VKResourceStateTracker`. In the final implementation this will be owned by the `VKDevice`. Registration of resources can be done by calling `VKResources.add_buffer` or `VKResources.add_image`. # Render graph Nodes can be added to the render graph. When adding a node its read/ write dependencies are extracted and converted into links (`VKNodeInfo. build_links`). When the caller wants to have a resource up to date the functions `VKRenderGraph.submit_for_read` or `VKRenderGraph.submit_for_present` can be called. These functions will select and order the nodes that are needed and convert them to `vkCmd*` commands. These commands include pipeline barrier and image layout transitions. The `vkCmd` are recorded into a command buffer which is sent to the device queue. ## Walking the graph Walking the render graph isn't implemented yet. The idea is to have a `Map<ResourceWithStamp, Vector<NodeHandle>> consumers` and `Map<ResourceWithStamp, NodeHandle> producers`. These attributes can be stored in the render graph and created when building the links, or can be created inside the VKScheduler as a variable. The exact detail which one would be better is unclear as there aren't any users yet. At the moment the scheduler would need them we need to figure out the best way to store and retrieve the consumers/producers. # Unit tests The render graph can be tested by enabling `WITH_GTEST` and use `vk_render_graph` as a filter. ``` bin/tests/blender_test --gtest_filter="vk_render_graph*" ``` Pull Request: https://projects.blender.org/blender/blender/pulls/120427
2024-04-19 10:46:50 +02:00
/* -------------------------------------------------------------------- */
/** \name Build nodes
* \{ */
void VKCommandBuilder::build_nodes(VKRenderGraph &render_graph,
VKCommandBufferInterface &command_buffer,
Span<NodeHandle> node_handles)
Vulkan: Render graph core **Design Task**: blender/blender#118330 This PR adds the core of the render graph. The render graph isn't used. Current implementation of the Vulkan Backend is slow by design. We focused on stability, before performance. With the new introduced render graph the focus will shift to performance and keep the stability at where it is. Some highlights: - Every context will get its own render graph. (`VKRenderGraph`). - Resources (and resource state tracking) is device specific (`VKResourceStateTracker`). - No node reordering / sub graph execution has been implemented. Currently All nodes in the graph is executed in the order they were added. (`VKScheduler`). - The links inside the graph describe the resources the nodes read from (input links) or writes to (output links) - When resources are written to a resource stamp is incremented allowing keeping track of which nodes needs which stamp of a resource. - At each link the access information (how does the node accesses the resource) and image layout (for image resources) are stored. This allows the render graph to find out how a resource was used in the past and will be used in the future. That is important to construct pipeline barriers that don't stall the whole GPU. # Defined nodes This implementation has nodes for: - Blit image - Clear color image - Copy buffers to buffers - Copy buffers to images - Copy images to images - Copy images to buffers - Dispatch compute shader - Fill buffers - Synchronization Each node has a node info, create info and data struct. The create info contains all data to construct the node, including the links of the graph. The data struct only contains the data stored inside the node. The node info contains the node specific implementation. > NOTE: Other nodes will be added after this PR lands to main. # Resources Before a render graph can be used, the resources should be registered to `VKResourceStateTracker`. In the final implementation this will be owned by the `VKDevice`. Registration of resources can be done by calling `VKResources.add_buffer` or `VKResources.add_image`. # Render graph Nodes can be added to the render graph. When adding a node its read/ write dependencies are extracted and converted into links (`VKNodeInfo. build_links`). When the caller wants to have a resource up to date the functions `VKRenderGraph.submit_for_read` or `VKRenderGraph.submit_for_present` can be called. These functions will select and order the nodes that are needed and convert them to `vkCmd*` commands. These commands include pipeline barrier and image layout transitions. The `vkCmd` are recorded into a command buffer which is sent to the device queue. ## Walking the graph Walking the render graph isn't implemented yet. The idea is to have a `Map<ResourceWithStamp, Vector<NodeHandle>> consumers` and `Map<ResourceWithStamp, NodeHandle> producers`. These attributes can be stored in the render graph and created when building the links, or can be created inside the VKScheduler as a variable. The exact detail which one would be better is unclear as there aren't any users yet. At the moment the scheduler would need them we need to figure out the best way to store and retrieve the consumers/producers. # Unit tests The render graph can be tested by enabling `WITH_GTEST` and use `vk_render_graph` as a filter. ``` bin/tests/blender_test --gtest_filter="vk_render_graph*" ``` Pull Request: https://projects.blender.org/blender/blender/pulls/120427
2024-04-19 10:46:50 +02:00
{
groups_init(render_graph, node_handles);
groups_extract_barriers(
render_graph, node_handles, command_buffer.use_dynamic_rendering_local_read);
Vulkan: Device command builder This PR implements a new the threading model for building render graphs based on tests performed last month. For out workload multithreaded command building will block in the driver or device. So better to use a single thread for command building. Details of the internal working is documented at https://developer.blender.org/docs/features/gpu/vulkan/render_graph/ - When a context is activated on a thread the context asks for a render graph it can use by calling `VKDevice::render_graph_new`. - Parts of the GPU backend that requires GPU commands will add a specific render graph node to the render graph. The nodes also contains a reference to all resources it needs including the access it needs and the image layout. - When the context is flushed the render graph is submitted to the device by calling `VKDevice::render_graph_submit`. - The device puts the render graph in `VKDevice::submission_pool`. - There is a single background thread that gets the next render graph to send to the GPU (`VKDevice::submission_runner`). - Reorder the commands of the render graph to comply with Vulkan specific command order rules and reducing possible bottlenecks. (`VKScheduler`) - Generate the required barriers `VKCommandBuilder::groups_extract_barriers`. This is a separate step to reduce resource locking giving other threads access to the resource states when they are building the render graph nodes. - GPU commands and pipeline barriers are recorded to a VkCommandBuffer. (`VKCommandBuilder::record_commands`) - When completed the command buffer can be submitted to the device queue. `vkQueueSubmit` - Render graphs that have been submitted can be reused by a next thread. This is done by pushing the render graph to the `VKDevice::unused_render_graphs` queue. Pull Request: https://projects.blender.org/blender/blender/pulls/132681
2025-01-27 08:55:23 +01:00
}
Vulkan: Render graph core **Design Task**: blender/blender#118330 This PR adds the core of the render graph. The render graph isn't used. Current implementation of the Vulkan Backend is slow by design. We focused on stability, before performance. With the new introduced render graph the focus will shift to performance and keep the stability at where it is. Some highlights: - Every context will get its own render graph. (`VKRenderGraph`). - Resources (and resource state tracking) is device specific (`VKResourceStateTracker`). - No node reordering / sub graph execution has been implemented. Currently All nodes in the graph is executed in the order they were added. (`VKScheduler`). - The links inside the graph describe the resources the nodes read from (input links) or writes to (output links) - When resources are written to a resource stamp is incremented allowing keeping track of which nodes needs which stamp of a resource. - At each link the access information (how does the node accesses the resource) and image layout (for image resources) are stored. This allows the render graph to find out how a resource was used in the past and will be used in the future. That is important to construct pipeline barriers that don't stall the whole GPU. # Defined nodes This implementation has nodes for: - Blit image - Clear color image - Copy buffers to buffers - Copy buffers to images - Copy images to images - Copy images to buffers - Dispatch compute shader - Fill buffers - Synchronization Each node has a node info, create info and data struct. The create info contains all data to construct the node, including the links of the graph. The data struct only contains the data stored inside the node. The node info contains the node specific implementation. > NOTE: Other nodes will be added after this PR lands to main. # Resources Before a render graph can be used, the resources should be registered to `VKResourceStateTracker`. In the final implementation this will be owned by the `VKDevice`. Registration of resources can be done by calling `VKResources.add_buffer` or `VKResources.add_image`. # Render graph Nodes can be added to the render graph. When adding a node its read/ write dependencies are extracted and converted into links (`VKNodeInfo. build_links`). When the caller wants to have a resource up to date the functions `VKRenderGraph.submit_for_read` or `VKRenderGraph.submit_for_present` can be called. These functions will select and order the nodes that are needed and convert them to `vkCmd*` commands. These commands include pipeline barrier and image layout transitions. The `vkCmd` are recorded into a command buffer which is sent to the device queue. ## Walking the graph Walking the render graph isn't implemented yet. The idea is to have a `Map<ResourceWithStamp, Vector<NodeHandle>> consumers` and `Map<ResourceWithStamp, NodeHandle> producers`. These attributes can be stored in the render graph and created when building the links, or can be created inside the VKScheduler as a variable. The exact detail which one would be better is unclear as there aren't any users yet. At the moment the scheduler would need them we need to figure out the best way to store and retrieve the consumers/producers. # Unit tests The render graph can be tested by enabling `WITH_GTEST` and use `vk_render_graph` as a filter. ``` bin/tests/blender_test --gtest_filter="vk_render_graph*" ``` Pull Request: https://projects.blender.org/blender/blender/pulls/120427
2024-04-19 10:46:50 +02:00
Vulkan: Device command builder This PR implements a new the threading model for building render graphs based on tests performed last month. For out workload multithreaded command building will block in the driver or device. So better to use a single thread for command building. Details of the internal working is documented at https://developer.blender.org/docs/features/gpu/vulkan/render_graph/ - When a context is activated on a thread the context asks for a render graph it can use by calling `VKDevice::render_graph_new`. - Parts of the GPU backend that requires GPU commands will add a specific render graph node to the render graph. The nodes also contains a reference to all resources it needs including the access it needs and the image layout. - When the context is flushed the render graph is submitted to the device by calling `VKDevice::render_graph_submit`. - The device puts the render graph in `VKDevice::submission_pool`. - There is a single background thread that gets the next render graph to send to the GPU (`VKDevice::submission_runner`). - Reorder the commands of the render graph to comply with Vulkan specific command order rules and reducing possible bottlenecks. (`VKScheduler`) - Generate the required barriers `VKCommandBuilder::groups_extract_barriers`. This is a separate step to reduce resource locking giving other threads access to the resource states when they are building the render graph nodes. - GPU commands and pipeline barriers are recorded to a VkCommandBuffer. (`VKCommandBuilder::record_commands`) - When completed the command buffer can be submitted to the device queue. `vkQueueSubmit` - Render graphs that have been submitted can be reused by a next thread. This is done by pushing the render graph to the `VKDevice::unused_render_graphs` queue. Pull Request: https://projects.blender.org/blender/blender/pulls/132681
2025-01-27 08:55:23 +01:00
void VKCommandBuilder::record_commands(VKRenderGraph &render_graph,
VKCommandBufferInterface &command_buffer,
Span<NodeHandle> node_handles)
{
groups_build_commands(render_graph, command_buffer, node_handles);
}
void VKCommandBuilder::groups_init(const VKRenderGraph &render_graph,
Span<NodeHandle> node_handles)
{
group_nodes_.clear();
IndexRange nodes_range = node_handles.index_range();
while (!nodes_range.is_empty()) {
IndexRange node_group = nodes_range.slice(0, 1);
NodeHandle node_handle = node_handles[nodes_range.first()];
const VKRenderGraphNode &node = render_graph.nodes_[node_handle];
while (node_type_is_rendering(node.type) && node_group.size() < nodes_range.size()) {
NodeHandle node_handle = node_handles[nodes_range[node_group.size()]];
const VKRenderGraphNode &node = render_graph.nodes_[node_handle];
if (!node_type_is_rendering(node.type) || node.type == VKNodeType::BEGIN_RENDERING) {
break;
}
node_group = nodes_range.slice(0, node_group.size() + 1);
}
group_nodes_.append(node_group);
nodes_range = nodes_range.drop_front(node_group.size());
Vulkan: Render graph core **Design Task**: blender/blender#118330 This PR adds the core of the render graph. The render graph isn't used. Current implementation of the Vulkan Backend is slow by design. We focused on stability, before performance. With the new introduced render graph the focus will shift to performance and keep the stability at where it is. Some highlights: - Every context will get its own render graph. (`VKRenderGraph`). - Resources (and resource state tracking) is device specific (`VKResourceStateTracker`). - No node reordering / sub graph execution has been implemented. Currently All nodes in the graph is executed in the order they were added. (`VKScheduler`). - The links inside the graph describe the resources the nodes read from (input links) or writes to (output links) - When resources are written to a resource stamp is incremented allowing keeping track of which nodes needs which stamp of a resource. - At each link the access information (how does the node accesses the resource) and image layout (for image resources) are stored. This allows the render graph to find out how a resource was used in the past and will be used in the future. That is important to construct pipeline barriers that don't stall the whole GPU. # Defined nodes This implementation has nodes for: - Blit image - Clear color image - Copy buffers to buffers - Copy buffers to images - Copy images to images - Copy images to buffers - Dispatch compute shader - Fill buffers - Synchronization Each node has a node info, create info and data struct. The create info contains all data to construct the node, including the links of the graph. The data struct only contains the data stored inside the node. The node info contains the node specific implementation. > NOTE: Other nodes will be added after this PR lands to main. # Resources Before a render graph can be used, the resources should be registered to `VKResourceStateTracker`. In the final implementation this will be owned by the `VKDevice`. Registration of resources can be done by calling `VKResources.add_buffer` or `VKResources.add_image`. # Render graph Nodes can be added to the render graph. When adding a node its read/ write dependencies are extracted and converted into links (`VKNodeInfo. build_links`). When the caller wants to have a resource up to date the functions `VKRenderGraph.submit_for_read` or `VKRenderGraph.submit_for_present` can be called. These functions will select and order the nodes that are needed and convert them to `vkCmd*` commands. These commands include pipeline barrier and image layout transitions. The `vkCmd` are recorded into a command buffer which is sent to the device queue. ## Walking the graph Walking the render graph isn't implemented yet. The idea is to have a `Map<ResourceWithStamp, Vector<NodeHandle>> consumers` and `Map<ResourceWithStamp, NodeHandle> producers`. These attributes can be stored in the render graph and created when building the links, or can be created inside the VKScheduler as a variable. The exact detail which one would be better is unclear as there aren't any users yet. At the moment the scheduler would need them we need to figure out the best way to store and retrieve the consumers/producers. # Unit tests The render graph can be tested by enabling `WITH_GTEST` and use `vk_render_graph` as a filter. ``` bin/tests/blender_test --gtest_filter="vk_render_graph*" ``` Pull Request: https://projects.blender.org/blender/blender/pulls/120427
2024-04-19 10:46:50 +02:00
}
}
void VKCommandBuilder::groups_extract_barriers(VKRenderGraph &render_graph,
Span<NodeHandle> node_handles,
bool use_local_read)
{
barrier_list_.clear();
vk_buffer_memory_barriers_.clear();
vk_image_memory_barriers_.clear();
LayeredImageTracker layered_tracker(*this);
/* Extract barriers. */
group_pre_barriers_.clear();
group_post_barriers_.clear();
node_pre_barriers_.resize(node_handles.size());
/* Keep track of the post barriers that needs to be added. The pre barriers will be stored
Vulkan: Device command builder This PR implements a new the threading model for building render graphs based on tests performed last month. For out workload multithreaded command building will block in the driver or device. So better to use a single thread for command building. Details of the internal working is documented at https://developer.blender.org/docs/features/gpu/vulkan/render_graph/ - When a context is activated on a thread the context asks for a render graph it can use by calling `VKDevice::render_graph_new`. - Parts of the GPU backend that requires GPU commands will add a specific render graph node to the render graph. The nodes also contains a reference to all resources it needs including the access it needs and the image layout. - When the context is flushed the render graph is submitted to the device by calling `VKDevice::render_graph_submit`. - The device puts the render graph in `VKDevice::submission_pool`. - There is a single background thread that gets the next render graph to send to the GPU (`VKDevice::submission_runner`). - Reorder the commands of the render graph to comply with Vulkan specific command order rules and reducing possible bottlenecks. (`VKScheduler`) - Generate the required barriers `VKCommandBuilder::groups_extract_barriers`. This is a separate step to reduce resource locking giving other threads access to the resource states when they are building the render graph nodes. - GPU commands and pipeline barriers are recorded to a VkCommandBuffer. (`VKCommandBuilder::record_commands`) - When completed the command buffer can be submitted to the device queue. `vkQueueSubmit` - Render graphs that have been submitted can be reused by a next thread. This is done by pushing the render graph to the `VKDevice::unused_render_graphs` queue. Pull Request: https://projects.blender.org/blender/blender/pulls/132681
2025-01-27 08:55:23 +01:00
* directly in `barrier_list_` but may not mingle with the pre barriers. Most barriers are
* group pre barriers. */
Vector<Barrier> post_barriers;
/* Keep track of the node pre barriers that needs to be added. The pre barriers will be stored
* directly in `barrier_list_` but may not mingle with the group barriers. */
Vector<Barrier> node_pre_barriers;
NodeHandle rendering_scope;
bool rendering_active = false;
for (const int64_t group_index : group_nodes_.index_range()) {
/* Extract the pre-barriers of this group. */
Barriers group_pre_barriers(barrier_list_.size(), 0);
const GroupNodes &node_group = group_nodes_[group_index];
for (const int64_t group_node_index : node_group) {
NodeHandle node_handle = node_handles[group_node_index];
VKRenderGraphNode &node = render_graph.nodes_[node_handle];
Barrier barrier = {};
build_pipeline_barriers(
render_graph, node_handle, node.pipeline_stage_get(), layered_tracker, barrier);
if (!barrier.is_empty()) {
#if 0
std::cout << __func__ << ": node_group=" << group_index
<< ", node_group_range=" << node_group.first() << "-" << node_group.last()
<< ", node_handle=" << node_handle << ", node_type=" << node.type
<< ", debug_group=" << render_graph.full_debug_group(node_handle) << "\n";
std::cout << __func__ << ": " << to_string_barrier(barrier);
#endif
barrier_list_.append(barrier);
}
/* Check for additional barriers when resuming rendering.
*
* Between suspending rendering and resuming the state/layout of resources can change and
* require additional barriers.
*/
if (node.type == VKNodeType::BEGIN_RENDERING) {
/* Begin rendering scope. */
BLI_assert(!rendering_active);
rendering_scope = node_handle;
rendering_active = true;
layered_tracker.begin(render_graph, node_handle);
}
else if (node.type == VKNodeType::END_RENDERING) {
/* End rendering scope. */
BLI_assert(rendering_active);
rendering_scope = 0;
rendering_active = false;
/* Any specific layout changes needs to be reverted, so the global resource state tracker
* reflects the correct state. These barriers needs to be added as node post barriers. We
* assume that END_RENDERING is always the last node of a group. */
Barrier barrier = {};
layered_tracker.end(barrier, use_local_read);
if (!barrier.is_empty()) {
post_barriers.append(barrier);
}
}
else if (rendering_active && !node_type_is_within_rendering(node.type)) {
/* Suspend active rendering scope. */
rendering_active = false;
/* Any specific layout changes needs to be reverted, so the global resource state tracker
* reflects the correct state. These barriers needs to be added as node post barriers.
*/
Barrier barrier = {};
layered_tracker.suspend(barrier, use_local_read);
if (!barrier.is_empty()) {
post_barriers.append(barrier);
}
}
else if (!rendering_active && node_type_is_within_rendering(node.type)) {
/* Resume rendering scope. */
VKRenderGraphNode &rendering_node = render_graph.nodes_[rendering_scope];
Barrier barrier = {};
build_pipeline_barriers(render_graph,
rendering_scope,
rendering_node.pipeline_stage_get(),
layered_tracker,
barrier);
if (!barrier.is_empty()) {
barrier_list_.append(barrier);
}
Vulkan: Device command builder This PR implements a new the threading model for building render graphs based on tests performed last month. For out workload multithreaded command building will block in the driver or device. So better to use a single thread for command building. Details of the internal working is documented at https://developer.blender.org/docs/features/gpu/vulkan/render_graph/ - When a context is activated on a thread the context asks for a render graph it can use by calling `VKDevice::render_graph_new`. - Parts of the GPU backend that requires GPU commands will add a specific render graph node to the render graph. The nodes also contains a reference to all resources it needs including the access it needs and the image layout. - When the context is flushed the render graph is submitted to the device by calling `VKDevice::render_graph_submit`. - The device puts the render graph in `VKDevice::submission_pool`. - There is a single background thread that gets the next render graph to send to the GPU (`VKDevice::submission_runner`). - Reorder the commands of the render graph to comply with Vulkan specific command order rules and reducing possible bottlenecks. (`VKScheduler`) - Generate the required barriers `VKCommandBuilder::groups_extract_barriers`. This is a separate step to reduce resource locking giving other threads access to the resource states when they are building the render graph nodes. - GPU commands and pipeline barriers are recorded to a VkCommandBuffer. (`VKCommandBuilder::record_commands`) - When completed the command buffer can be submitted to the device queue. `vkQueueSubmit` - Render graphs that have been submitted can be reused by a next thread. This is done by pushing the render graph to the `VKDevice::unused_render_graphs` queue. Pull Request: https://projects.blender.org/blender/blender/pulls/132681
2025-01-27 08:55:23 +01:00
/* Resume layered tracking. Each layer that has an override will be transition back to
* the layer specific image layout. */
barrier = {};
layered_tracker.resume(barrier, use_local_read);
if (!barrier.is_empty()) {
barrier_list_.append(barrier);
}
rendering_active = true;
}
/* Extract pre barriers for nodes. */
if (use_local_read && node_type_is_within_rendering(node.type) &&
node_has_input_attachments(render_graph, node_handle))
{
Barrier barrier = {};
build_pipeline_barriers(
render_graph, node_handle, node.pipeline_stage_get(), layered_tracker, barrier, true);
if (!barrier.is_empty()) {
node_pre_barriers.append(barrier);
}
}
}
if (rendering_active) {
/* Suspend layered image tracker. When active the next group will always be a compute/data
* transfer group.
*
* Any specific layout changes needs to be reverted, so the global resource state tracker
* reflects the correct state. These barriers needs to be added as node post barriers.
*/
Barrier barrier = {};
layered_tracker.suspend(barrier, use_local_read);
if (!barrier.is_empty()) {
post_barriers.append(barrier);
}
rendering_active = false;
}
/* Update the group pre and post barriers. Pre barriers are already stored in the
* barrier_list_. The post barriers are appended after the pre barriers. */
int64_t barrier_list_size = barrier_list_.size();
group_pre_barriers_.append(group_pre_barriers.with_new_end(barrier_list_size));
barrier_list_.extend(std::move(post_barriers));
group_post_barriers_.append(
IndexRange::from_begin_end(barrier_list_size, barrier_list_.size()));
if (!node_pre_barriers.is_empty()) {
barrier_list_size = barrier_list_.size();
barrier_list_.extend(std::move(node_pre_barriers));
/* Shift all node pre barrier references to the new location in the barrier_list_. */
for (const int64_t group_node_index : node_group) {
NodeHandle node_handle = node_handles[group_node_index];
if (!node_pre_barriers_[node_handle].is_empty()) {
node_pre_barriers_[node_handle].from_begin_size(
node_pre_barriers_[node_handle].start() + barrier_list_size, 1);
}
}
}
}
BLI_assert(group_pre_barriers_.size() == group_nodes_.size());
BLI_assert(group_post_barriers_.size() == group_nodes_.size());
}
void VKCommandBuilder::groups_build_commands(VKRenderGraph &render_graph,
VKCommandBufferInterface &command_buffer,
Span<NodeHandle> node_handles)
{
DebugGroups debug_groups = {};
VKBoundPipelines active_pipelines = {};
NodeHandle rendering_scope = 0;
bool rendering_active = false;
for (int64_t group_index : group_nodes_.index_range()) {
IndexRange group_nodes = group_nodes_[group_index];
Span<NodeHandle> group_node_handles = node_handles.slice(group_nodes);
/* Record group pre barriers. */
for (BarrierIndex barrier_index : group_pre_barriers_[group_index]) {
BLI_assert_msg(!rendering_active,
"Pre group barriers must be executed outside a rendering scope.");
Barrier &barrier = barrier_list_[barrier_index];
#if 0
std::cout << __func__ << ": node_group=" << group_index
<< ", node_group_range=" << group_node_handles.first() << "-"
<< group_node_handles.last() << ", pre_barrier=(" << to_string_barrier(barrier)
<< ")\n";
#endif
send_pipeline_barriers(command_buffer, barrier, false);
}
/* Record group node commands. */
for (NodeHandle node_handle : group_node_handles) {
VKRenderGraphNode &node = render_graph.nodes_[node_handle];
if (G.debug & G_DEBUG_GPU) {
activate_debug_group(render_graph, command_buffer, debug_groups, node_handle);
}
if (node.type == VKNodeType::BEGIN_RENDERING) {
rendering_scope = node_handle;
rendering_active = true;
/* Check of the group spans a full rendering scope. In that case we don't need to set
* the VK_RENDERING_SUSPENDING_BIT. */
const VKRenderGraphNode &last_node = render_graph.nodes_[group_node_handles.last()];
bool will_be_suspended = last_node.type != VKNodeType::END_RENDERING;
if (will_be_suspended) {
render_graph.storage_.begin_rendering[node.storage_index].vk_rendering_info.flags =
VK_RENDERING_SUSPENDING_BIT;
}
}
else if (node.type == VKNodeType::END_RENDERING) {
rendering_active = false;
}
else if (node_type_is_within_rendering(node.type)) {
if (!rendering_active) {
/* Resume rendering scope. */
VKRenderGraphNode &rendering_node = render_graph.nodes_[rendering_scope];
render_graph.storage_.begin_rendering[rendering_node.storage_index]
.vk_rendering_info.flags = VK_RENDERING_RESUMING_BIT;
rendering_node.build_commands(command_buffer, render_graph.storage_, active_pipelines);
rendering_active = true;
}
}
/* Record group node barriers. (VK_EXT_dynamic_rendering_local_read) */
for (BarrierIndex node_pre_barrier_index : node_pre_barriers_[node_handle]) {
Barrier &barrier = barrier_list_[node_pre_barrier_index];
#if 0
std::cout << __func__ << ": node_group=" << group_index
<< ", node_group_range=" << group_node_handles.first() << "-"
<< group_node_handles.last() << ", node_pre_barrier=(" << to_string_barrier(barrier)
<< ")\n";
#endif
// TODO: Barrier should already contain the changes for local read.
send_pipeline_barriers(command_buffer, barrier, true);
}
Vulkan: Layer tracking during render scope EEVEE can bind layers of a texture that is also used as an attachment. When binding the image layout of these specific layers can be different that the image layout of the whole image. This fixes the known synchronization issues inside EEVEE. wasp_bot, tree_creature and wanderer scenes can be rendered without any synchronization issue reported by the Vulkan validation layers. Design task: #124214 When beginning to render the attachments are being evaluated. If there is an arrayed texture (with multiple layers) the individual layers of that texture can be tracked during until the rendering is ended. When the same texture is bound to a shader it will be a different layer (otherwise there is a feedback loop, which isn't allowed). The bound layers will typically need a different layout the transition to the new layout is executed and recorded. When the rendering ends, the layers are transitioned back to the layout the texture is expected in. It can happen that a layer is used multiple times during the same rendering. In that case the rendering should be suspended to perform the transition. Image layout transitions are not allowed during rendering. There is one place where a layer needs to be transited multiple times that is when EEVEE wants to extract the thickness from the shadow. The thickness is stored inside the gbuffer_normal which is also used as an attachment. Eval then samples the thickness from the gbuffer_normal as a sampler. To work around this issue we suspend the rendering when a `GPU_BARRIER_SHADER_IMAGE_ACCESS` is signaled. Pull Request: https://projects.blender.org/blender/blender/pulls/124407
2024-07-16 16:39:18 +02:00
#if 0
std::cout << __func__ << ": node_group=" << group_index
<< ", node_group_range=" << group_node_handles.first() << "-"
<< group_node_handles.last() << ", node_handle=" << node_handle
<< ", node_type=" << node.type
<< ", debug group=" << render_graph.full_debug_group(node_handle) << "\n";
#endif
node.build_commands(command_buffer, render_graph.storage_, active_pipelines);
}
if (rendering_active) {
/* Suspend rendering as the next node group will contain data transfer/dispatch commands.
*/
rendering_active = false;
if (command_buffer.use_dynamic_rendering) {
command_buffer.end_rendering();
}
else {
command_buffer.end_render_pass();
}
VKRenderGraphNode &rendering_node = render_graph.nodes_[rendering_scope];
render_graph.storage_.begin_rendering[rendering_node.storage_index].vk_rendering_info.flags =
VK_RENDERING_RESUMING_BIT;
}
/* Record group post barriers. */
for (BarrierIndex barrier_index : group_post_barriers_[group_index]) {
BLI_assert_msg(!rendering_active,
"Post group barriers must be executed outside a rendering scope.");
Barrier &barrier = barrier_list_[barrier_index];
#if 0
std::cout << __func__ << ": node_group=" << group_index
<< ", node_group_range=" << group_node_handles.first() << "-"
<< group_node_handles.last() << ", post_barrier=(" << to_string_barrier(barrier)
<< ")\n";
#endif
send_pipeline_barriers(command_buffer, barrier, false);
Vulkan: Layer tracking during render scope EEVEE can bind layers of a texture that is also used as an attachment. When binding the image layout of these specific layers can be different that the image layout of the whole image. This fixes the known synchronization issues inside EEVEE. wasp_bot, tree_creature and wanderer scenes can be rendered without any synchronization issue reported by the Vulkan validation layers. Design task: #124214 When beginning to render the attachments are being evaluated. If there is an arrayed texture (with multiple layers) the individual layers of that texture can be tracked during until the rendering is ended. When the same texture is bound to a shader it will be a different layer (otherwise there is a feedback loop, which isn't allowed). The bound layers will typically need a different layout the transition to the new layout is executed and recorded. When the rendering ends, the layers are transitioned back to the layout the texture is expected in. It can happen that a layer is used multiple times during the same rendering. In that case the rendering should be suspended to perform the transition. Image layout transitions are not allowed during rendering. There is one place where a layer needs to be transited multiple times that is when EEVEE wants to extract the thickness from the shadow. The thickness is stored inside the gbuffer_normal which is also used as an attachment. Eval then samples the thickness from the gbuffer_normal as a sampler. To work around this issue we suspend the rendering when a `GPU_BARRIER_SHADER_IMAGE_ACCESS` is signaled. Pull Request: https://projects.blender.org/blender/blender/pulls/124407
2024-07-16 16:39:18 +02:00
}
}
finish_debug_groups(command_buffer, debug_groups);
}
bool VKCommandBuilder::node_has_input_attachments(const VKRenderGraph &render_graph,
NodeHandle node)
{
const VKRenderGraphNodeLinks &links = render_graph.links_[node];
const Vector<VKRenderGraphLink> &inputs = links.inputs;
return std::any_of(inputs.begin(), inputs.end(), [](const VKRenderGraphLink &input) {
return input.vk_access_flags & VK_ACCESS_INPUT_ATTACHMENT_READ_BIT;
});
Vulkan: Render graph core **Design Task**: blender/blender#118330 This PR adds the core of the render graph. The render graph isn't used. Current implementation of the Vulkan Backend is slow by design. We focused on stability, before performance. With the new introduced render graph the focus will shift to performance and keep the stability at where it is. Some highlights: - Every context will get its own render graph. (`VKRenderGraph`). - Resources (and resource state tracking) is device specific (`VKResourceStateTracker`). - No node reordering / sub graph execution has been implemented. Currently All nodes in the graph is executed in the order they were added. (`VKScheduler`). - The links inside the graph describe the resources the nodes read from (input links) or writes to (output links) - When resources are written to a resource stamp is incremented allowing keeping track of which nodes needs which stamp of a resource. - At each link the access information (how does the node accesses the resource) and image layout (for image resources) are stored. This allows the render graph to find out how a resource was used in the past and will be used in the future. That is important to construct pipeline barriers that don't stall the whole GPU. # Defined nodes This implementation has nodes for: - Blit image - Clear color image - Copy buffers to buffers - Copy buffers to images - Copy images to images - Copy images to buffers - Dispatch compute shader - Fill buffers - Synchronization Each node has a node info, create info and data struct. The create info contains all data to construct the node, including the links of the graph. The data struct only contains the data stored inside the node. The node info contains the node specific implementation. > NOTE: Other nodes will be added after this PR lands to main. # Resources Before a render graph can be used, the resources should be registered to `VKResourceStateTracker`. In the final implementation this will be owned by the `VKDevice`. Registration of resources can be done by calling `VKResources.add_buffer` or `VKResources.add_image`. # Render graph Nodes can be added to the render graph. When adding a node its read/ write dependencies are extracted and converted into links (`VKNodeInfo. build_links`). When the caller wants to have a resource up to date the functions `VKRenderGraph.submit_for_read` or `VKRenderGraph.submit_for_present` can be called. These functions will select and order the nodes that are needed and convert them to `vkCmd*` commands. These commands include pipeline barrier and image layout transitions. The `vkCmd` are recorded into a command buffer which is sent to the device queue. ## Walking the graph Walking the render graph isn't implemented yet. The idea is to have a `Map<ResourceWithStamp, Vector<NodeHandle>> consumers` and `Map<ResourceWithStamp, NodeHandle> producers`. These attributes can be stored in the render graph and created when building the links, or can be created inside the VKScheduler as a variable. The exact detail which one would be better is unclear as there aren't any users yet. At the moment the scheduler would need them we need to figure out the best way to store and retrieve the consumers/producers. # Unit tests The render graph can be tested by enabling `WITH_GTEST` and use `vk_render_graph` as a filter. ``` bin/tests/blender_test --gtest_filter="vk_render_graph*" ``` Pull Request: https://projects.blender.org/blender/blender/pulls/120427
2024-04-19 10:46:50 +02:00
}
void VKCommandBuilder::activate_debug_group(VKRenderGraph &render_graph,
VKCommandBufferInterface &command_buffer,
DebugGroups &debug_groups,
NodeHandle node_handle)
{
VKRenderGraph::DebugGroupID debug_group = render_graph.debug_.node_group_map[node_handle];
if (debug_group == debug_groups.active_debug_group_id) {
return;
}
/* Determine the number of pops and pushes that will happen on the debug stack. */
int num_ends = 0;
int num_begins = 0;
if (debug_group == -1) {
num_ends = debug_groups.debug_level;
}
else {
Vector<VKRenderGraph::DebugGroupNameID> &to_group =
render_graph.debug_.used_groups[debug_group];
if (debug_groups.active_debug_group_id != -1) {
Vector<VKRenderGraph::DebugGroupNameID> &from_group =
render_graph.debug_.used_groups[debug_groups.active_debug_group_id];
num_ends = max_ii(from_group.size() - to_group.size(), 0);
int num_checks = min_ii(from_group.size(), to_group.size());
for (int index : IndexRange(num_checks)) {
if (from_group[index] != to_group[index]) {
num_ends += num_checks - index;
break;
}
}
}
num_begins = to_group.size() - (debug_groups.debug_level - num_ends);
}
/* Perform the pops from the debug stack. */
for (int index = 0; index < num_ends; index++) {
command_buffer.end_debug_utils_label();
}
debug_groups.debug_level -= num_ends;
/* Perform the pushes to the debug stack. */
if (num_begins > 0) {
Vector<VKRenderGraph::DebugGroupNameID> &to_group =
render_graph.debug_.used_groups[debug_group];
VkDebugUtilsLabelEXT debug_utils_label = {};
debug_utils_label.sType = VK_STRUCTURE_TYPE_DEBUG_UTILS_LABEL_EXT;
for (int index : IndexRange(debug_groups.debug_level, num_begins)) {
const VKRenderGraph::DebugGroup &debug_group = render_graph.debug_.groups[to_group[index]];
debug_utils_label.pLabelName = debug_group.name.c_str();
copy_v4_v4(debug_utils_label.color, debug_group.color);
command_buffer.begin_debug_utils_label(&debug_utils_label);
}
}
debug_groups.debug_level += num_begins;
debug_groups.active_debug_group_id = debug_group;
}
void VKCommandBuilder::finish_debug_groups(VKCommandBufferInterface &command_buffer,
DebugGroups &debug_groups)
{
for (int i = 0; i < debug_groups.debug_level; i++) {
command_buffer.end_debug_utils_label();
}
debug_groups.debug_level = 0;
}
Vulkan: Render graph core **Design Task**: blender/blender#118330 This PR adds the core of the render graph. The render graph isn't used. Current implementation of the Vulkan Backend is slow by design. We focused on stability, before performance. With the new introduced render graph the focus will shift to performance and keep the stability at where it is. Some highlights: - Every context will get its own render graph. (`VKRenderGraph`). - Resources (and resource state tracking) is device specific (`VKResourceStateTracker`). - No node reordering / sub graph execution has been implemented. Currently All nodes in the graph is executed in the order they were added. (`VKScheduler`). - The links inside the graph describe the resources the nodes read from (input links) or writes to (output links) - When resources are written to a resource stamp is incremented allowing keeping track of which nodes needs which stamp of a resource. - At each link the access information (how does the node accesses the resource) and image layout (for image resources) are stored. This allows the render graph to find out how a resource was used in the past and will be used in the future. That is important to construct pipeline barriers that don't stall the whole GPU. # Defined nodes This implementation has nodes for: - Blit image - Clear color image - Copy buffers to buffers - Copy buffers to images - Copy images to images - Copy images to buffers - Dispatch compute shader - Fill buffers - Synchronization Each node has a node info, create info and data struct. The create info contains all data to construct the node, including the links of the graph. The data struct only contains the data stored inside the node. The node info contains the node specific implementation. > NOTE: Other nodes will be added after this PR lands to main. # Resources Before a render graph can be used, the resources should be registered to `VKResourceStateTracker`. In the final implementation this will be owned by the `VKDevice`. Registration of resources can be done by calling `VKResources.add_buffer` or `VKResources.add_image`. # Render graph Nodes can be added to the render graph. When adding a node its read/ write dependencies are extracted and converted into links (`VKNodeInfo. build_links`). When the caller wants to have a resource up to date the functions `VKRenderGraph.submit_for_read` or `VKRenderGraph.submit_for_present` can be called. These functions will select and order the nodes that are needed and convert them to `vkCmd*` commands. These commands include pipeline barrier and image layout transitions. The `vkCmd` are recorded into a command buffer which is sent to the device queue. ## Walking the graph Walking the render graph isn't implemented yet. The idea is to have a `Map<ResourceWithStamp, Vector<NodeHandle>> consumers` and `Map<ResourceWithStamp, NodeHandle> producers`. These attributes can be stored in the render graph and created when building the links, or can be created inside the VKScheduler as a variable. The exact detail which one would be better is unclear as there aren't any users yet. At the moment the scheduler would need them we need to figure out the best way to store and retrieve the consumers/producers. # Unit tests The render graph can be tested by enabling `WITH_GTEST` and use `vk_render_graph` as a filter. ``` bin/tests/blender_test --gtest_filter="vk_render_graph*" ``` Pull Request: https://projects.blender.org/blender/blender/pulls/120427
2024-04-19 10:46:50 +02:00
void VKCommandBuilder::build_pipeline_barriers(VKRenderGraph &render_graph,
NodeHandle node_handle,
VkPipelineStageFlags pipeline_stage,
LayeredImageTracker &layered_tracker,
Barrier &r_barrier,
bool within_rendering)
Vulkan: Render graph core **Design Task**: blender/blender#118330 This PR adds the core of the render graph. The render graph isn't used. Current implementation of the Vulkan Backend is slow by design. We focused on stability, before performance. With the new introduced render graph the focus will shift to performance and keep the stability at where it is. Some highlights: - Every context will get its own render graph. (`VKRenderGraph`). - Resources (and resource state tracking) is device specific (`VKResourceStateTracker`). - No node reordering / sub graph execution has been implemented. Currently All nodes in the graph is executed in the order they were added. (`VKScheduler`). - The links inside the graph describe the resources the nodes read from (input links) or writes to (output links) - When resources are written to a resource stamp is incremented allowing keeping track of which nodes needs which stamp of a resource. - At each link the access information (how does the node accesses the resource) and image layout (for image resources) are stored. This allows the render graph to find out how a resource was used in the past and will be used in the future. That is important to construct pipeline barriers that don't stall the whole GPU. # Defined nodes This implementation has nodes for: - Blit image - Clear color image - Copy buffers to buffers - Copy buffers to images - Copy images to images - Copy images to buffers - Dispatch compute shader - Fill buffers - Synchronization Each node has a node info, create info and data struct. The create info contains all data to construct the node, including the links of the graph. The data struct only contains the data stored inside the node. The node info contains the node specific implementation. > NOTE: Other nodes will be added after this PR lands to main. # Resources Before a render graph can be used, the resources should be registered to `VKResourceStateTracker`. In the final implementation this will be owned by the `VKDevice`. Registration of resources can be done by calling `VKResources.add_buffer` or `VKResources.add_image`. # Render graph Nodes can be added to the render graph. When adding a node its read/ write dependencies are extracted and converted into links (`VKNodeInfo. build_links`). When the caller wants to have a resource up to date the functions `VKRenderGraph.submit_for_read` or `VKRenderGraph.submit_for_present` can be called. These functions will select and order the nodes that are needed and convert them to `vkCmd*` commands. These commands include pipeline barrier and image layout transitions. The `vkCmd` are recorded into a command buffer which is sent to the device queue. ## Walking the graph Walking the render graph isn't implemented yet. The idea is to have a `Map<ResourceWithStamp, Vector<NodeHandle>> consumers` and `Map<ResourceWithStamp, NodeHandle> producers`. These attributes can be stored in the render graph and created when building the links, or can be created inside the VKScheduler as a variable. The exact detail which one would be better is unclear as there aren't any users yet. At the moment the scheduler would need them we need to figure out the best way to store and retrieve the consumers/producers. # Unit tests The render graph can be tested by enabling `WITH_GTEST` and use `vk_render_graph` as a filter. ``` bin/tests/blender_test --gtest_filter="vk_render_graph*" ``` Pull Request: https://projects.blender.org/blender/blender/pulls/120427
2024-04-19 10:46:50 +02:00
{
reset_barriers(r_barrier);
add_image_barriers(
render_graph, node_handle, pipeline_stage, layered_tracker, r_barrier, within_rendering);
add_buffer_barriers(render_graph, node_handle, pipeline_stage, r_barrier);
Vulkan: Render graph core **Design Task**: blender/blender#118330 This PR adds the core of the render graph. The render graph isn't used. Current implementation of the Vulkan Backend is slow by design. We focused on stability, before performance. With the new introduced render graph the focus will shift to performance and keep the stability at where it is. Some highlights: - Every context will get its own render graph. (`VKRenderGraph`). - Resources (and resource state tracking) is device specific (`VKResourceStateTracker`). - No node reordering / sub graph execution has been implemented. Currently All nodes in the graph is executed in the order they were added. (`VKScheduler`). - The links inside the graph describe the resources the nodes read from (input links) or writes to (output links) - When resources are written to a resource stamp is incremented allowing keeping track of which nodes needs which stamp of a resource. - At each link the access information (how does the node accesses the resource) and image layout (for image resources) are stored. This allows the render graph to find out how a resource was used in the past and will be used in the future. That is important to construct pipeline barriers that don't stall the whole GPU. # Defined nodes This implementation has nodes for: - Blit image - Clear color image - Copy buffers to buffers - Copy buffers to images - Copy images to images - Copy images to buffers - Dispatch compute shader - Fill buffers - Synchronization Each node has a node info, create info and data struct. The create info contains all data to construct the node, including the links of the graph. The data struct only contains the data stored inside the node. The node info contains the node specific implementation. > NOTE: Other nodes will be added after this PR lands to main. # Resources Before a render graph can be used, the resources should be registered to `VKResourceStateTracker`. In the final implementation this will be owned by the `VKDevice`. Registration of resources can be done by calling `VKResources.add_buffer` or `VKResources.add_image`. # Render graph Nodes can be added to the render graph. When adding a node its read/ write dependencies are extracted and converted into links (`VKNodeInfo. build_links`). When the caller wants to have a resource up to date the functions `VKRenderGraph.submit_for_read` or `VKRenderGraph.submit_for_present` can be called. These functions will select and order the nodes that are needed and convert them to `vkCmd*` commands. These commands include pipeline barrier and image layout transitions. The `vkCmd` are recorded into a command buffer which is sent to the device queue. ## Walking the graph Walking the render graph isn't implemented yet. The idea is to have a `Map<ResourceWithStamp, Vector<NodeHandle>> consumers` and `Map<ResourceWithStamp, NodeHandle> producers`. These attributes can be stored in the render graph and created when building the links, or can be created inside the VKScheduler as a variable. The exact detail which one would be better is unclear as there aren't any users yet. At the moment the scheduler would need them we need to figure out the best way to store and retrieve the consumers/producers. # Unit tests The render graph can be tested by enabling `WITH_GTEST` and use `vk_render_graph` as a filter. ``` bin/tests/blender_test --gtest_filter="vk_render_graph*" ``` Pull Request: https://projects.blender.org/blender/blender/pulls/120427
2024-04-19 10:46:50 +02:00
}
/** \} */
/* -------------------------------------------------------------------- */
/** \name Pipeline barriers
* \{ */
void VKCommandBuilder::reset_barriers(Barrier &r_barrier)
Vulkan: Render graph core **Design Task**: blender/blender#118330 This PR adds the core of the render graph. The render graph isn't used. Current implementation of the Vulkan Backend is slow by design. We focused on stability, before performance. With the new introduced render graph the focus will shift to performance and keep the stability at where it is. Some highlights: - Every context will get its own render graph. (`VKRenderGraph`). - Resources (and resource state tracking) is device specific (`VKResourceStateTracker`). - No node reordering / sub graph execution has been implemented. Currently All nodes in the graph is executed in the order they were added. (`VKScheduler`). - The links inside the graph describe the resources the nodes read from (input links) or writes to (output links) - When resources are written to a resource stamp is incremented allowing keeping track of which nodes needs which stamp of a resource. - At each link the access information (how does the node accesses the resource) and image layout (for image resources) are stored. This allows the render graph to find out how a resource was used in the past and will be used in the future. That is important to construct pipeline barriers that don't stall the whole GPU. # Defined nodes This implementation has nodes for: - Blit image - Clear color image - Copy buffers to buffers - Copy buffers to images - Copy images to images - Copy images to buffers - Dispatch compute shader - Fill buffers - Synchronization Each node has a node info, create info and data struct. The create info contains all data to construct the node, including the links of the graph. The data struct only contains the data stored inside the node. The node info contains the node specific implementation. > NOTE: Other nodes will be added after this PR lands to main. # Resources Before a render graph can be used, the resources should be registered to `VKResourceStateTracker`. In the final implementation this will be owned by the `VKDevice`. Registration of resources can be done by calling `VKResources.add_buffer` or `VKResources.add_image`. # Render graph Nodes can be added to the render graph. When adding a node its read/ write dependencies are extracted and converted into links (`VKNodeInfo. build_links`). When the caller wants to have a resource up to date the functions `VKRenderGraph.submit_for_read` or `VKRenderGraph.submit_for_present` can be called. These functions will select and order the nodes that are needed and convert them to `vkCmd*` commands. These commands include pipeline barrier and image layout transitions. The `vkCmd` are recorded into a command buffer which is sent to the device queue. ## Walking the graph Walking the render graph isn't implemented yet. The idea is to have a `Map<ResourceWithStamp, Vector<NodeHandle>> consumers` and `Map<ResourceWithStamp, NodeHandle> producers`. These attributes can be stored in the render graph and created when building the links, or can be created inside the VKScheduler as a variable. The exact detail which one would be better is unclear as there aren't any users yet. At the moment the scheduler would need them we need to figure out the best way to store and retrieve the consumers/producers. # Unit tests The render graph can be tested by enabling `WITH_GTEST` and use `vk_render_graph` as a filter. ``` bin/tests/blender_test --gtest_filter="vk_render_graph*" ``` Pull Request: https://projects.blender.org/blender/blender/pulls/120427
2024-04-19 10:46:50 +02:00
{
r_barrier.dst_stage_mask = r_barrier.src_stage_mask = VK_PIPELINE_STAGE_NONE;
Vulkan: Render graph core **Design Task**: blender/blender#118330 This PR adds the core of the render graph. The render graph isn't used. Current implementation of the Vulkan Backend is slow by design. We focused on stability, before performance. With the new introduced render graph the focus will shift to performance and keep the stability at where it is. Some highlights: - Every context will get its own render graph. (`VKRenderGraph`). - Resources (and resource state tracking) is device specific (`VKResourceStateTracker`). - No node reordering / sub graph execution has been implemented. Currently All nodes in the graph is executed in the order they were added. (`VKScheduler`). - The links inside the graph describe the resources the nodes read from (input links) or writes to (output links) - When resources are written to a resource stamp is incremented allowing keeping track of which nodes needs which stamp of a resource. - At each link the access information (how does the node accesses the resource) and image layout (for image resources) are stored. This allows the render graph to find out how a resource was used in the past and will be used in the future. That is important to construct pipeline barriers that don't stall the whole GPU. # Defined nodes This implementation has nodes for: - Blit image - Clear color image - Copy buffers to buffers - Copy buffers to images - Copy images to images - Copy images to buffers - Dispatch compute shader - Fill buffers - Synchronization Each node has a node info, create info and data struct. The create info contains all data to construct the node, including the links of the graph. The data struct only contains the data stored inside the node. The node info contains the node specific implementation. > NOTE: Other nodes will be added after this PR lands to main. # Resources Before a render graph can be used, the resources should be registered to `VKResourceStateTracker`. In the final implementation this will be owned by the `VKDevice`. Registration of resources can be done by calling `VKResources.add_buffer` or `VKResources.add_image`. # Render graph Nodes can be added to the render graph. When adding a node its read/ write dependencies are extracted and converted into links (`VKNodeInfo. build_links`). When the caller wants to have a resource up to date the functions `VKRenderGraph.submit_for_read` or `VKRenderGraph.submit_for_present` can be called. These functions will select and order the nodes that are needed and convert them to `vkCmd*` commands. These commands include pipeline barrier and image layout transitions. The `vkCmd` are recorded into a command buffer which is sent to the device queue. ## Walking the graph Walking the render graph isn't implemented yet. The idea is to have a `Map<ResourceWithStamp, Vector<NodeHandle>> consumers` and `Map<ResourceWithStamp, NodeHandle> producers`. These attributes can be stored in the render graph and created when building the links, or can be created inside the VKScheduler as a variable. The exact detail which one would be better is unclear as there aren't any users yet. At the moment the scheduler would need them we need to figure out the best way to store and retrieve the consumers/producers. # Unit tests The render graph can be tested by enabling `WITH_GTEST` and use `vk_render_graph` as a filter. ``` bin/tests/blender_test --gtest_filter="vk_render_graph*" ``` Pull Request: https://projects.blender.org/blender/blender/pulls/120427
2024-04-19 10:46:50 +02:00
}
void VKCommandBuilder::send_pipeline_barriers(VKCommandBufferInterface &command_buffer,
const Barrier &barrier,
bool within_rendering)
Vulkan: Render graph core **Design Task**: blender/blender#118330 This PR adds the core of the render graph. The render graph isn't used. Current implementation of the Vulkan Backend is slow by design. We focused on stability, before performance. With the new introduced render graph the focus will shift to performance and keep the stability at where it is. Some highlights: - Every context will get its own render graph. (`VKRenderGraph`). - Resources (and resource state tracking) is device specific (`VKResourceStateTracker`). - No node reordering / sub graph execution has been implemented. Currently All nodes in the graph is executed in the order they were added. (`VKScheduler`). - The links inside the graph describe the resources the nodes read from (input links) or writes to (output links) - When resources are written to a resource stamp is incremented allowing keeping track of which nodes needs which stamp of a resource. - At each link the access information (how does the node accesses the resource) and image layout (for image resources) are stored. This allows the render graph to find out how a resource was used in the past and will be used in the future. That is important to construct pipeline barriers that don't stall the whole GPU. # Defined nodes This implementation has nodes for: - Blit image - Clear color image - Copy buffers to buffers - Copy buffers to images - Copy images to images - Copy images to buffers - Dispatch compute shader - Fill buffers - Synchronization Each node has a node info, create info and data struct. The create info contains all data to construct the node, including the links of the graph. The data struct only contains the data stored inside the node. The node info contains the node specific implementation. > NOTE: Other nodes will be added after this PR lands to main. # Resources Before a render graph can be used, the resources should be registered to `VKResourceStateTracker`. In the final implementation this will be owned by the `VKDevice`. Registration of resources can be done by calling `VKResources.add_buffer` or `VKResources.add_image`. # Render graph Nodes can be added to the render graph. When adding a node its read/ write dependencies are extracted and converted into links (`VKNodeInfo. build_links`). When the caller wants to have a resource up to date the functions `VKRenderGraph.submit_for_read` or `VKRenderGraph.submit_for_present` can be called. These functions will select and order the nodes that are needed and convert them to `vkCmd*` commands. These commands include pipeline barrier and image layout transitions. The `vkCmd` are recorded into a command buffer which is sent to the device queue. ## Walking the graph Walking the render graph isn't implemented yet. The idea is to have a `Map<ResourceWithStamp, Vector<NodeHandle>> consumers` and `Map<ResourceWithStamp, NodeHandle> producers`. These attributes can be stored in the render graph and created when building the links, or can be created inside the VKScheduler as a variable. The exact detail which one would be better is unclear as there aren't any users yet. At the moment the scheduler would need them we need to figure out the best way to store and retrieve the consumers/producers. # Unit tests The render graph can be tested by enabling `WITH_GTEST` and use `vk_render_graph` as a filter. ``` bin/tests/blender_test --gtest_filter="vk_render_graph*" ``` Pull Request: https://projects.blender.org/blender/blender/pulls/120427
2024-04-19 10:46:50 +02:00
{
if (barrier.is_empty()) {
Vulkan: Render graph core **Design Task**: blender/blender#118330 This PR adds the core of the render graph. The render graph isn't used. Current implementation of the Vulkan Backend is slow by design. We focused on stability, before performance. With the new introduced render graph the focus will shift to performance and keep the stability at where it is. Some highlights: - Every context will get its own render graph. (`VKRenderGraph`). - Resources (and resource state tracking) is device specific (`VKResourceStateTracker`). - No node reordering / sub graph execution has been implemented. Currently All nodes in the graph is executed in the order they were added. (`VKScheduler`). - The links inside the graph describe the resources the nodes read from (input links) or writes to (output links) - When resources are written to a resource stamp is incremented allowing keeping track of which nodes needs which stamp of a resource. - At each link the access information (how does the node accesses the resource) and image layout (for image resources) are stored. This allows the render graph to find out how a resource was used in the past and will be used in the future. That is important to construct pipeline barriers that don't stall the whole GPU. # Defined nodes This implementation has nodes for: - Blit image - Clear color image - Copy buffers to buffers - Copy buffers to images - Copy images to images - Copy images to buffers - Dispatch compute shader - Fill buffers - Synchronization Each node has a node info, create info and data struct. The create info contains all data to construct the node, including the links of the graph. The data struct only contains the data stored inside the node. The node info contains the node specific implementation. > NOTE: Other nodes will be added after this PR lands to main. # Resources Before a render graph can be used, the resources should be registered to `VKResourceStateTracker`. In the final implementation this will be owned by the `VKDevice`. Registration of resources can be done by calling `VKResources.add_buffer` or `VKResources.add_image`. # Render graph Nodes can be added to the render graph. When adding a node its read/ write dependencies are extracted and converted into links (`VKNodeInfo. build_links`). When the caller wants to have a resource up to date the functions `VKRenderGraph.submit_for_read` or `VKRenderGraph.submit_for_present` can be called. These functions will select and order the nodes that are needed and convert them to `vkCmd*` commands. These commands include pipeline barrier and image layout transitions. The `vkCmd` are recorded into a command buffer which is sent to the device queue. ## Walking the graph Walking the render graph isn't implemented yet. The idea is to have a `Map<ResourceWithStamp, Vector<NodeHandle>> consumers` and `Map<ResourceWithStamp, NodeHandle> producers`. These attributes can be stored in the render graph and created when building the links, or can be created inside the VKScheduler as a variable. The exact detail which one would be better is unclear as there aren't any users yet. At the moment the scheduler would need them we need to figure out the best way to store and retrieve the consumers/producers. # Unit tests The render graph can be tested by enabling `WITH_GTEST` and use `vk_render_graph` as a filter. ``` bin/tests/blender_test --gtest_filter="vk_render_graph*" ``` Pull Request: https://projects.blender.org/blender/blender/pulls/120427
2024-04-19 10:46:50 +02:00
return;
}
/* When no resources have been used, we can start the barrier at the top of the pipeline.
* It is not allowed to set it to None. */
/* TODO: VK_KHR_synchronization2 allows setting src_stage_mask to NONE. */
/* When no resources have been used, we can start the barrier at the top of the pipeline.
* It is not allowed to set it to None. */
VkPipelineStageFlags src_stage_mask = (barrier.src_stage_mask == VK_PIPELINE_STAGE_NONE) ?
VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT :
2025-01-15 12:24:28 +11:00
VkPipelineStageFlagBits(barrier.src_stage_mask);
Vulkan: Render graph core **Design Task**: blender/blender#118330 This PR adds the core of the render graph. The render graph isn't used. Current implementation of the Vulkan Backend is slow by design. We focused on stability, before performance. With the new introduced render graph the focus will shift to performance and keep the stability at where it is. Some highlights: - Every context will get its own render graph. (`VKRenderGraph`). - Resources (and resource state tracking) is device specific (`VKResourceStateTracker`). - No node reordering / sub graph execution has been implemented. Currently All nodes in the graph is executed in the order they were added. (`VKScheduler`). - The links inside the graph describe the resources the nodes read from (input links) or writes to (output links) - When resources are written to a resource stamp is incremented allowing keeping track of which nodes needs which stamp of a resource. - At each link the access information (how does the node accesses the resource) and image layout (for image resources) are stored. This allows the render graph to find out how a resource was used in the past and will be used in the future. That is important to construct pipeline barriers that don't stall the whole GPU. # Defined nodes This implementation has nodes for: - Blit image - Clear color image - Copy buffers to buffers - Copy buffers to images - Copy images to images - Copy images to buffers - Dispatch compute shader - Fill buffers - Synchronization Each node has a node info, create info and data struct. The create info contains all data to construct the node, including the links of the graph. The data struct only contains the data stored inside the node. The node info contains the node specific implementation. > NOTE: Other nodes will be added after this PR lands to main. # Resources Before a render graph can be used, the resources should be registered to `VKResourceStateTracker`. In the final implementation this will be owned by the `VKDevice`. Registration of resources can be done by calling `VKResources.add_buffer` or `VKResources.add_image`. # Render graph Nodes can be added to the render graph. When adding a node its read/ write dependencies are extracted and converted into links (`VKNodeInfo. build_links`). When the caller wants to have a resource up to date the functions `VKRenderGraph.submit_for_read` or `VKRenderGraph.submit_for_present` can be called. These functions will select and order the nodes that are needed and convert them to `vkCmd*` commands. These commands include pipeline barrier and image layout transitions. The `vkCmd` are recorded into a command buffer which is sent to the device queue. ## Walking the graph Walking the render graph isn't implemented yet. The idea is to have a `Map<ResourceWithStamp, Vector<NodeHandle>> consumers` and `Map<ResourceWithStamp, NodeHandle> producers`. These attributes can be stored in the render graph and created when building the links, or can be created inside the VKScheduler as a variable. The exact detail which one would be better is unclear as there aren't any users yet. At the moment the scheduler would need them we need to figure out the best way to store and retrieve the consumers/producers. # Unit tests The render graph can be tested by enabling `WITH_GTEST` and use `vk_render_graph` as a filter. ``` bin/tests/blender_test --gtest_filter="vk_render_graph*" ``` Pull Request: https://projects.blender.org/blender/blender/pulls/120427
2024-04-19 10:46:50 +02:00
VkPipelineStageFlags dst_stage_mask = barrier.dst_stage_mask;
2025-01-20 11:19:23 +11:00
/* TODO: this should be done during barrier extraction making within_rendering obsolete. */
if (within_rendering) {
2025-01-20 11:19:23 +11:00
/* See: VUID - `vkCmdPipelineBarrier` - `srcStageMask` - 09556
* If `vkCmdPipelineBarrier` is called within a render pass instance started with
* `vkCmdBeginRendering`, this command must only specify frame-buffer-space stages in
* `srcStageMask` and `dstStageMask`. */
src_stage_mask = dst_stage_mask = VK_PIPELINE_STAGE_FRAGMENT_SHADER_BIT |
VK_PIPELINE_STAGE_EARLY_FRAGMENT_TESTS_BIT |
VK_PIPELINE_STAGE_LATE_FRAGMENT_TESTS_BIT |
VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT;
}
Span<VkBufferMemoryBarrier> buffer_barriers = vk_buffer_memory_barriers_.as_span().slice(
barrier.buffer_memory_barriers);
Span<VkImageMemoryBarrier> image_barriers = vk_image_memory_barriers_.as_span().slice(
barrier.image_memory_barriers);
command_buffer.pipeline_barrier(src_stage_mask,
dst_stage_mask,
Vulkan: Render graph core **Design Task**: blender/blender#118330 This PR adds the core of the render graph. The render graph isn't used. Current implementation of the Vulkan Backend is slow by design. We focused on stability, before performance. With the new introduced render graph the focus will shift to performance and keep the stability at where it is. Some highlights: - Every context will get its own render graph. (`VKRenderGraph`). - Resources (and resource state tracking) is device specific (`VKResourceStateTracker`). - No node reordering / sub graph execution has been implemented. Currently All nodes in the graph is executed in the order they were added. (`VKScheduler`). - The links inside the graph describe the resources the nodes read from (input links) or writes to (output links) - When resources are written to a resource stamp is incremented allowing keeping track of which nodes needs which stamp of a resource. - At each link the access information (how does the node accesses the resource) and image layout (for image resources) are stored. This allows the render graph to find out how a resource was used in the past and will be used in the future. That is important to construct pipeline barriers that don't stall the whole GPU. # Defined nodes This implementation has nodes for: - Blit image - Clear color image - Copy buffers to buffers - Copy buffers to images - Copy images to images - Copy images to buffers - Dispatch compute shader - Fill buffers - Synchronization Each node has a node info, create info and data struct. The create info contains all data to construct the node, including the links of the graph. The data struct only contains the data stored inside the node. The node info contains the node specific implementation. > NOTE: Other nodes will be added after this PR lands to main. # Resources Before a render graph can be used, the resources should be registered to `VKResourceStateTracker`. In the final implementation this will be owned by the `VKDevice`. Registration of resources can be done by calling `VKResources.add_buffer` or `VKResources.add_image`. # Render graph Nodes can be added to the render graph. When adding a node its read/ write dependencies are extracted and converted into links (`VKNodeInfo. build_links`). When the caller wants to have a resource up to date the functions `VKRenderGraph.submit_for_read` or `VKRenderGraph.submit_for_present` can be called. These functions will select and order the nodes that are needed and convert them to `vkCmd*` commands. These commands include pipeline barrier and image layout transitions. The `vkCmd` are recorded into a command buffer which is sent to the device queue. ## Walking the graph Walking the render graph isn't implemented yet. The idea is to have a `Map<ResourceWithStamp, Vector<NodeHandle>> consumers` and `Map<ResourceWithStamp, NodeHandle> producers`. These attributes can be stored in the render graph and created when building the links, or can be created inside the VKScheduler as a variable. The exact detail which one would be better is unclear as there aren't any users yet. At the moment the scheduler would need them we need to figure out the best way to store and retrieve the consumers/producers. # Unit tests The render graph can be tested by enabling `WITH_GTEST` and use `vk_render_graph` as a filter. ``` bin/tests/blender_test --gtest_filter="vk_render_graph*" ``` Pull Request: https://projects.blender.org/blender/blender/pulls/120427
2024-04-19 10:46:50 +02:00
VK_DEPENDENCY_BY_REGION_BIT,
0,
nullptr,
buffer_barriers.size(),
buffer_barriers.data(),
image_barriers.size(),
image_barriers.data());
Vulkan: Render graph core **Design Task**: blender/blender#118330 This PR adds the core of the render graph. The render graph isn't used. Current implementation of the Vulkan Backend is slow by design. We focused on stability, before performance. With the new introduced render graph the focus will shift to performance and keep the stability at where it is. Some highlights: - Every context will get its own render graph. (`VKRenderGraph`). - Resources (and resource state tracking) is device specific (`VKResourceStateTracker`). - No node reordering / sub graph execution has been implemented. Currently All nodes in the graph is executed in the order they were added. (`VKScheduler`). - The links inside the graph describe the resources the nodes read from (input links) or writes to (output links) - When resources are written to a resource stamp is incremented allowing keeping track of which nodes needs which stamp of a resource. - At each link the access information (how does the node accesses the resource) and image layout (for image resources) are stored. This allows the render graph to find out how a resource was used in the past and will be used in the future. That is important to construct pipeline barriers that don't stall the whole GPU. # Defined nodes This implementation has nodes for: - Blit image - Clear color image - Copy buffers to buffers - Copy buffers to images - Copy images to images - Copy images to buffers - Dispatch compute shader - Fill buffers - Synchronization Each node has a node info, create info and data struct. The create info contains all data to construct the node, including the links of the graph. The data struct only contains the data stored inside the node. The node info contains the node specific implementation. > NOTE: Other nodes will be added after this PR lands to main. # Resources Before a render graph can be used, the resources should be registered to `VKResourceStateTracker`. In the final implementation this will be owned by the `VKDevice`. Registration of resources can be done by calling `VKResources.add_buffer` or `VKResources.add_image`. # Render graph Nodes can be added to the render graph. When adding a node its read/ write dependencies are extracted and converted into links (`VKNodeInfo. build_links`). When the caller wants to have a resource up to date the functions `VKRenderGraph.submit_for_read` or `VKRenderGraph.submit_for_present` can be called. These functions will select and order the nodes that are needed and convert them to `vkCmd*` commands. These commands include pipeline barrier and image layout transitions. The `vkCmd` are recorded into a command buffer which is sent to the device queue. ## Walking the graph Walking the render graph isn't implemented yet. The idea is to have a `Map<ResourceWithStamp, Vector<NodeHandle>> consumers` and `Map<ResourceWithStamp, NodeHandle> producers`. These attributes can be stored in the render graph and created when building the links, or can be created inside the VKScheduler as a variable. The exact detail which one would be better is unclear as there aren't any users yet. At the moment the scheduler would need them we need to figure out the best way to store and retrieve the consumers/producers. # Unit tests The render graph can be tested by enabling `WITH_GTEST` and use `vk_render_graph` as a filter. ``` bin/tests/blender_test --gtest_filter="vk_render_graph*" ``` Pull Request: https://projects.blender.org/blender/blender/pulls/120427
2024-04-19 10:46:50 +02:00
}
void VKCommandBuilder::add_buffer_barriers(VKRenderGraph &render_graph,
NodeHandle node_handle,
VkPipelineStageFlags node_stages,
Barrier &r_barrier)
Vulkan: Render graph core **Design Task**: blender/blender#118330 This PR adds the core of the render graph. The render graph isn't used. Current implementation of the Vulkan Backend is slow by design. We focused on stability, before performance. With the new introduced render graph the focus will shift to performance and keep the stability at where it is. Some highlights: - Every context will get its own render graph. (`VKRenderGraph`). - Resources (and resource state tracking) is device specific (`VKResourceStateTracker`). - No node reordering / sub graph execution has been implemented. Currently All nodes in the graph is executed in the order they were added. (`VKScheduler`). - The links inside the graph describe the resources the nodes read from (input links) or writes to (output links) - When resources are written to a resource stamp is incremented allowing keeping track of which nodes needs which stamp of a resource. - At each link the access information (how does the node accesses the resource) and image layout (for image resources) are stored. This allows the render graph to find out how a resource was used in the past and will be used in the future. That is important to construct pipeline barriers that don't stall the whole GPU. # Defined nodes This implementation has nodes for: - Blit image - Clear color image - Copy buffers to buffers - Copy buffers to images - Copy images to images - Copy images to buffers - Dispatch compute shader - Fill buffers - Synchronization Each node has a node info, create info and data struct. The create info contains all data to construct the node, including the links of the graph. The data struct only contains the data stored inside the node. The node info contains the node specific implementation. > NOTE: Other nodes will be added after this PR lands to main. # Resources Before a render graph can be used, the resources should be registered to `VKResourceStateTracker`. In the final implementation this will be owned by the `VKDevice`. Registration of resources can be done by calling `VKResources.add_buffer` or `VKResources.add_image`. # Render graph Nodes can be added to the render graph. When adding a node its read/ write dependencies are extracted and converted into links (`VKNodeInfo. build_links`). When the caller wants to have a resource up to date the functions `VKRenderGraph.submit_for_read` or `VKRenderGraph.submit_for_present` can be called. These functions will select and order the nodes that are needed and convert them to `vkCmd*` commands. These commands include pipeline barrier and image layout transitions. The `vkCmd` are recorded into a command buffer which is sent to the device queue. ## Walking the graph Walking the render graph isn't implemented yet. The idea is to have a `Map<ResourceWithStamp, Vector<NodeHandle>> consumers` and `Map<ResourceWithStamp, NodeHandle> producers`. These attributes can be stored in the render graph and created when building the links, or can be created inside the VKScheduler as a variable. The exact detail which one would be better is unclear as there aren't any users yet. At the moment the scheduler would need them we need to figure out the best way to store and retrieve the consumers/producers. # Unit tests The render graph can be tested by enabling `WITH_GTEST` and use `vk_render_graph` as a filter. ``` bin/tests/blender_test --gtest_filter="vk_render_graph*" ``` Pull Request: https://projects.blender.org/blender/blender/pulls/120427
2024-04-19 10:46:50 +02:00
{
r_barrier.buffer_memory_barriers = IndexRange(vk_buffer_memory_barriers_.size(), 0);
add_buffer_read_barriers(render_graph, node_handle, node_stages, r_barrier);
add_buffer_write_barriers(render_graph, node_handle, node_stages, r_barrier);
r_barrier.buffer_memory_barriers = r_barrier.buffer_memory_barriers.with_new_end(
vk_buffer_memory_barriers_.size());
Vulkan: Render graph core **Design Task**: blender/blender#118330 This PR adds the core of the render graph. The render graph isn't used. Current implementation of the Vulkan Backend is slow by design. We focused on stability, before performance. With the new introduced render graph the focus will shift to performance and keep the stability at where it is. Some highlights: - Every context will get its own render graph. (`VKRenderGraph`). - Resources (and resource state tracking) is device specific (`VKResourceStateTracker`). - No node reordering / sub graph execution has been implemented. Currently All nodes in the graph is executed in the order they were added. (`VKScheduler`). - The links inside the graph describe the resources the nodes read from (input links) or writes to (output links) - When resources are written to a resource stamp is incremented allowing keeping track of which nodes needs which stamp of a resource. - At each link the access information (how does the node accesses the resource) and image layout (for image resources) are stored. This allows the render graph to find out how a resource was used in the past and will be used in the future. That is important to construct pipeline barriers that don't stall the whole GPU. # Defined nodes This implementation has nodes for: - Blit image - Clear color image - Copy buffers to buffers - Copy buffers to images - Copy images to images - Copy images to buffers - Dispatch compute shader - Fill buffers - Synchronization Each node has a node info, create info and data struct. The create info contains all data to construct the node, including the links of the graph. The data struct only contains the data stored inside the node. The node info contains the node specific implementation. > NOTE: Other nodes will be added after this PR lands to main. # Resources Before a render graph can be used, the resources should be registered to `VKResourceStateTracker`. In the final implementation this will be owned by the `VKDevice`. Registration of resources can be done by calling `VKResources.add_buffer` or `VKResources.add_image`. # Render graph Nodes can be added to the render graph. When adding a node its read/ write dependencies are extracted and converted into links (`VKNodeInfo. build_links`). When the caller wants to have a resource up to date the functions `VKRenderGraph.submit_for_read` or `VKRenderGraph.submit_for_present` can be called. These functions will select and order the nodes that are needed and convert them to `vkCmd*` commands. These commands include pipeline barrier and image layout transitions. The `vkCmd` are recorded into a command buffer which is sent to the device queue. ## Walking the graph Walking the render graph isn't implemented yet. The idea is to have a `Map<ResourceWithStamp, Vector<NodeHandle>> consumers` and `Map<ResourceWithStamp, NodeHandle> producers`. These attributes can be stored in the render graph and created when building the links, or can be created inside the VKScheduler as a variable. The exact detail which one would be better is unclear as there aren't any users yet. At the moment the scheduler would need them we need to figure out the best way to store and retrieve the consumers/producers. # Unit tests The render graph can be tested by enabling `WITH_GTEST` and use `vk_render_graph` as a filter. ``` bin/tests/blender_test --gtest_filter="vk_render_graph*" ``` Pull Request: https://projects.blender.org/blender/blender/pulls/120427
2024-04-19 10:46:50 +02:00
}
void VKCommandBuilder::add_buffer_read_barriers(VKRenderGraph &render_graph,
NodeHandle node_handle,
VkPipelineStageFlags node_stages,
Barrier &r_barrier)
Vulkan: Render graph core **Design Task**: blender/blender#118330 This PR adds the core of the render graph. The render graph isn't used. Current implementation of the Vulkan Backend is slow by design. We focused on stability, before performance. With the new introduced render graph the focus will shift to performance and keep the stability at where it is. Some highlights: - Every context will get its own render graph. (`VKRenderGraph`). - Resources (and resource state tracking) is device specific (`VKResourceStateTracker`). - No node reordering / sub graph execution has been implemented. Currently All nodes in the graph is executed in the order they were added. (`VKScheduler`). - The links inside the graph describe the resources the nodes read from (input links) or writes to (output links) - When resources are written to a resource stamp is incremented allowing keeping track of which nodes needs which stamp of a resource. - At each link the access information (how does the node accesses the resource) and image layout (for image resources) are stored. This allows the render graph to find out how a resource was used in the past and will be used in the future. That is important to construct pipeline barriers that don't stall the whole GPU. # Defined nodes This implementation has nodes for: - Blit image - Clear color image - Copy buffers to buffers - Copy buffers to images - Copy images to images - Copy images to buffers - Dispatch compute shader - Fill buffers - Synchronization Each node has a node info, create info and data struct. The create info contains all data to construct the node, including the links of the graph. The data struct only contains the data stored inside the node. The node info contains the node specific implementation. > NOTE: Other nodes will be added after this PR lands to main. # Resources Before a render graph can be used, the resources should be registered to `VKResourceStateTracker`. In the final implementation this will be owned by the `VKDevice`. Registration of resources can be done by calling `VKResources.add_buffer` or `VKResources.add_image`. # Render graph Nodes can be added to the render graph. When adding a node its read/ write dependencies are extracted and converted into links (`VKNodeInfo. build_links`). When the caller wants to have a resource up to date the functions `VKRenderGraph.submit_for_read` or `VKRenderGraph.submit_for_present` can be called. These functions will select and order the nodes that are needed and convert them to `vkCmd*` commands. These commands include pipeline barrier and image layout transitions. The `vkCmd` are recorded into a command buffer which is sent to the device queue. ## Walking the graph Walking the render graph isn't implemented yet. The idea is to have a `Map<ResourceWithStamp, Vector<NodeHandle>> consumers` and `Map<ResourceWithStamp, NodeHandle> producers`. These attributes can be stored in the render graph and created when building the links, or can be created inside the VKScheduler as a variable. The exact detail which one would be better is unclear as there aren't any users yet. At the moment the scheduler would need them we need to figure out the best way to store and retrieve the consumers/producers. # Unit tests The render graph can be tested by enabling `WITH_GTEST` and use `vk_render_graph` as a filter. ``` bin/tests/blender_test --gtest_filter="vk_render_graph*" ``` Pull Request: https://projects.blender.org/blender/blender/pulls/120427
2024-04-19 10:46:50 +02:00
{
for (const VKRenderGraphLink &link : render_graph.links_[node_handle].inputs) {
if (!link.is_link_to_buffer()) {
continue;
}
Vulkan: Render graph core **Design Task**: blender/blender#118330 This PR adds the core of the render graph. The render graph isn't used. Current implementation of the Vulkan Backend is slow by design. We focused on stability, before performance. With the new introduced render graph the focus will shift to performance and keep the stability at where it is. Some highlights: - Every context will get its own render graph. (`VKRenderGraph`). - Resources (and resource state tracking) is device specific (`VKResourceStateTracker`). - No node reordering / sub graph execution has been implemented. Currently All nodes in the graph is executed in the order they were added. (`VKScheduler`). - The links inside the graph describe the resources the nodes read from (input links) or writes to (output links) - When resources are written to a resource stamp is incremented allowing keeping track of which nodes needs which stamp of a resource. - At each link the access information (how does the node accesses the resource) and image layout (for image resources) are stored. This allows the render graph to find out how a resource was used in the past and will be used in the future. That is important to construct pipeline barriers that don't stall the whole GPU. # Defined nodes This implementation has nodes for: - Blit image - Clear color image - Copy buffers to buffers - Copy buffers to images - Copy images to images - Copy images to buffers - Dispatch compute shader - Fill buffers - Synchronization Each node has a node info, create info and data struct. The create info contains all data to construct the node, including the links of the graph. The data struct only contains the data stored inside the node. The node info contains the node specific implementation. > NOTE: Other nodes will be added after this PR lands to main. # Resources Before a render graph can be used, the resources should be registered to `VKResourceStateTracker`. In the final implementation this will be owned by the `VKDevice`. Registration of resources can be done by calling `VKResources.add_buffer` or `VKResources.add_image`. # Render graph Nodes can be added to the render graph. When adding a node its read/ write dependencies are extracted and converted into links (`VKNodeInfo. build_links`). When the caller wants to have a resource up to date the functions `VKRenderGraph.submit_for_read` or `VKRenderGraph.submit_for_present` can be called. These functions will select and order the nodes that are needed and convert them to `vkCmd*` commands. These commands include pipeline barrier and image layout transitions. The `vkCmd` are recorded into a command buffer which is sent to the device queue. ## Walking the graph Walking the render graph isn't implemented yet. The idea is to have a `Map<ResourceWithStamp, Vector<NodeHandle>> consumers` and `Map<ResourceWithStamp, NodeHandle> producers`. These attributes can be stored in the render graph and created when building the links, or can be created inside the VKScheduler as a variable. The exact detail which one would be better is unclear as there aren't any users yet. At the moment the scheduler would need them we need to figure out the best way to store and retrieve the consumers/producers. # Unit tests The render graph can be tested by enabling `WITH_GTEST` and use `vk_render_graph` as a filter. ``` bin/tests/blender_test --gtest_filter="vk_render_graph*" ``` Pull Request: https://projects.blender.org/blender/blender/pulls/120427
2024-04-19 10:46:50 +02:00
const ResourceWithStamp &versioned_resource = link.resource;
VKResourceStateTracker::Resource &resource = render_graph.resources_.resources_.lookup(
versioned_resource.handle);
VKResourceBarrierState &resource_state = resource.barrier_state;
Vulkan: Layer tracking during render scope EEVEE can bind layers of a texture that is also used as an attachment. When binding the image layout of these specific layers can be different that the image layout of the whole image. This fixes the known synchronization issues inside EEVEE. wasp_bot, tree_creature and wanderer scenes can be rendered without any synchronization issue reported by the Vulkan validation layers. Design task: #124214 When beginning to render the attachments are being evaluated. If there is an arrayed texture (with multiple layers) the individual layers of that texture can be tracked during until the rendering is ended. When the same texture is bound to a shader it will be a different layer (otherwise there is a feedback loop, which isn't allowed). The bound layers will typically need a different layout the transition to the new layout is executed and recorded. When the rendering ends, the layers are transitioned back to the layout the texture is expected in. It can happen that a layer is used multiple times during the same rendering. In that case the rendering should be suspended to perform the transition. Image layout transitions are not allowed during rendering. There is one place where a layer needs to be transited multiple times that is when EEVEE wants to extract the thickness from the shadow. The thickness is stored inside the gbuffer_normal which is also used as an attachment. Eval then samples the thickness from the gbuffer_normal as a sampler. To work around this issue we suspend the rendering when a `GPU_BARRIER_SHADER_IMAGE_ACCESS` is signaled. Pull Request: https://projects.blender.org/blender/blender/pulls/124407
2024-07-16 16:39:18 +02:00
const bool is_first_read = resource_state.is_new_stamp();
if (!is_first_read &&
(resource_state.vk_access & link.vk_access_flags) == link.vk_access_flags &&
(resource_state.vk_pipeline_stages & node_stages) == node_stages)
{
Vulkan: Render graph core **Design Task**: blender/blender#118330 This PR adds the core of the render graph. The render graph isn't used. Current implementation of the Vulkan Backend is slow by design. We focused on stability, before performance. With the new introduced render graph the focus will shift to performance and keep the stability at where it is. Some highlights: - Every context will get its own render graph. (`VKRenderGraph`). - Resources (and resource state tracking) is device specific (`VKResourceStateTracker`). - No node reordering / sub graph execution has been implemented. Currently All nodes in the graph is executed in the order they were added. (`VKScheduler`). - The links inside the graph describe the resources the nodes read from (input links) or writes to (output links) - When resources are written to a resource stamp is incremented allowing keeping track of which nodes needs which stamp of a resource. - At each link the access information (how does the node accesses the resource) and image layout (for image resources) are stored. This allows the render graph to find out how a resource was used in the past and will be used in the future. That is important to construct pipeline barriers that don't stall the whole GPU. # Defined nodes This implementation has nodes for: - Blit image - Clear color image - Copy buffers to buffers - Copy buffers to images - Copy images to images - Copy images to buffers - Dispatch compute shader - Fill buffers - Synchronization Each node has a node info, create info and data struct. The create info contains all data to construct the node, including the links of the graph. The data struct only contains the data stored inside the node. The node info contains the node specific implementation. > NOTE: Other nodes will be added after this PR lands to main. # Resources Before a render graph can be used, the resources should be registered to `VKResourceStateTracker`. In the final implementation this will be owned by the `VKDevice`. Registration of resources can be done by calling `VKResources.add_buffer` or `VKResources.add_image`. # Render graph Nodes can be added to the render graph. When adding a node its read/ write dependencies are extracted and converted into links (`VKNodeInfo. build_links`). When the caller wants to have a resource up to date the functions `VKRenderGraph.submit_for_read` or `VKRenderGraph.submit_for_present` can be called. These functions will select and order the nodes that are needed and convert them to `vkCmd*` commands. These commands include pipeline barrier and image layout transitions. The `vkCmd` are recorded into a command buffer which is sent to the device queue. ## Walking the graph Walking the render graph isn't implemented yet. The idea is to have a `Map<ResourceWithStamp, Vector<NodeHandle>> consumers` and `Map<ResourceWithStamp, NodeHandle> producers`. These attributes can be stored in the render graph and created when building the links, or can be created inside the VKScheduler as a variable. The exact detail which one would be better is unclear as there aren't any users yet. At the moment the scheduler would need them we need to figure out the best way to store and retrieve the consumers/producers. # Unit tests The render graph can be tested by enabling `WITH_GTEST` and use `vk_render_graph` as a filter. ``` bin/tests/blender_test --gtest_filter="vk_render_graph*" ``` Pull Request: https://projects.blender.org/blender/blender/pulls/120427
2024-04-19 10:46:50 +02:00
/* Has already been covered in a previous call no need to add this one. */
continue;
}
const VkAccessFlags wait_access = resource_state.vk_access;
r_barrier.src_stage_mask |= resource_state.vk_pipeline_stages;
r_barrier.dst_stage_mask |= node_stages;
Vulkan: Render graph core **Design Task**: blender/blender#118330 This PR adds the core of the render graph. The render graph isn't used. Current implementation of the Vulkan Backend is slow by design. We focused on stability, before performance. With the new introduced render graph the focus will shift to performance and keep the stability at where it is. Some highlights: - Every context will get its own render graph. (`VKRenderGraph`). - Resources (and resource state tracking) is device specific (`VKResourceStateTracker`). - No node reordering / sub graph execution has been implemented. Currently All nodes in the graph is executed in the order they were added. (`VKScheduler`). - The links inside the graph describe the resources the nodes read from (input links) or writes to (output links) - When resources are written to a resource stamp is incremented allowing keeping track of which nodes needs which stamp of a resource. - At each link the access information (how does the node accesses the resource) and image layout (for image resources) are stored. This allows the render graph to find out how a resource was used in the past and will be used in the future. That is important to construct pipeline barriers that don't stall the whole GPU. # Defined nodes This implementation has nodes for: - Blit image - Clear color image - Copy buffers to buffers - Copy buffers to images - Copy images to images - Copy images to buffers - Dispatch compute shader - Fill buffers - Synchronization Each node has a node info, create info and data struct. The create info contains all data to construct the node, including the links of the graph. The data struct only contains the data stored inside the node. The node info contains the node specific implementation. > NOTE: Other nodes will be added after this PR lands to main. # Resources Before a render graph can be used, the resources should be registered to `VKResourceStateTracker`. In the final implementation this will be owned by the `VKDevice`. Registration of resources can be done by calling `VKResources.add_buffer` or `VKResources.add_image`. # Render graph Nodes can be added to the render graph. When adding a node its read/ write dependencies are extracted and converted into links (`VKNodeInfo. build_links`). When the caller wants to have a resource up to date the functions `VKRenderGraph.submit_for_read` or `VKRenderGraph.submit_for_present` can be called. These functions will select and order the nodes that are needed and convert them to `vkCmd*` commands. These commands include pipeline barrier and image layout transitions. The `vkCmd` are recorded into a command buffer which is sent to the device queue. ## Walking the graph Walking the render graph isn't implemented yet. The idea is to have a `Map<ResourceWithStamp, Vector<NodeHandle>> consumers` and `Map<ResourceWithStamp, NodeHandle> producers`. These attributes can be stored in the render graph and created when building the links, or can be created inside the VKScheduler as a variable. The exact detail which one would be better is unclear as there aren't any users yet. At the moment the scheduler would need them we need to figure out the best way to store and retrieve the consumers/producers. # Unit tests The render graph can be tested by enabling `WITH_GTEST` and use `vk_render_graph` as a filter. ``` bin/tests/blender_test --gtest_filter="vk_render_graph*" ``` Pull Request: https://projects.blender.org/blender/blender/pulls/120427
2024-04-19 10:46:50 +02:00
Vulkan: Layer tracking during render scope EEVEE can bind layers of a texture that is also used as an attachment. When binding the image layout of these specific layers can be different that the image layout of the whole image. This fixes the known synchronization issues inside EEVEE. wasp_bot, tree_creature and wanderer scenes can be rendered without any synchronization issue reported by the Vulkan validation layers. Design task: #124214 When beginning to render the attachments are being evaluated. If there is an arrayed texture (with multiple layers) the individual layers of that texture can be tracked during until the rendering is ended. When the same texture is bound to a shader it will be a different layer (otherwise there is a feedback loop, which isn't allowed). The bound layers will typically need a different layout the transition to the new layout is executed and recorded. When the rendering ends, the layers are transitioned back to the layout the texture is expected in. It can happen that a layer is used multiple times during the same rendering. In that case the rendering should be suspended to perform the transition. Image layout transitions are not allowed during rendering. There is one place where a layer needs to be transited multiple times that is when EEVEE wants to extract the thickness from the shadow. The thickness is stored inside the gbuffer_normal which is also used as an attachment. Eval then samples the thickness from the gbuffer_normal as a sampler. To work around this issue we suspend the rendering when a `GPU_BARRIER_SHADER_IMAGE_ACCESS` is signaled. Pull Request: https://projects.blender.org/blender/blender/pulls/124407
2024-07-16 16:39:18 +02:00
if (is_first_read) {
resource_state.vk_access = link.vk_access_flags;
resource_state.vk_pipeline_stages = node_stages;
}
else {
resource_state.vk_access |= link.vk_access_flags;
resource_state.vk_pipeline_stages |= node_stages;
}
add_buffer_barrier(resource.buffer.vk_buffer, r_barrier, wait_access, link.vk_access_flags);
Vulkan: Render graph core **Design Task**: blender/blender#118330 This PR adds the core of the render graph. The render graph isn't used. Current implementation of the Vulkan Backend is slow by design. We focused on stability, before performance. With the new introduced render graph the focus will shift to performance and keep the stability at where it is. Some highlights: - Every context will get its own render graph. (`VKRenderGraph`). - Resources (and resource state tracking) is device specific (`VKResourceStateTracker`). - No node reordering / sub graph execution has been implemented. Currently All nodes in the graph is executed in the order they were added. (`VKScheduler`). - The links inside the graph describe the resources the nodes read from (input links) or writes to (output links) - When resources are written to a resource stamp is incremented allowing keeping track of which nodes needs which stamp of a resource. - At each link the access information (how does the node accesses the resource) and image layout (for image resources) are stored. This allows the render graph to find out how a resource was used in the past and will be used in the future. That is important to construct pipeline barriers that don't stall the whole GPU. # Defined nodes This implementation has nodes for: - Blit image - Clear color image - Copy buffers to buffers - Copy buffers to images - Copy images to images - Copy images to buffers - Dispatch compute shader - Fill buffers - Synchronization Each node has a node info, create info and data struct. The create info contains all data to construct the node, including the links of the graph. The data struct only contains the data stored inside the node. The node info contains the node specific implementation. > NOTE: Other nodes will be added after this PR lands to main. # Resources Before a render graph can be used, the resources should be registered to `VKResourceStateTracker`. In the final implementation this will be owned by the `VKDevice`. Registration of resources can be done by calling `VKResources.add_buffer` or `VKResources.add_image`. # Render graph Nodes can be added to the render graph. When adding a node its read/ write dependencies are extracted and converted into links (`VKNodeInfo. build_links`). When the caller wants to have a resource up to date the functions `VKRenderGraph.submit_for_read` or `VKRenderGraph.submit_for_present` can be called. These functions will select and order the nodes that are needed and convert them to `vkCmd*` commands. These commands include pipeline barrier and image layout transitions. The `vkCmd` are recorded into a command buffer which is sent to the device queue. ## Walking the graph Walking the render graph isn't implemented yet. The idea is to have a `Map<ResourceWithStamp, Vector<NodeHandle>> consumers` and `Map<ResourceWithStamp, NodeHandle> producers`. These attributes can be stored in the render graph and created when building the links, or can be created inside the VKScheduler as a variable. The exact detail which one would be better is unclear as there aren't any users yet. At the moment the scheduler would need them we need to figure out the best way to store and retrieve the consumers/producers. # Unit tests The render graph can be tested by enabling `WITH_GTEST` and use `vk_render_graph` as a filter. ``` bin/tests/blender_test --gtest_filter="vk_render_graph*" ``` Pull Request: https://projects.blender.org/blender/blender/pulls/120427
2024-04-19 10:46:50 +02:00
}
}
void VKCommandBuilder::add_buffer_write_barriers(VKRenderGraph &render_graph,
NodeHandle node_handle,
VkPipelineStageFlags node_stages,
Barrier &r_barrier)
Vulkan: Render graph core **Design Task**: blender/blender#118330 This PR adds the core of the render graph. The render graph isn't used. Current implementation of the Vulkan Backend is slow by design. We focused on stability, before performance. With the new introduced render graph the focus will shift to performance and keep the stability at where it is. Some highlights: - Every context will get its own render graph. (`VKRenderGraph`). - Resources (and resource state tracking) is device specific (`VKResourceStateTracker`). - No node reordering / sub graph execution has been implemented. Currently All nodes in the graph is executed in the order they were added. (`VKScheduler`). - The links inside the graph describe the resources the nodes read from (input links) or writes to (output links) - When resources are written to a resource stamp is incremented allowing keeping track of which nodes needs which stamp of a resource. - At each link the access information (how does the node accesses the resource) and image layout (for image resources) are stored. This allows the render graph to find out how a resource was used in the past and will be used in the future. That is important to construct pipeline barriers that don't stall the whole GPU. # Defined nodes This implementation has nodes for: - Blit image - Clear color image - Copy buffers to buffers - Copy buffers to images - Copy images to images - Copy images to buffers - Dispatch compute shader - Fill buffers - Synchronization Each node has a node info, create info and data struct. The create info contains all data to construct the node, including the links of the graph. The data struct only contains the data stored inside the node. The node info contains the node specific implementation. > NOTE: Other nodes will be added after this PR lands to main. # Resources Before a render graph can be used, the resources should be registered to `VKResourceStateTracker`. In the final implementation this will be owned by the `VKDevice`. Registration of resources can be done by calling `VKResources.add_buffer` or `VKResources.add_image`. # Render graph Nodes can be added to the render graph. When adding a node its read/ write dependencies are extracted and converted into links (`VKNodeInfo. build_links`). When the caller wants to have a resource up to date the functions `VKRenderGraph.submit_for_read` or `VKRenderGraph.submit_for_present` can be called. These functions will select and order the nodes that are needed and convert them to `vkCmd*` commands. These commands include pipeline barrier and image layout transitions. The `vkCmd` are recorded into a command buffer which is sent to the device queue. ## Walking the graph Walking the render graph isn't implemented yet. The idea is to have a `Map<ResourceWithStamp, Vector<NodeHandle>> consumers` and `Map<ResourceWithStamp, NodeHandle> producers`. These attributes can be stored in the render graph and created when building the links, or can be created inside the VKScheduler as a variable. The exact detail which one would be better is unclear as there aren't any users yet. At the moment the scheduler would need them we need to figure out the best way to store and retrieve the consumers/producers. # Unit tests The render graph can be tested by enabling `WITH_GTEST` and use `vk_render_graph` as a filter. ``` bin/tests/blender_test --gtest_filter="vk_render_graph*" ``` Pull Request: https://projects.blender.org/blender/blender/pulls/120427
2024-04-19 10:46:50 +02:00
{
for (const VKRenderGraphLink link : render_graph.links_[node_handle].outputs) {
if (!link.is_link_to_buffer()) {
continue;
}
Vulkan: Render graph core **Design Task**: blender/blender#118330 This PR adds the core of the render graph. The render graph isn't used. Current implementation of the Vulkan Backend is slow by design. We focused on stability, before performance. With the new introduced render graph the focus will shift to performance and keep the stability at where it is. Some highlights: - Every context will get its own render graph. (`VKRenderGraph`). - Resources (and resource state tracking) is device specific (`VKResourceStateTracker`). - No node reordering / sub graph execution has been implemented. Currently All nodes in the graph is executed in the order they were added. (`VKScheduler`). - The links inside the graph describe the resources the nodes read from (input links) or writes to (output links) - When resources are written to a resource stamp is incremented allowing keeping track of which nodes needs which stamp of a resource. - At each link the access information (how does the node accesses the resource) and image layout (for image resources) are stored. This allows the render graph to find out how a resource was used in the past and will be used in the future. That is important to construct pipeline barriers that don't stall the whole GPU. # Defined nodes This implementation has nodes for: - Blit image - Clear color image - Copy buffers to buffers - Copy buffers to images - Copy images to images - Copy images to buffers - Dispatch compute shader - Fill buffers - Synchronization Each node has a node info, create info and data struct. The create info contains all data to construct the node, including the links of the graph. The data struct only contains the data stored inside the node. The node info contains the node specific implementation. > NOTE: Other nodes will be added after this PR lands to main. # Resources Before a render graph can be used, the resources should be registered to `VKResourceStateTracker`. In the final implementation this will be owned by the `VKDevice`. Registration of resources can be done by calling `VKResources.add_buffer` or `VKResources.add_image`. # Render graph Nodes can be added to the render graph. When adding a node its read/ write dependencies are extracted and converted into links (`VKNodeInfo. build_links`). When the caller wants to have a resource up to date the functions `VKRenderGraph.submit_for_read` or `VKRenderGraph.submit_for_present` can be called. These functions will select and order the nodes that are needed and convert them to `vkCmd*` commands. These commands include pipeline barrier and image layout transitions. The `vkCmd` are recorded into a command buffer which is sent to the device queue. ## Walking the graph Walking the render graph isn't implemented yet. The idea is to have a `Map<ResourceWithStamp, Vector<NodeHandle>> consumers` and `Map<ResourceWithStamp, NodeHandle> producers`. These attributes can be stored in the render graph and created when building the links, or can be created inside the VKScheduler as a variable. The exact detail which one would be better is unclear as there aren't any users yet. At the moment the scheduler would need them we need to figure out the best way to store and retrieve the consumers/producers. # Unit tests The render graph can be tested by enabling `WITH_GTEST` and use `vk_render_graph` as a filter. ``` bin/tests/blender_test --gtest_filter="vk_render_graph*" ``` Pull Request: https://projects.blender.org/blender/blender/pulls/120427
2024-04-19 10:46:50 +02:00
const ResourceWithStamp &versioned_resource = link.resource;
VKResourceStateTracker::Resource &resource = render_graph.resources_.resources_.lookup(
versioned_resource.handle);
VKResourceBarrierState &resource_state = resource.barrier_state;
const VkAccessFlags wait_access = resource_state.vk_access;
Vulkan: Render graph core **Design Task**: blender/blender#118330 This PR adds the core of the render graph. The render graph isn't used. Current implementation of the Vulkan Backend is slow by design. We focused on stability, before performance. With the new introduced render graph the focus will shift to performance and keep the stability at where it is. Some highlights: - Every context will get its own render graph. (`VKRenderGraph`). - Resources (and resource state tracking) is device specific (`VKResourceStateTracker`). - No node reordering / sub graph execution has been implemented. Currently All nodes in the graph is executed in the order they were added. (`VKScheduler`). - The links inside the graph describe the resources the nodes read from (input links) or writes to (output links) - When resources are written to a resource stamp is incremented allowing keeping track of which nodes needs which stamp of a resource. - At each link the access information (how does the node accesses the resource) and image layout (for image resources) are stored. This allows the render graph to find out how a resource was used in the past and will be used in the future. That is important to construct pipeline barriers that don't stall the whole GPU. # Defined nodes This implementation has nodes for: - Blit image - Clear color image - Copy buffers to buffers - Copy buffers to images - Copy images to images - Copy images to buffers - Dispatch compute shader - Fill buffers - Synchronization Each node has a node info, create info and data struct. The create info contains all data to construct the node, including the links of the graph. The data struct only contains the data stored inside the node. The node info contains the node specific implementation. > NOTE: Other nodes will be added after this PR lands to main. # Resources Before a render graph can be used, the resources should be registered to `VKResourceStateTracker`. In the final implementation this will be owned by the `VKDevice`. Registration of resources can be done by calling `VKResources.add_buffer` or `VKResources.add_image`. # Render graph Nodes can be added to the render graph. When adding a node its read/ write dependencies are extracted and converted into links (`VKNodeInfo. build_links`). When the caller wants to have a resource up to date the functions `VKRenderGraph.submit_for_read` or `VKRenderGraph.submit_for_present` can be called. These functions will select and order the nodes that are needed and convert them to `vkCmd*` commands. These commands include pipeline barrier and image layout transitions. The `vkCmd` are recorded into a command buffer which is sent to the device queue. ## Walking the graph Walking the render graph isn't implemented yet. The idea is to have a `Map<ResourceWithStamp, Vector<NodeHandle>> consumers` and `Map<ResourceWithStamp, NodeHandle> producers`. These attributes can be stored in the render graph and created when building the links, or can be created inside the VKScheduler as a variable. The exact detail which one would be better is unclear as there aren't any users yet. At the moment the scheduler would need them we need to figure out the best way to store and retrieve the consumers/producers. # Unit tests The render graph can be tested by enabling `WITH_GTEST` and use `vk_render_graph` as a filter. ``` bin/tests/blender_test --gtest_filter="vk_render_graph*" ``` Pull Request: https://projects.blender.org/blender/blender/pulls/120427
2024-04-19 10:46:50 +02:00
r_barrier.src_stage_mask |= resource_state.vk_pipeline_stages;
r_barrier.dst_stage_mask |= node_stages;
Vulkan: Render graph core **Design Task**: blender/blender#118330 This PR adds the core of the render graph. The render graph isn't used. Current implementation of the Vulkan Backend is slow by design. We focused on stability, before performance. With the new introduced render graph the focus will shift to performance and keep the stability at where it is. Some highlights: - Every context will get its own render graph. (`VKRenderGraph`). - Resources (and resource state tracking) is device specific (`VKResourceStateTracker`). - No node reordering / sub graph execution has been implemented. Currently All nodes in the graph is executed in the order they were added. (`VKScheduler`). - The links inside the graph describe the resources the nodes read from (input links) or writes to (output links) - When resources are written to a resource stamp is incremented allowing keeping track of which nodes needs which stamp of a resource. - At each link the access information (how does the node accesses the resource) and image layout (for image resources) are stored. This allows the render graph to find out how a resource was used in the past and will be used in the future. That is important to construct pipeline barriers that don't stall the whole GPU. # Defined nodes This implementation has nodes for: - Blit image - Clear color image - Copy buffers to buffers - Copy buffers to images - Copy images to images - Copy images to buffers - Dispatch compute shader - Fill buffers - Synchronization Each node has a node info, create info and data struct. The create info contains all data to construct the node, including the links of the graph. The data struct only contains the data stored inside the node. The node info contains the node specific implementation. > NOTE: Other nodes will be added after this PR lands to main. # Resources Before a render graph can be used, the resources should be registered to `VKResourceStateTracker`. In the final implementation this will be owned by the `VKDevice`. Registration of resources can be done by calling `VKResources.add_buffer` or `VKResources.add_image`. # Render graph Nodes can be added to the render graph. When adding a node its read/ write dependencies are extracted and converted into links (`VKNodeInfo. build_links`). When the caller wants to have a resource up to date the functions `VKRenderGraph.submit_for_read` or `VKRenderGraph.submit_for_present` can be called. These functions will select and order the nodes that are needed and convert them to `vkCmd*` commands. These commands include pipeline barrier and image layout transitions. The `vkCmd` are recorded into a command buffer which is sent to the device queue. ## Walking the graph Walking the render graph isn't implemented yet. The idea is to have a `Map<ResourceWithStamp, Vector<NodeHandle>> consumers` and `Map<ResourceWithStamp, NodeHandle> producers`. These attributes can be stored in the render graph and created when building the links, or can be created inside the VKScheduler as a variable. The exact detail which one would be better is unclear as there aren't any users yet. At the moment the scheduler would need them we need to figure out the best way to store and retrieve the consumers/producers. # Unit tests The render graph can be tested by enabling `WITH_GTEST` and use `vk_render_graph` as a filter. ``` bin/tests/blender_test --gtest_filter="vk_render_graph*" ``` Pull Request: https://projects.blender.org/blender/blender/pulls/120427
2024-04-19 10:46:50 +02:00
resource_state.vk_access = link.vk_access_flags;
resource_state.vk_pipeline_stages = node_stages;
Vulkan: Render graph core **Design Task**: blender/blender#118330 This PR adds the core of the render graph. The render graph isn't used. Current implementation of the Vulkan Backend is slow by design. We focused on stability, before performance. With the new introduced render graph the focus will shift to performance and keep the stability at where it is. Some highlights: - Every context will get its own render graph. (`VKRenderGraph`). - Resources (and resource state tracking) is device specific (`VKResourceStateTracker`). - No node reordering / sub graph execution has been implemented. Currently All nodes in the graph is executed in the order they were added. (`VKScheduler`). - The links inside the graph describe the resources the nodes read from (input links) or writes to (output links) - When resources are written to a resource stamp is incremented allowing keeping track of which nodes needs which stamp of a resource. - At each link the access information (how does the node accesses the resource) and image layout (for image resources) are stored. This allows the render graph to find out how a resource was used in the past and will be used in the future. That is important to construct pipeline barriers that don't stall the whole GPU. # Defined nodes This implementation has nodes for: - Blit image - Clear color image - Copy buffers to buffers - Copy buffers to images - Copy images to images - Copy images to buffers - Dispatch compute shader - Fill buffers - Synchronization Each node has a node info, create info and data struct. The create info contains all data to construct the node, including the links of the graph. The data struct only contains the data stored inside the node. The node info contains the node specific implementation. > NOTE: Other nodes will be added after this PR lands to main. # Resources Before a render graph can be used, the resources should be registered to `VKResourceStateTracker`. In the final implementation this will be owned by the `VKDevice`. Registration of resources can be done by calling `VKResources.add_buffer` or `VKResources.add_image`. # Render graph Nodes can be added to the render graph. When adding a node its read/ write dependencies are extracted and converted into links (`VKNodeInfo. build_links`). When the caller wants to have a resource up to date the functions `VKRenderGraph.submit_for_read` or `VKRenderGraph.submit_for_present` can be called. These functions will select and order the nodes that are needed and convert them to `vkCmd*` commands. These commands include pipeline barrier and image layout transitions. The `vkCmd` are recorded into a command buffer which is sent to the device queue. ## Walking the graph Walking the render graph isn't implemented yet. The idea is to have a `Map<ResourceWithStamp, Vector<NodeHandle>> consumers` and `Map<ResourceWithStamp, NodeHandle> producers`. These attributes can be stored in the render graph and created when building the links, or can be created inside the VKScheduler as a variable. The exact detail which one would be better is unclear as there aren't any users yet. At the moment the scheduler would need them we need to figure out the best way to store and retrieve the consumers/producers. # Unit tests The render graph can be tested by enabling `WITH_GTEST` and use `vk_render_graph` as a filter. ``` bin/tests/blender_test --gtest_filter="vk_render_graph*" ``` Pull Request: https://projects.blender.org/blender/blender/pulls/120427
2024-04-19 10:46:50 +02:00
if (wait_access != VK_ACCESS_NONE) {
add_buffer_barrier(resource.buffer.vk_buffer, r_barrier, wait_access, link.vk_access_flags);
Vulkan: Render graph core **Design Task**: blender/blender#118330 This PR adds the core of the render graph. The render graph isn't used. Current implementation of the Vulkan Backend is slow by design. We focused on stability, before performance. With the new introduced render graph the focus will shift to performance and keep the stability at where it is. Some highlights: - Every context will get its own render graph. (`VKRenderGraph`). - Resources (and resource state tracking) is device specific (`VKResourceStateTracker`). - No node reordering / sub graph execution has been implemented. Currently All nodes in the graph is executed in the order they were added. (`VKScheduler`). - The links inside the graph describe the resources the nodes read from (input links) or writes to (output links) - When resources are written to a resource stamp is incremented allowing keeping track of which nodes needs which stamp of a resource. - At each link the access information (how does the node accesses the resource) and image layout (for image resources) are stored. This allows the render graph to find out how a resource was used in the past and will be used in the future. That is important to construct pipeline barriers that don't stall the whole GPU. # Defined nodes This implementation has nodes for: - Blit image - Clear color image - Copy buffers to buffers - Copy buffers to images - Copy images to images - Copy images to buffers - Dispatch compute shader - Fill buffers - Synchronization Each node has a node info, create info and data struct. The create info contains all data to construct the node, including the links of the graph. The data struct only contains the data stored inside the node. The node info contains the node specific implementation. > NOTE: Other nodes will be added after this PR lands to main. # Resources Before a render graph can be used, the resources should be registered to `VKResourceStateTracker`. In the final implementation this will be owned by the `VKDevice`. Registration of resources can be done by calling `VKResources.add_buffer` or `VKResources.add_image`. # Render graph Nodes can be added to the render graph. When adding a node its read/ write dependencies are extracted and converted into links (`VKNodeInfo. build_links`). When the caller wants to have a resource up to date the functions `VKRenderGraph.submit_for_read` or `VKRenderGraph.submit_for_present` can be called. These functions will select and order the nodes that are needed and convert them to `vkCmd*` commands. These commands include pipeline barrier and image layout transitions. The `vkCmd` are recorded into a command buffer which is sent to the device queue. ## Walking the graph Walking the render graph isn't implemented yet. The idea is to have a `Map<ResourceWithStamp, Vector<NodeHandle>> consumers` and `Map<ResourceWithStamp, NodeHandle> producers`. These attributes can be stored in the render graph and created when building the links, or can be created inside the VKScheduler as a variable. The exact detail which one would be better is unclear as there aren't any users yet. At the moment the scheduler would need them we need to figure out the best way to store and retrieve the consumers/producers. # Unit tests The render graph can be tested by enabling `WITH_GTEST` and use `vk_render_graph` as a filter. ``` bin/tests/blender_test --gtest_filter="vk_render_graph*" ``` Pull Request: https://projects.blender.org/blender/blender/pulls/120427
2024-04-19 10:46:50 +02:00
}
}
}
void VKCommandBuilder::add_buffer_barrier(VkBuffer vk_buffer,
Barrier &r_barrier,
Vulkan: Render graph core **Design Task**: blender/blender#118330 This PR adds the core of the render graph. The render graph isn't used. Current implementation of the Vulkan Backend is slow by design. We focused on stability, before performance. With the new introduced render graph the focus will shift to performance and keep the stability at where it is. Some highlights: - Every context will get its own render graph. (`VKRenderGraph`). - Resources (and resource state tracking) is device specific (`VKResourceStateTracker`). - No node reordering / sub graph execution has been implemented. Currently All nodes in the graph is executed in the order they were added. (`VKScheduler`). - The links inside the graph describe the resources the nodes read from (input links) or writes to (output links) - When resources are written to a resource stamp is incremented allowing keeping track of which nodes needs which stamp of a resource. - At each link the access information (how does the node accesses the resource) and image layout (for image resources) are stored. This allows the render graph to find out how a resource was used in the past and will be used in the future. That is important to construct pipeline barriers that don't stall the whole GPU. # Defined nodes This implementation has nodes for: - Blit image - Clear color image - Copy buffers to buffers - Copy buffers to images - Copy images to images - Copy images to buffers - Dispatch compute shader - Fill buffers - Synchronization Each node has a node info, create info and data struct. The create info contains all data to construct the node, including the links of the graph. The data struct only contains the data stored inside the node. The node info contains the node specific implementation. > NOTE: Other nodes will be added after this PR lands to main. # Resources Before a render graph can be used, the resources should be registered to `VKResourceStateTracker`. In the final implementation this will be owned by the `VKDevice`. Registration of resources can be done by calling `VKResources.add_buffer` or `VKResources.add_image`. # Render graph Nodes can be added to the render graph. When adding a node its read/ write dependencies are extracted and converted into links (`VKNodeInfo. build_links`). When the caller wants to have a resource up to date the functions `VKRenderGraph.submit_for_read` or `VKRenderGraph.submit_for_present` can be called. These functions will select and order the nodes that are needed and convert them to `vkCmd*` commands. These commands include pipeline barrier and image layout transitions. The `vkCmd` are recorded into a command buffer which is sent to the device queue. ## Walking the graph Walking the render graph isn't implemented yet. The idea is to have a `Map<ResourceWithStamp, Vector<NodeHandle>> consumers` and `Map<ResourceWithStamp, NodeHandle> producers`. These attributes can be stored in the render graph and created when building the links, or can be created inside the VKScheduler as a variable. The exact detail which one would be better is unclear as there aren't any users yet. At the moment the scheduler would need them we need to figure out the best way to store and retrieve the consumers/producers. # Unit tests The render graph can be tested by enabling `WITH_GTEST` and use `vk_render_graph` as a filter. ``` bin/tests/blender_test --gtest_filter="vk_render_graph*" ``` Pull Request: https://projects.blender.org/blender/blender/pulls/120427
2024-04-19 10:46:50 +02:00
VkAccessFlags src_access_mask,
VkAccessFlags dst_access_mask)
{
for (VkBufferMemoryBarrier &vk_buffer_memory_barrier :
vk_buffer_memory_barriers_.as_mutable_span().drop_front(
r_barrier.buffer_memory_barriers.start()))
{
if (vk_buffer_memory_barrier.buffer == vk_buffer) {
/* When registering read/write buffers, it can be that the node internally requires
* read/write. In this case we adjust the dstAccessMask of the read barrier. */
if ((vk_buffer_memory_barrier.dstAccessMask & src_access_mask) == src_access_mask) {
vk_buffer_memory_barrier.dstAccessMask |= dst_access_mask;
return;
}
/* When re-registering resources we can skip if access mask already contain all the flags.
*/
if ((vk_buffer_memory_barrier.dstAccessMask & dst_access_mask) == dst_access_mask &&
(vk_buffer_memory_barrier.srcAccessMask & src_access_mask) == src_access_mask)
{
return;
}
}
}
vk_buffer_memory_barriers_.append({VK_STRUCTURE_TYPE_BUFFER_MEMORY_BARRIER,
nullptr,
src_access_mask,
dst_access_mask,
VK_QUEUE_FAMILY_IGNORED,
VK_QUEUE_FAMILY_IGNORED,
vk_buffer,
0,
VK_WHOLE_SIZE});
Vulkan: Render graph core **Design Task**: blender/blender#118330 This PR adds the core of the render graph. The render graph isn't used. Current implementation of the Vulkan Backend is slow by design. We focused on stability, before performance. With the new introduced render graph the focus will shift to performance and keep the stability at where it is. Some highlights: - Every context will get its own render graph. (`VKRenderGraph`). - Resources (and resource state tracking) is device specific (`VKResourceStateTracker`). - No node reordering / sub graph execution has been implemented. Currently All nodes in the graph is executed in the order they were added. (`VKScheduler`). - The links inside the graph describe the resources the nodes read from (input links) or writes to (output links) - When resources are written to a resource stamp is incremented allowing keeping track of which nodes needs which stamp of a resource. - At each link the access information (how does the node accesses the resource) and image layout (for image resources) are stored. This allows the render graph to find out how a resource was used in the past and will be used in the future. That is important to construct pipeline barriers that don't stall the whole GPU. # Defined nodes This implementation has nodes for: - Blit image - Clear color image - Copy buffers to buffers - Copy buffers to images - Copy images to images - Copy images to buffers - Dispatch compute shader - Fill buffers - Synchronization Each node has a node info, create info and data struct. The create info contains all data to construct the node, including the links of the graph. The data struct only contains the data stored inside the node. The node info contains the node specific implementation. > NOTE: Other nodes will be added after this PR lands to main. # Resources Before a render graph can be used, the resources should be registered to `VKResourceStateTracker`. In the final implementation this will be owned by the `VKDevice`. Registration of resources can be done by calling `VKResources.add_buffer` or `VKResources.add_image`. # Render graph Nodes can be added to the render graph. When adding a node its read/ write dependencies are extracted and converted into links (`VKNodeInfo. build_links`). When the caller wants to have a resource up to date the functions `VKRenderGraph.submit_for_read` or `VKRenderGraph.submit_for_present` can be called. These functions will select and order the nodes that are needed and convert them to `vkCmd*` commands. These commands include pipeline barrier and image layout transitions. The `vkCmd` are recorded into a command buffer which is sent to the device queue. ## Walking the graph Walking the render graph isn't implemented yet. The idea is to have a `Map<ResourceWithStamp, Vector<NodeHandle>> consumers` and `Map<ResourceWithStamp, NodeHandle> producers`. These attributes can be stored in the render graph and created when building the links, or can be created inside the VKScheduler as a variable. The exact detail which one would be better is unclear as there aren't any users yet. At the moment the scheduler would need them we need to figure out the best way to store and retrieve the consumers/producers. # Unit tests The render graph can be tested by enabling `WITH_GTEST` and use `vk_render_graph` as a filter. ``` bin/tests/blender_test --gtest_filter="vk_render_graph*" ``` Pull Request: https://projects.blender.org/blender/blender/pulls/120427
2024-04-19 10:46:50 +02:00
}
void VKCommandBuilder::add_image_barriers(VKRenderGraph &render_graph,
NodeHandle node_handle,
VkPipelineStageFlags node_stages,
LayeredImageTracker &layered_tracker,
Barrier &r_barrier,
bool within_rendering)
Vulkan: Render graph core **Design Task**: blender/blender#118330 This PR adds the core of the render graph. The render graph isn't used. Current implementation of the Vulkan Backend is slow by design. We focused on stability, before performance. With the new introduced render graph the focus will shift to performance and keep the stability at where it is. Some highlights: - Every context will get its own render graph. (`VKRenderGraph`). - Resources (and resource state tracking) is device specific (`VKResourceStateTracker`). - No node reordering / sub graph execution has been implemented. Currently All nodes in the graph is executed in the order they were added. (`VKScheduler`). - The links inside the graph describe the resources the nodes read from (input links) or writes to (output links) - When resources are written to a resource stamp is incremented allowing keeping track of which nodes needs which stamp of a resource. - At each link the access information (how does the node accesses the resource) and image layout (for image resources) are stored. This allows the render graph to find out how a resource was used in the past and will be used in the future. That is important to construct pipeline barriers that don't stall the whole GPU. # Defined nodes This implementation has nodes for: - Blit image - Clear color image - Copy buffers to buffers - Copy buffers to images - Copy images to images - Copy images to buffers - Dispatch compute shader - Fill buffers - Synchronization Each node has a node info, create info and data struct. The create info contains all data to construct the node, including the links of the graph. The data struct only contains the data stored inside the node. The node info contains the node specific implementation. > NOTE: Other nodes will be added after this PR lands to main. # Resources Before a render graph can be used, the resources should be registered to `VKResourceStateTracker`. In the final implementation this will be owned by the `VKDevice`. Registration of resources can be done by calling `VKResources.add_buffer` or `VKResources.add_image`. # Render graph Nodes can be added to the render graph. When adding a node its read/ write dependencies are extracted and converted into links (`VKNodeInfo. build_links`). When the caller wants to have a resource up to date the functions `VKRenderGraph.submit_for_read` or `VKRenderGraph.submit_for_present` can be called. These functions will select and order the nodes that are needed and convert them to `vkCmd*` commands. These commands include pipeline barrier and image layout transitions. The `vkCmd` are recorded into a command buffer which is sent to the device queue. ## Walking the graph Walking the render graph isn't implemented yet. The idea is to have a `Map<ResourceWithStamp, Vector<NodeHandle>> consumers` and `Map<ResourceWithStamp, NodeHandle> producers`. These attributes can be stored in the render graph and created when building the links, or can be created inside the VKScheduler as a variable. The exact detail which one would be better is unclear as there aren't any users yet. At the moment the scheduler would need them we need to figure out the best way to store and retrieve the consumers/producers. # Unit tests The render graph can be tested by enabling `WITH_GTEST` and use `vk_render_graph` as a filter. ``` bin/tests/blender_test --gtest_filter="vk_render_graph*" ``` Pull Request: https://projects.blender.org/blender/blender/pulls/120427
2024-04-19 10:46:50 +02:00
{
r_barrier.image_memory_barriers = IndexRange(vk_image_memory_barriers_.size(), 0);
add_image_read_barriers(
render_graph, node_handle, node_stages, layered_tracker, r_barrier, within_rendering);
add_image_write_barriers(
render_graph, node_handle, node_stages, layered_tracker, r_barrier, within_rendering);
r_barrier.image_memory_barriers = r_barrier.image_memory_barriers.with_new_end(
vk_image_memory_barriers_.size());
Vulkan: Render graph core **Design Task**: blender/blender#118330 This PR adds the core of the render graph. The render graph isn't used. Current implementation of the Vulkan Backend is slow by design. We focused on stability, before performance. With the new introduced render graph the focus will shift to performance and keep the stability at where it is. Some highlights: - Every context will get its own render graph. (`VKRenderGraph`). - Resources (and resource state tracking) is device specific (`VKResourceStateTracker`). - No node reordering / sub graph execution has been implemented. Currently All nodes in the graph is executed in the order they were added. (`VKScheduler`). - The links inside the graph describe the resources the nodes read from (input links) or writes to (output links) - When resources are written to a resource stamp is incremented allowing keeping track of which nodes needs which stamp of a resource. - At each link the access information (how does the node accesses the resource) and image layout (for image resources) are stored. This allows the render graph to find out how a resource was used in the past and will be used in the future. That is important to construct pipeline barriers that don't stall the whole GPU. # Defined nodes This implementation has nodes for: - Blit image - Clear color image - Copy buffers to buffers - Copy buffers to images - Copy images to images - Copy images to buffers - Dispatch compute shader - Fill buffers - Synchronization Each node has a node info, create info and data struct. The create info contains all data to construct the node, including the links of the graph. The data struct only contains the data stored inside the node. The node info contains the node specific implementation. > NOTE: Other nodes will be added after this PR lands to main. # Resources Before a render graph can be used, the resources should be registered to `VKResourceStateTracker`. In the final implementation this will be owned by the `VKDevice`. Registration of resources can be done by calling `VKResources.add_buffer` or `VKResources.add_image`. # Render graph Nodes can be added to the render graph. When adding a node its read/ write dependencies are extracted and converted into links (`VKNodeInfo. build_links`). When the caller wants to have a resource up to date the functions `VKRenderGraph.submit_for_read` or `VKRenderGraph.submit_for_present` can be called. These functions will select and order the nodes that are needed and convert them to `vkCmd*` commands. These commands include pipeline barrier and image layout transitions. The `vkCmd` are recorded into a command buffer which is sent to the device queue. ## Walking the graph Walking the render graph isn't implemented yet. The idea is to have a `Map<ResourceWithStamp, Vector<NodeHandle>> consumers` and `Map<ResourceWithStamp, NodeHandle> producers`. These attributes can be stored in the render graph and created when building the links, or can be created inside the VKScheduler as a variable. The exact detail which one would be better is unclear as there aren't any users yet. At the moment the scheduler would need them we need to figure out the best way to store and retrieve the consumers/producers. # Unit tests The render graph can be tested by enabling `WITH_GTEST` and use `vk_render_graph` as a filter. ``` bin/tests/blender_test --gtest_filter="vk_render_graph*" ``` Pull Request: https://projects.blender.org/blender/blender/pulls/120427
2024-04-19 10:46:50 +02:00
}
void VKCommandBuilder::add_image_read_barriers(VKRenderGraph &render_graph,
NodeHandle node_handle,
VkPipelineStageFlags node_stages,
LayeredImageTracker &layered_tracker,
Barrier &r_barrier,
bool within_rendering)
Vulkan: Render graph core **Design Task**: blender/blender#118330 This PR adds the core of the render graph. The render graph isn't used. Current implementation of the Vulkan Backend is slow by design. We focused on stability, before performance. With the new introduced render graph the focus will shift to performance and keep the stability at where it is. Some highlights: - Every context will get its own render graph. (`VKRenderGraph`). - Resources (and resource state tracking) is device specific (`VKResourceStateTracker`). - No node reordering / sub graph execution has been implemented. Currently All nodes in the graph is executed in the order they were added. (`VKScheduler`). - The links inside the graph describe the resources the nodes read from (input links) or writes to (output links) - When resources are written to a resource stamp is incremented allowing keeping track of which nodes needs which stamp of a resource. - At each link the access information (how does the node accesses the resource) and image layout (for image resources) are stored. This allows the render graph to find out how a resource was used in the past and will be used in the future. That is important to construct pipeline barriers that don't stall the whole GPU. # Defined nodes This implementation has nodes for: - Blit image - Clear color image - Copy buffers to buffers - Copy buffers to images - Copy images to images - Copy images to buffers - Dispatch compute shader - Fill buffers - Synchronization Each node has a node info, create info and data struct. The create info contains all data to construct the node, including the links of the graph. The data struct only contains the data stored inside the node. The node info contains the node specific implementation. > NOTE: Other nodes will be added after this PR lands to main. # Resources Before a render graph can be used, the resources should be registered to `VKResourceStateTracker`. In the final implementation this will be owned by the `VKDevice`. Registration of resources can be done by calling `VKResources.add_buffer` or `VKResources.add_image`. # Render graph Nodes can be added to the render graph. When adding a node its read/ write dependencies are extracted and converted into links (`VKNodeInfo. build_links`). When the caller wants to have a resource up to date the functions `VKRenderGraph.submit_for_read` or `VKRenderGraph.submit_for_present` can be called. These functions will select and order the nodes that are needed and convert them to `vkCmd*` commands. These commands include pipeline barrier and image layout transitions. The `vkCmd` are recorded into a command buffer which is sent to the device queue. ## Walking the graph Walking the render graph isn't implemented yet. The idea is to have a `Map<ResourceWithStamp, Vector<NodeHandle>> consumers` and `Map<ResourceWithStamp, NodeHandle> producers`. These attributes can be stored in the render graph and created when building the links, or can be created inside the VKScheduler as a variable. The exact detail which one would be better is unclear as there aren't any users yet. At the moment the scheduler would need them we need to figure out the best way to store and retrieve the consumers/producers. # Unit tests The render graph can be tested by enabling `WITH_GTEST` and use `vk_render_graph` as a filter. ``` bin/tests/blender_test --gtest_filter="vk_render_graph*" ``` Pull Request: https://projects.blender.org/blender/blender/pulls/120427
2024-04-19 10:46:50 +02:00
{
for (const VKRenderGraphLink &link : render_graph.links_[node_handle].inputs) {
if (link.is_link_to_buffer()) {
continue;
}
Vulkan: Render graph core **Design Task**: blender/blender#118330 This PR adds the core of the render graph. The render graph isn't used. Current implementation of the Vulkan Backend is slow by design. We focused on stability, before performance. With the new introduced render graph the focus will shift to performance and keep the stability at where it is. Some highlights: - Every context will get its own render graph. (`VKRenderGraph`). - Resources (and resource state tracking) is device specific (`VKResourceStateTracker`). - No node reordering / sub graph execution has been implemented. Currently All nodes in the graph is executed in the order they were added. (`VKScheduler`). - The links inside the graph describe the resources the nodes read from (input links) or writes to (output links) - When resources are written to a resource stamp is incremented allowing keeping track of which nodes needs which stamp of a resource. - At each link the access information (how does the node accesses the resource) and image layout (for image resources) are stored. This allows the render graph to find out how a resource was used in the past and will be used in the future. That is important to construct pipeline barriers that don't stall the whole GPU. # Defined nodes This implementation has nodes for: - Blit image - Clear color image - Copy buffers to buffers - Copy buffers to images - Copy images to images - Copy images to buffers - Dispatch compute shader - Fill buffers - Synchronization Each node has a node info, create info and data struct. The create info contains all data to construct the node, including the links of the graph. The data struct only contains the data stored inside the node. The node info contains the node specific implementation. > NOTE: Other nodes will be added after this PR lands to main. # Resources Before a render graph can be used, the resources should be registered to `VKResourceStateTracker`. In the final implementation this will be owned by the `VKDevice`. Registration of resources can be done by calling `VKResources.add_buffer` or `VKResources.add_image`. # Render graph Nodes can be added to the render graph. When adding a node its read/ write dependencies are extracted and converted into links (`VKNodeInfo. build_links`). When the caller wants to have a resource up to date the functions `VKRenderGraph.submit_for_read` or `VKRenderGraph.submit_for_present` can be called. These functions will select and order the nodes that are needed and convert them to `vkCmd*` commands. These commands include pipeline barrier and image layout transitions. The `vkCmd` are recorded into a command buffer which is sent to the device queue. ## Walking the graph Walking the render graph isn't implemented yet. The idea is to have a `Map<ResourceWithStamp, Vector<NodeHandle>> consumers` and `Map<ResourceWithStamp, NodeHandle> producers`. These attributes can be stored in the render graph and created when building the links, or can be created inside the VKScheduler as a variable. The exact detail which one would be better is unclear as there aren't any users yet. At the moment the scheduler would need them we need to figure out the best way to store and retrieve the consumers/producers. # Unit tests The render graph can be tested by enabling `WITH_GTEST` and use `vk_render_graph` as a filter. ``` bin/tests/blender_test --gtest_filter="vk_render_graph*" ``` Pull Request: https://projects.blender.org/blender/blender/pulls/120427
2024-04-19 10:46:50 +02:00
const ResourceWithStamp &versioned_resource = link.resource;
VKResourceStateTracker::Resource &resource = render_graph.resources_.resources_.lookup(
versioned_resource.handle);
VKResourceBarrierState &resource_state = resource.barrier_state;
Vulkan: Layer tracking during render scope EEVEE can bind layers of a texture that is also used as an attachment. When binding the image layout of these specific layers can be different that the image layout of the whole image. This fixes the known synchronization issues inside EEVEE. wasp_bot, tree_creature and wanderer scenes can be rendered without any synchronization issue reported by the Vulkan validation layers. Design task: #124214 When beginning to render the attachments are being evaluated. If there is an arrayed texture (with multiple layers) the individual layers of that texture can be tracked during until the rendering is ended. When the same texture is bound to a shader it will be a different layer (otherwise there is a feedback loop, which isn't allowed). The bound layers will typically need a different layout the transition to the new layout is executed and recorded. When the rendering ends, the layers are transitioned back to the layout the texture is expected in. It can happen that a layer is used multiple times during the same rendering. In that case the rendering should be suspended to perform the transition. Image layout transitions are not allowed during rendering. There is one place where a layer needs to be transited multiple times that is when EEVEE wants to extract the thickness from the shadow. The thickness is stored inside the gbuffer_normal which is also used as an attachment. Eval then samples the thickness from the gbuffer_normal as a sampler. To work around this issue we suspend the rendering when a `GPU_BARRIER_SHADER_IMAGE_ACCESS` is signaled. Pull Request: https://projects.blender.org/blender/blender/pulls/124407
2024-07-16 16:39:18 +02:00
const bool is_first_read = resource_state.is_new_stamp();
if ((!is_first_read) &&
(resource_state.vk_access & link.vk_access_flags) == link.vk_access_flags &&
(resource_state.vk_pipeline_stages & node_stages) == node_stages &&
Vulkan: Render graph core **Design Task**: blender/blender#118330 This PR adds the core of the render graph. The render graph isn't used. Current implementation of the Vulkan Backend is slow by design. We focused on stability, before performance. With the new introduced render graph the focus will shift to performance and keep the stability at where it is. Some highlights: - Every context will get its own render graph. (`VKRenderGraph`). - Resources (and resource state tracking) is device specific (`VKResourceStateTracker`). - No node reordering / sub graph execution has been implemented. Currently All nodes in the graph is executed in the order they were added. (`VKScheduler`). - The links inside the graph describe the resources the nodes read from (input links) or writes to (output links) - When resources are written to a resource stamp is incremented allowing keeping track of which nodes needs which stamp of a resource. - At each link the access information (how does the node accesses the resource) and image layout (for image resources) are stored. This allows the render graph to find out how a resource was used in the past and will be used in the future. That is important to construct pipeline barriers that don't stall the whole GPU. # Defined nodes This implementation has nodes for: - Blit image - Clear color image - Copy buffers to buffers - Copy buffers to images - Copy images to images - Copy images to buffers - Dispatch compute shader - Fill buffers - Synchronization Each node has a node info, create info and data struct. The create info contains all data to construct the node, including the links of the graph. The data struct only contains the data stored inside the node. The node info contains the node specific implementation. > NOTE: Other nodes will be added after this PR lands to main. # Resources Before a render graph can be used, the resources should be registered to `VKResourceStateTracker`. In the final implementation this will be owned by the `VKDevice`. Registration of resources can be done by calling `VKResources.add_buffer` or `VKResources.add_image`. # Render graph Nodes can be added to the render graph. When adding a node its read/ write dependencies are extracted and converted into links (`VKNodeInfo. build_links`). When the caller wants to have a resource up to date the functions `VKRenderGraph.submit_for_read` or `VKRenderGraph.submit_for_present` can be called. These functions will select and order the nodes that are needed and convert them to `vkCmd*` commands. These commands include pipeline barrier and image layout transitions. The `vkCmd` are recorded into a command buffer which is sent to the device queue. ## Walking the graph Walking the render graph isn't implemented yet. The idea is to have a `Map<ResourceWithStamp, Vector<NodeHandle>> consumers` and `Map<ResourceWithStamp, NodeHandle> producers`. These attributes can be stored in the render graph and created when building the links, or can be created inside the VKScheduler as a variable. The exact detail which one would be better is unclear as there aren't any users yet. At the moment the scheduler would need them we need to figure out the best way to store and retrieve the consumers/producers. # Unit tests The render graph can be tested by enabling `WITH_GTEST` and use `vk_render_graph` as a filter. ``` bin/tests/blender_test --gtest_filter="vk_render_graph*" ``` Pull Request: https://projects.blender.org/blender/blender/pulls/120427
2024-04-19 10:46:50 +02:00
resource_state.image_layout == link.vk_image_layout)
{
/* Has already been covered in previous barrier no need to add this one. */
Vulkan: Render graph core **Design Task**: blender/blender#118330 This PR adds the core of the render graph. The render graph isn't used. Current implementation of the Vulkan Backend is slow by design. We focused on stability, before performance. With the new introduced render graph the focus will shift to performance and keep the stability at where it is. Some highlights: - Every context will get its own render graph. (`VKRenderGraph`). - Resources (and resource state tracking) is device specific (`VKResourceStateTracker`). - No node reordering / sub graph execution has been implemented. Currently All nodes in the graph is executed in the order they were added. (`VKScheduler`). - The links inside the graph describe the resources the nodes read from (input links) or writes to (output links) - When resources are written to a resource stamp is incremented allowing keeping track of which nodes needs which stamp of a resource. - At each link the access information (how does the node accesses the resource) and image layout (for image resources) are stored. This allows the render graph to find out how a resource was used in the past and will be used in the future. That is important to construct pipeline barriers that don't stall the whole GPU. # Defined nodes This implementation has nodes for: - Blit image - Clear color image - Copy buffers to buffers - Copy buffers to images - Copy images to images - Copy images to buffers - Dispatch compute shader - Fill buffers - Synchronization Each node has a node info, create info and data struct. The create info contains all data to construct the node, including the links of the graph. The data struct only contains the data stored inside the node. The node info contains the node specific implementation. > NOTE: Other nodes will be added after this PR lands to main. # Resources Before a render graph can be used, the resources should be registered to `VKResourceStateTracker`. In the final implementation this will be owned by the `VKDevice`. Registration of resources can be done by calling `VKResources.add_buffer` or `VKResources.add_image`. # Render graph Nodes can be added to the render graph. When adding a node its read/ write dependencies are extracted and converted into links (`VKNodeInfo. build_links`). When the caller wants to have a resource up to date the functions `VKRenderGraph.submit_for_read` or `VKRenderGraph.submit_for_present` can be called. These functions will select and order the nodes that are needed and convert them to `vkCmd*` commands. These commands include pipeline barrier and image layout transitions. The `vkCmd` are recorded into a command buffer which is sent to the device queue. ## Walking the graph Walking the render graph isn't implemented yet. The idea is to have a `Map<ResourceWithStamp, Vector<NodeHandle>> consumers` and `Map<ResourceWithStamp, NodeHandle> producers`. These attributes can be stored in the render graph and created when building the links, or can be created inside the VKScheduler as a variable. The exact detail which one would be better is unclear as there aren't any users yet. At the moment the scheduler would need them we need to figure out the best way to store and retrieve the consumers/producers. # Unit tests The render graph can be tested by enabling `WITH_GTEST` and use `vk_render_graph` as a filter. ``` bin/tests/blender_test --gtest_filter="vk_render_graph*" ``` Pull Request: https://projects.blender.org/blender/blender/pulls/120427
2024-04-19 10:46:50 +02:00
continue;
}
if (within_rendering && link.vk_image_layout != VK_IMAGE_LAYOUT_RENDERING_LOCAL_READ_KHR) {
/* Allow only local read barriers inside rendering scope */
continue;
}
Vulkan: Render graph core **Design Task**: blender/blender#118330 This PR adds the core of the render graph. The render graph isn't used. Current implementation of the Vulkan Backend is slow by design. We focused on stability, before performance. With the new introduced render graph the focus will shift to performance and keep the stability at where it is. Some highlights: - Every context will get its own render graph. (`VKRenderGraph`). - Resources (and resource state tracking) is device specific (`VKResourceStateTracker`). - No node reordering / sub graph execution has been implemented. Currently All nodes in the graph is executed in the order they were added. (`VKScheduler`). - The links inside the graph describe the resources the nodes read from (input links) or writes to (output links) - When resources are written to a resource stamp is incremented allowing keeping track of which nodes needs which stamp of a resource. - At each link the access information (how does the node accesses the resource) and image layout (for image resources) are stored. This allows the render graph to find out how a resource was used in the past and will be used in the future. That is important to construct pipeline barriers that don't stall the whole GPU. # Defined nodes This implementation has nodes for: - Blit image - Clear color image - Copy buffers to buffers - Copy buffers to images - Copy images to images - Copy images to buffers - Dispatch compute shader - Fill buffers - Synchronization Each node has a node info, create info and data struct. The create info contains all data to construct the node, including the links of the graph. The data struct only contains the data stored inside the node. The node info contains the node specific implementation. > NOTE: Other nodes will be added after this PR lands to main. # Resources Before a render graph can be used, the resources should be registered to `VKResourceStateTracker`. In the final implementation this will be owned by the `VKDevice`. Registration of resources can be done by calling `VKResources.add_buffer` or `VKResources.add_image`. # Render graph Nodes can be added to the render graph. When adding a node its read/ write dependencies are extracted and converted into links (`VKNodeInfo. build_links`). When the caller wants to have a resource up to date the functions `VKRenderGraph.submit_for_read` or `VKRenderGraph.submit_for_present` can be called. These functions will select and order the nodes that are needed and convert them to `vkCmd*` commands. These commands include pipeline barrier and image layout transitions. The `vkCmd` are recorded into a command buffer which is sent to the device queue. ## Walking the graph Walking the render graph isn't implemented yet. The idea is to have a `Map<ResourceWithStamp, Vector<NodeHandle>> consumers` and `Map<ResourceWithStamp, NodeHandle> producers`. These attributes can be stored in the render graph and created when building the links, or can be created inside the VKScheduler as a variable. The exact detail which one would be better is unclear as there aren't any users yet. At the moment the scheduler would need them we need to figure out the best way to store and retrieve the consumers/producers. # Unit tests The render graph can be tested by enabling `WITH_GTEST` and use `vk_render_graph` as a filter. ``` bin/tests/blender_test --gtest_filter="vk_render_graph*" ``` Pull Request: https://projects.blender.org/blender/blender/pulls/120427
2024-04-19 10:46:50 +02:00
if (resource_state.image_layout != link.vk_image_layout &&
layered_tracker.contains(resource.image.vk_image))
Vulkan: Layer tracking during render scope EEVEE can bind layers of a texture that is also used as an attachment. When binding the image layout of these specific layers can be different that the image layout of the whole image. This fixes the known synchronization issues inside EEVEE. wasp_bot, tree_creature and wanderer scenes can be rendered without any synchronization issue reported by the Vulkan validation layers. Design task: #124214 When beginning to render the attachments are being evaluated. If there is an arrayed texture (with multiple layers) the individual layers of that texture can be tracked during until the rendering is ended. When the same texture is bound to a shader it will be a different layer (otherwise there is a feedback loop, which isn't allowed). The bound layers will typically need a different layout the transition to the new layout is executed and recorded. When the rendering ends, the layers are transitioned back to the layout the texture is expected in. It can happen that a layer is used multiple times during the same rendering. In that case the rendering should be suspended to perform the transition. Image layout transitions are not allowed during rendering. There is one place where a layer needs to be transited multiple times that is when EEVEE wants to extract the thickness from the shadow. The thickness is stored inside the gbuffer_normal which is also used as an attachment. Eval then samples the thickness from the gbuffer_normal as a sampler. To work around this issue we suspend the rendering when a `GPU_BARRIER_SHADER_IMAGE_ACCESS` is signaled. Pull Request: https://projects.blender.org/blender/blender/pulls/124407
2024-07-16 16:39:18 +02:00
{
layered_tracker.update(resource.image.vk_image,
link.layer_base,
link.layer_count,
resource_state.image_layout,
link.vk_image_layout,
r_barrier);
Vulkan: Layer tracking during render scope EEVEE can bind layers of a texture that is also used as an attachment. When binding the image layout of these specific layers can be different that the image layout of the whole image. This fixes the known synchronization issues inside EEVEE. wasp_bot, tree_creature and wanderer scenes can be rendered without any synchronization issue reported by the Vulkan validation layers. Design task: #124214 When beginning to render the attachments are being evaluated. If there is an arrayed texture (with multiple layers) the individual layers of that texture can be tracked during until the rendering is ended. When the same texture is bound to a shader it will be a different layer (otherwise there is a feedback loop, which isn't allowed). The bound layers will typically need a different layout the transition to the new layout is executed and recorded. When the rendering ends, the layers are transitioned back to the layout the texture is expected in. It can happen that a layer is used multiple times during the same rendering. In that case the rendering should be suspended to perform the transition. Image layout transitions are not allowed during rendering. There is one place where a layer needs to be transited multiple times that is when EEVEE wants to extract the thickness from the shadow. The thickness is stored inside the gbuffer_normal which is also used as an attachment. Eval then samples the thickness from the gbuffer_normal as a sampler. To work around this issue we suspend the rendering when a `GPU_BARRIER_SHADER_IMAGE_ACCESS` is signaled. Pull Request: https://projects.blender.org/blender/blender/pulls/124407
2024-07-16 16:39:18 +02:00
continue;
}
VkAccessFlags wait_access = resource_state.vk_access;
r_barrier.src_stage_mask |= resource_state.vk_pipeline_stages;
r_barrier.dst_stage_mask |= node_stages;
Vulkan: Render graph core **Design Task**: blender/blender#118330 This PR adds the core of the render graph. The render graph isn't used. Current implementation of the Vulkan Backend is slow by design. We focused on stability, before performance. With the new introduced render graph the focus will shift to performance and keep the stability at where it is. Some highlights: - Every context will get its own render graph. (`VKRenderGraph`). - Resources (and resource state tracking) is device specific (`VKResourceStateTracker`). - No node reordering / sub graph execution has been implemented. Currently All nodes in the graph is executed in the order they were added. (`VKScheduler`). - The links inside the graph describe the resources the nodes read from (input links) or writes to (output links) - When resources are written to a resource stamp is incremented allowing keeping track of which nodes needs which stamp of a resource. - At each link the access information (how does the node accesses the resource) and image layout (for image resources) are stored. This allows the render graph to find out how a resource was used in the past and will be used in the future. That is important to construct pipeline barriers that don't stall the whole GPU. # Defined nodes This implementation has nodes for: - Blit image - Clear color image - Copy buffers to buffers - Copy buffers to images - Copy images to images - Copy images to buffers - Dispatch compute shader - Fill buffers - Synchronization Each node has a node info, create info and data struct. The create info contains all data to construct the node, including the links of the graph. The data struct only contains the data stored inside the node. The node info contains the node specific implementation. > NOTE: Other nodes will be added after this PR lands to main. # Resources Before a render graph can be used, the resources should be registered to `VKResourceStateTracker`. In the final implementation this will be owned by the `VKDevice`. Registration of resources can be done by calling `VKResources.add_buffer` or `VKResources.add_image`. # Render graph Nodes can be added to the render graph. When adding a node its read/ write dependencies are extracted and converted into links (`VKNodeInfo. build_links`). When the caller wants to have a resource up to date the functions `VKRenderGraph.submit_for_read` or `VKRenderGraph.submit_for_present` can be called. These functions will select and order the nodes that are needed and convert them to `vkCmd*` commands. These commands include pipeline barrier and image layout transitions. The `vkCmd` are recorded into a command buffer which is sent to the device queue. ## Walking the graph Walking the render graph isn't implemented yet. The idea is to have a `Map<ResourceWithStamp, Vector<NodeHandle>> consumers` and `Map<ResourceWithStamp, NodeHandle> producers`. These attributes can be stored in the render graph and created when building the links, or can be created inside the VKScheduler as a variable. The exact detail which one would be better is unclear as there aren't any users yet. At the moment the scheduler would need them we need to figure out the best way to store and retrieve the consumers/producers. # Unit tests The render graph can be tested by enabling `WITH_GTEST` and use `vk_render_graph` as a filter. ``` bin/tests/blender_test --gtest_filter="vk_render_graph*" ``` Pull Request: https://projects.blender.org/blender/blender/pulls/120427
2024-04-19 10:46:50 +02:00
Vulkan: Layer tracking during render scope EEVEE can bind layers of a texture that is also used as an attachment. When binding the image layout of these specific layers can be different that the image layout of the whole image. This fixes the known synchronization issues inside EEVEE. wasp_bot, tree_creature and wanderer scenes can be rendered without any synchronization issue reported by the Vulkan validation layers. Design task: #124214 When beginning to render the attachments are being evaluated. If there is an arrayed texture (with multiple layers) the individual layers of that texture can be tracked during until the rendering is ended. When the same texture is bound to a shader it will be a different layer (otherwise there is a feedback loop, which isn't allowed). The bound layers will typically need a different layout the transition to the new layout is executed and recorded. When the rendering ends, the layers are transitioned back to the layout the texture is expected in. It can happen that a layer is used multiple times during the same rendering. In that case the rendering should be suspended to perform the transition. Image layout transitions are not allowed during rendering. There is one place where a layer needs to be transited multiple times that is when EEVEE wants to extract the thickness from the shadow. The thickness is stored inside the gbuffer_normal which is also used as an attachment. Eval then samples the thickness from the gbuffer_normal as a sampler. To work around this issue we suspend the rendering when a `GPU_BARRIER_SHADER_IMAGE_ACCESS` is signaled. Pull Request: https://projects.blender.org/blender/blender/pulls/124407
2024-07-16 16:39:18 +02:00
if (is_first_read) {
resource_state.vk_access = link.vk_access_flags;
resource_state.vk_pipeline_stages = node_stages;
}
else {
resource_state.vk_access |= link.vk_access_flags;
resource_state.vk_pipeline_stages |= node_stages;
}
Vulkan: Render graph core **Design Task**: blender/blender#118330 This PR adds the core of the render graph. The render graph isn't used. Current implementation of the Vulkan Backend is slow by design. We focused on stability, before performance. With the new introduced render graph the focus will shift to performance and keep the stability at where it is. Some highlights: - Every context will get its own render graph. (`VKRenderGraph`). - Resources (and resource state tracking) is device specific (`VKResourceStateTracker`). - No node reordering / sub graph execution has been implemented. Currently All nodes in the graph is executed in the order they were added. (`VKScheduler`). - The links inside the graph describe the resources the nodes read from (input links) or writes to (output links) - When resources are written to a resource stamp is incremented allowing keeping track of which nodes needs which stamp of a resource. - At each link the access information (how does the node accesses the resource) and image layout (for image resources) are stored. This allows the render graph to find out how a resource was used in the past and will be used in the future. That is important to construct pipeline barriers that don't stall the whole GPU. # Defined nodes This implementation has nodes for: - Blit image - Clear color image - Copy buffers to buffers - Copy buffers to images - Copy images to images - Copy images to buffers - Dispatch compute shader - Fill buffers - Synchronization Each node has a node info, create info and data struct. The create info contains all data to construct the node, including the links of the graph. The data struct only contains the data stored inside the node. The node info contains the node specific implementation. > NOTE: Other nodes will be added after this PR lands to main. # Resources Before a render graph can be used, the resources should be registered to `VKResourceStateTracker`. In the final implementation this will be owned by the `VKDevice`. Registration of resources can be done by calling `VKResources.add_buffer` or `VKResources.add_image`. # Render graph Nodes can be added to the render graph. When adding a node its read/ write dependencies are extracted and converted into links (`VKNodeInfo. build_links`). When the caller wants to have a resource up to date the functions `VKRenderGraph.submit_for_read` or `VKRenderGraph.submit_for_present` can be called. These functions will select and order the nodes that are needed and convert them to `vkCmd*` commands. These commands include pipeline barrier and image layout transitions. The `vkCmd` are recorded into a command buffer which is sent to the device queue. ## Walking the graph Walking the render graph isn't implemented yet. The idea is to have a `Map<ResourceWithStamp, Vector<NodeHandle>> consumers` and `Map<ResourceWithStamp, NodeHandle> producers`. These attributes can be stored in the render graph and created when building the links, or can be created inside the VKScheduler as a variable. The exact detail which one would be better is unclear as there aren't any users yet. At the moment the scheduler would need them we need to figure out the best way to store and retrieve the consumers/producers. # Unit tests The render graph can be tested by enabling `WITH_GTEST` and use `vk_render_graph` as a filter. ``` bin/tests/blender_test --gtest_filter="vk_render_graph*" ``` Pull Request: https://projects.blender.org/blender/blender/pulls/120427
2024-04-19 10:46:50 +02:00
add_image_barrier(resource.image.vk_image,
r_barrier,
Vulkan: Render graph core **Design Task**: blender/blender#118330 This PR adds the core of the render graph. The render graph isn't used. Current implementation of the Vulkan Backend is slow by design. We focused on stability, before performance. With the new introduced render graph the focus will shift to performance and keep the stability at where it is. Some highlights: - Every context will get its own render graph. (`VKRenderGraph`). - Resources (and resource state tracking) is device specific (`VKResourceStateTracker`). - No node reordering / sub graph execution has been implemented. Currently All nodes in the graph is executed in the order they were added. (`VKScheduler`). - The links inside the graph describe the resources the nodes read from (input links) or writes to (output links) - When resources are written to a resource stamp is incremented allowing keeping track of which nodes needs which stamp of a resource. - At each link the access information (how does the node accesses the resource) and image layout (for image resources) are stored. This allows the render graph to find out how a resource was used in the past and will be used in the future. That is important to construct pipeline barriers that don't stall the whole GPU. # Defined nodes This implementation has nodes for: - Blit image - Clear color image - Copy buffers to buffers - Copy buffers to images - Copy images to images - Copy images to buffers - Dispatch compute shader - Fill buffers - Synchronization Each node has a node info, create info and data struct. The create info contains all data to construct the node, including the links of the graph. The data struct only contains the data stored inside the node. The node info contains the node specific implementation. > NOTE: Other nodes will be added after this PR lands to main. # Resources Before a render graph can be used, the resources should be registered to `VKResourceStateTracker`. In the final implementation this will be owned by the `VKDevice`. Registration of resources can be done by calling `VKResources.add_buffer` or `VKResources.add_image`. # Render graph Nodes can be added to the render graph. When adding a node its read/ write dependencies are extracted and converted into links (`VKNodeInfo. build_links`). When the caller wants to have a resource up to date the functions `VKRenderGraph.submit_for_read` or `VKRenderGraph.submit_for_present` can be called. These functions will select and order the nodes that are needed and convert them to `vkCmd*` commands. These commands include pipeline barrier and image layout transitions. The `vkCmd` are recorded into a command buffer which is sent to the device queue. ## Walking the graph Walking the render graph isn't implemented yet. The idea is to have a `Map<ResourceWithStamp, Vector<NodeHandle>> consumers` and `Map<ResourceWithStamp, NodeHandle> producers`. These attributes can be stored in the render graph and created when building the links, or can be created inside the VKScheduler as a variable. The exact detail which one would be better is unclear as there aren't any users yet. At the moment the scheduler would need them we need to figure out the best way to store and retrieve the consumers/producers. # Unit tests The render graph can be tested by enabling `WITH_GTEST` and use `vk_render_graph` as a filter. ``` bin/tests/blender_test --gtest_filter="vk_render_graph*" ``` Pull Request: https://projects.blender.org/blender/blender/pulls/120427
2024-04-19 10:46:50 +02:00
wait_access,
link.vk_access_flags,
Vulkan: Render graph core **Design Task**: blender/blender#118330 This PR adds the core of the render graph. The render graph isn't used. Current implementation of the Vulkan Backend is slow by design. We focused on stability, before performance. With the new introduced render graph the focus will shift to performance and keep the stability at where it is. Some highlights: - Every context will get its own render graph. (`VKRenderGraph`). - Resources (and resource state tracking) is device specific (`VKResourceStateTracker`). - No node reordering / sub graph execution has been implemented. Currently All nodes in the graph is executed in the order they were added. (`VKScheduler`). - The links inside the graph describe the resources the nodes read from (input links) or writes to (output links) - When resources are written to a resource stamp is incremented allowing keeping track of which nodes needs which stamp of a resource. - At each link the access information (how does the node accesses the resource) and image layout (for image resources) are stored. This allows the render graph to find out how a resource was used in the past and will be used in the future. That is important to construct pipeline barriers that don't stall the whole GPU. # Defined nodes This implementation has nodes for: - Blit image - Clear color image - Copy buffers to buffers - Copy buffers to images - Copy images to images - Copy images to buffers - Dispatch compute shader - Fill buffers - Synchronization Each node has a node info, create info and data struct. The create info contains all data to construct the node, including the links of the graph. The data struct only contains the data stored inside the node. The node info contains the node specific implementation. > NOTE: Other nodes will be added after this PR lands to main. # Resources Before a render graph can be used, the resources should be registered to `VKResourceStateTracker`. In the final implementation this will be owned by the `VKDevice`. Registration of resources can be done by calling `VKResources.add_buffer` or `VKResources.add_image`. # Render graph Nodes can be added to the render graph. When adding a node its read/ write dependencies are extracted and converted into links (`VKNodeInfo. build_links`). When the caller wants to have a resource up to date the functions `VKRenderGraph.submit_for_read` or `VKRenderGraph.submit_for_present` can be called. These functions will select and order the nodes that are needed and convert them to `vkCmd*` commands. These commands include pipeline barrier and image layout transitions. The `vkCmd` are recorded into a command buffer which is sent to the device queue. ## Walking the graph Walking the render graph isn't implemented yet. The idea is to have a `Map<ResourceWithStamp, Vector<NodeHandle>> consumers` and `Map<ResourceWithStamp, NodeHandle> producers`. These attributes can be stored in the render graph and created when building the links, or can be created inside the VKScheduler as a variable. The exact detail which one would be better is unclear as there aren't any users yet. At the moment the scheduler would need them we need to figure out the best way to store and retrieve the consumers/producers. # Unit tests The render graph can be tested by enabling `WITH_GTEST` and use `vk_render_graph` as a filter. ``` bin/tests/blender_test --gtest_filter="vk_render_graph*" ``` Pull Request: https://projects.blender.org/blender/blender/pulls/120427
2024-04-19 10:46:50 +02:00
resource_state.image_layout,
link.vk_image_layout,
link.vk_image_aspect);
Vulkan: Render graph core **Design Task**: blender/blender#118330 This PR adds the core of the render graph. The render graph isn't used. Current implementation of the Vulkan Backend is slow by design. We focused on stability, before performance. With the new introduced render graph the focus will shift to performance and keep the stability at where it is. Some highlights: - Every context will get its own render graph. (`VKRenderGraph`). - Resources (and resource state tracking) is device specific (`VKResourceStateTracker`). - No node reordering / sub graph execution has been implemented. Currently All nodes in the graph is executed in the order they were added. (`VKScheduler`). - The links inside the graph describe the resources the nodes read from (input links) or writes to (output links) - When resources are written to a resource stamp is incremented allowing keeping track of which nodes needs which stamp of a resource. - At each link the access information (how does the node accesses the resource) and image layout (for image resources) are stored. This allows the render graph to find out how a resource was used in the past and will be used in the future. That is important to construct pipeline barriers that don't stall the whole GPU. # Defined nodes This implementation has nodes for: - Blit image - Clear color image - Copy buffers to buffers - Copy buffers to images - Copy images to images - Copy images to buffers - Dispatch compute shader - Fill buffers - Synchronization Each node has a node info, create info and data struct. The create info contains all data to construct the node, including the links of the graph. The data struct only contains the data stored inside the node. The node info contains the node specific implementation. > NOTE: Other nodes will be added after this PR lands to main. # Resources Before a render graph can be used, the resources should be registered to `VKResourceStateTracker`. In the final implementation this will be owned by the `VKDevice`. Registration of resources can be done by calling `VKResources.add_buffer` or `VKResources.add_image`. # Render graph Nodes can be added to the render graph. When adding a node its read/ write dependencies are extracted and converted into links (`VKNodeInfo. build_links`). When the caller wants to have a resource up to date the functions `VKRenderGraph.submit_for_read` or `VKRenderGraph.submit_for_present` can be called. These functions will select and order the nodes that are needed and convert them to `vkCmd*` commands. These commands include pipeline barrier and image layout transitions. The `vkCmd` are recorded into a command buffer which is sent to the device queue. ## Walking the graph Walking the render graph isn't implemented yet. The idea is to have a `Map<ResourceWithStamp, Vector<NodeHandle>> consumers` and `Map<ResourceWithStamp, NodeHandle> producers`. These attributes can be stored in the render graph and created when building the links, or can be created inside the VKScheduler as a variable. The exact detail which one would be better is unclear as there aren't any users yet. At the moment the scheduler would need them we need to figure out the best way to store and retrieve the consumers/producers. # Unit tests The render graph can be tested by enabling `WITH_GTEST` and use `vk_render_graph` as a filter. ``` bin/tests/blender_test --gtest_filter="vk_render_graph*" ``` Pull Request: https://projects.blender.org/blender/blender/pulls/120427
2024-04-19 10:46:50 +02:00
resource_state.image_layout = link.vk_image_layout;
}
}
void VKCommandBuilder::add_image_write_barriers(VKRenderGraph &render_graph,
NodeHandle node_handle,
VkPipelineStageFlags node_stages,
LayeredImageTracker &layered_tracker,
Barrier &r_barrier,
bool within_rendering)
Vulkan: Render graph core **Design Task**: blender/blender#118330 This PR adds the core of the render graph. The render graph isn't used. Current implementation of the Vulkan Backend is slow by design. We focused on stability, before performance. With the new introduced render graph the focus will shift to performance and keep the stability at where it is. Some highlights: - Every context will get its own render graph. (`VKRenderGraph`). - Resources (and resource state tracking) is device specific (`VKResourceStateTracker`). - No node reordering / sub graph execution has been implemented. Currently All nodes in the graph is executed in the order they were added. (`VKScheduler`). - The links inside the graph describe the resources the nodes read from (input links) or writes to (output links) - When resources are written to a resource stamp is incremented allowing keeping track of which nodes needs which stamp of a resource. - At each link the access information (how does the node accesses the resource) and image layout (for image resources) are stored. This allows the render graph to find out how a resource was used in the past and will be used in the future. That is important to construct pipeline barriers that don't stall the whole GPU. # Defined nodes This implementation has nodes for: - Blit image - Clear color image - Copy buffers to buffers - Copy buffers to images - Copy images to images - Copy images to buffers - Dispatch compute shader - Fill buffers - Synchronization Each node has a node info, create info and data struct. The create info contains all data to construct the node, including the links of the graph. The data struct only contains the data stored inside the node. The node info contains the node specific implementation. > NOTE: Other nodes will be added after this PR lands to main. # Resources Before a render graph can be used, the resources should be registered to `VKResourceStateTracker`. In the final implementation this will be owned by the `VKDevice`. Registration of resources can be done by calling `VKResources.add_buffer` or `VKResources.add_image`. # Render graph Nodes can be added to the render graph. When adding a node its read/ write dependencies are extracted and converted into links (`VKNodeInfo. build_links`). When the caller wants to have a resource up to date the functions `VKRenderGraph.submit_for_read` or `VKRenderGraph.submit_for_present` can be called. These functions will select and order the nodes that are needed and convert them to `vkCmd*` commands. These commands include pipeline barrier and image layout transitions. The `vkCmd` are recorded into a command buffer which is sent to the device queue. ## Walking the graph Walking the render graph isn't implemented yet. The idea is to have a `Map<ResourceWithStamp, Vector<NodeHandle>> consumers` and `Map<ResourceWithStamp, NodeHandle> producers`. These attributes can be stored in the render graph and created when building the links, or can be created inside the VKScheduler as a variable. The exact detail which one would be better is unclear as there aren't any users yet. At the moment the scheduler would need them we need to figure out the best way to store and retrieve the consumers/producers. # Unit tests The render graph can be tested by enabling `WITH_GTEST` and use `vk_render_graph` as a filter. ``` bin/tests/blender_test --gtest_filter="vk_render_graph*" ``` Pull Request: https://projects.blender.org/blender/blender/pulls/120427
2024-04-19 10:46:50 +02:00
{
for (const VKRenderGraphLink link : render_graph.links_[node_handle].outputs) {
if (link.is_link_to_buffer()) {
continue;
}
Vulkan: Render graph core **Design Task**: blender/blender#118330 This PR adds the core of the render graph. The render graph isn't used. Current implementation of the Vulkan Backend is slow by design. We focused on stability, before performance. With the new introduced render graph the focus will shift to performance and keep the stability at where it is. Some highlights: - Every context will get its own render graph. (`VKRenderGraph`). - Resources (and resource state tracking) is device specific (`VKResourceStateTracker`). - No node reordering / sub graph execution has been implemented. Currently All nodes in the graph is executed in the order they were added. (`VKScheduler`). - The links inside the graph describe the resources the nodes read from (input links) or writes to (output links) - When resources are written to a resource stamp is incremented allowing keeping track of which nodes needs which stamp of a resource. - At each link the access information (how does the node accesses the resource) and image layout (for image resources) are stored. This allows the render graph to find out how a resource was used in the past and will be used in the future. That is important to construct pipeline barriers that don't stall the whole GPU. # Defined nodes This implementation has nodes for: - Blit image - Clear color image - Copy buffers to buffers - Copy buffers to images - Copy images to images - Copy images to buffers - Dispatch compute shader - Fill buffers - Synchronization Each node has a node info, create info and data struct. The create info contains all data to construct the node, including the links of the graph. The data struct only contains the data stored inside the node. The node info contains the node specific implementation. > NOTE: Other nodes will be added after this PR lands to main. # Resources Before a render graph can be used, the resources should be registered to `VKResourceStateTracker`. In the final implementation this will be owned by the `VKDevice`. Registration of resources can be done by calling `VKResources.add_buffer` or `VKResources.add_image`. # Render graph Nodes can be added to the render graph. When adding a node its read/ write dependencies are extracted and converted into links (`VKNodeInfo. build_links`). When the caller wants to have a resource up to date the functions `VKRenderGraph.submit_for_read` or `VKRenderGraph.submit_for_present` can be called. These functions will select and order the nodes that are needed and convert them to `vkCmd*` commands. These commands include pipeline barrier and image layout transitions. The `vkCmd` are recorded into a command buffer which is sent to the device queue. ## Walking the graph Walking the render graph isn't implemented yet. The idea is to have a `Map<ResourceWithStamp, Vector<NodeHandle>> consumers` and `Map<ResourceWithStamp, NodeHandle> producers`. These attributes can be stored in the render graph and created when building the links, or can be created inside the VKScheduler as a variable. The exact detail which one would be better is unclear as there aren't any users yet. At the moment the scheduler would need them we need to figure out the best way to store and retrieve the consumers/producers. # Unit tests The render graph can be tested by enabling `WITH_GTEST` and use `vk_render_graph` as a filter. ``` bin/tests/blender_test --gtest_filter="vk_render_graph*" ``` Pull Request: https://projects.blender.org/blender/blender/pulls/120427
2024-04-19 10:46:50 +02:00
const ResourceWithStamp &versioned_resource = link.resource;
VKResourceStateTracker::Resource &resource = render_graph.resources_.resources_.lookup(
versioned_resource.handle);
VKResourceBarrierState &resource_state = resource.barrier_state;
const VkAccessFlags wait_access = resource_state.vk_access;
if (within_rendering && link.vk_image_layout != VK_IMAGE_LAYOUT_RENDERING_LOCAL_READ_KHR) {
/* Allow only local read barriers inside rendering scope */
continue;
}
if (layered_tracker.contains(resource.image.vk_image) &&
Vulkan: Layer tracking during render scope EEVEE can bind layers of a texture that is also used as an attachment. When binding the image layout of these specific layers can be different that the image layout of the whole image. This fixes the known synchronization issues inside EEVEE. wasp_bot, tree_creature and wanderer scenes can be rendered without any synchronization issue reported by the Vulkan validation layers. Design task: #124214 When beginning to render the attachments are being evaluated. If there is an arrayed texture (with multiple layers) the individual layers of that texture can be tracked during until the rendering is ended. When the same texture is bound to a shader it will be a different layer (otherwise there is a feedback loop, which isn't allowed). The bound layers will typically need a different layout the transition to the new layout is executed and recorded. When the rendering ends, the layers are transitioned back to the layout the texture is expected in. It can happen that a layer is used multiple times during the same rendering. In that case the rendering should be suspended to perform the transition. Image layout transitions are not allowed during rendering. There is one place where a layer needs to be transited multiple times that is when EEVEE wants to extract the thickness from the shadow. The thickness is stored inside the gbuffer_normal which is also used as an attachment. Eval then samples the thickness from the gbuffer_normal as a sampler. To work around this issue we suspend the rendering when a `GPU_BARRIER_SHADER_IMAGE_ACCESS` is signaled. Pull Request: https://projects.blender.org/blender/blender/pulls/124407
2024-07-16 16:39:18 +02:00
resource_state.image_layout != link.vk_image_layout)
{
layered_tracker.update(resource.image.vk_image,
link.layer_base,
link.layer_count,
resource_state.image_layout,
link.vk_image_layout,
r_barrier);
Vulkan: Layer tracking during render scope EEVEE can bind layers of a texture that is also used as an attachment. When binding the image layout of these specific layers can be different that the image layout of the whole image. This fixes the known synchronization issues inside EEVEE. wasp_bot, tree_creature and wanderer scenes can be rendered without any synchronization issue reported by the Vulkan validation layers. Design task: #124214 When beginning to render the attachments are being evaluated. If there is an arrayed texture (with multiple layers) the individual layers of that texture can be tracked during until the rendering is ended. When the same texture is bound to a shader it will be a different layer (otherwise there is a feedback loop, which isn't allowed). The bound layers will typically need a different layout the transition to the new layout is executed and recorded. When the rendering ends, the layers are transitioned back to the layout the texture is expected in. It can happen that a layer is used multiple times during the same rendering. In that case the rendering should be suspended to perform the transition. Image layout transitions are not allowed during rendering. There is one place where a layer needs to be transited multiple times that is when EEVEE wants to extract the thickness from the shadow. The thickness is stored inside the gbuffer_normal which is also used as an attachment. Eval then samples the thickness from the gbuffer_normal as a sampler. To work around this issue we suspend the rendering when a `GPU_BARRIER_SHADER_IMAGE_ACCESS` is signaled. Pull Request: https://projects.blender.org/blender/blender/pulls/124407
2024-07-16 16:39:18 +02:00
continue;
}
r_barrier.src_stage_mask |= resource_state.vk_pipeline_stages;
r_barrier.dst_stage_mask |= node_stages;
Vulkan: Render graph core **Design Task**: blender/blender#118330 This PR adds the core of the render graph. The render graph isn't used. Current implementation of the Vulkan Backend is slow by design. We focused on stability, before performance. With the new introduced render graph the focus will shift to performance and keep the stability at where it is. Some highlights: - Every context will get its own render graph. (`VKRenderGraph`). - Resources (and resource state tracking) is device specific (`VKResourceStateTracker`). - No node reordering / sub graph execution has been implemented. Currently All nodes in the graph is executed in the order they were added. (`VKScheduler`). - The links inside the graph describe the resources the nodes read from (input links) or writes to (output links) - When resources are written to a resource stamp is incremented allowing keeping track of which nodes needs which stamp of a resource. - At each link the access information (how does the node accesses the resource) and image layout (for image resources) are stored. This allows the render graph to find out how a resource was used in the past and will be used in the future. That is important to construct pipeline barriers that don't stall the whole GPU. # Defined nodes This implementation has nodes for: - Blit image - Clear color image - Copy buffers to buffers - Copy buffers to images - Copy images to images - Copy images to buffers - Dispatch compute shader - Fill buffers - Synchronization Each node has a node info, create info and data struct. The create info contains all data to construct the node, including the links of the graph. The data struct only contains the data stored inside the node. The node info contains the node specific implementation. > NOTE: Other nodes will be added after this PR lands to main. # Resources Before a render graph can be used, the resources should be registered to `VKResourceStateTracker`. In the final implementation this will be owned by the `VKDevice`. Registration of resources can be done by calling `VKResources.add_buffer` or `VKResources.add_image`. # Render graph Nodes can be added to the render graph. When adding a node its read/ write dependencies are extracted and converted into links (`VKNodeInfo. build_links`). When the caller wants to have a resource up to date the functions `VKRenderGraph.submit_for_read` or `VKRenderGraph.submit_for_present` can be called. These functions will select and order the nodes that are needed and convert them to `vkCmd*` commands. These commands include pipeline barrier and image layout transitions. The `vkCmd` are recorded into a command buffer which is sent to the device queue. ## Walking the graph Walking the render graph isn't implemented yet. The idea is to have a `Map<ResourceWithStamp, Vector<NodeHandle>> consumers` and `Map<ResourceWithStamp, NodeHandle> producers`. These attributes can be stored in the render graph and created when building the links, or can be created inside the VKScheduler as a variable. The exact detail which one would be better is unclear as there aren't any users yet. At the moment the scheduler would need them we need to figure out the best way to store and retrieve the consumers/producers. # Unit tests The render graph can be tested by enabling `WITH_GTEST` and use `vk_render_graph` as a filter. ``` bin/tests/blender_test --gtest_filter="vk_render_graph*" ``` Pull Request: https://projects.blender.org/blender/blender/pulls/120427
2024-04-19 10:46:50 +02:00
resource_state.vk_access = link.vk_access_flags;
resource_state.vk_pipeline_stages = node_stages;
Vulkan: Render graph core **Design Task**: blender/blender#118330 This PR adds the core of the render graph. The render graph isn't used. Current implementation of the Vulkan Backend is slow by design. We focused on stability, before performance. With the new introduced render graph the focus will shift to performance and keep the stability at where it is. Some highlights: - Every context will get its own render graph. (`VKRenderGraph`). - Resources (and resource state tracking) is device specific (`VKResourceStateTracker`). - No node reordering / sub graph execution has been implemented. Currently All nodes in the graph is executed in the order they were added. (`VKScheduler`). - The links inside the graph describe the resources the nodes read from (input links) or writes to (output links) - When resources are written to a resource stamp is incremented allowing keeping track of which nodes needs which stamp of a resource. - At each link the access information (how does the node accesses the resource) and image layout (for image resources) are stored. This allows the render graph to find out how a resource was used in the past and will be used in the future. That is important to construct pipeline barriers that don't stall the whole GPU. # Defined nodes This implementation has nodes for: - Blit image - Clear color image - Copy buffers to buffers - Copy buffers to images - Copy images to images - Copy images to buffers - Dispatch compute shader - Fill buffers - Synchronization Each node has a node info, create info and data struct. The create info contains all data to construct the node, including the links of the graph. The data struct only contains the data stored inside the node. The node info contains the node specific implementation. > NOTE: Other nodes will be added after this PR lands to main. # Resources Before a render graph can be used, the resources should be registered to `VKResourceStateTracker`. In the final implementation this will be owned by the `VKDevice`. Registration of resources can be done by calling `VKResources.add_buffer` or `VKResources.add_image`. # Render graph Nodes can be added to the render graph. When adding a node its read/ write dependencies are extracted and converted into links (`VKNodeInfo. build_links`). When the caller wants to have a resource up to date the functions `VKRenderGraph.submit_for_read` or `VKRenderGraph.submit_for_present` can be called. These functions will select and order the nodes that are needed and convert them to `vkCmd*` commands. These commands include pipeline barrier and image layout transitions. The `vkCmd` are recorded into a command buffer which is sent to the device queue. ## Walking the graph Walking the render graph isn't implemented yet. The idea is to have a `Map<ResourceWithStamp, Vector<NodeHandle>> consumers` and `Map<ResourceWithStamp, NodeHandle> producers`. These attributes can be stored in the render graph and created when building the links, or can be created inside the VKScheduler as a variable. The exact detail which one would be better is unclear as there aren't any users yet. At the moment the scheduler would need them we need to figure out the best way to store and retrieve the consumers/producers. # Unit tests The render graph can be tested by enabling `WITH_GTEST` and use `vk_render_graph` as a filter. ``` bin/tests/blender_test --gtest_filter="vk_render_graph*" ``` Pull Request: https://projects.blender.org/blender/blender/pulls/120427
2024-04-19 10:46:50 +02:00
if (wait_access != VK_ACCESS_NONE || link.vk_image_layout != resource_state.image_layout) {
add_image_barrier(resource.image.vk_image,
r_barrier,
Vulkan: Render graph core **Design Task**: blender/blender#118330 This PR adds the core of the render graph. The render graph isn't used. Current implementation of the Vulkan Backend is slow by design. We focused on stability, before performance. With the new introduced render graph the focus will shift to performance and keep the stability at where it is. Some highlights: - Every context will get its own render graph. (`VKRenderGraph`). - Resources (and resource state tracking) is device specific (`VKResourceStateTracker`). - No node reordering / sub graph execution has been implemented. Currently All nodes in the graph is executed in the order they were added. (`VKScheduler`). - The links inside the graph describe the resources the nodes read from (input links) or writes to (output links) - When resources are written to a resource stamp is incremented allowing keeping track of which nodes needs which stamp of a resource. - At each link the access information (how does the node accesses the resource) and image layout (for image resources) are stored. This allows the render graph to find out how a resource was used in the past and will be used in the future. That is important to construct pipeline barriers that don't stall the whole GPU. # Defined nodes This implementation has nodes for: - Blit image - Clear color image - Copy buffers to buffers - Copy buffers to images - Copy images to images - Copy images to buffers - Dispatch compute shader - Fill buffers - Synchronization Each node has a node info, create info and data struct. The create info contains all data to construct the node, including the links of the graph. The data struct only contains the data stored inside the node. The node info contains the node specific implementation. > NOTE: Other nodes will be added after this PR lands to main. # Resources Before a render graph can be used, the resources should be registered to `VKResourceStateTracker`. In the final implementation this will be owned by the `VKDevice`. Registration of resources can be done by calling `VKResources.add_buffer` or `VKResources.add_image`. # Render graph Nodes can be added to the render graph. When adding a node its read/ write dependencies are extracted and converted into links (`VKNodeInfo. build_links`). When the caller wants to have a resource up to date the functions `VKRenderGraph.submit_for_read` or `VKRenderGraph.submit_for_present` can be called. These functions will select and order the nodes that are needed and convert them to `vkCmd*` commands. These commands include pipeline barrier and image layout transitions. The `vkCmd` are recorded into a command buffer which is sent to the device queue. ## Walking the graph Walking the render graph isn't implemented yet. The idea is to have a `Map<ResourceWithStamp, Vector<NodeHandle>> consumers` and `Map<ResourceWithStamp, NodeHandle> producers`. These attributes can be stored in the render graph and created when building the links, or can be created inside the VKScheduler as a variable. The exact detail which one would be better is unclear as there aren't any users yet. At the moment the scheduler would need them we need to figure out the best way to store and retrieve the consumers/producers. # Unit tests The render graph can be tested by enabling `WITH_GTEST` and use `vk_render_graph` as a filter. ``` bin/tests/blender_test --gtest_filter="vk_render_graph*" ``` Pull Request: https://projects.blender.org/blender/blender/pulls/120427
2024-04-19 10:46:50 +02:00
wait_access,
link.vk_access_flags,
resource_state.image_layout,
link.vk_image_layout,
link.vk_image_aspect);
Vulkan: Render graph core **Design Task**: blender/blender#118330 This PR adds the core of the render graph. The render graph isn't used. Current implementation of the Vulkan Backend is slow by design. We focused on stability, before performance. With the new introduced render graph the focus will shift to performance and keep the stability at where it is. Some highlights: - Every context will get its own render graph. (`VKRenderGraph`). - Resources (and resource state tracking) is device specific (`VKResourceStateTracker`). - No node reordering / sub graph execution has been implemented. Currently All nodes in the graph is executed in the order they were added. (`VKScheduler`). - The links inside the graph describe the resources the nodes read from (input links) or writes to (output links) - When resources are written to a resource stamp is incremented allowing keeping track of which nodes needs which stamp of a resource. - At each link the access information (how does the node accesses the resource) and image layout (for image resources) are stored. This allows the render graph to find out how a resource was used in the past and will be used in the future. That is important to construct pipeline barriers that don't stall the whole GPU. # Defined nodes This implementation has nodes for: - Blit image - Clear color image - Copy buffers to buffers - Copy buffers to images - Copy images to images - Copy images to buffers - Dispatch compute shader - Fill buffers - Synchronization Each node has a node info, create info and data struct. The create info contains all data to construct the node, including the links of the graph. The data struct only contains the data stored inside the node. The node info contains the node specific implementation. > NOTE: Other nodes will be added after this PR lands to main. # Resources Before a render graph can be used, the resources should be registered to `VKResourceStateTracker`. In the final implementation this will be owned by the `VKDevice`. Registration of resources can be done by calling `VKResources.add_buffer` or `VKResources.add_image`. # Render graph Nodes can be added to the render graph. When adding a node its read/ write dependencies are extracted and converted into links (`VKNodeInfo. build_links`). When the caller wants to have a resource up to date the functions `VKRenderGraph.submit_for_read` or `VKRenderGraph.submit_for_present` can be called. These functions will select and order the nodes that are needed and convert them to `vkCmd*` commands. These commands include pipeline barrier and image layout transitions. The `vkCmd` are recorded into a command buffer which is sent to the device queue. ## Walking the graph Walking the render graph isn't implemented yet. The idea is to have a `Map<ResourceWithStamp, Vector<NodeHandle>> consumers` and `Map<ResourceWithStamp, NodeHandle> producers`. These attributes can be stored in the render graph and created when building the links, or can be created inside the VKScheduler as a variable. The exact detail which one would be better is unclear as there aren't any users yet. At the moment the scheduler would need them we need to figure out the best way to store and retrieve the consumers/producers. # Unit tests The render graph can be tested by enabling `WITH_GTEST` and use `vk_render_graph` as a filter. ``` bin/tests/blender_test --gtest_filter="vk_render_graph*" ``` Pull Request: https://projects.blender.org/blender/blender/pulls/120427
2024-04-19 10:46:50 +02:00
resource_state.image_layout = link.vk_image_layout;
}
}
}
void VKCommandBuilder::add_image_barrier(VkImage vk_image,
Barrier &r_barrier,
Vulkan: Render graph core **Design Task**: blender/blender#118330 This PR adds the core of the render graph. The render graph isn't used. Current implementation of the Vulkan Backend is slow by design. We focused on stability, before performance. With the new introduced render graph the focus will shift to performance and keep the stability at where it is. Some highlights: - Every context will get its own render graph. (`VKRenderGraph`). - Resources (and resource state tracking) is device specific (`VKResourceStateTracker`). - No node reordering / sub graph execution has been implemented. Currently All nodes in the graph is executed in the order they were added. (`VKScheduler`). - The links inside the graph describe the resources the nodes read from (input links) or writes to (output links) - When resources are written to a resource stamp is incremented allowing keeping track of which nodes needs which stamp of a resource. - At each link the access information (how does the node accesses the resource) and image layout (for image resources) are stored. This allows the render graph to find out how a resource was used in the past and will be used in the future. That is important to construct pipeline barriers that don't stall the whole GPU. # Defined nodes This implementation has nodes for: - Blit image - Clear color image - Copy buffers to buffers - Copy buffers to images - Copy images to images - Copy images to buffers - Dispatch compute shader - Fill buffers - Synchronization Each node has a node info, create info and data struct. The create info contains all data to construct the node, including the links of the graph. The data struct only contains the data stored inside the node. The node info contains the node specific implementation. > NOTE: Other nodes will be added after this PR lands to main. # Resources Before a render graph can be used, the resources should be registered to `VKResourceStateTracker`. In the final implementation this will be owned by the `VKDevice`. Registration of resources can be done by calling `VKResources.add_buffer` or `VKResources.add_image`. # Render graph Nodes can be added to the render graph. When adding a node its read/ write dependencies are extracted and converted into links (`VKNodeInfo. build_links`). When the caller wants to have a resource up to date the functions `VKRenderGraph.submit_for_read` or `VKRenderGraph.submit_for_present` can be called. These functions will select and order the nodes that are needed and convert them to `vkCmd*` commands. These commands include pipeline barrier and image layout transitions. The `vkCmd` are recorded into a command buffer which is sent to the device queue. ## Walking the graph Walking the render graph isn't implemented yet. The idea is to have a `Map<ResourceWithStamp, Vector<NodeHandle>> consumers` and `Map<ResourceWithStamp, NodeHandle> producers`. These attributes can be stored in the render graph and created when building the links, or can be created inside the VKScheduler as a variable. The exact detail which one would be better is unclear as there aren't any users yet. At the moment the scheduler would need them we need to figure out the best way to store and retrieve the consumers/producers. # Unit tests The render graph can be tested by enabling `WITH_GTEST` and use `vk_render_graph` as a filter. ``` bin/tests/blender_test --gtest_filter="vk_render_graph*" ``` Pull Request: https://projects.blender.org/blender/blender/pulls/120427
2024-04-19 10:46:50 +02:00
VkAccessFlags src_access_mask,
VkAccessFlags dst_access_mask,
VkImageLayout old_layout,
VkImageLayout new_layout,
Vulkan: Layer tracking during render scope EEVEE can bind layers of a texture that is also used as an attachment. When binding the image layout of these specific layers can be different that the image layout of the whole image. This fixes the known synchronization issues inside EEVEE. wasp_bot, tree_creature and wanderer scenes can be rendered without any synchronization issue reported by the Vulkan validation layers. Design task: #124214 When beginning to render the attachments are being evaluated. If there is an arrayed texture (with multiple layers) the individual layers of that texture can be tracked during until the rendering is ended. When the same texture is bound to a shader it will be a different layer (otherwise there is a feedback loop, which isn't allowed). The bound layers will typically need a different layout the transition to the new layout is executed and recorded. When the rendering ends, the layers are transitioned back to the layout the texture is expected in. It can happen that a layer is used multiple times during the same rendering. In that case the rendering should be suspended to perform the transition. Image layout transitions are not allowed during rendering. There is one place where a layer needs to be transited multiple times that is when EEVEE wants to extract the thickness from the shadow. The thickness is stored inside the gbuffer_normal which is also used as an attachment. Eval then samples the thickness from the gbuffer_normal as a sampler. To work around this issue we suspend the rendering when a `GPU_BARRIER_SHADER_IMAGE_ACCESS` is signaled. Pull Request: https://projects.blender.org/blender/blender/pulls/124407
2024-07-16 16:39:18 +02:00
VkImageAspectFlags aspect_mask,
uint32_t layer_base,
uint32_t layer_count)
Vulkan: Render graph core **Design Task**: blender/blender#118330 This PR adds the core of the render graph. The render graph isn't used. Current implementation of the Vulkan Backend is slow by design. We focused on stability, before performance. With the new introduced render graph the focus will shift to performance and keep the stability at where it is. Some highlights: - Every context will get its own render graph. (`VKRenderGraph`). - Resources (and resource state tracking) is device specific (`VKResourceStateTracker`). - No node reordering / sub graph execution has been implemented. Currently All nodes in the graph is executed in the order they were added. (`VKScheduler`). - The links inside the graph describe the resources the nodes read from (input links) or writes to (output links) - When resources are written to a resource stamp is incremented allowing keeping track of which nodes needs which stamp of a resource. - At each link the access information (how does the node accesses the resource) and image layout (for image resources) are stored. This allows the render graph to find out how a resource was used in the past and will be used in the future. That is important to construct pipeline barriers that don't stall the whole GPU. # Defined nodes This implementation has nodes for: - Blit image - Clear color image - Copy buffers to buffers - Copy buffers to images - Copy images to images - Copy images to buffers - Dispatch compute shader - Fill buffers - Synchronization Each node has a node info, create info and data struct. The create info contains all data to construct the node, including the links of the graph. The data struct only contains the data stored inside the node. The node info contains the node specific implementation. > NOTE: Other nodes will be added after this PR lands to main. # Resources Before a render graph can be used, the resources should be registered to `VKResourceStateTracker`. In the final implementation this will be owned by the `VKDevice`. Registration of resources can be done by calling `VKResources.add_buffer` or `VKResources.add_image`. # Render graph Nodes can be added to the render graph. When adding a node its read/ write dependencies are extracted and converted into links (`VKNodeInfo. build_links`). When the caller wants to have a resource up to date the functions `VKRenderGraph.submit_for_read` or `VKRenderGraph.submit_for_present` can be called. These functions will select and order the nodes that are needed and convert them to `vkCmd*` commands. These commands include pipeline barrier and image layout transitions. The `vkCmd` are recorded into a command buffer which is sent to the device queue. ## Walking the graph Walking the render graph isn't implemented yet. The idea is to have a `Map<ResourceWithStamp, Vector<NodeHandle>> consumers` and `Map<ResourceWithStamp, NodeHandle> producers`. These attributes can be stored in the render graph and created when building the links, or can be created inside the VKScheduler as a variable. The exact detail which one would be better is unclear as there aren't any users yet. At the moment the scheduler would need them we need to figure out the best way to store and retrieve the consumers/producers. # Unit tests The render graph can be tested by enabling `WITH_GTEST` and use `vk_render_graph` as a filter. ``` bin/tests/blender_test --gtest_filter="vk_render_graph*" ``` Pull Request: https://projects.blender.org/blender/blender/pulls/120427
2024-04-19 10:46:50 +02:00
{
BLI_assert(aspect_mask != VK_IMAGE_ASPECT_NONE);
for (VkImageMemoryBarrier &vk_image_memory_barrier :
vk_image_memory_barriers_.as_mutable_span().drop_front(
r_barrier.image_memory_barriers.start()))
{
if (vk_image_memory_barrier.image == vk_image) {
/* When registering read/write buffers, it can be that the node internally requires
* read/write. In this case we adjust the dstAccessMask of the read barrier. An example is
* EEVEE update HIZ compute shader and shadow tagging. */
if ((vk_image_memory_barrier.dstAccessMask & src_access_mask) == src_access_mask) {
vk_image_memory_barrier.dstAccessMask |= dst_access_mask;
return;
}
/* When re-registering resources we can skip if access mask already contain all the flags.
*/
if ((vk_image_memory_barrier.dstAccessMask & dst_access_mask) == dst_access_mask &&
(vk_image_memory_barrier.srcAccessMask & src_access_mask) == src_access_mask &&
old_layout == new_layout)
{
return;
}
}
}
vk_image_memory_barriers_.append(
{VK_STRUCTURE_TYPE_IMAGE_MEMORY_BARRIER,
nullptr,
src_access_mask,
dst_access_mask,
old_layout,
new_layout,
VK_QUEUE_FAMILY_IGNORED,
VK_QUEUE_FAMILY_IGNORED,
vk_image,
{aspect_mask, 0, VK_REMAINING_MIP_LEVELS, layer_base, layer_count}});
Vulkan: Layer tracking during render scope EEVEE can bind layers of a texture that is also used as an attachment. When binding the image layout of these specific layers can be different that the image layout of the whole image. This fixes the known synchronization issues inside EEVEE. wasp_bot, tree_creature and wanderer scenes can be rendered without any synchronization issue reported by the Vulkan validation layers. Design task: #124214 When beginning to render the attachments are being evaluated. If there is an arrayed texture (with multiple layers) the individual layers of that texture can be tracked during until the rendering is ended. When the same texture is bound to a shader it will be a different layer (otherwise there is a feedback loop, which isn't allowed). The bound layers will typically need a different layout the transition to the new layout is executed and recorded. When the rendering ends, the layers are transitioned back to the layout the texture is expected in. It can happen that a layer is used multiple times during the same rendering. In that case the rendering should be suspended to perform the transition. Image layout transitions are not allowed during rendering. There is one place where a layer needs to be transited multiple times that is when EEVEE wants to extract the thickness from the shadow. The thickness is stored inside the gbuffer_normal which is also used as an attachment. Eval then samples the thickness from the gbuffer_normal as a sampler. To work around this issue we suspend the rendering when a `GPU_BARRIER_SHADER_IMAGE_ACCESS` is signaled. Pull Request: https://projects.blender.org/blender/blender/pulls/124407
2024-07-16 16:39:18 +02:00
}
/** \} */
/* -------------------------------------------------------------------- */
/** \name Sub-resource tracking
* \{ */
void VKCommandBuilder::LayeredImageTracker::begin(const VKRenderGraph &render_graph,
NodeHandle node_handle)
Vulkan: Layer tracking during render scope EEVEE can bind layers of a texture that is also used as an attachment. When binding the image layout of these specific layers can be different that the image layout of the whole image. This fixes the known synchronization issues inside EEVEE. wasp_bot, tree_creature and wanderer scenes can be rendered without any synchronization issue reported by the Vulkan validation layers. Design task: #124214 When beginning to render the attachments are being evaluated. If there is an arrayed texture (with multiple layers) the individual layers of that texture can be tracked during until the rendering is ended. When the same texture is bound to a shader it will be a different layer (otherwise there is a feedback loop, which isn't allowed). The bound layers will typically need a different layout the transition to the new layout is executed and recorded. When the rendering ends, the layers are transitioned back to the layout the texture is expected in. It can happen that a layer is used multiple times during the same rendering. In that case the rendering should be suspended to perform the transition. Image layout transitions are not allowed during rendering. There is one place where a layer needs to be transited multiple times that is when EEVEE wants to extract the thickness from the shadow. The thickness is stored inside the gbuffer_normal which is also used as an attachment. Eval then samples the thickness from the gbuffer_normal as a sampler. To work around this issue we suspend the rendering when a `GPU_BARRIER_SHADER_IMAGE_ACCESS` is signaled. Pull Request: https://projects.blender.org/blender/blender/pulls/124407
2024-07-16 16:39:18 +02:00
{
BLI_assert(render_graph.nodes_[node_handle].type == VKNodeType::BEGIN_RENDERING);
layered_attachments.clear();
layered_bindings.clear();
Vulkan: Layer tracking during render scope EEVEE can bind layers of a texture that is also used as an attachment. When binding the image layout of these specific layers can be different that the image layout of the whole image. This fixes the known synchronization issues inside EEVEE. wasp_bot, tree_creature and wanderer scenes can be rendered without any synchronization issue reported by the Vulkan validation layers. Design task: #124214 When beginning to render the attachments are being evaluated. If there is an arrayed texture (with multiple layers) the individual layers of that texture can be tracked during until the rendering is ended. When the same texture is bound to a shader it will be a different layer (otherwise there is a feedback loop, which isn't allowed). The bound layers will typically need a different layout the transition to the new layout is executed and recorded. When the rendering ends, the layers are transitioned back to the layout the texture is expected in. It can happen that a layer is used multiple times during the same rendering. In that case the rendering should be suspended to perform the transition. Image layout transitions are not allowed during rendering. There is one place where a layer needs to be transited multiple times that is when EEVEE wants to extract the thickness from the shadow. The thickness is stored inside the gbuffer_normal which is also used as an attachment. Eval then samples the thickness from the gbuffer_normal as a sampler. To work around this issue we suspend the rendering when a `GPU_BARRIER_SHADER_IMAGE_ACCESS` is signaled. Pull Request: https://projects.blender.org/blender/blender/pulls/124407
2024-07-16 16:39:18 +02:00
const VKRenderGraphNodeLinks &links = render_graph.links_[node_handle];
for (const VKRenderGraphLink &link : links.outputs) {
VKResourceStateTracker::Resource &resource = render_graph.resources_.resources_.lookup(
link.resource.handle);
if (resource.has_multiple_layers()) {
layered_attachments.add(resource.image.vk_image);
Vulkan: Layer tracking during render scope EEVEE can bind layers of a texture that is also used as an attachment. When binding the image layout of these specific layers can be different that the image layout of the whole image. This fixes the known synchronization issues inside EEVEE. wasp_bot, tree_creature and wanderer scenes can be rendered without any synchronization issue reported by the Vulkan validation layers. Design task: #124214 When beginning to render the attachments are being evaluated. If there is an arrayed texture (with multiple layers) the individual layers of that texture can be tracked during until the rendering is ended. When the same texture is bound to a shader it will be a different layer (otherwise there is a feedback loop, which isn't allowed). The bound layers will typically need a different layout the transition to the new layout is executed and recorded. When the rendering ends, the layers are transitioned back to the layout the texture is expected in. It can happen that a layer is used multiple times during the same rendering. In that case the rendering should be suspended to perform the transition. Image layout transitions are not allowed during rendering. There is one place where a layer needs to be transited multiple times that is when EEVEE wants to extract the thickness from the shadow. The thickness is stored inside the gbuffer_normal which is also used as an attachment. Eval then samples the thickness from the gbuffer_normal as a sampler. To work around this issue we suspend the rendering when a `GPU_BARRIER_SHADER_IMAGE_ACCESS` is signaled. Pull Request: https://projects.blender.org/blender/blender/pulls/124407
2024-07-16 16:39:18 +02:00
}
}
}
void VKCommandBuilder::LayeredImageTracker::update(VkImage vk_image,
uint32_t layer,
uint32_t layer_count,
VkImageLayout old_layout,
VkImageLayout new_layout,
Barrier &r_barrier)
Vulkan: Layer tracking during render scope EEVEE can bind layers of a texture that is also used as an attachment. When binding the image layout of these specific layers can be different that the image layout of the whole image. This fixes the known synchronization issues inside EEVEE. wasp_bot, tree_creature and wanderer scenes can be rendered without any synchronization issue reported by the Vulkan validation layers. Design task: #124214 When beginning to render the attachments are being evaluated. If there is an arrayed texture (with multiple layers) the individual layers of that texture can be tracked during until the rendering is ended. When the same texture is bound to a shader it will be a different layer (otherwise there is a feedback loop, which isn't allowed). The bound layers will typically need a different layout the transition to the new layout is executed and recorded. When the rendering ends, the layers are transitioned back to the layout the texture is expected in. It can happen that a layer is used multiple times during the same rendering. In that case the rendering should be suspended to perform the transition. Image layout transitions are not allowed during rendering. There is one place where a layer needs to be transited multiple times that is when EEVEE wants to extract the thickness from the shadow. The thickness is stored inside the gbuffer_normal which is also used as an attachment. Eval then samples the thickness from the gbuffer_normal as a sampler. To work around this issue we suspend the rendering when a `GPU_BARRIER_SHADER_IMAGE_ACCESS` is signaled. Pull Request: https://projects.blender.org/blender/blender/pulls/124407
2024-07-16 16:39:18 +02:00
{
for (const TrackedImage &binding : layered_bindings) {
Vulkan: Layer tracking during render scope EEVEE can bind layers of a texture that is also used as an attachment. When binding the image layout of these specific layers can be different that the image layout of the whole image. This fixes the known synchronization issues inside EEVEE. wasp_bot, tree_creature and wanderer scenes can be rendered without any synchronization issue reported by the Vulkan validation layers. Design task: #124214 When beginning to render the attachments are being evaluated. If there is an arrayed texture (with multiple layers) the individual layers of that texture can be tracked during until the rendering is ended. When the same texture is bound to a shader it will be a different layer (otherwise there is a feedback loop, which isn't allowed). The bound layers will typically need a different layout the transition to the new layout is executed and recorded. When the rendering ends, the layers are transitioned back to the layout the texture is expected in. It can happen that a layer is used multiple times during the same rendering. In that case the rendering should be suspended to perform the transition. Image layout transitions are not allowed during rendering. There is one place where a layer needs to be transited multiple times that is when EEVEE wants to extract the thickness from the shadow. The thickness is stored inside the gbuffer_normal which is also used as an attachment. Eval then samples the thickness from the gbuffer_normal as a sampler. To work around this issue we suspend the rendering when a `GPU_BARRIER_SHADER_IMAGE_ACCESS` is signaled. Pull Request: https://projects.blender.org/blender/blender/pulls/124407
2024-07-16 16:39:18 +02:00
if (binding.vk_image == vk_image && binding.layer == layer) {
BLI_assert_msg(binding.vk_image_layout == new_layout,
"We don't support that one layer transitions multiple times during a "
"rendering scope.");
/* Early exit as layer is in correct layout. This is a normal case as we expect multiple
* draw commands to take place during a rendering scope with the same layer access. */
Vulkan: Layer tracking during render scope EEVEE can bind layers of a texture that is also used as an attachment. When binding the image layout of these specific layers can be different that the image layout of the whole image. This fixes the known synchronization issues inside EEVEE. wasp_bot, tree_creature and wanderer scenes can be rendered without any synchronization issue reported by the Vulkan validation layers. Design task: #124214 When beginning to render the attachments are being evaluated. If there is an arrayed texture (with multiple layers) the individual layers of that texture can be tracked during until the rendering is ended. When the same texture is bound to a shader it will be a different layer (otherwise there is a feedback loop, which isn't allowed). The bound layers will typically need a different layout the transition to the new layout is executed and recorded. When the rendering ends, the layers are transitioned back to the layout the texture is expected in. It can happen that a layer is used multiple times during the same rendering. In that case the rendering should be suspended to perform the transition. Image layout transitions are not allowed during rendering. There is one place where a layer needs to be transited multiple times that is when EEVEE wants to extract the thickness from the shadow. The thickness is stored inside the gbuffer_normal which is also used as an attachment. Eval then samples the thickness from the gbuffer_normal as a sampler. To work around this issue we suspend the rendering when a `GPU_BARRIER_SHADER_IMAGE_ACCESS` is signaled. Pull Request: https://projects.blender.org/blender/blender/pulls/124407
2024-07-16 16:39:18 +02:00
return;
}
}
layered_bindings.append({vk_image, new_layout, layer, layer_count});
Vulkan: Layer tracking during render scope EEVEE can bind layers of a texture that is also used as an attachment. When binding the image layout of these specific layers can be different that the image layout of the whole image. This fixes the known synchronization issues inside EEVEE. wasp_bot, tree_creature and wanderer scenes can be rendered without any synchronization issue reported by the Vulkan validation layers. Design task: #124214 When beginning to render the attachments are being evaluated. If there is an arrayed texture (with multiple layers) the individual layers of that texture can be tracked during until the rendering is ended. When the same texture is bound to a shader it will be a different layer (otherwise there is a feedback loop, which isn't allowed). The bound layers will typically need a different layout the transition to the new layout is executed and recorded. When the rendering ends, the layers are transitioned back to the layout the texture is expected in. It can happen that a layer is used multiple times during the same rendering. In that case the rendering should be suspended to perform the transition. Image layout transitions are not allowed during rendering. There is one place where a layer needs to be transited multiple times that is when EEVEE wants to extract the thickness from the shadow. The thickness is stored inside the gbuffer_normal which is also used as an attachment. Eval then samples the thickness from the gbuffer_normal as a sampler. To work around this issue we suspend the rendering when a `GPU_BARRIER_SHADER_IMAGE_ACCESS` is signaled. Pull Request: https://projects.blender.org/blender/blender/pulls/124407
2024-07-16 16:39:18 +02:00
/* We should be able to do better. BOTTOM/TOP is really a worst case barrier. */
r_barrier.src_stage_mask = VK_PIPELINE_STAGE_ALL_COMMANDS_BIT;
r_barrier.dst_stage_mask = VK_PIPELINE_STAGE_ALL_COMMANDS_BIT;
command_builder.add_image_barrier(vk_image,
r_barrier,
VK_ACCESS_TRANSFER_WRITE_BIT,
VK_ACCESS_SHADER_READ_BIT | VK_ACCESS_SHADER_WRITE_BIT |
VK_ACCESS_COLOR_ATTACHMENT_READ_BIT |
VK_ACCESS_COLOR_ATTACHMENT_WRITE_BIT |
VK_ACCESS_TRANSFER_READ_BIT | VK_ACCESS_TRANSFER_WRITE_BIT,
old_layout,
new_layout,
VK_IMAGE_ASPECT_COLOR_BIT,
layer,
layer_count);
Vulkan: Layer tracking during render scope EEVEE can bind layers of a texture that is also used as an attachment. When binding the image layout of these specific layers can be different that the image layout of the whole image. This fixes the known synchronization issues inside EEVEE. wasp_bot, tree_creature and wanderer scenes can be rendered without any synchronization issue reported by the Vulkan validation layers. Design task: #124214 When beginning to render the attachments are being evaluated. If there is an arrayed texture (with multiple layers) the individual layers of that texture can be tracked during until the rendering is ended. When the same texture is bound to a shader it will be a different layer (otherwise there is a feedback loop, which isn't allowed). The bound layers will typically need a different layout the transition to the new layout is executed and recorded. When the rendering ends, the layers are transitioned back to the layout the texture is expected in. It can happen that a layer is used multiple times during the same rendering. In that case the rendering should be suspended to perform the transition. Image layout transitions are not allowed during rendering. There is one place where a layer needs to be transited multiple times that is when EEVEE wants to extract the thickness from the shadow. The thickness is stored inside the gbuffer_normal which is also used as an attachment. Eval then samples the thickness from the gbuffer_normal as a sampler. To work around this issue we suspend the rendering when a `GPU_BARRIER_SHADER_IMAGE_ACCESS` is signaled. Pull Request: https://projects.blender.org/blender/blender/pulls/124407
2024-07-16 16:39:18 +02:00
}
void VKCommandBuilder::LayeredImageTracker::end(Barrier &r_barrier, bool use_local_read)
Vulkan: Layer tracking during render scope EEVEE can bind layers of a texture that is also used as an attachment. When binding the image layout of these specific layers can be different that the image layout of the whole image. This fixes the known synchronization issues inside EEVEE. wasp_bot, tree_creature and wanderer scenes can be rendered without any synchronization issue reported by the Vulkan validation layers. Design task: #124214 When beginning to render the attachments are being evaluated. If there is an arrayed texture (with multiple layers) the individual layers of that texture can be tracked during until the rendering is ended. When the same texture is bound to a shader it will be a different layer (otherwise there is a feedback loop, which isn't allowed). The bound layers will typically need a different layout the transition to the new layout is executed and recorded. When the rendering ends, the layers are transitioned back to the layout the texture is expected in. It can happen that a layer is used multiple times during the same rendering. In that case the rendering should be suspended to perform the transition. Image layout transitions are not allowed during rendering. There is one place where a layer needs to be transited multiple times that is when EEVEE wants to extract the thickness from the shadow. The thickness is stored inside the gbuffer_normal which is also used as an attachment. Eval then samples the thickness from the gbuffer_normal as a sampler. To work around this issue we suspend the rendering when a `GPU_BARRIER_SHADER_IMAGE_ACCESS` is signaled. Pull Request: https://projects.blender.org/blender/blender/pulls/124407
2024-07-16 16:39:18 +02:00
{
suspend(r_barrier, use_local_read);
layered_attachments.clear();
layered_bindings.clear();
}
void VKCommandBuilder::LayeredImageTracker::suspend(Barrier &r_barrier, bool use_local_read)
{
if (layered_bindings.is_empty()) {
return;
Vulkan: Layer tracking during render scope EEVEE can bind layers of a texture that is also used as an attachment. When binding the image layout of these specific layers can be different that the image layout of the whole image. This fixes the known synchronization issues inside EEVEE. wasp_bot, tree_creature and wanderer scenes can be rendered without any synchronization issue reported by the Vulkan validation layers. Design task: #124214 When beginning to render the attachments are being evaluated. If there is an arrayed texture (with multiple layers) the individual layers of that texture can be tracked during until the rendering is ended. When the same texture is bound to a shader it will be a different layer (otherwise there is a feedback loop, which isn't allowed). The bound layers will typically need a different layout the transition to the new layout is executed and recorded. When the rendering ends, the layers are transitioned back to the layout the texture is expected in. It can happen that a layer is used multiple times during the same rendering. In that case the rendering should be suspended to perform the transition. Image layout transitions are not allowed during rendering. There is one place where a layer needs to be transited multiple times that is when EEVEE wants to extract the thickness from the shadow. The thickness is stored inside the gbuffer_normal which is also used as an attachment. Eval then samples the thickness from the gbuffer_normal as a sampler. To work around this issue we suspend the rendering when a `GPU_BARRIER_SHADER_IMAGE_ACCESS` is signaled. Pull Request: https://projects.blender.org/blender/blender/pulls/124407
2024-07-16 16:39:18 +02:00
}
command_builder.reset_barriers(r_barrier);
/* We should be able to do better. BOTTOM/TOP is really a worst case barrier. */
r_barrier.src_stage_mask = VK_PIPELINE_STAGE_ALL_COMMANDS_BIT;
r_barrier.dst_stage_mask = VK_PIPELINE_STAGE_ALL_COMMANDS_BIT;
int64_t start_index = command_builder.vk_image_memory_barriers_.size();
r_barrier.image_memory_barriers = IndexRange::from_begin_size(start_index, 0);
for (const TrackedImage &binding : layered_bindings) {
command_builder.add_image_barrier(
binding.vk_image,
r_barrier,
VK_ACCESS_SHADER_READ_BIT | VK_ACCESS_SHADER_WRITE_BIT |
VK_ACCESS_COLOR_ATTACHMENT_READ_BIT | VK_ACCESS_COLOR_ATTACHMENT_WRITE_BIT |
VK_ACCESS_TRANSFER_READ_BIT | VK_ACCESS_TRANSFER_WRITE_BIT,
VK_ACCESS_SHADER_READ_BIT | VK_ACCESS_SHADER_WRITE_BIT |
VK_ACCESS_COLOR_ATTACHMENT_READ_BIT | VK_ACCESS_COLOR_ATTACHMENT_WRITE_BIT |
VK_ACCESS_TRANSFER_READ_BIT | VK_ACCESS_TRANSFER_WRITE_BIT,
binding.vk_image_layout,
use_local_read ? VK_IMAGE_LAYOUT_RENDERING_LOCAL_READ_KHR :
VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL,
VK_IMAGE_ASPECT_COLOR_BIT,
binding.layer,
binding.layer_count);
r_barrier.image_memory_barriers = r_barrier.image_memory_barriers.with_new_end(
command_builder.vk_image_memory_barriers_.size());
#if 0
std::cout << __func__ << ": transition layout image=" << binding.vk_image
<< ", layer=" << binding.layer << ", count=" << binding.layer_count
<< ", from_layout=" << to_string(binding.vk_image_layout)
<< ", to_layout=" << to_string(VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL) << "\n";
#endif
}
Vulkan: Render graph core **Design Task**: blender/blender#118330 This PR adds the core of the render graph. The render graph isn't used. Current implementation of the Vulkan Backend is slow by design. We focused on stability, before performance. With the new introduced render graph the focus will shift to performance and keep the stability at where it is. Some highlights: - Every context will get its own render graph. (`VKRenderGraph`). - Resources (and resource state tracking) is device specific (`VKResourceStateTracker`). - No node reordering / sub graph execution has been implemented. Currently All nodes in the graph is executed in the order they were added. (`VKScheduler`). - The links inside the graph describe the resources the nodes read from (input links) or writes to (output links) - When resources are written to a resource stamp is incremented allowing keeping track of which nodes needs which stamp of a resource. - At each link the access information (how does the node accesses the resource) and image layout (for image resources) are stored. This allows the render graph to find out how a resource was used in the past and will be used in the future. That is important to construct pipeline barriers that don't stall the whole GPU. # Defined nodes This implementation has nodes for: - Blit image - Clear color image - Copy buffers to buffers - Copy buffers to images - Copy images to images - Copy images to buffers - Dispatch compute shader - Fill buffers - Synchronization Each node has a node info, create info and data struct. The create info contains all data to construct the node, including the links of the graph. The data struct only contains the data stored inside the node. The node info contains the node specific implementation. > NOTE: Other nodes will be added after this PR lands to main. # Resources Before a render graph can be used, the resources should be registered to `VKResourceStateTracker`. In the final implementation this will be owned by the `VKDevice`. Registration of resources can be done by calling `VKResources.add_buffer` or `VKResources.add_image`. # Render graph Nodes can be added to the render graph. When adding a node its read/ write dependencies are extracted and converted into links (`VKNodeInfo. build_links`). When the caller wants to have a resource up to date the functions `VKRenderGraph.submit_for_read` or `VKRenderGraph.submit_for_present` can be called. These functions will select and order the nodes that are needed and convert them to `vkCmd*` commands. These commands include pipeline barrier and image layout transitions. The `vkCmd` are recorded into a command buffer which is sent to the device queue. ## Walking the graph Walking the render graph isn't implemented yet. The idea is to have a `Map<ResourceWithStamp, Vector<NodeHandle>> consumers` and `Map<ResourceWithStamp, NodeHandle> producers`. These attributes can be stored in the render graph and created when building the links, or can be created inside the VKScheduler as a variable. The exact detail which one would be better is unclear as there aren't any users yet. At the moment the scheduler would need them we need to figure out the best way to store and retrieve the consumers/producers. # Unit tests The render graph can be tested by enabling `WITH_GTEST` and use `vk_render_graph` as a filter. ``` bin/tests/blender_test --gtest_filter="vk_render_graph*" ``` Pull Request: https://projects.blender.org/blender/blender/pulls/120427
2024-04-19 10:46:50 +02:00
}
void VKCommandBuilder::LayeredImageTracker::resume(Barrier &r_barrier, bool use_local_read)
{
if (layered_bindings.is_empty()) {
return;
}
command_builder.reset_barriers(r_barrier);
/* We should be able to do better. BOTTOM/TOP is really a worst case barrier. */
r_barrier.src_stage_mask = VK_PIPELINE_STAGE_ALL_COMMANDS_BIT;
r_barrier.dst_stage_mask = VK_PIPELINE_STAGE_ALL_COMMANDS_BIT;
int64_t start_index = command_builder.vk_image_memory_barriers_.size();
r_barrier.image_memory_barriers = IndexRange::from_begin_size(start_index, 0);
for (const TrackedImage &binding : layered_bindings) {
command_builder.add_image_barrier(
binding.vk_image,
r_barrier,
VK_ACCESS_SHADER_READ_BIT | VK_ACCESS_SHADER_WRITE_BIT |
VK_ACCESS_COLOR_ATTACHMENT_READ_BIT | VK_ACCESS_COLOR_ATTACHMENT_WRITE_BIT |
VK_ACCESS_TRANSFER_READ_BIT | VK_ACCESS_TRANSFER_WRITE_BIT,
VK_ACCESS_SHADER_READ_BIT | VK_ACCESS_SHADER_WRITE_BIT |
VK_ACCESS_COLOR_ATTACHMENT_READ_BIT | VK_ACCESS_COLOR_ATTACHMENT_WRITE_BIT |
VK_ACCESS_TRANSFER_READ_BIT | VK_ACCESS_TRANSFER_WRITE_BIT,
use_local_read ? VK_IMAGE_LAYOUT_RENDERING_LOCAL_READ_KHR :
VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL,
binding.vk_image_layout,
VK_IMAGE_ASPECT_COLOR_BIT,
binding.layer,
binding.layer_count);
#if 0
std::cout << __func__ << ": transition layout image=" << binding.vk_image
<< ", layer=" << binding.layer << ", count=" << binding.layer_count
<< ", from_layout=" << to_string(VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL)
<< ", to_layout=" << to_string(binding.vk_image_layout) << "\n";
#endif
}
}
Vulkan: Render graph core **Design Task**: blender/blender#118330 This PR adds the core of the render graph. The render graph isn't used. Current implementation of the Vulkan Backend is slow by design. We focused on stability, before performance. With the new introduced render graph the focus will shift to performance and keep the stability at where it is. Some highlights: - Every context will get its own render graph. (`VKRenderGraph`). - Resources (and resource state tracking) is device specific (`VKResourceStateTracker`). - No node reordering / sub graph execution has been implemented. Currently All nodes in the graph is executed in the order they were added. (`VKScheduler`). - The links inside the graph describe the resources the nodes read from (input links) or writes to (output links) - When resources are written to a resource stamp is incremented allowing keeping track of which nodes needs which stamp of a resource. - At each link the access information (how does the node accesses the resource) and image layout (for image resources) are stored. This allows the render graph to find out how a resource was used in the past and will be used in the future. That is important to construct pipeline barriers that don't stall the whole GPU. # Defined nodes This implementation has nodes for: - Blit image - Clear color image - Copy buffers to buffers - Copy buffers to images - Copy images to images - Copy images to buffers - Dispatch compute shader - Fill buffers - Synchronization Each node has a node info, create info and data struct. The create info contains all data to construct the node, including the links of the graph. The data struct only contains the data stored inside the node. The node info contains the node specific implementation. > NOTE: Other nodes will be added after this PR lands to main. # Resources Before a render graph can be used, the resources should be registered to `VKResourceStateTracker`. In the final implementation this will be owned by the `VKDevice`. Registration of resources can be done by calling `VKResources.add_buffer` or `VKResources.add_image`. # Render graph Nodes can be added to the render graph. When adding a node its read/ write dependencies are extracted and converted into links (`VKNodeInfo. build_links`). When the caller wants to have a resource up to date the functions `VKRenderGraph.submit_for_read` or `VKRenderGraph.submit_for_present` can be called. These functions will select and order the nodes that are needed and convert them to `vkCmd*` commands. These commands include pipeline barrier and image layout transitions. The `vkCmd` are recorded into a command buffer which is sent to the device queue. ## Walking the graph Walking the render graph isn't implemented yet. The idea is to have a `Map<ResourceWithStamp, Vector<NodeHandle>> consumers` and `Map<ResourceWithStamp, NodeHandle> producers`. These attributes can be stored in the render graph and created when building the links, or can be created inside the VKScheduler as a variable. The exact detail which one would be better is unclear as there aren't any users yet. At the moment the scheduler would need them we need to figure out the best way to store and retrieve the consumers/producers. # Unit tests The render graph can be tested by enabling `WITH_GTEST` and use `vk_render_graph` as a filter. ``` bin/tests/blender_test --gtest_filter="vk_render_graph*" ``` Pull Request: https://projects.blender.org/blender/blender/pulls/120427
2024-04-19 10:46:50 +02:00
/** \} */
/* -------------------------------------------------------------------- */
/** \name Debugging tools
* \{ */
std::string VKCommandBuilder::to_string_barrier(const Barrier &barrier)
{
std::stringstream ss;
ss << "src_stage_mask=" << to_string_vk_pipeline_stage_flags(barrier.src_stage_mask)
<< ", dst_stage_mask=" << to_string_vk_pipeline_stage_flags(barrier.dst_stage_mask) << "\n";
for (const VkBufferMemoryBarrier &buffer_memory_barrier :
vk_buffer_memory_barriers_.as_span().slice(barrier.buffer_memory_barriers))
{
ss << " - src_access_mask=" << to_string_vk_access_flags(buffer_memory_barrier.srcAccessMask)
<< ", dst_access_mask=" << to_string_vk_access_flags(buffer_memory_barrier.dstAccessMask)
<< ", vk_buffer=" << to_string(buffer_memory_barrier.buffer) << "\n";
}
for (const VkImageMemoryBarrier &image_memory_barrier :
vk_image_memory_barriers_.as_span().slice(barrier.image_memory_barriers))
{
ss << " - src_access_mask=" << to_string_vk_access_flags(image_memory_barrier.srcAccessMask)
<< ", dst_access_mask=" << to_string_vk_access_flags(image_memory_barrier.dstAccessMask)
<< ", vk_image=" << to_string(image_memory_barrier.image)
<< ", old_layout=" << to_string(image_memory_barrier.oldLayout)
<< ", new_layout=" << to_string(image_memory_barrier.newLayout)
<< ", subresource_range=" << to_string(image_memory_barrier.subresourceRange, 2) << "\n";
}
return ss.str();
}
/** \} */
Vulkan: Render graph core **Design Task**: blender/blender#118330 This PR adds the core of the render graph. The render graph isn't used. Current implementation of the Vulkan Backend is slow by design. We focused on stability, before performance. With the new introduced render graph the focus will shift to performance and keep the stability at where it is. Some highlights: - Every context will get its own render graph. (`VKRenderGraph`). - Resources (and resource state tracking) is device specific (`VKResourceStateTracker`). - No node reordering / sub graph execution has been implemented. Currently All nodes in the graph is executed in the order they were added. (`VKScheduler`). - The links inside the graph describe the resources the nodes read from (input links) or writes to (output links) - When resources are written to a resource stamp is incremented allowing keeping track of which nodes needs which stamp of a resource. - At each link the access information (how does the node accesses the resource) and image layout (for image resources) are stored. This allows the render graph to find out how a resource was used in the past and will be used in the future. That is important to construct pipeline barriers that don't stall the whole GPU. # Defined nodes This implementation has nodes for: - Blit image - Clear color image - Copy buffers to buffers - Copy buffers to images - Copy images to images - Copy images to buffers - Dispatch compute shader - Fill buffers - Synchronization Each node has a node info, create info and data struct. The create info contains all data to construct the node, including the links of the graph. The data struct only contains the data stored inside the node. The node info contains the node specific implementation. > NOTE: Other nodes will be added after this PR lands to main. # Resources Before a render graph can be used, the resources should be registered to `VKResourceStateTracker`. In the final implementation this will be owned by the `VKDevice`. Registration of resources can be done by calling `VKResources.add_buffer` or `VKResources.add_image`. # Render graph Nodes can be added to the render graph. When adding a node its read/ write dependencies are extracted and converted into links (`VKNodeInfo. build_links`). When the caller wants to have a resource up to date the functions `VKRenderGraph.submit_for_read` or `VKRenderGraph.submit_for_present` can be called. These functions will select and order the nodes that are needed and convert them to `vkCmd*` commands. These commands include pipeline barrier and image layout transitions. The `vkCmd` are recorded into a command buffer which is sent to the device queue. ## Walking the graph Walking the render graph isn't implemented yet. The idea is to have a `Map<ResourceWithStamp, Vector<NodeHandle>> consumers` and `Map<ResourceWithStamp, NodeHandle> producers`. These attributes can be stored in the render graph and created when building the links, or can be created inside the VKScheduler as a variable. The exact detail which one would be better is unclear as there aren't any users yet. At the moment the scheduler would need them we need to figure out the best way to store and retrieve the consumers/producers. # Unit tests The render graph can be tested by enabling `WITH_GTEST` and use `vk_render_graph` as a filter. ``` bin/tests/blender_test --gtest_filter="vk_render_graph*" ``` Pull Request: https://projects.blender.org/blender/blender/pulls/120427
2024-04-19 10:46:50 +02:00
} // namespace blender::gpu::render_graph