Commit Graph

27 Commits

Author SHA1 Message Date
Jeroen Bakker
13d6ba1f62 Vulkan: Reduce memory overhead rendergraph
For performance reasons render graphs can keep memory allocated so it
could be reused. This PR optimizes the memory usage inside the
rendergraph to keep it within normal usage.

I didn't detect any performance regression with this change but reduces
the memory when performing final image rendering of heavy scenes.

Partial fix for #137382. the amount of memory still increases with 4mb
per render. It fixes the main difference when using large scenes.

Pull Request: https://projects.blender.org/blender/blender/pulls/137660
2025-04-17 13:45:47 +02:00
Jeroen Bakker
e6b3cc8983 Vulkan: Device command builder
This PR implements a new the threading model for building render graphs
based on tests performed last month. For out workload multithreaded
command building will block in the driver or device. So better to use a
single thread for command building.

Details of the internal working is documented at https://developer.blender.org/docs/features/gpu/vulkan/render_graph/

- When a context is activated on a thread the context asks for a
  render graph it can use by calling `VKDevice::render_graph_new`.
- Parts of the GPU backend that requires GPU commands will add a
  specific render graph node to the render graph. The nodes also
  contains a reference to all resources it needs including the
  access it needs and the image layout.
- When the context is flushed the render graph is submitted to the
  device by calling `VKDevice::render_graph_submit`.
- The device puts the render graph in `VKDevice::submission_pool`.
- There is a single background thread that gets the next render
  graph to send to the GPU (`VKDevice::submission_runner`).
  - Reorder the commands of the render graph to comply with Vulkan
    specific command order rules and reducing possible bottlenecks.
    (`VKScheduler`)
  - Generate the required barriers `VKCommandBuilder::groups_extract_barriers`.
    This is a separate step to reduce resource locking giving other
    threads access to the resource states when they are building
    the render graph nodes.
  - GPU commands and pipeline barriers are recorded to a VkCommandBuffer.
    (`VKCommandBuilder::record_commands`)
  - When completed the command buffer can be submitted to the device
    queue. `vkQueueSubmit`
  - Render graphs that have been submitted can be reused by a next
    thread. This is done by pushing the render graph to the
    `VKDevice::unused_render_graphs` queue.

Pull Request: https://projects.blender.org/blender/blender/pulls/132681
2025-01-27 08:55:23 +01:00
Jeroen Bakker
ff804882bd Refactor: Vulkan: Store large data in separate vectors
VKRenderGraphNode is 892 bytes and most of the bytes are used for
specific nodes. By storing large structs in separate vectors we can
reduce the needed memory and improve cache pre-fetching.

With this change the VKRenderGraphNode is reduced to 64 bytes.

On a (50 frames shader_balls.blend) the end user performance is improved by
2%.

| **Platform**     | **Before** | **After** |
| ---------------- | ---------- | --------- |
| AMD W7700        | 1409 ms    | 1383 ms   |
| NVIDIA RTX 6000  | 1443 ms    | 1428 ms   |

Pull Request: https://projects.blender.org/blender/blender/pulls/133317
2025-01-21 14:49:23 +01:00
Jeroen Bakker
390ca01685 Cleanup: Vulkan: Remove resource ownership
Images used to be tracked with ownership in order to reset swap chain
images to its original layout. This isn't used anymore as we always mark
them in VK_IMAGE_LAYOUT_UNDEFINED to make the first pipeline barrier a
nop.

This change reduces unneeded complexity and safe a few CPU cycles.

Pull Request: https://projects.blender.org/blender/blender/pulls/133197
2025-01-17 14:46:22 +01:00
Jeroen Bakker
fbe05ac60b Cleanup: Vulkan: Remove unused code/parameters
Initial design had a more complex use case for render graphs.
They are not really used and will not in the near term. This PR
removes some code that doesn't do a thing

Pull Request: https://projects.blender.org/blender/blender/pulls/133047
2025-01-14 14:09:50 +01:00
Jiarui-Yan
b356c2bec1 Vulkan: Colored debug groups
This PR adds color to debug group so that RenderDoc can give color to different
debug groups and sub-groups in the parent debug group.

Ref: #124099

Pull Request: https://projects.blender.org/blender/blender/pulls/129340
2024-11-29 07:52:00 +01:00
Jeroen Bakker
c69b107a28 Fix #130121: Vulkan: Lightbaking resources freed to early
When lighting baking is used in a background render the resources are
freed to early. The cause is that light baking does some initialization
within a context, that isn't send to the GPU. The first iteration of
light baking is expecting that it can free resources, what leads to GPU
resources to be deleted that are still used by commands that are
scheduled to be send to the GPU.

This PR fixes this by using multiple resource pools when background
rendering and ensure that contexts are send to the GPU when rendering
ends.

Pull Request: https://projects.blender.org/blender/blender/pulls/131094
2024-11-28 16:05:59 +01:00
Jeroen Bakker
faf2c6f0ca Vulkan: Add method to query debug group of node
When debugging render graph the debug group name can narrow down the
place where a node originates from. This PR adds a function to retrieve
the full debug group name of a specific node.

Pull Request: https://projects.blender.org/blender/blender/pulls/131081
2024-11-28 12:00:21 +01:00
Campbell Barton
b9f055459a Cleanup: ensure trailing space around comment blocks 2024-11-27 19:01:00 +11:00
Jeroen Bakker
acb205763e Fix: Vulkan: Cycles CPU Synchronization
When using cycles in the viewport there it uses render threads and
workers to update the viewport. All threads can record commands to the
queue and needs to be synchronized. This didn't happen leading to
incorrect renders.

Detected when researching #128608

Pull Request: https://projects.blender.org/blender/blender/pulls/128975
2024-10-14 15:38:22 +02:00
Jeroen Bakker
4d8581c7ee Vulkan: Add node to update buffers
This change adds the option to update a buffer via the render
graph via `vkCmdUpdateBuffer`. This is only enabled for
uniform buffers as they are small and aligned/sized correctly.

Pull Request: https://projects.blender.org/blender/blender/pulls/128416
2024-10-01 14:22:56 +02:00
Jeroen Bakker
7a7ae5defe Vulkan: Occlusion Queries
Implements occlusion queries using VkQueryPools.

Pull Request: https://projects.blender.org/blender/blender/pulls/123883
2024-08-20 11:27:33 +02:00
Jeroen Bakker
ef3ceb3629 Vulkan: Resource Pools
This PR implements #126353; In short: keep discard list as part of swap chain images. This allows
better determination when resources are actually not in use anymore.

## Resource pool

Resource pools keep track of the resources for a swap chain image.

In Blender this is a bit more complicated due to the way GPUContext work. A single thread can have
multiple contexts. Some of them have a swap chain (GHOST Window) other don't (draw manager). The
resource pool should be shared between the contexts running on the same thread.

When opening multiple windows there are also multiple swap chains to consider.

### Discard pile

Resource handles that are deleted and stored in the discard pile. When we are sure that these
resources are not used on the GPU anymore these are destroyed.

### Reusable resources

There are other resources as well like:
- Descriptor sets
- Descriptor pools

## Open issues

There are some limitations that require future PRs to fix including:
- Background rendering
- Handling multiple windows
- Improve CPU/GPU synchronization
- Reuse staging buffers

Pull Request: https://projects.blender.org/blender/blender/pulls/126353
2024-08-19 15:37:48 +02:00
Jeroen Bakker
0d71d83d47 Vulkan: Share rendergraphs on context inside same thread
This PR will share render graphs between all contexts that run in
the same thread. This allows the draw manager commands to be added
to the same render graph as the UI.

- Fixes debug groups hiearchy. Draw manager would restart a hierarchy as
  it wasn't aware of the debug groups already added by the UI
- Removes cpu sync when switching between contexts.

In a future change this is needed to improve discarding resources.

Pull Request: https://projects.blender.org/blender/blender/pulls/124715
2024-07-15 16:03:51 +02:00
Jeroen Bakker
bf3c6a3480 Vulkan: Improve debugging render graph
Adds debug print function to output a node with its inputs and outputs.
Also keep track of the name of the resource (only images) what will
be presented. Tracking of the resource name is only done in debug builds.

Pull Request: https://projects.blender.org/blender/blender/pulls/124033
2024-07-02 13:29:34 +02:00
Jeroen Bakker
ee0b7b9a95 Vulken: Mix array aspect of image views
The image views type can change depending based on how they are bound
to shaders. When a shader accesses a view without array operations,
the image view should not be an array. This was previously ignored.

Pull Request: https://projects.blender.org/blender/blender/pulls/123726
2024-06-25 15:15:18 +02:00
Jeroen Bakker
f5b173188e Vulkan: Fix incorrect read image barrier
When having a sequential read image barriers for the same resource
and the second one requires an image layout transition the incorrect
barriers where generated.

This was fixed by aligning the implementation with write image barriers.

Pull Request: https://projects.blender.org/blender/blender/pulls/123712
2024-06-25 11:05:07 +02:00
Jeroen Bakker
4353b7ffba Vulkan: Remove unused code
Vulkan backend has recently switched to a render graph approach. Many
code was left so we could develop the render graph beside the previous
implementation. Last week we removed the switch. This PR will remove
most of the unused code. There might be some left and will be removed
when detected.

Pull Request: https://projects.blender.org/blender/blender/pulls/123422
2024-06-20 11:34:19 +02:00
Jeroen Bakker
22d352dacc Vulkan: Render graph drawing
This PR adds drawing support to the render graph. It adds support for
draw, indirect draw, indexed draw and indexed indirect draw.

Draw commands can only be executed within a rendering scope. Data
transfer commands and dispatch commands cannot be executed within a
rendering scope. Blender can still send in commands in any order and
the render graph needs to find out the best order to minimize context
switches (rendering/begin/end). This is the responsibility of the
scheduler.

The scheduler will push data transfer and dispatch commands outside the
rendering scope:
- data transfer and dispatch commands at the beginning are done before
  the rendering begin.
- data transfer and dispatch commands at the end are done after the
  rendering end.
- data transfer and dispatches in between draw commands will be pushed
  to the beginning if they are not yet being used.
- for all other data transfer and dispatch commands the rendering is
  suspenderd and will be continued afterwards.

Within a rendering context it is not allowed to perform synchronization
commands. Any synchronization commands inside a rendering scope will be
performed before the rendering scope begins. Nodes are now organized
in groups to simplify the code around this area.

Pull Request: https://projects.blender.org/blender/blender/pulls/123168
2024-06-13 09:37:17 +02:00
Jeroen Bakker
5ee46b0ef3 Vulkan: Use macros to reduce code replication
Use macros to reduce code replication.

- `VKRenderGraph::add_node`
- `VKRenderGraphNode::build_commands`
- `VKRenderGraphNode::free_data`

Pull Request: https://projects.blender.org/blender/blender/pulls/122141
2024-05-23 11:21:55 +02:00
Jeroen Bakker
c3c4e948b1 Vulkan: Fixing issues in debugging groups
* Debugging groups were not being applied as that part of the code
  wasn't ported to the original patch
* Debugging groups didn't account for nodes that weren't owned by
  any debug group.

Pull Request: https://projects.blender.org/blender/blender/pulls/122136
2024-05-23 09:48:25 +02:00
Jeroen Bakker
488e74a209 Vulkan: Render graph debug groups
This PR implements debug groups in the render graph. Each node contains
a reference to the debug group they belong to. During scheduling the
nodes can be reordered and the correct debug group needs to be
activated.

This is done by keeping track of the current debug group. When a
different debug group is needed, the needed ends/begins are added
to the command buffer.

This mechanism also cleans up debug groups that are not used at all
as they don't have any nodes associated to it.

Pull Request: https://projects.blender.org/blender/blender/pulls/122054
2024-05-21 17:34:55 +02:00
Jeroen Bakker
40f2df1a81 Vulkan: Render graph clear attachments
Adds support for clear attachments to the render graph.
Clearing attachments require that the command buffer is
being set for rendering (vkCmdBeginRendering).

**What works**

- Clear attachments are working with dynamic rendering and the render graph
- `GPUVulkanTest.framebuffer_clear_color_single_attachment`
- `GPUVulkanTest.framebuffer_clear_color_multiple_attachments`
- `GPUVulkanTest.framebuffer_clear_multiple_color_multiple_attachments`
- `GPUVulkanTest.framebuffer_scissor_test`
- `GPUVulkanTest.framebuffer_clear_depth`

**What still needs to be addressed**

When a command buffer is rendering it cannot do any pipeline
barriers. For clearing attachments this isn't needed, but when
we address drawing we might need to register drawing resource
dependencies to the dependency list of the begin rendering node.
This will be solved when drawing will be implemented. [#121648]

The begin rendering node is large as it has to store data for
framebuffers with 8 color attachments as well. This might
become an overhead and we could solve this by splicing the
data of larger nodes into 2 lists. This will be addressed
after we are able to draw a screen so we can collect data
on this topic. [#121649]

Pull Request: https://projects.blender.org/blender/blender/pulls/121073
2024-05-10 15:39:56 +02:00
Jeroen Bakker
c8ccf77564 Vulkan: Render graph dispatch indirect
Add dispatch indirect node. Also refactored the dispatch (direct) node
so more logic could be reused. The context only stores a `VKResourceAccessInfo`
struct which is reused by both the dispatch and dispatch indirect node.

Pull Request: https://projects.blender.org/blender/blender/pulls/120993
2024-04-24 21:28:45 +02:00
Campbell Barton
019d3ef939 Cleanup: spelling in comments 2024-04-24 10:48:45 +10:00
Jeroen Bakker
be75f1ac2b Vulkan: Render graph textures
This PR implements render graph for VKTexture. During the
implementation some tweaks to the render graph was done
to support depth and stencil textures.

The render graph will record the image aspect being used
for each node. This will then be used to generate barriers
for the correct aspect.

Also fixes an issue that uploading of array textures didn't
allocate a large enough staging buffer.

Pull Request: https://projects.blender.org/blender/blender/pulls/120821
2024-04-19 14:55:39 +02:00
Jeroen Bakker
adab06bc67 Vulkan: Render graph core
**Design Task**: blender/blender#118330

This PR adds the core of the render graph. The render graph isn't used.
Current implementation of the Vulkan Backend is slow by design. We
focused on stability, before performance. With the new introduced render
graph the focus will shift to performance and keep the stability at where
it is.

Some highlights:
- Every context will get its own render graph. (`VKRenderGraph`).
- Resources (and resource state tracking) is device specific (`VKResourceStateTracker`).
- No node reordering / sub graph execution has been implemented. Currently
  All nodes in the graph is executed in the order they were added. (`VKScheduler`).
- The links inside the graph describe the resources the nodes read from (input links)
  or writes to (output links)
- When resources are written to a resource stamp is incremented allowing keeping
  track of which nodes needs which stamp of a resource.
- At each link the access information (how does the node accesses the resource)
  and image layout (for image resources) are stored. This allows the render graph
  to find out how a resource was used in the past and will be used in the future.
  That is important to construct pipeline barriers that don't stall the whole GPU.

# Defined nodes

This implementation has nodes for:
- Blit image
- Clear color image
- Copy buffers to buffers
- Copy buffers to images
- Copy images to images
- Copy images to buffers
- Dispatch compute shader
- Fill buffers
- Synchronization

Each node has a node info, create info and data struct. The create info
contains all data to construct the node, including the links of the graph.
The data struct only contains the data stored inside the node. The node info
contains the node specific implementation.

> NOTE: Other nodes will be added after this PR lands to main.

# Resources

Before a render graph can be used, the resources should be registered
to `VKResourceStateTracker`. In the final implementation this will be owned by
the `VKDevice`. Registration of resources can be done by calling
`VKResources.add_buffer` or `VKResources.add_image`.

# Render graph

Nodes can be added to the render graph. When adding a node its read/
write dependencies are extracted and converted into links (`VKNodeInfo.
build_links`).
When the caller wants to have a resource up to date the functions
`VKRenderGraph.submit_for_read` or `VKRenderGraph.submit_for_present`
can be called.

These functions will select and order the nodes that are needed
and convert them to `vkCmd*` commands. These commands include pipeline
barrier and image layout transitions.

The `vkCmd` are recorded into a command buffer which is sent to the
device queue.

## Walking the graph

Walking the render graph isn't implemented yet. The idea is to have a
`Map<ResourceWithStamp, Vector<NodeHandle>> consumers` and
`Map<ResourceWithStamp, NodeHandle> producers`. These attributes can
be stored in the render graph and created when building the links, or
can be created inside the VKScheduler as a variable. The exact detail
which one would be better is unclear as there aren't any users yet. At
the moment the scheduler would need them we need to figure out the best
way to store and retrieve the consumers/producers.

# Unit tests

The render graph can be tested by enabling `WITH_GTEST` and use
`vk_render_graph` as a filter.

```
bin/tests/blender_test --gtest_filter="vk_render_graph*"
```

Pull Request: https://projects.blender.org/blender/blender/pulls/120427
2024-04-19 10:46:50 +02:00