Commit Graph

111 Commits

Author SHA1 Message Date
Jeroen Bakker
e24e58f293 Vulkan: Recycle descriptor pools
This PR adds recycling of descriptor pools. Currently descriptor pools
are discarded when full or context is flushed. This PR allows
descriptor pools to be discarded for reuse.
It is also more conservative and only discard
Descriptor pools when they are full or fragmented.

When using the Vulkan backend a small amount of descriptor memory can leak. Even
when we clean up all resources, drivers can still keep data around on the
GPU. Eventually this can lead to out of memory issues depending on how
the GPU driver actually manages descriptor sets.

When the descriptor sets of the descriptor pool aren't used anymore
the VKDiscardPool will recycle the pools back to its original VKDescriptorPools.
It needs to be the same instance as descriptor pools/sets are owned by
a single thread.

Pull Request: https://projects.blender.org/blender/blender/pulls/144992
2025-08-22 17:11:26 +02:00
Jeroen Bakker
0ea1feabd9 Vulkan: HDR support for Windows
This PR adds HDR support for Windows for `VK_COLOR_SPACE_EXTENDED_SRGB_LINEAR_EXT`
on `VK_FORMAT_R16G16B16A16_SFLOAT` swapchains .

For nonlinear surface formats (sRGB and extended sRGB) the back buffer is blit into the swapchain,
When VK_COLOR_SPACE_EXTENDED_SRGB_LINEAR_EXT is used as surface format a compute shader
is used to flip and invert the gamma.

SDR white level is updated from a few window event changes, but actually
none of them immediately respond to SDR white level changes in the system.
That requires using the WinRT API, which we don't do so far.

Current limitations:
- Intel GPU support
- Dual GPU support

In the future we may add controls inside Blender for absolute HDR nits,
across different platforms. But this makes behavior closer to macOS.

See !144565 for details

Co-authored-by: Brecht Van Lommel <brecht@blender.org>
Pull Request: https://projects.blender.org/blender/blender/pulls/144717
2025-08-22 10:11:55 +02:00
Jeroen Bakker
ea83567811 Vulkan: Destroy resources in submission thread
This PR moves the responsibility of destroying discarded resources to
the submission thread. Previous implementation could be blocked and
would not always run.

This solves memory leak when rendering in background and keeps the
overall memory usage lower as all is done in a single location.

Pull Request: https://projects.blender.org/blender/blender/pulls/144440
2025-08-12 15:58:49 +02:00
Clément Foucault
1388a70914 GPU: Remove wrapper type for gpu::Shader
This is the first step into merging DRW_gpu_wrapper.hh into
the GPU module.

This is very similar to #119825.

Pull Request: https://projects.blender.org/blender/blender/pulls/144229
2025-08-11 09:34:28 +02:00
Jeroen Bakker
073b4d4d7b Fix #144048: Vulkan: Crash minimized windows
Swapchain handling of minimized windows wasn't correct. On some
platforms it still tried to create images with no surface.

This PR will discard swapchains of minimized windows, but still being
able to flush the render graph and free resources.

Pull Request: https://projects.blender.org/blender/blender/pulls/144189
2025-08-08 11:03:02 +02:00
Jeroen Bakker
cd00d8ca00 Fix: Vulkan: Use after free when switching scenes
Unreported issue introduced from recent changes. (memory leak in immediate mode)

Pull Request: https://projects.blender.org/blender/blender/pulls/144116
2025-08-07 08:56:59 +02:00
Jeroen Bakker
1f47a51335 Fix #142305: Vulkan: Memory leak immediate mode
Resetting and recycling of immediate drawing buffers was never done and
would leak memory as the buffers were only destroyed when Blender
exited.

This is solved by not recycling or resetting the buffers and rely on the
discard pools. Additional cleanup of removing unused code-paths is also
part of this change so it can be backported to 4.5.

Pull Request: https://projects.blender.org/blender/blender/pulls/143995
2025-08-05 14:40:54 +02:00
Clément Foucault
32d64d35bb Refactor: GPU: Texture: Replace eGPUTextureFormat by TextureFormat
This offers better semantic and safety of the API.

Part of #130632

Pull Request: https://projects.blender.org/blender/blender/pulls/142818
2025-07-22 14:58:54 +02:00
Jeroen Bakker
4d9c5ebd97 Vulkan: Move Wayland/HDR support out of experimental
This PR moves Wayland/HDR support out of experimental.
This allows more people to test and provide feedback. We
can always decide later to disable it for the release, but so
far we only got positive feedback.

Pull Request: https://projects.blender.org/blender/blender/pulls/141666
2025-07-09 13:24:31 +02:00
Jeroen Bakker
e63a20fee1 Vulkan: HDR support for Wayland
This change enables HDR support for wayland as an experimental feature.
It supports both non-linear extended sRGB and un-clamped sRGB.

Windows isn't supported as the HDR settings are not accessible via an
API and would require similar settings that games use to configure the
monitor. Adding those sliders isn't what we would like to add.

Vulkan (working group) is working on new extensions that might change
the shortcomings. It isn't clear yet what the extension will do and what
the impact is for applications that want to use it. When the extension
is out we should review at the situation again.

Pull Request: https://projects.blender.org/blender/blender/pulls/133159
2025-06-24 11:51:14 +02:00
Jeroen Bakker
ea1652dca3 Fix #140229: Vulkan: Crash during depth aware-navigation
Depth navigation sends many small render graphs to the device. It can be
that a subsequent render graph uses the same shader as the previous one
with the same descriptor set tracker. The descriptor set tracker didn't
cleared its full state and a subsequent render graph was generating
commands assuming that the device was in a certain state.

However it wasn't and a command to bind a descriptor set was skipped
resulting in a device out of bound write. Depending on the platform this
could overwrite any data on the GPU, including shader programs as the
select shader writes to a storage buffer. This clarifies why the issue
resulted in very odd and none consistent behavior.

This PR fixes this by clearing the VKPipelineData and
VKDescriptorTracker.

Pull Request: https://projects.blender.org/blender/blender/pulls/140526
2025-06-17 11:17:03 +02:00
Clément Foucault
023865b314 Vulkan: Add CPU profiling
This has limited use cases since it doesn't
profile the heavy part of the vulkan backend.

Almost 1:1 port of the metal implementation from #139551.
Doesn't cover rendergraph submission nor GPU timings.

Pull Request: https://projects.blender.org/blender/blender/pulls/139899
2025-06-06 14:39:51 +02:00
Jeroen Bakker
66d361bd29 Vulkan: Add support for descriptor buffers
Descriptor sets/pools are known to be troublesome as it doesn't match
how GPUs work, or how application want to work, adding more complexity
than needed. This results is quite an overhead allocating and
deallocating descriptor sets.

This PR will use descriptor buffers when they are available. Most platforms
support descriptor buffers. When not available descriptor pools/sets
will be used.

Although this is a feature I would like to land it in 4.5 due to the API changes.
This makes it easier to fix issues when 4.5 is released.
The feature can easily be disabled by setting the feature to false if it has
to many problems.

Pull Request: https://projects.blender.org/blender/blender/pulls/138266
2025-06-06 10:20:36 +02:00
Jeroen Bakker
283a267c13 Fix #139558: Incorrect shader when using edge slide even
In this case a triangle shader was used to render points.

This change entails:
- Using point shaders in this case
- Add support for `GPU_point_size` to update the uniform of the point
  shader.

Pull Request: https://projects.blender.org/blender/blender/pulls/139574
2025-05-29 10:12:33 +02:00
Clément Foucault
caac241c84 GPU: Make Shader Specialization Constant API Thread Safe
This allows multiple threads to request different specializations without
locking usage of all specialized shaders program when a new specialization
is being compiled.

The specialization constants are bundled in a structure that is being
passed to the `Shader::bind()` method. The structure is owned by the
calling thread and only used by the `Shader::bind()`.
Only querying for the specialized shader (Map lookup) is locking the shader
usage.

The variant compilation is now also locking and ensured that
multiple thread trying to compile the same variant will never result
in race condition.

Note that this removes the `is_dirty` optimization. This can be added
back if this becomes a bottleneck in the future. Otherwise, the
performance impact is not noticeable.

Pull Request: https://projects.blender.org/blender/blender/pulls/136991
2025-05-19 17:42:55 +02:00
Jeroen Bakker
2143eb7a4f Refactor: Vulkan/OpenXR: Import memory handles only once
Importing memory is done to often. when memory doens't change the
previous imported memory can be used.

The idea is to keep track of the last used buffer and keep reusing
it until the view/resolution has changed. This should not happen during
a session.

Pull Request: https://projects.blender.org/blender/blender/pulls/138984
2025-05-19 12:32:08 +02:00
Jeroen Bakker
1c72dca726 Cleanup: Remove unused code 2025-05-15 12:00:09 +02:00
Jeroen Bakker
3b3cab471a Fix #138843: Vulkan: Swapchain issues
- Reduce artifacts during resizing to also recreate the swapchain
  when acquire image is suboptimal
- Do not stretch when backbuffer and swapchain have a different size

Pull Request: https://projects.blender.org/blender/blender/pulls/138925
2025-05-15 11:57:44 +02:00
Miguel Pozo
992e7c95a7 GPU: Converge ShaderCompiler implementations
Part of #136993.

Share as much of the ShaderCompiler implementations as possible.
Remove the ShaderCompiler/ShaderCompilerGeneric split and make most of
its functions non virtual.
Move the `get_compiler` function from `Context` to `GPUBackend` and
creation/deletion to `GPUBackend::init/delete_resources`.
Add a `batch_cancel` function to `ShaderCompiler` (needed for the
GPUPass refactor).

As a nice extra, the multithreaded OpenGL compilation has become faster
too.
The barbershop materials + EEVEE static shaders have gone from 27s to
22s.

I have not observed any performance difference on Vulkan or Metal.

Pull Request: https://projects.blender.org/blender/blender/pulls/136676
2025-05-08 18:16:47 +02:00
Jeroen Bakker
5102f33ef9 Fix #137081: Vulkan: Memory leak in descriptor pools.
Descriptor pools were never discarded, leading to out of memory issues
when running for a long time. This PR discards used descriptor pool when
the render graph is submitted to the device.

- Detected that to descriptor sets could be uploaded multiple times
  however once was always empty.
- When render graph is flushed all descriptor pools are discarded.
- Improved debugging of discard pools.

Pull Request: https://projects.blender.org/blender/blender/pulls/137521
2025-04-15 12:18:34 +02:00
Jeroen Bakker
b4028ee28f Fix #137395: Vulkan: Memory reset to early
A better solution to solve the memory leak needs to be checked. Partial
revert of 3c70758f00 as it can reset GPUs
or data buffers.
2025-04-14 09:45:14 +02:00
Josh Belanich
3c70758f00 Fix #137081: Vulkan: Crash during animation playback
A couple of memory leak fixes for the vulkan backend.

We increment the submission_id on render_graphs upon reset. This
triggers cleanup of anything tracked as a VKResourceTracker. Notably
uniform buffers created for push constant fallbacks. This fixes a memory
leak that was accumulating VKUniformBuffers every frame without cleaning
them up.

Reset resource pools when a swapchain image is presented. This ends up
calling vkResetDescriptorPool, freeing up descriptor set resources. This
fixes a memory leak that was accumulate descriptor sets and pools over
time without freeing them.

Pull Request: https://projects.blender.org/blender/blender/pulls/137305
2025-04-11 14:46:35 +02:00
Jeroen Bakker
b65b6febb9 Fix: Vulkan/OpenXR: Use correct data format for CPU transfers
Incorrect data format was selected when using CPU data transfers in
OpenXR. It always used `GPU_DATA_HALF_FLOAT`, also when the swapchains
where `GPU_RGBA8`. This resulted in black screens in release mode, and
asserts in debud mode.

Fixed by selecting the correct data transfer data type based on the
swapchain format.

Co-authored-by: jeroen@blender.org <Jeroen Bakker>
Pull Request: https://projects.blender.org/blender/blender/pulls/137269
2025-04-10 14:22:55 +02:00
Jeroen Bakker
7ecacbc3e6 Vulkan/OpenXR: Support VK_KHR_external_memory_win32
This PR add support to use a win32 handle to perform share render
result with the OpenXR vulkan instance. This is only possible when
the GPU matches. Otherwise a CPU roundtrip will be performed.

Pull Request: https://projects.blender.org/blender/blender/pulls/137093
2025-04-08 15:21:55 +02:00
Miguel Pozo
a5ed5dc4bf GPU: Support deferred compilation in ShaderCompilerGeneric
Update the `ShaderCompilerGeneric` to support deferred compilation
using the batch compilation API, so we can get rid of
`drw_manager_shader`.
This approach also allows supporting non-blocking compilation
for static shaders.

This shouldn't cause any behavior changes at the moment, since batch
compilation is not yet used when parallel compilation is disabled.

This adds a `GPUWorker` and a `GPUSecondaryContext` as an easy to use
wrapper for managing secondary GPU contexts.

(Part of #133674)
Pull Request: https://projects.blender.org/blender/blender/pulls/136518
2025-04-07 15:26:25 +02:00
Jeroen Bakker
a46643af0f Vulkan/OpenXR: Add support for VK_KHR_external_memory_fd
Current implementation uses a CPU roundtrip to transfer render result
to the Xr Swapchain. This PR adds support for sharing the render result
on Linux systems by using file descriptors.

To extend this solution to win32 or dx handles can be done by extending
the data transfer modes, register the correct extensions. When not
using the same GPU between Blender and OpenXR the CPU roundtrip
will still be used.

Solution has been validated with monado simulator and seems to be as
fast as OpenGL.

Performance can be improved by using GPU based synchronization.
Current API is limited as we cannot chain the different renders and
swapchains.

Pull Request: https://projects.blender.org/blender/blender/pulls/136933
2025-04-04 16:01:06 +02:00
Jeroen Bakker
aed9f22233 Refactor: Vulkan: swapchain
This PR refactors the way how swapchains are used.

Allow scaling of the swapchain content to the actual resolution of the swapchain.
can reduce artefacts when resizing windows when supported.

When frame rate is to fast the previous implementation could use a semaphore
that were still in use, leading to unwanted stuttering on certain platforms. Waiting
when the rendering has finished (GHOST_Frame.submission_fence), before the
next image is acquired from the swap chain.

Mailbox has been disabled as it can calculate more frames then actually been
presented, leading to a lag and increased  power usage on others.

Pull Request: https://projects.blender.org/blender/blender/pulls/136603
2025-04-01 16:01:22 +02:00
Jeroen Bakker
5e26f5cc2a Vulkan: Reduce lag on certain platforms.
After reviewing the locations where `GPU_flush()` are used it doesn't seem
to be harmfull to include these for the Vulkan backend as well. Hopefully
will save some lag that can happen when submitting one huge render graph.

Improved playback of rain_restaurant.blend where frames could be dropped
resulting into UI lag.

Pull Request: https://projects.blender.org/blender/blender/pulls/136654
2025-03-31 12:16:48 +02:00
Jeroen Bakker
3885a37541 Vulkan: Initial OpenXR support
The Blender's VkInstance cannot be shared with OpenXR VkInstance. The
reason is a chicken and egg problem where OpenXR needs to be started
before Vulkan. OpenXR can add special vulkan specific requirements
(instance&device) that are only available when the user starts an OpenXR
session.

The goal implementation is to share memory between both instances using
[VK_KHR_external_memory](https://registry.khronos.org/vulkan/specs/latest/man/html/VK_KHR_external_memory.html) and related extensions. However this seems
to be a bridge to far as a initial step. Reason: There are not that many
samples/ guides and documentation to be found to handle the workflow that
we require. We want to do a smaller step by step approach to gain the needed
knowledge.

For that reason this PR does the most stupidest thing that can be done to
share memory between instances. Download the render result to CPU RAM share
the host pointer with the OpenXR instance which copies it to the swap chain.
Also the synchronization is done using wait idle commands.

<video src="attachments/32a0d69b-c3fa-4272-aea0-d207609afaaf" title="Screencast From 2025-03-18 11-16-17.webm" controls></video>

**Gaining knowledge**

- Experiment with `VK_KHR_external_memory_host` extension for uploading vertex buffers (not related to OpenXR).
- Import host pointer with `VK_KHR_external_memory_host`. This reduces the additional
  memcpy on OpenXR side.
- Export host pointer from Blender side from a mappable buffer.
- Replace host pointers with fd/dmabuf/winhandle
- Remove mappable buffer.

Ref #133718

Pull Request: https://projects.blender.org/blender/blender/pulls/133824
2025-03-27 16:57:51 +01:00
Jeroen Bakker
409ce2b976 Vulkan: Swapchain synchronization
This PR adds swapchain synchronization. When the swapchain swaps the
buffers it can add a wait semaphore/signal semaphore to support GPU
based synchronization

10 times playback of `rain_restaurant.blend` on AMD RX 7700
Before: 10 × Animation playback: 72347.5540 ms, average: 7234.75539684 ms
After: 10 × Animation playback: 41523.2441 ms, average: 4152.32441425 ms

Getting around the OpenGL performance target.

Pull Request: https://projects.blender.org/blender/blender/pulls/136259
2025-03-24 10:28:52 +01:00
Jeroen Bakker
4429cc7e84 Fix: Vulkan: Incorrect framebuffer selection
When swap chain is updated the logic could select an incorrect
framebuffer. This isn't actually the case during normal usage, but has
been detected during the development of OpenXR support. Here it did
matter.

Pull Request: https://projects.blender.org/blender/blender/pulls/136115
2025-03-18 11:49:52 +01:00
Brecht Van Lommel
3dab100860 Fix: ASAN errors after addition of texture pool
Same fix as #132504. Free the texture pool before the derived GPU context
class, as that one is used as part of freeing the texture pool.

Pull Request: https://projects.blender.org/blender/blender/pulls/135444
2025-03-04 16:54:05 +01:00
Jeroen Bakker
e6b3cc8983 Vulkan: Device command builder
This PR implements a new the threading model for building render graphs
based on tests performed last month. For out workload multithreaded
command building will block in the driver or device. So better to use a
single thread for command building.

Details of the internal working is documented at https://developer.blender.org/docs/features/gpu/vulkan/render_graph/

- When a context is activated on a thread the context asks for a
  render graph it can use by calling `VKDevice::render_graph_new`.
- Parts of the GPU backend that requires GPU commands will add a
  specific render graph node to the render graph. The nodes also
  contains a reference to all resources it needs including the
  access it needs and the image layout.
- When the context is flushed the render graph is submitted to the
  device by calling `VKDevice::render_graph_submit`.
- The device puts the render graph in `VKDevice::submission_pool`.
- There is a single background thread that gets the next render
  graph to send to the GPU (`VKDevice::submission_runner`).
  - Reorder the commands of the render graph to comply with Vulkan
    specific command order rules and reducing possible bottlenecks.
    (`VKScheduler`)
  - Generate the required barriers `VKCommandBuilder::groups_extract_barriers`.
    This is a separate step to reduce resource locking giving other
    threads access to the resource states when they are building
    the render graph nodes.
  - GPU commands and pipeline barriers are recorded to a VkCommandBuffer.
    (`VKCommandBuilder::record_commands`)
  - When completed the command buffer can be submitted to the device
    queue. `vkQueueSubmit`
  - Render graphs that have been submitted can be reused by a next
    thread. This is done by pushing the render graph to the
    `VKDevice::unused_render_graphs` queue.

Pull Request: https://projects.blender.org/blender/blender/pulls/132681
2025-01-27 08:55:23 +01:00
Jeroen Bakker
390ca01685 Cleanup: Vulkan: Remove resource ownership
Images used to be tracked with ownership in order to reset swap chain
images to its original layout. This isn't used anymore as we always mark
them in VK_IMAGE_LAYOUT_UNDEFINED to make the first pipeline barrier a
nop.

This change reduces unneeded complexity and safe a few CPU cycles.

Pull Request: https://projects.blender.org/blender/blender/pulls/133197
2025-01-17 14:46:22 +01:00
Jeroen Bakker
2f18e4fe29 Vulkan: Add debug group for swapchain
Improves debugging swapchains when using renderdoc.

Pull Request: https://projects.blender.org/blender/blender/pulls/133190
2025-01-17 11:40:11 +01:00
Jeroen Bakker
80ec04b4ef Cleanup: Vulkan: Use full surface format
GHOST_ContextVK used to pass only the surface texture format to
the GPU backend, it didn't pass the color space. This PR also includes
the color space.

Pull Request: https://projects.blender.org/blender/blender/pulls/133185
2025-01-17 10:28:22 +01:00
Jeroen Bakker
04e64b27ea Vulkan: Ignore swapchain image layout/content
Blender always updates all pixels of the swap chain. As an optimization
we can skip the initial layout transition from present to transfer
destination as all pixels will be rewritten.

Pull Request: https://projects.blender.org/blender/blender/pulls/133061
2025-01-14 17:08:04 +01:00
Brecht Van Lommel
24e5226ff0 Fix #128186: Invalid GPU framebuffer free from context
Framebuffers are getting freed in the GPUContext base class destructor. But
the framebuffer destructors use the MTL/VK/GLContext derived class, whose
destructor has already completed at this point. So these contexts are no
longer valid to use.

Now free the framebuffers earlier.

This caused ASAN warnings, it's not known to cause actual bugs.

Pull Request: https://projects.blender.org/blender/blender/pulls/132504
2025-01-06 11:32:02 +01:00
Guillermo Venegas
7f7f9e987f Fix #130817: Make resource pool to cycle when swapchain images are presented
Its not standard how `Present Engines` return images for presentation, and
currently is expected that they cycle between swap-chain images with each
`vkAcquireNextImageKHR` call.

However present engines could return any available image, that can mean
to reuse the last presented one if available. (This seem to be the behavior
using `Layered on DXGI Swapchain` the default `Present Method` used
with latest NVIDIA drivers on Windows).

Since resource pools expects to images to cycle in a sequential order, if any
present engine always return the same image for presentation only a single
resource pool would be used for each rendered frame, and since resources
are only released by cycling between resource pools, this resource pool would
overflow since it never releases any resource.

This changes makes resource pools to cycle each time a image is presented.

Pull Request: https://projects.blender.org/blender/blender/pulls/131129
2024-11-29 12:35:44 +01:00
Jeroen Bakker
c69b107a28 Fix #130121: Vulkan: Lightbaking resources freed to early
When lighting baking is used in a background render the resources are
freed to early. The cause is that light baking does some initialization
within a context, that isn't send to the GPU. The first iteration of
light baking is expecting that it can free resources, what leads to GPU
resources to be deleted that are still used by commands that are
scheduled to be send to the GPU.

This PR fixes this by using multiple resource pools when background
rendering and ensure that contexts are send to the GPU when rendering
ends.

Pull Request: https://projects.blender.org/blender/blender/pulls/131094
2024-11-28 16:05:59 +01:00
Jeroen Bakker
c2695e2dcc Vulkan: Add support for legacy platforms
Dynamic rendering is a Vulkan 1.3 feature. Most platforms have support
for them, but there are several legacy platforms that don't support dynamic
rendering or have driver bugs that don't allow us to use it.

This change will make dynamic rendering optional allowing legacy
platforms to use Vulkan.

**Limitations**

`GPU_LOADACTION_CLEAR` is implemented as clear attachments.
Render passes do support load clear, but adding support to it would
add complexity as it required multiple pipeline variations to support
suspend/resume rendering. It isn't clear when which variation should
be used what lead to compiling to many pipelines and branches in the
codebase. Using clear attachments doesn't require the complexity
for what is expected to be only used by platforms not supported by
the GPU vendors.

Subpass inputs and dual source blending are not supported as
Subpass inputs can alter the exact binding location of attachments.
Fixing this would add code complexity that is not used.

Ref: #129063

**Current state**

![image](/attachments/9ce012e5-2d88-4775-a636-2b74de812826)

Pull Request: https://projects.blender.org/blender/blender/pulls/129062
2024-11-19 16:30:31 +01:00
Jeroen Bakker
c1379ff2b3 Fix #130161: Vulkan: Grid overlay artifact when copying to swap chain
When copying the window to the swap chain the image needs to be copied
upside down to match Vulkan/OpenGL image coordinate differences.

There was an of by 1 error when copying resulting in minor drawing
glitch which was noticeable when looking at the viewport grid.

Pull Request: https://projects.blender.org/blender/blender/pulls/130328
2024-11-15 17:12:24 +01:00
Jeroen Bakker
791f90ab8d Vulkan: Remove guardedalloc option
WITH_VULKAN_GUARDEDALLOC is a development option to use Blenders guarded
allocator when allocating internal vulkan driver resources. It does not provide any benefits
as this should be covered by vulkan validation and drivers are often ignoring this. This
change will remove the option from cmake and source code.

Pull Request: https://projects.blender.org/blender/blender/pulls/129039
2024-10-15 13:46:00 +02:00
Jeroen Bakker
d35cd15e12 Fix #128608: Vulkan: Sync issues when sharing context between threads
Resources are shared, when running multiple contexts on the same thread.
Cycles uses the same context on multiple threads and expected same resources.

This change will introduce a single render graph per context and an updated
resource management. Render graphs are not shared anymore; Resource pools
are still shared, but garbage collection depends on the thread and if
background rendering is used.

Pull Request: https://projects.blender.org/blender/blender/pulls/128983
2024-10-14 15:42:46 +02:00
Jeroen Bakker
0eff22dd2a Fix #128258: Vulkan: Memory leak preview job rendering
When performing preview job rendering the memory wasn't recycled leading
to a memory leak. For background rendering we already recycled memory in
a correct way. This change enables the same branch during preview
rendering.

Also adds a better `VKDevice::debug_print` to see the resources being
tracked by the different threads and resource pools.

Pull Request: https://projects.blender.org/blender/blender/pulls/128377
2024-10-01 09:09:42 +02:00
Jeroen Bakker
725b5027fb Vulkan: Refactor immediate mode
Immediate mode uses the old 'resource tracker' which has been replaced
by swap chain resource pools. This PR optimizes immediate mode buffers
by utilizing resource pools.

Pull Request: https://projects.blender.org/blender/blender/pulls/128188
2024-09-26 16:01:30 +02:00
Jeroen Bakker
88b5467e0e Vulkan: Batch Upload Descriptor Sets
Descriptor sets will be uploaded in batch. This allows drivers to
do additional optimizations or at least push some looping to the
driver side.

Pull Request: https://projects.blender.org/blender/blender/pulls/128167
2024-09-26 12:04:09 +02:00
Jeroen Bakker
d75cf2efd4 Vulkan: Refactor resource binding
Resource binding was over-complicated as I didn't understood the state
manager and vulkan to make the correct decisions at that time. This
refactor will remove a lot of the complexity and improves the performance.

**Performance**

The performance improvement is noticeable in complex grease pencil
scenes.

Grease pencil benchmark file picknick:
- `NVIDIA Quadro RTX 6000` 17 fps -> 24 fps
- `Intel(R) Arc(tm) A750 Graphics (DG2)` 6 -> 21 fps

**Bottle-neck**

The performance improvements originates from moving the update entry
point from state manager to shader interface. The previous implementation
(state manager) had to loop over all the bound resources and find in the
shader interface where it was located in the descriptor set. Ignoring
resources that were not used by the shader. But also making it hard
to determine if descriptor sets actually changed. Previous implementation
assumed descriptor sets always changed.

When descriptor set changed a new descriptor set needed to be allocated.
Most drivers this is a fast operation, but on Intel/Mesa this was measurable
slow. Using an allocation pool doesn't fit the Vulkan API as you are only
able to reuse when the layout matches exactly. Of course doable, but requires
another structure to keep track of the actual layouts.

**Solution**

By using the shader interface as entry point we can:
1. Keep track if there are any changes in the state manager. If not and the
   layout is the same, the previous shader can be reused.
2. In stead of looping over each bound resource, we loop over bind points.

**Future extensions**

Bundle all descriptor set uploads just before use. This would be more
in line with how 'modern' Vulkan should be implemented. This PR already
separates the uploading from the updating and technically allows to upload
more than one descriptor set.

Instead of looking 1 set back we should measure if we can handle multiple
or keep track of the different layouts resources to improve the performance
even further.

Optional use `VK_KHR_descriptor_buffer` when available.

Pull Request: https://projects.blender.org/blender/blender/pulls/128068
2024-09-26 10:59:45 +02:00
Jeroen Bakker
56b7ff256f Vulkan: Fix validation error push constants for compute shaders
Since parallel compilations was introduced, a validation error
was signalling that push constants for compute shaders didn't have
the correct pipeline binding. The root cause was that the pipeline
binding was determined, before the type of shader was known.

This PR fixes this by detemining if a shader is a compute shader up
front. It also removes some code that could lead to issues.

Pull Request: https://projects.blender.org/blender/blender/pulls/128010
2024-09-23 09:44:29 +02:00
Jeroen Bakker
ec7fc8fef4 Vulkan: Parallel shader compilation
This PR introduces parallel shader compilation for Vulkan shader
modules. This will improve shader compilation when switching to material
preview or EEVEE render preview. It also improves material compilation.
However in order to measure the differences shaderc needs to be updated.

PR has been created so we can already start with the code review. This
PR doesn't include SPIR-V caching, what will land in a separate PR as
it needs more validation.

Parallel shader compilation has been tested on AMD/NVIDIA on Linux.
Testing on other platforms is planned in the upcoming days.

**Performance**

```
AMD Ryzen™ 9 7950X × 32, 64GB Ram
Operating system: Linux-6.8.0-44-generic-x86_64-with-glibc2.39 64 Bits, X11 UI
Graphics card: Quadro RTX 6000/PCIe/SSE2 NVIDIA Corporation 4.6.0 NVIDIA 550.107.02
```

*Test*: Start blender, open barbershop_interior.blend and wait until the viewport
has fully settled.

| Backend | Test                      | Duration |
| ------- | ------------------------- | -------- |
| OpenGL  | Coldstart/No subprocesses | 1:52     |
| OpenGL  | Coldstart/8 Subprocesses  | 0:54     |
| OpenGL  | Warmstart/8 Subprocesses  | 0:06     |
| Vulkan  | Coldstart Without PR      | 0:59     |
| Vulkan  | Warmstart Without PR      | 0:58     |
| Vulkan  | Coldstart With PR         | 0:33     |
| Vulkan  | Warmstart With PR         | 0:08     |

The difference in time (why OpenGL is faster in a warm start is that all
shaders are cached). Vulkan in this case doesn't cache anything and all
shaders are recompiled each time. Caching the shaders will be part of
a future PR. Main reason not to add it to this PR directly is that SPIR-V
cannot easily be validated and would require a sidecar to keep SPIR-V
compatible with external tools..

**NOTE**:
- This PR was extracted from #127418
- This PR requires #127564 to land and libraries to update. Linux lib
  is available as attachment in this PR. It works without, but is as slow as
  single threaded compilation.

Pull Request: https://projects.blender.org/blender/blender/pulls/127698
2024-09-20 08:30:09 +02:00