After reviewing the locations where `GPU_flush()` are used it doesn't seem
to be harmfull to include these for the Vulkan backend as well. Hopefully
will save some lag that can happen when submitting one huge render graph.
Improved playback of rain_restaurant.blend where frames could be dropped
resulting into UI lag.
Pull Request: https://projects.blender.org/blender/blender/pulls/136654
The Blender's VkInstance cannot be shared with OpenXR VkInstance. The
reason is a chicken and egg problem where OpenXR needs to be started
before Vulkan. OpenXR can add special vulkan specific requirements
(instance&device) that are only available when the user starts an OpenXR
session.
The goal implementation is to share memory between both instances using
[VK_KHR_external_memory](https://registry.khronos.org/vulkan/specs/latest/man/html/VK_KHR_external_memory.html) and related extensions. However this seems
to be a bridge to far as a initial step. Reason: There are not that many
samples/ guides and documentation to be found to handle the workflow that
we require. We want to do a smaller step by step approach to gain the needed
knowledge.
For that reason this PR does the most stupidest thing that can be done to
share memory between instances. Download the render result to CPU RAM share
the host pointer with the OpenXR instance which copies it to the swap chain.
Also the synchronization is done using wait idle commands.
<video src="attachments/32a0d69b-c3fa-4272-aea0-d207609afaaf" title="Screencast From 2025-03-18 11-16-17.webm" controls></video>
**Gaining knowledge**
- Experiment with `VK_KHR_external_memory_host` extension for uploading vertex buffers (not related to OpenXR).
- Import host pointer with `VK_KHR_external_memory_host`. This reduces the additional
memcpy on OpenXR side.
- Export host pointer from Blender side from a mappable buffer.
- Replace host pointers with fd/dmabuf/winhandle
- Remove mappable buffer.
Ref #133718
Pull Request: https://projects.blender.org/blender/blender/pulls/133824
Followup to 9b70851d91.
Return buffers by value rather than creating an empty/uninitialized
buffer first, then initializing it in an extraction function. This generally
makes the code easier to follow. And avoiding these half-created buffers
is an essential step to adding some sort of more global cache.
Pull Request: https://projects.blender.org/blender/blender/pulls/136570
The initial goal of this PR is to avoid creating vertex and index
buffers as part of the "request" phase of the drawing loop. Conflating
requesting and creating index buffers might not sound so bad, but it
ends up significantly complicating the whole process. It is also
incompatible with a future buffer cache that would allow avoiding
re-uploading mesh buffers.
Specifically, this means removing the use of `DRW_vbo_request` and
`DRW_ibo_request` from the mesh batch extraction process. Instead, a
list of buffer types is gathered based on the requested batches. Then
that list is filtered to find the batches that haven't been requested
yet. Overall I find the new process much easier to understand.
A few examples of simplifications this allows are avoiding allocating
`MeshRenderData` on the heap, and the removal of its `use_final_mesh`
member. That's just replaced by passing the necessary information
through the call stack.
Another notable difference is that for meshes, EEVEE's velocity module
now requests a batch that contains the buffer rather than just requesting
the buffer itself. This is just simpler to get working since it doesn't require
a separate code path.
The task graph argument for extraction is unused after this change. It wasn't
used effectively anyway; a simpler method of multithreading extractions is
used in this PR. I didn't remove it completely because it will probably be
repurposed in the next step of this project.
The next step in this project is to replace `MeshBufferList` with a
global cache that's keyed based on the mesh data that compromises each
batch, when possible (i.e. for non edit-mode meshes). This changes above
should be applied to other object types too.
Pull Request: https://projects.blender.org/blender/blender/pulls/135699
This makes it possible to restore previous Blender 4.3 behavior of bump
mapping, where the large filter width was sometimes (ab)used to get a bevel
like effect on stepwise textures.
For bump from the displacement socket, filter width remains fixed at 0.1.
Ref #133991, #135841
Pull Request: https://projects.blender.org/blender/blender/pulls/136465
The crash regression comes from 583e2b7240.
Pass the s_sharedHGLRC directly to wglCreateContextAttribsARB instead
of using wglShareLists.
Context: https://github.com/baldurk/renderdoc/issues/1224
This doesn't only fix the recent regression, but solves all the long
standing issues with Renderdoc on Windows (F12 rendering support,
multiple windows, deferred compilation...).
(Fix suggested by @LazyDodo)
Pull Request: https://projects.blender.org/blender/blender/pulls/136140
This PR adds swapchain synchronization. When the swapchain swaps the
buffers it can add a wait semaphore/signal semaphore to support GPU
based synchronization
10 times playback of `rain_restaurant.blend` on AMD RX 7700
Before: 10 × Animation playback: 72347.5540 ms, average: 7234.75539684 ms
After: 10 × Animation playback: 41523.2441 ms, average: 4152.32441425 ms
Getting around the OpenGL performance target.
Pull Request: https://projects.blender.org/blender/blender/pulls/136259
Potential fix for legacy AMD driver issue.
- Updating drivers using a clean install has proven to fix the issue as well.
As driver can leave parts of an older driver active what it actually the cau
- Solution comments out the `#line` by replacing the first two characters.
Pull Request: https://projects.blender.org/blender/blender/pulls/136231
This patch adds a new Image Info node which returns information about
compositor images. The node has three sources:
The node returns the following information:
- Pixel Coordinates: The coordinates of the centers of the pixels in the
image. Those are essentially the integer coordinates with half pixels
offsets added.
- Texture Coordinates: Zero centered pixel coordinates normalized along
the greater dimension. Somewhat analogous to Object coordinates in
shader nodes.
- Resolution: The resolution of the image.
- Location: The location of the image in the virtual compositing space.
- Rotation: The rotation of the image in the virtual compositing space.
- Scale: The scale of the image in the virtual compositing space.
This node is very useful to allow greater flexibility and procedural
creations. For instance, coordinates can be used to create procedural
effects like vignette using very simple math nodes. And size can be used
to compute size-relative parameters for pixel-parameter nodes.
Pull Request: https://projects.blender.org/blender/blender/pulls/135104
When swap chain is updated the logic could select an incorrect
framebuffer. This isn't actually the case during normal usage, but has
been detected during the development of OpenXR support. Here it did
matter.
Pull Request: https://projects.blender.org/blender/blender/pulls/136115
This is just the shader change.
It allows more freedom for the UI team to tweak the appearance.
The is not functional changes in this patch.
Rel #126334
When using vec3[] as push constants it selected the incorrect
branch resulting in uploading incorrect data to the shader.
This resulted in not seeing the clipping bounds in vulkan.
Ref: #131111
Since the introduction of storage buffers in Blender, the calling
code has been responsible for ensuring the buffer meets allocation
requirements. All backends require the allocation size to be divisible
by 16 bytes. Until now, this was sufficient, but with GPU subdivision
changes, an external library must also adhere to these requirements.
For OpenSubdiv (OSD), some buffers are not 16-byte aligned, leading
to potential misallocation. Currently, this is mitigated by allocating
a few extra bytes, but this approach has the drawback of potentially
reading unintended bytes beyond the source buffer.
This PR adopts a similar approach to vertex buffers: the backend handles
extra byte allocation while ensuring data uploads and downloads function
correctly without requiring those additional bytes.
No changes were needed for Metal, as its allocation size is already
aligned to 256 bytes.
**Alternative solutions considered**:
- Copying the CPU buffer to a larger buffer when needed (performance impact).
- Modifying OSD buffers to allocate extra space (requires changes to an external library).
- Implementing GPU_storagebuf_update_sub.
Ref #135873
Pull Request: https://projects.blender.org/blender/blender/pulls/135716
Resolves several int -> uint conversion warnings. Warnings like the
following will be printed otherwise:
```
|
225 | uint shadow_type = flags & 0xF;
| ^
| gpu_shader_text_vert.glsl:17:22: Warning: some implementations
may not support implicit int -> uint conversions for `&'
operators; consider casting explicitly for portability
```
Pull Request: https://projects.blender.org/blender/blender/pulls/135890
The patch strings did not have thread safe initialization.
The string might hav been returned null or incomplete
which might trigger compilation errors.
Vulkan handles are currently only requested once. In the future OpenXR
also needs acces to these handles and additional handles will be needed
when introducing copy queues and async compute.
This PR will collect the handles in a struct to ensure we don't need to
alter the GHOST interface for every change.
Pull Request: https://projects.blender.org/blender/blender/pulls/135905
This PR enabled GPU based subdivision on Metal.
Most work is done in #135296.
- Metal max storage bindings for compute shaders were never set.
Some performance figures: Suzanne 6 subdivision levels
| Machine | CPU Subdivision | GPU Subdivision |
| --------------- | --------------- | --------------- |
| M1 Studio Ultra | 7fps | 12 fps |
| M2 Air | 3fps | 11 fps |
Pull Request: https://projects.blender.org/blender/blender/pulls/135628
`GPU_vertbuf_update_sub` is used by GPU based subdivision to integrate
quads, triangles and edges. This is just an implementation to make it
work as we are planning bigger changes to improve performance of
uploading data to the GPU.
Pull Request: https://projects.blender.org/blender/blender/pulls/135774
The Mix Color shader node does not retain the alpha channel of the first
input in both the Linear Light and Soft Light modes, while it is retain
for other modes. Further, result clamping also ignores the alpha due to
using the vector clamp function, which introduces implicit conversion
that removes the alpha.
This does not matter for EEVEE because it does nothing with the alpha
channel. But the code will now be shared with the compositor, which does
care about the alpha channel. So adjust the code accordingly to retain
the alpha in those cases.
Pull Request: https://projects.blender.org/blender/blender/pulls/135632
Blender already had its own copy of OpenSubDiv containing some local fixes
and code-style. This code still used gl-calls. This PR updates the calls
to use GPU module. This allows us to use OpenSubDiv to be usable on other
backends as well.
This PR was tested on OpenGL, Vulkan and Metal. Metal can be enabled,
but Vulkan requires some API changes to work with loose geometry.

# Considerations
**ShaderCreateInfo**
intern/opensubdiv now requires access to GPU module. This to create buffers
in the correct context and trigger correct dispatches. ShaderCreateInfo is used
to construct the shader for cross compilation to Metal/Vulkan. However opensubdiv
shader caching structures are still used.
**Vertex buffers vs storage buffers**
Implementation tries to keep as close to the original OSD implementation. If
they used storage buffers for data, we will use GPUStorageBuf. If it uses vertex
buffers, we will use gpu::VertBuf.
**Evaluator const**
The evaluator cannot be const anymore as the GPU module API only allows
updating SSBOs when constructing. API could be improved to support updating
SSBOs.
Current implementation has a change to use reads out of bounds when constructing
SSBOs. An API change is in the planning to remove this issue. This will be fixed in
an upcoming PR. We wanted to land this PR as the visibility of the issue is not
common and multiple other changes rely on this PR to land.
Pull Request: https://projects.blender.org/blender/blender/pulls/135296
It has been confirmed that the latest release of AMD drivers has fixed
issues for both OpenGL and Vulkan. Users should use AMD driver 25.3.1
or later. Removing the workaround as it has performance penalties on
RDNA2 based GPUs.
Reference: #135516
Pull Request: https://projects.blender.org/blender/blender/pulls/135630