When clearing only the depth of a depth/stencil only the depth
part of the image aspect was stored in the node. This is invalid
when the image needed to be transitioned.
Pull Request: https://projects.blender.org/blender/blender/pulls/123713
When having a sequential read image barriers for the same resource
and the second one requires an image layout transition the incorrect
barriers where generated.
This was fixed by aligning the implementation with write image barriers.
Pull Request: https://projects.blender.org/blender/blender/pulls/123712
Internally the image and texture resources where kept in a vector
where the elements were referenced. When using more than 16 images
this vector is reallocated and previous references become invalid.
This is a quick fix and should be changed with something more
stable.
Pull Request: https://projects.blender.org/blender/blender/pulls/123656
Due to incompatible binding namespaces between Vulkan and OpenGL we
offset the images in the ubo list. In the previous implementation
this could still go wrong as the images and textures bindings where
sequential. When EEVEE binds resources it can also try to bind resources
that aren't valid for the current shader. In this case it was still
possible that the incorrect binding was chosen.
This is fixed by offsetting the images by a large number.
Pull Request: https://projects.blender.org/blender/blender/pulls/123649
When many text using BLF the glymp texture could be re-written.
In this case the new upload should be done in a separate render
graph node group. This wasn't the case and resulted in
validation warnings about the glyph texture being in an layout
that wasn't expected.
This PR simplifies the group extraction a bit by looking ahead
when the group ends.
Pull Request: https://projects.blender.org/blender/blender/pulls/123547
Due to incorrect logic any multi viewport setup could be cleaned
when dynamic rendering begins. This patch moves the clearing
of the viewport/scissor setup when binding.
This issue fixes shadow rendering in EEVEE.
Pull Request: https://projects.blender.org/blender/blender/pulls/123484
`VK_EXT_shader_stencil_export` isn't supported by NVIDIA devices.
This extension was recently added to support EEVEE PBR layer selection.
This PR makes this extension optional and selects the work around
when not supported by the physical device.
Fixes#114385
Pull Request: https://projects.blender.org/blender/blender/pulls/123470
Vulkan backend has recently switched to a render graph approach. Many
code was left so we could develop the render graph beside the previous
implementation. Last week we removed the switch. This PR will remove
most of the unused code. There might be some left and will be removed
when detected.
Pull Request: https://projects.blender.org/blender/blender/pulls/123422
Add a `.data<T>()` method that retrieves a mutable span. This is useful
more and more as we change to filling in vertex buffer data arrays
directly, and compared to raw pointers it's safer too because of asserts
in debug builds.
Pull Request: https://projects.blender.org/blender/blender/pulls/123338
Texture update ignored layered based offsets and extents. This was
an oversight due to misintepreting the API. This fixes uploading the
utility texture of EEVEE and enables correct material shading.
Pull Request: https://projects.blender.org/blender/blender/pulls/123371
When using GPU_SAMPLER_EXTEND_MODE_EXTEND the incorrect sampler
was created, making the OCIO shader fail. This PR selects the
correct wrapping mode (CLAMP_TO_EDGE).
Pull Request: https://projects.blender.org/blender/blender/pulls/123234
This PR hooks up the vulkan backend with the render graph
for drawing. It can run Blender better than the previous
implementation so we also flipped it to be the default
implementation.
**Some highlights**
- Adds support for framebuffer load/store operations
- Adds support for framebuffer subpass transitions
- Fixes workbench shadows
- Performance is just below OpenGL performance when comparing
fps. But the screen feels more fluent when using complex
scenes.
- Current performance is without doing any optimizations so
will improve in the future.
- EEVEE will not crash but has artifacts and many parts that
require more work.
**Related to**
- #121648
- #118330
**Known Limitation**
- Similar to previous implementation resources can be freed when
still in use crashing Blender. This is typically the case when
playing back an animation or updating a material icon.
**Next steps**
- Remove old implementation
- Get EEVEE to work
- Fix double resource freeing
- Improve performance by identifying hotspots and change them
Pull Request: https://projects.blender.org/blender/blender/pulls/121787
This PR adds drawing support to the render graph. It adds support for
draw, indirect draw, indexed draw and indexed indirect draw.
Draw commands can only be executed within a rendering scope. Data
transfer commands and dispatch commands cannot be executed within a
rendering scope. Blender can still send in commands in any order and
the render graph needs to find out the best order to minimize context
switches (rendering/begin/end). This is the responsibility of the
scheduler.
The scheduler will push data transfer and dispatch commands outside the
rendering scope:
- data transfer and dispatch commands at the beginning are done before
the rendering begin.
- data transfer and dispatch commands at the end are done after the
rendering end.
- data transfer and dispatches in between draw commands will be pushed
to the beginning if they are not yet being used.
- for all other data transfer and dispatch commands the rendering is
suspenderd and will be continued afterwards.
Within a rendering context it is not allowed to perform synchronization
commands. Any synchronization commands inside a rendering scope will be
performed before the rendering scope begins. Nodes are now organized
in groups to simplify the code around this area.
Pull Request: https://projects.blender.org/blender/blender/pulls/123168
Allow precompiling specialization constants variations in parallel.
Only supported in OpenGL as the rest of the batch compilation API,
on the other backends the function is a no-op.
This also moves the `SpecializationConstant` from
`gpu_shader_create_info` (private API) into`GPU_common_types`
(public API).
Pull Request: https://projects.blender.org/blender/blender/pulls/122796
This is the first commit of the several required to support
subprocess-based parallel compilation on OpenGL.
This provides the base API and implementation, and exposes the max
subprocesses setting on the UI, but it's not used by any code yet.
More information and the rest of the code can be found in #121925.
This one includes:
- A new `GPU_shader_batch` API that allows requesting the compilation
of multiple shaders at once, allowing GPU backed to compile them in
parallel and asynchronously without blocking the Blender UI.
- A virtual `ShaderCompiler` class that backends can use to add their
own implementation.
- A `ShaderCompilerGeneric` class that implements synchronous/blocking
compilation of batches for backends that don't have their own
implementation yet.
- A `GLShaderCompiler` that supports parallel compilation using
subprocesses.
- A new `BLI_subprocess` API, including IPC (required for the
`GLShaderCompiler` implementation).
- The implementation of the subprocess program in
`GPU_compilation_subprocess`.
- A new `Max Shader Compilation Subprocesses` option in
`Preferences > System > Memory & Limits` to enable parallel shader
compilation and the max number of subprocesses to allocate (each
subprocess has a relatively high memory footprint).
Implementation Overview:
There's a single `GLShaderCompiler` shared by all OpenGL contexts.
This class stores a pool of up to `GCaps.max_parallel_compilations`
subprocesses that can be used for compilation.
Each subprocess has a shared memory pool used for sending the shader
source code from the main Blender process and for receiving the already
compiled shader binary from the subprocess. This is synchronized using
a series of shared semaphores.
The subprocesses maintain a shader cache on disk inside a
`BLENDER_SHADER_CACHE` folder at the OS temporary folder.
Shaders that fail to compile are tried to be compiled again locally for
proper error reports.
Hanged subprocesses are currently detected using a timeout of 30s.
Pull Request: https://projects.blender.org/blender/blender/pulls/122232
BLF font rendering isn't compatible with render graph as it
rewrites buffers that are not yet drawn. To work around this issue
the vertex buffers should always be created on device and not
directly altered by CPU code.
Pull Request: https://projects.blender.org/blender/blender/pulls/122648
The mesh triangulation data is stored in CPU memory with the same format
as the triangles GPU index buffer. Because of that we can skip creating a
temporary copied owned by the GPU API. One way to do that is to just
upload the data directly and avoid keeping a reference to it. However, we
can only upload GPU data from the main thread with OpenGL, so instead
reference the data and keep track of whether to free it.
When drawing a mesh with a single material and 1.8 million faces, this
change gives a 12-15% improvement in framerate, from about 32 to 37 FPS.
Part of #116901.
Pull Request: https://projects.blender.org/blender/blender/pulls/122175
When synchonizing the image for presentation the incorrect access
mask was used. This PR changes the access mask from data transfer
write to memory write. The data transfer write is not allowed to
change the image layout.
Pull Request: https://projects.blender.org/blender/blender/pulls/122138
* Debugging groups were not being applied as that part of the code
wasn't ported to the original patch
* Debugging groups didn't account for nodes that weren't owned by
any debug group.
Pull Request: https://projects.blender.org/blender/blender/pulls/122136
This PR implements debug groups in the render graph. Each node contains
a reference to the debug group they belong to. During scheduling the
nodes can be reordered and the correct debug group needs to be
activated.
This is done by keeping track of the current debug group. When a
different debug group is needed, the needed ends/begins are added
to the command buffer.
This mechanism also cleans up debug groups that are not used at all
as they don't have any nodes associated to it.
Pull Request: https://projects.blender.org/blender/blender/pulls/122054
Previously the VKShaderInterface was constructed twice. This was
due to a limitation of the Shader api. Specialization constants
introduced an Shader::init function which allows to pre-initialize
the shader interface before a shader is finalized.
Pull Request: https://projects.blender.org/blender/blender/pulls/122049
Move ownership of image views to VKTexture. VKFramebuffer can request
access to the image views. This allows reconfiguring framebuffers when
the previous configuration is still in use by the render graph.
Pull Request: https://projects.blender.org/blender/blender/pulls/121727
This fixes several issues with push constants.
Push constants test were failing due to setting incorrect parameters.
The pipeline stage was used as if it was a shader stage.
When using push constants fallback the uniform was loaded to the
descriptor set after the descriptor set was checked.
Suppress updating push constants of non render graph, when render
graph is active.
Pull Request: https://projects.blender.org/blender/blender/pulls/121772
Compute tests were failing due to recent changes.
* Incorrect image layout was used for writable image bindings.
* Incorrect pipeline stage was used for indirect command buffers.
Pull Request: https://projects.blender.org/blender/blender/pulls/121744
Overlay engine extra layer can record draw list commands with an
empty index buffer. This would not affect any pixels and should be
ignored.
Issue detected when vulkan validation layers are turned on and loading
default scene.
Pull Request: https://projects.blender.org/blender/blender/pulls/121736
There is an implementation flaw in the render graph where local pointers
cannot be updated, but the data it refers to can be reallocated to
another location.
The cause of this is that the nodes use an union, which can only contain
simple constructed structs (eg memcpy). this union is stored in a vector
and can relocate the union. Any local pointers can (and will) become
invalid.
This PR is a quick fix by updating the pointers just before sending
them to the command buffer. In future a better fox needs to be done
as part of #121649.
Pull Request: https://projects.blender.org/blender/blender/pulls/121723
The render graph tests initialized a command buffer wrapper that
requires a working device. The wrapper was not used, but when
destructing it would try to deallocate the command buffer, which
cannot be done as that requires a working device as well.
This PR removes the unneeded command buffer wrappers in the test
cases.
Result isn't consistent on Windows platform due to copying structs with arrays.
This is a quick fix and will be looked at next week. The code isn't used at this moment.
Pull Request: https://projects.blender.org/blender/blender/pulls/121670