These functions are trivial and shouldn't add the cost of a call.
They appeared in profiles, which they shouldn't since they mostly
just return access to member variables. Inlining them reduces
the backend's overhead when sculpting.
Also reserve a Vector before repeated appending.
Pull Request: https://projects.blender.org/blender/blender/pulls/138349
This PR implements dynamic viewport state for the Vulkan gpu backend.
By doing so, it fixes#130914.
The following high-level changes were made:
1. The pipeline pool no longer uses the viewport and scissor
states to identify graphics pipelines, only the number of viewports
and the number of scissors. Graphics pipelines are configured with
dynamic viewport and scissor states upon construction.
2. The desired viewport and scissor configurations for drawing are set
in the data of the draw nodes in the render graph.
3. The draw nodes use these viewport and scissors settings in
`build_commands`. If the viewport and scissor settings have changed
between nodes, then vkCmdSetViewport and vkCmdSetScissor commands
are sent to the command buffer.
4. Tests are updated to verify that set_viewport and set_scissor commands
are executed the correct number of times. (Also note that I needed to
#136987 in order to avoid skipping some Vulkan tests).
See the attached screencast for verification. The number of graphics pipelines
no longer grow when resizing the viewport.
Pull Request: https://projects.blender.org/blender/blender/pulls/137002
This PR implements a new the threading model for building render graphs
based on tests performed last month. For out workload multithreaded
command building will block in the driver or device. So better to use a
single thread for command building.
Details of the internal working is documented at https://developer.blender.org/docs/features/gpu/vulkan/render_graph/
- When a context is activated on a thread the context asks for a
render graph it can use by calling `VKDevice::render_graph_new`.
- Parts of the GPU backend that requires GPU commands will add a
specific render graph node to the render graph. The nodes also
contains a reference to all resources it needs including the
access it needs and the image layout.
- When the context is flushed the render graph is submitted to the
device by calling `VKDevice::render_graph_submit`.
- The device puts the render graph in `VKDevice::submission_pool`.
- There is a single background thread that gets the next render
graph to send to the GPU (`VKDevice::submission_runner`).
- Reorder the commands of the render graph to comply with Vulkan
specific command order rules and reducing possible bottlenecks.
(`VKScheduler`)
- Generate the required barriers `VKCommandBuilder::groups_extract_barriers`.
This is a separate step to reduce resource locking giving other
threads access to the resource states when they are building
the render graph nodes.
- GPU commands and pipeline barriers are recorded to a VkCommandBuffer.
(`VKCommandBuilder::record_commands`)
- When completed the command buffer can be submitted to the device
queue. `vkQueueSubmit`
- Render graphs that have been submitted can be reused by a next
thread. This is done by pushing the render graph to the
`VKDevice::unused_render_graphs` queue.
Pull Request: https://projects.blender.org/blender/blender/pulls/132681
Immediate mode uses the old 'resource tracker' which has been replaced
by swap chain resource pools. This PR optimizes immediate mode buffers
by utilizing resource pools.
Pull Request: https://projects.blender.org/blender/blender/pulls/128188
Resource binding was over-complicated as I didn't understood the state
manager and vulkan to make the correct decisions at that time. This
refactor will remove a lot of the complexity and improves the performance.
**Performance**
The performance improvement is noticeable in complex grease pencil
scenes.
Grease pencil benchmark file picknick:
- `NVIDIA Quadro RTX 6000` 17 fps -> 24 fps
- `Intel(R) Arc(tm) A750 Graphics (DG2)` 6 -> 21 fps
**Bottle-neck**
The performance improvements originates from moving the update entry
point from state manager to shader interface. The previous implementation
(state manager) had to loop over all the bound resources and find in the
shader interface where it was located in the descriptor set. Ignoring
resources that were not used by the shader. But also making it hard
to determine if descriptor sets actually changed. Previous implementation
assumed descriptor sets always changed.
When descriptor set changed a new descriptor set needed to be allocated.
Most drivers this is a fast operation, but on Intel/Mesa this was measurable
slow. Using an allocation pool doesn't fit the Vulkan API as you are only
able to reuse when the layout matches exactly. Of course doable, but requires
another structure to keep track of the actual layouts.
**Solution**
By using the shader interface as entry point we can:
1. Keep track if there are any changes in the state manager. If not and the
layout is the same, the previous shader can be reused.
2. In stead of looping over each bound resource, we loop over bind points.
**Future extensions**
Bundle all descriptor set uploads just before use. This would be more
in line with how 'modern' Vulkan should be implemented. This PR already
separates the uploading from the updating and technically allows to upload
more than one descriptor set.
Instead of looking 1 set back we should measure if we can handle multiple
or keep track of the different layouts resources to improve the performance
even further.
Optional use `VK_KHR_descriptor_buffer` when available.
Pull Request: https://projects.blender.org/blender/blender/pulls/128068
De-interleaved vertex buffers offsets the attribute in the buffer to
the de-interleaved position. The vertex attribute offset is limited by a
constrained and would raise an error when the buffers just a bit larger.
*VUID-VkVertexInputAttributeDescription-offset-00622*: offset must
be less than or equal to `VkPhysicalDeviceLimits::maxVertexInputAttributeOffset`
This PR fixes this by offsetting the buffer in stead of the attribute.
Offsetting buffers is limited by the amount of memory.
Pull Request: https://projects.blender.org/blender/blender/pulls/128031
Vulkan backend has recently switched to a render graph approach. Many
code was left so we could develop the render graph beside the previous
implementation. Last week we removed the switch. This PR will remove
most of the unused code. There might be some left and will be removed
when detected.
Pull Request: https://projects.blender.org/blender/blender/pulls/123422
This PR hooks up the vulkan backend with the render graph
for drawing. It can run Blender better than the previous
implementation so we also flipped it to be the default
implementation.
**Some highlights**
- Adds support for framebuffer load/store operations
- Adds support for framebuffer subpass transitions
- Fixes workbench shadows
- Performance is just below OpenGL performance when comparing
fps. But the screen feels more fluent when using complex
scenes.
- Current performance is without doing any optimizations so
will improve in the future.
- EEVEE will not crash but has artifacts and many parts that
require more work.
**Related to**
- #121648
- #118330
**Known Limitation**
- Similar to previous implementation resources can be freed when
still in use crashing Blender. This is typically the case when
playing back an animation or updating a material icon.
**Next steps**
- Remove old implementation
- Get EEVEE to work
- Fix double resource freeing
- Improve performance by identifying hotspots and change them
Pull Request: https://projects.blender.org/blender/blender/pulls/121787
When building the resource access used when adding dispatch/draw commands
to the render graph, the access mask is required. This PR stores the
access mask in the shader interface. When binding the resources referenced
by the state manager, the resource access info struct is populated with
the access flags.
In the near future the resource access info will be passed when adding
a dispatch/draw node to the render graph to generate the links.
Pull Request: https://projects.blender.org/blender/blender/pulls/120908
Previously a storage buffer was used to store draw list commands as it
matches already existing APIs. Unfortunately StorageBuffers prefers to
be stored on the GPU device and would reduce the benefit of a dynamic
draw list.
This PR replaces the storage buffer with a regular buffer, which keeps
more control where to store the buffer.
Pull Request: https://projects.blender.org/blender/blender/pulls/117712
Currently all buffer types were stored in host memory, which is visible to the GPU as well.
This is typically slow as the data would be transferred over the PCI bus when used.
Most of the time Index and Vertex buffers are written once and read many times so it makes
more sense to locate them on the GPU. Storage buffers typically require quick access as they
are created for shading/compute purposes.
This PR will try to store vertex buffers, index buffers and storage buffers on device memory
to improve the performance.
Uniform buffers are still located on host memory as they can be uploaded during binding process.
This can (will) reset the graphics pipeline triggering draw calls using unattached resources.
In future this could be optimized further as in:
* using different pools for allocating specific buffers, with a fallback when buffers cannot be
stored on the GPU anymore.
* store uniform buffers in device memory
Pull Request: https://projects.blender.org/blender/blender/pulls/115343
Multi indirect drawing would bind an offset index buffer, but
indirect drawing parameters also offset the index buffer so
incorrect geometry was drawn.
Fixes drawing of meshes with multiple materials.
Pull Request: https://projects.blender.org/blender/blender/pulls/115190
Some minor tweaks to the vulkan backend to support grease pencil
drawing. The changes include:
* Add support for GPU_DATA_10_11_11_REV clearing
* Use correct index buffer start and count
Anti aliasing isn't working as they require different samplers being
configured and that require some design work.
Effects haven't been tested.
Pull Request: https://projects.blender.org/blender/blender/pulls/114659
Goal is to reduce the number of command buffer flushes by tracking what is
happening in the different command queues. This is an initial step towards
advanced queue-ing strategies.
The new (intermediate) strategy records commands to different command
buffers based on what they do. There is a command buffer for data transfers,
compute pipelines and graphics pipelines.
When a compute command is recorded it ensures that all graphic commands
are finished. When a graphic command is recorded it ensures all compute
commands are finished. When a graphic or compute command is scheduled
all recorded data transfer commands are scheduled as well.
Some improvements are expected as multiple compute and data transfers
commands can now be scheduled at the same time and don't need to unbind
and rebind render passes. Especially when using EEVEE-Next which is
compute centric the performance change is visible for the user.
Pull Request: https://projects.blender.org/blender/blender/pulls/114104
This PR implements indirect drawing for the Vulkan backend. Indirect
drawing is a requirement for workbench-next.
NOTE: that this is one of multiple changes needed to get to the same
support level. With this patch only objects at the center of the world
are drawn correctly.
Pull Request: https://projects.blender.org/blender/blender/pulls/111334
Listing the "Blender Foundation" as copyright holder implied the Blender
Foundation holds copyright to files which may include work from many
developers.
While keeping copyright on headers makes sense for isolated libraries,
Blender's own code may be refactored or moved between files in a way
that makes the per file copyright holders less meaningful.
Copyright references to the "Blender Foundation" have been replaced with
"Blender Authors", with the exception of `./extern/` since these this
contains libraries which are more isolated, any changed to license
headers there can be handled on a case-by-case basis.
Some directories in `./intern/` have also been excluded:
- `./intern/cycles/` it's own `AUTHORS` file is planned.
- `./intern/opensubdiv/`.
An "AUTHORS" file has been added, using the chromium projects authors
file as a template.
Design task: #110784
Ref !110783.
A lot of files were missing copyright field in the header and
the Blender Foundation contributed to them in a sense of bug
fixing and general maintenance.
This change makes it explicit that those files are at least
partially copyrighted by the Blender Foundation.
Note that this does not make it so the Blender Foundation is
the only holder of the copyright in those files, and developers
who do not have a signed contract with the foundation still
hold the copyright as well.
Another aspect of this change is using SPDX format for the
header. We already used it for the license specification,
and now we state it for the copyright as well, following the
FAQ:
https://reuse.software/faq/
Initial graphic pipeline targeting. The goal of this PR is to have an initial
graphics pipeline with missing features. It should help identifying
areas that requires engineering.
Current state is that developers of the GPU module can help with the many
smaller pieces that needs to be engineered in order to get it working. It is not
intended for users or developers from other modules, but your welcome to learn
and give feedback on the code and engineering part.
We do expect that large parts of the code still needs to be re-engineered into
a more future-proof implementation.
**Some highlights**:
- In Vulkan the state is kept in the pipeline. Therefore the state is tracked
per pipeline. In the near future this could be used as a cache. More research
is needed against the default pipeline cache that vulkan already provides.
- This PR is based on the work that Kazashi Yoshioka already did. And include
work from him in the next areas
- Vertex attributes
- Vertex data conversions
- Pipeline state
- Immediate support working.
- This PR modifies the VKCommandBuffer to keep track of the framebuffer and its
binding state(render pass). Some Vulkan commands require no render pass to be
active, other require a render pass. As the order of our commands on API level
can not be separated this PR introduces a state engine to keep track of the
current state and desired state. This is a temporary solution, the final
solution will be proposed when we have a pixel on the screen. At that time
I expect that we can design a command encoder that supports all the cases
we need.
**Notices**:
- This branch works on NVIDIA GPUs and has been validated on a Linux system. AMD
is known not to work (stalls) and Intel GPUs have not been tested at all. Windows might work
but hasn't been validated yet.
- The graphics pipeline is implemented with pixels in mind, not with performance. Currently
when a draw call is scheduled it is flushed and waited until it is finished drawing, before
other draw calls can be scheduled. We expected the performance to be worse that it actually
is, but we expect huge performance gains in the future.
- Any advanced drawing (that is used by the image editor, compositor or 3d viewport) isn't
implemented and might crash when used.
- Using multiple windows or resizing of window isn't supported and will stall the system.
Pull Request: https://projects.blender.org/blender/blender/pulls/106224
The goal is to solve confusion of the "All rights reserved" for licensing
code under an open-source license.
The phrase "All rights reserved" comes from a historical convention that
required this phrase for the copyright protection to apply. This convention
is no longer relevant.
However, even though the phrase has no meaning in establishing the copyright
it has not lost meaning in terms of licensing.
This change makes it so code under the Blender Foundation copyright does
not use "all rights reserved". This is also how the GPL license itself
states how to apply it to the source code:
<one line to give the program's name and a brief idea of what it does.>
Copyright (C) <year> <name of author>
This program is free software ...
This change does not change copyright notice in cases when the copyright
is dual (BF and an author), or just an author of the code. It also does
mot change copyright which is inherited from NaN Holding BV as it needs
some further investigation about what is the proper way to handle it.
For example
```
OIIOOutputDriver::~OIIOOutputDriver()
{
}
```
becomes
```
OIIOOutputDriver::~OIIOOutputDriver() {}
```
Saves quite some vertical space, which is especially handy for
constructors.
Pull Request: https://projects.blender.org/blender/blender/pulls/105594
This patch adds a placeholder for the vulkan backend.
When activated (`WITH_VULKAN_BACKEND=On` and `--gpu-backend vulkan`)
it might open a blender screen, but nothing should be visible as
none of the functions are implemented or otherwise crash on a nullptr.
This is expected as this is just a placeholder. The goal is to add shader compilation
+validation to this backend as one of the next steps so we can validate
changes to existing shaders on OpenGL, Metal and Vulkan at the same time.
Reviewed By: fclem
Differential Revision: https://developer.blender.org/D16338