Part of #136993.
Share as much of the ShaderCompiler implementations as possible.
Remove the ShaderCompiler/ShaderCompilerGeneric split and make most of
its functions non virtual.
Move the `get_compiler` function from `Context` to `GPUBackend` and
creation/deletion to `GPUBackend::init/delete_resources`.
Add a `batch_cancel` function to `ShaderCompiler` (needed for the
GPUPass refactor).
As a nice extra, the multithreaded OpenGL compilation has become faster
too.
The barbershop materials + EEVEE static shaders have gone from 27s to
22s.
I have not observed any performance difference on Vulkan or Metal.
Pull Request: https://projects.blender.org/blender/blender/pulls/136676
Multiple threads would be setting the globals
`g_shader_builtin_srgb_transform` and
`g_shader_builtin_srgb_is_dirty`.
These are use for color management inside the builtin
shaders. But the render thread could modify these
values even if its shader have no use of these.
The fix is to move these globals to the `gpu::Context`
class. This way we remove the race condition.
Update the `ShaderCompilerGeneric` to support deferred compilation
using the batch compilation API, so we can get rid of
`drw_manager_shader`.
This approach also allows supporting non-blocking compilation
for static shaders.
This shouldn't cause any behavior changes at the moment, since batch
compilation is not yet used when parallel compilation is disabled.
This adds a `GPUWorker` and a `GPUSecondaryContext` as an easy to use
wrapper for managing secondary GPU contexts.
(Part of #133674)
Pull Request: https://projects.blender.org/blender/blender/pulls/136518
This patch adds the texture pool functionality that was previously
only available in the DRW module to the GPU module.
This allows to not rely on global `DST` variable for the managment
of these temporary textures.
Moreover, this can be extended using dedicated GPU backend
specific behavior to reduce the amount of memory needed
to render.
The implementation is mostly copy pasted from the draw implementation
but with more documentation. Also it is simplified since the
`DRW_texture_pool_query` functionality is not needed.
Pull Request: https://projects.blender.org/blender/blender/pulls/134403
Framebuffers are getting freed in the GPUContext base class destructor. But
the framebuffer destructors use the MTL/VK/GLContext derived class, whose
destructor has already completed at this point. So these contexts are no
longer valid to use.
Now free the framebuffers earlier.
This caused ASAN warnings, it's not known to cause actual bugs.
Pull Request: https://projects.blender.org/blender/blender/pulls/132504
This port is not so straightforward.
This shader is used in different configurations and is
available to python bindings. So we need to keep
compatibility with different attributes configurations.
This is why attributes are loaded per component and a
uniform sets the length of the component.
Since this shader can be used from both the imm and batch
API, we need to inject some workarounds to bind the buffers
correctly.
The end result is still less versatile than the previous
metal workaround (i.e.: more attribute fetch mode supported),
but it is also way less code.
### Limitations:
The new shader has some limitation:
- Both `color` and `pos` attributes need to be `F32`.
- Each attribute needs to be 4byte aligned.
- Fetch type needs to be `GPU_FETCH_FLOAT`.
- Primitive type needs to be `GPU_PRIM_LINES`, `GPU_PRIM_LINE_STRIP` or `GPU_PRIM_LINE_LOOP`.
- If drawing using an index buffer, it must contain no primitive restart.
Rel #127493
Co-authored-by: Jeroen Bakker <jeroen@blender.org>
Pull Request: https://projects.blender.org/blender/blender/pulls/129315
Printf buffer read needs to be inside render boundaries
to work. Since render boundaries can be nested, use a stack.
Fixes assert when quitting blender.
This allows much easier debugging of shader programs.
Usage is as simple as adding `printf` calls inside shaders.
example: `printf("Formating %d\n", my_var);`
Contrary to the `drw_print`, this is not limited
to draw manager shader dispatch/draws. It is compatible
with any shader inside blender.
Most notably, this doesn't need a viewport to display.
So this can be used to debug render pipeline.
Data formating is currently limited to only `%x`, `%d`,
`%u` and `%f`. This could be easily extended if this is
really needed.
There is no type checking, so values are directly reinterpreted
as specified by the printf format.
The current approach for making this work is to bind a
storage buffer inside `GPU_shader_bind`, making it
available to any shader that needs it. The storage buffer
is downloaded back to CPU after a frame or a render
step and the content printed to the console.
This scheduling means that you cannot rely on these printfs
to detect crashes. We could add a mode to force flushing
at shader binding to avoid this limitation.
The values are written from the shaders in binary form and
only formated on the CPU. This avoid issues with manual
printing like with `drw_print`.
Pull Request: https://projects.blender.org/blender/blender/pulls/125071
This is the first commit of the several required to support
subprocess-based parallel compilation on OpenGL.
This provides the base API and implementation, and exposes the max
subprocesses setting on the UI, but it's not used by any code yet.
More information and the rest of the code can be found in #121925.
This one includes:
- A new `GPU_shader_batch` API that allows requesting the compilation
of multiple shaders at once, allowing GPU backed to compile them in
parallel and asynchronously without blocking the Blender UI.
- A virtual `ShaderCompiler` class that backends can use to add their
own implementation.
- A `ShaderCompilerGeneric` class that implements synchronous/blocking
compilation of batches for backends that don't have their own
implementation yet.
- A `GLShaderCompiler` that supports parallel compilation using
subprocesses.
- A new `BLI_subprocess` API, including IPC (required for the
`GLShaderCompiler` implementation).
- The implementation of the subprocess program in
`GPU_compilation_subprocess`.
- A new `Max Shader Compilation Subprocesses` option in
`Preferences > System > Memory & Limits` to enable parallel shader
compilation and the max number of subprocesses to allocate (each
subprocess has a relatively high memory footprint).
Implementation Overview:
There's a single `GLShaderCompiler` shared by all OpenGL contexts.
This class stores a pool of up to `GCaps.max_parallel_compilations`
subprocesses that can be used for compilation.
Each subprocess has a shared memory pool used for sending the shader
source code from the main Blender process and for receiving the already
compiled shader binary from the subprocess. This is synchronized using
a series of shared semaphores.
The subprocesses maintain a shader cache on disk inside a
`BLENDER_SHADER_CACHE` folder at the OS temporary folder.
Shaders that fail to compile are tried to be compiled again locally for
proper error reports.
Hanged subprocesses are currently detected using a timeout of 30s.
Pull Request: https://projects.blender.org/blender/blender/pulls/122232
This PR adds a context function to consider all
buffer bindings obsolete. This is in order to
track missing binds and invalid lingering states
accross `draw::Pass`es.
The functions `GPU_storagebuf_debug_unbind_all`
and `GPU_uniformbuf_debug_unbind_all` do nothing
more than resetting the internal debug slot bits
to zero. This is what OpenGL backend does as it
doesn't track the bindings themselves.
Other backends might have other way to detect
missing bindings. If not they should be
implemented separately anyway.
I renamed the function to `debug_unbind_all` to
denote that it actually does something related to
debugging.
This also add SSBO binding check for OpenGL as it
was also missing.
#### Future
This error checking logic is pretty much backend
agnostic. While it would be nice to move it at
`gpu::Context` level, we don't have the resources
for that now.
Pull Request: https://projects.blender.org/blender/blender/pulls/120716
Adds an option to set the capture title when using renderdoc
`GPU_debug_capture_begin` has an optional `title` parameter to set
the title of the renderdoc capture.
Pull Request: https://projects.blender.org/blender/blender/pulls/118649
A lot of files were missing copyright field in the header and
the Blender Foundation contributed to them in a sense of bug
fixing and general maintenance.
This change makes it explicit that those files are at least
partially copyrighted by the Blender Foundation.
Note that this does not make it so the Blender Foundation is
the only holder of the copyright in those files, and developers
who do not have a signed contract with the foundation still
hold the copyright as well.
Another aspect of this change is using SPDX format for the
header. We already used it for the license specification,
and now we state it for the copyright as well, following the
FAQ:
https://reuse.software/faq/
Adds two modes of GPU frame capture support for
enhanced debugging. GPU frame capture begin/end
allow instantaneous frame capture of all GPU commands
within the capture boundary.
GPU frame capture scopes allow several user-defined capture
regions which can wrap key parts of code. These scopes are
exposed to connected GPU tools allowing the user to manually
trigger a capture of a known scope at the desired time.
This is currently integrated with the Metal backend for
support with Xcode.
Related to #105591
Pull Request: https://projects.blender.org/blender/blender/pulls/105717
Full support for translation and compilation of shaders in Metal, using
GPUShaderCreateInfo. Includes render pipeline state creation and management,
enabling all standard GPU viewport rendering features in Metal.
Authored by Apple: Michael Parkin-White, Marco Giordano
Ref T96261
Reviewed By: fclem
Maniphest Tasks: T96261
Differential Revision: https://developer.blender.org/D15563
MTLFrameBuffer has been implemented to support creation of RenderCommandEncoders, along with supporting functionality in the Metal Context.
Optimisation stubs for GPU_framebuffer_bind_ext has been added, which enables specific assignment of attachment load-store ops at the bind level, rather than on a framebuffer object as a whole.
Begin and end frame markers are used to encapsulate frame boundaries for explicit workload submission. This is required for explicit APIs where implicit flushing of work does not occur.
Ref T96261
Reviewed By: fclem
Maniphest Tasks: T96261
Differential Revision: https://developer.blender.org/D15027
Use a shorter/simpler license convention, stops the header taking so
much space.
Follow the SPDX license specification: https://spdx.org/licenses
- C/C++/objc/objc++
- Python
- Shell Scripts
- CMake, GNUmakefile
While most of the source tree has been included
- `./extern/` was left out.
- `./intern/cycles` & `./intern/atomic` are also excluded because they
use different header conventions.
doc/license/SPDX-license-identifiers.txt has been added to list SPDX all
used identifiers.
See P2788 for the script that automated these edits.
Reviewed By: brecht, mont29, sergey
Ref D14069
Debug groups makes it easier to view from where an error comes from.
The backend can also implement its own callback to make it easier to
follow the API call structure in frame debuggers.
This makes the GPUContext follow the same naming convention as the rest
of the module.
Also add a static getter for extra bonus style (no need for casts):
- Context::get()
- GLContext::get()
This is part of the Vulkan backend task T68990.
This is mostly a cleanup, however, there is a small change:
We don't use a special Vertex Array binding function for Immediate
anymore and just reuse the one for batches.
This might create a bit more state changes but this could be fixed
easily if it causes perf regression.
# Conflicts:
# source/blender/gpu/intern/gpu_context.cc
This is related to the Vulkan port T68990.
This is a full cleanup of the Framebuffer module and a separation
of OpenGL related functions.
There is some changes with how the default framebuffers are handled.
Now the default framebuffers are individually wrapped inside special
GLFrameBuffers. This make it easier to keep track of the currently bound
framebuffer state and have some specificity for operations on these
framebuffers.
Another change is dropping the optimisation of only configuring the
changed attachements during framebuffers update. This does not give
any benefits and add some complexity to the code. This might be brought
back if it has a performance impact on some systems.
This also adds support for naming framebuffers but it is currently not
used.
This is in preparation of the Framebuffer GL backend.
This is a just changing types and moving some code.
No logic is changed... almost... it just removes the context attach.
i.e: `gpu_context_add/remove_framebuffer()`
This is not needed for now and was even disabled in release.
This is part of T68990.
This changes the drawing paradigm a bit. The VAO configuration is done
JIT-style and depends on context active shader.
This is to allow more flexibility for implementations to do optimization
at lower level.
The vao cache is now its own class to isolate the concept. It is this
class that is reference by the GLContext for ownership of the containing
VAO ids.