Cleanup and simplification of GPUMaterial and GPUPass compilation.
See #133674 for details/goals.
- Remove the `draw_manage_shader` thread.
Deferred compilation is now handled by the gpu::ShaderCompiler
through the batch compilation API.
Batch management is handled by the `GPUPassCache`.
- Simplify `GPUMaterial` status tracking so it just queries the
`GPUPass` status.
- Split the `GPUPass` and the `GPUCodegen` code.
- Replaced the (broken) `GPU_material_recalc_flag_get` with the new
`GPU_pass_compilation_timestamp`.
- Add the `GPU_pass_cache_wait_for_all` and
`GPU_shader_batch_wait_for_all`, and remove the busy waits from
EEVEE.
- Remove many unused functions, properties, includes...
Pull Request: https://projects.blender.org/blender/blender/pulls/135637
It can happen that the previous context drew with a different
colorspace. In the case where the new context is drawing with
the same shader that was previously bound (shader binding
optimization), the uniform would not be set again because the
dirty flag would not have been set (since the color space of
this new context never changed). The shader would reuse the same
colorspace as the previous context framebuffer (see #137855).
Fix#137855
Pull Request: https://projects.blender.org/blender/blender/pulls/139226
This caused UB in the tests now that tests are all ran
inside the same context.
A shader could be free but its pointer would be dangling
inside the `Context`. A new shader could have the same
address and generate UB after binding.
This is not the best way to solve this issue but at least
we prevent the use of the UB.
Pull Request: https://projects.blender.org/blender/blender/pulls/139109
This patch adds a new `BLI_mutex.hh` header which adds `blender::Mutex` as alias
for either `tbb::mutex` or `std::mutex` depending on whether TBB is enabled.
Description copied from the patch:
```
/**
* blender::Mutex should be used as the default mutex in Blender. It implements a subset of the API
* of std::mutex but has overall better guaranteed properties. It can be used with RAII helpers
* like std::lock_guard. However, it is not compatible with e.g. std::condition_variable. So one
* still has to use std::mutex for that case.
*
* The mutex provided by TBB has these properties:
* - It's as fast as a spin-lock in the non-contended case, i.e. when no other thread is trying to
* lock the mutex at the same time.
* - In the contended case, it spins a couple of times but then blocks to avoid draining system
* resources by spinning for a long time.
* - It's only 1 byte large, compared to e.g. 40 bytes when using the std::mutex of GCC. This makes
* it more feasible to have many smaller mutexes which can improve scalability of algorithms
* compared to using fewer larger mutexes. Also it just reduces "memory slop" across Blender.
* - It is *not* a fair mutex, i.e. it's not guaranteed that a thread will ever be able to lock the
* mutex when there are always more than one threads that try to lock it. In the majority of
* cases, using a fair mutex just causes extra overhead without any benefit. std::mutex is not
* guaranteed to be fair either.
*/
```
The performance benchmark suggests that the impact is negilible in almost
all cases. The only benchmarks that show interesting behavior are the once
testing foreach zones in Geometry Nodes. These tests are explicitly testing
overhead, which I still have to reduce over time. So it's not unexpected that
changing the mutex has an impact there. What's interesting is that on macos the
performance improves a lot while on linux it gets worse. Since that overhead
should eventually be removed almost entirely, I don't really consider that
blocking.
Links:
* Documentation of different mutex flavors in TBB:
https://www.intel.com/content/www/us/en/docs/onetbb/developer-guide-api-reference/2021-12/mutex-flavors.html
* Older implementation of a similar mutex by me:
https://archive.blender.org/developer/differential/0016/0016711/index.html
* Interesting read regarding how a mutex can be this small:
https://webkit.org/blog/6161/locking-in-webkit/
Pull Request: https://projects.blender.org/blender/blender/pulls/138370
- Make the module global and allow usage from anywhere.
- Remove the matrix API for thread safety reason
- Add lifetime management
- Make display linked to the overlays for easy toggling
## Notes
- Lifetime is in redraw. If there is 4 viewport redrawing, the lifetime decrement by 4 for each window redraw. This allows one viewport to be producer and another one being the consumer.
- Display happens at the start of overlays. Any added visuals inside of the overlays drawing functions will be displayed the next redraw.
- Redraw is not given to happen. It is only given if there is some scene / render update which tags the viewport to redraw.
- Persistent lines are not reset on file load.
Rel #137006
Pull Request: https://projects.blender.org/blender/blender/pulls/137106
Includes win32 specific extensions definitions when including
`vk_common.hh`. Inside `gpu_context.cc` vulkan needs to be
included before opengl, otherwise windows 10 builders will
report a warning.
```
[6421/7520] Building CXX object source\blender\gpu\CMakeFiles\bf_gpu.dir\intern\gpu_context.cc.obj
C:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\shared\minwindef.h(130): warning C4005: 'APIENTRY': macro redefinition
C:\Users\blender\git\blender-vexp\blender.git\lib\windows_x64\epoxy\include\epoxy/gl.h(59): note: see previous definition of 'APIENTRY'
```
Pull Request: https://projects.blender.org/blender/blender/pulls/137134
Update the `ShaderCompilerGeneric` to support deferred compilation
using the batch compilation API, so we can get rid of
`drw_manager_shader`.
This approach also allows supporting non-blocking compilation
for static shaders.
This shouldn't cause any behavior changes at the moment, since batch
compilation is not yet used when parallel compilation is disabled.
This adds a `GPUWorker` and a `GPUSecondaryContext` as an easy to use
wrapper for managing secondary GPU contexts.
(Part of #133674)
Pull Request: https://projects.blender.org/blender/blender/pulls/136518
The previous fix 8f00f068ad
doesn't work as the printf buffer gets recreated.
Ensure render boundaries at lower level and do the printf
flush manually.
This patch adds the texture pool functionality that was previously
only available in the DRW module to the GPU module.
This allows to not rely on global `DST` variable for the managment
of these temporary textures.
Moreover, this can be extended using dedicated GPU backend
specific behavior to reduce the amount of memory needed
to render.
The implementation is mostly copy pasted from the draw implementation
but with more documentation. Also it is simplified since the
`DRW_texture_pool_query` functionality is not needed.
Pull Request: https://projects.blender.org/blender/blender/pulls/134403
Framebuffers are getting freed in the GPUContext base class destructor. But
the framebuffer destructors use the MTL/VK/GLContext derived class, whose
destructor has already completed at this point. So these contexts are no
longer valid to use.
Now free the framebuffers earlier.
This caused ASAN warnings, it's not known to cause actual bugs.
Pull Request: https://projects.blender.org/blender/blender/pulls/132504
Rendering animations from Python scripts via `bpy.ops.render.opengl()`
did not trigger any of the notifications in the Metal back-end to
indicate a frame had been rendered and that the associated resources
could be released. This adds a call to GPU_render_step() after each
render. For the original asset in the bug report this reduces the high
memory watermark from 30gb to 13gb for 500 frames. 13gb is likely
still too high and therefore it is likely there are additional leaks
that need to be addressed so this should only be considered a partial
fix.
Authored by Apple: James McCarthy
Co-authored-by: James McCarthy <jamesmccarthy@apple.com>
Co-authored-by: Clément Foucault <foucault.clem@gmail.com>
Pull Request: https://projects.blender.org/blender/blender/pulls/131085
This port is not so straightforward.
This shader is used in different configurations and is
available to python bindings. So we need to keep
compatibility with different attributes configurations.
This is why attributes are loaded per component and a
uniform sets the length of the component.
Since this shader can be used from both the imm and batch
API, we need to inject some workarounds to bind the buffers
correctly.
The end result is still less versatile than the previous
metal workaround (i.e.: more attribute fetch mode supported),
but it is also way less code.
### Limitations:
The new shader has some limitation:
- Both `color` and `pos` attributes need to be `F32`.
- Each attribute needs to be 4byte aligned.
- Fetch type needs to be `GPU_FETCH_FLOAT`.
- Primitive type needs to be `GPU_PRIM_LINES`, `GPU_PRIM_LINE_STRIP` or `GPU_PRIM_LINE_LOOP`.
- If drawing using an index buffer, it must contain no primitive restart.
Rel #127493
Co-authored-by: Jeroen Bakker <jeroen@blender.org>
Pull Request: https://projects.blender.org/blender/blender/pulls/129315
Printf buffer read needs to be inside render boundaries
to work. Since render boundaries can be nested, use a stack.
Fixes assert when quitting blender.
Make sure all printing happens inside render boundaries
since it needs to read a storage buffer which needs to
record some commands inside command buffers.
This PR allows users to select a GPU backend.
In the system tab of the user preferences the GPU backend can be selected in the `Display Graphics` panel.
It will require a restart of Blender before the changes become effective.
During startup minimum requirements are checked. Blender will switch automatically
to OpenGL when no compatible Vulkan device could be detected. A dialog will be shown
to inform the user.
The setting of the in the `Display Graphics` panel are still overridden when blender is started
using the `--gpu-backend` option. When starting blender with `--debug-gpu` the backend
detection will print to the console.
See PR for detailed information and screenshots of the UI.
Implements #126504
Pull Request: https://projects.blender.org/blender/blender/pulls/126545
This allows much easier debugging of shader programs.
Usage is as simple as adding `printf` calls inside shaders.
example: `printf("Formating %d\n", my_var);`
Contrary to the `drw_print`, this is not limited
to draw manager shader dispatch/draws. It is compatible
with any shader inside blender.
Most notably, this doesn't need a viewport to display.
So this can be used to debug render pipeline.
Data formating is currently limited to only `%x`, `%d`,
`%u` and `%f`. This could be easily extended if this is
really needed.
There is no type checking, so values are directly reinterpreted
as specified by the printf format.
The current approach for making this work is to bind a
storage buffer inside `GPU_shader_bind`, making it
available to any shader that needs it. The storage buffer
is downloaded back to CPU after a frame or a render
step and the content printed to the console.
This scheduling means that you cannot rely on these printfs
to detect crashes. We could add a mode to force flushing
at shader binding to avoid this limitation.
The values are written from the shaders in binary form and
only formated on the CPU. This avoid issues with manual
printing like with `drw_print`.
Pull Request: https://projects.blender.org/blender/blender/pulls/125071
Don't assume existence of GPU backend in (background) preview rendering.
Also add null pointer checks and rely on assert instead to detect
invalid usage of GPU_render_begin/end, so that potential future mistakes
don't cause crashes.
Pull Request: https://projects.blender.org/blender/blender/pulls/112971
With the introduction of metal and vulkan we use a different
GPU backend selection which broke the dialog box for unsupported
platforms.
Blender asserted and segfaulted before the dialog was being shown
to the user.
This patch solves this by introducing a dummy GPU backend in case no
GPU backend was supported by OpenGL, Metal and Vulkan Backend.
It also adds the showMessageBox to GHOST_SystemCocoa.
Related to #110335
Pull Request: https://projects.blender.org/blender/blender/pulls/110919
The maximum OpenGL versions supported on mac
doesn't meet the minimum required version (>=4.3) anymore.
This removes all the OpenGL paths in GHOST
Cocoa backend and from the drop down menu in
the user preferences.
Pull Request: https://projects.blender.org/blender/blender/pulls/110185
A lot of files were missing copyright field in the header and
the Blender Foundation contributed to them in a sense of bug
fixing and general maintenance.
This change makes it explicit that those files are at least
partially copyrighted by the Blender Foundation.
Note that this does not make it so the Blender Foundation is
the only holder of the copyright in those files, and developers
who do not have a signed contract with the foundation still
hold the copyright as well.
Another aspect of this change is using SPDX format for the
header. We already used it for the license specification,
and now we state it for the copyright as well, following the
FAQ:
https://reuse.software/faq/
(MacOS) only: In the System tab of the user preferences the user has the
ability to select a GPU backend that Blender will use. After changing
the GPU backend setting, the user has to restart Blender before the
setting is used.
It was added to start collecting feedback on the Metal backend without
using the command lines.
By default Blender will select OpenGL as backend. When Metal is selected
(via `--gpu-backend metal` or via user preferences) OpenGL will be used as
fallback when the platform isn't capable of running Metal.
This patch adds a placeholder for the vulkan backend.
When activated (`WITH_VULKAN_BACKEND=On` and `--gpu-backend vulkan`)
it might open a blender screen, but nothing should be visible as
none of the functions are implemented or otherwise crash on a nullptr.
This is expected as this is just a placeholder. The goal is to add shader compilation
+validation to this backend as one of the next steps so we can validate
changes to existing shaders on OpenGL, Metal and Vulkan at the same time.
Reviewed By: fclem
Differential Revision: https://developer.blender.org/D16338
Add command line argument to switch gpu backend. Add `--gpu-backend` option to
override the gpu backend selected by Blender.
Values for this option that will be available in releases for now are:
* opengl: Force blender to select OpenGL backend.
During development and depending on compile options additional values can exist:
* metal: Force Blender to select Metal backend.
When this option isn't provided the internal logic for GPU backend selection will be used.
Note that this is at the time of writing the same as always selecting the opengl backend.
Reviewed By: fclem, brecht, MichaelPW
Differential Revision: https://developer.blender.org/D16297
MTLContext provides functionality for command encoding, binding management and graphics device management. MTLImmediate provides simple draw enablement with dynamically encoded data. These draws utilise temporary scratch buffer memory to provide minimal bandwidth overhead during workload submission.
This patch also contains empty placeholders for MTLBatch and MTLDrawList to enable testing of first pixels on-screen without failure.
The Metal API also requires access to the GHOST_Context to ensure the same pre-initialized Metal GPU device is used by the viewport. Given the explicit nature of Metal, explicit control is also needed over presentation, to ensure correct work scheduling and rendering pipeline state.
Authored by Apple: Michael Parkin-White
Ref T96261
(The diff is based on 043f59cb3b)
Reviewed By: fclem
Differential Revision: https://developer.blender.org/D15953
Full support for translation and compilation of shaders in Metal, using
GPUShaderCreateInfo. Includes render pipeline state creation and management,
enabling all standard GPU viewport rendering features in Metal.
Authored by Apple: Michael Parkin-White, Marco Giordano
Ref T96261
Reviewed By: fclem
Maniphest Tasks: T96261
Differential Revision: https://developer.blender.org/D15563
Backend initialization needs to be delayed until after the OpenGL context
is created. This worked fine in foreground mode because the OpenGL context
already exists for the window at the point GPU_backend_init_once was called,
but not for background mode.
Create the backend just in time in GPU_context_create as before, and
automatically free it when the last context id discarded. But check if any
GPU backend is supported before creating the OpenGL context.
Ref D15463, D15465
This causes an assert with libepoxy, but was wrong already regardless.
Refactor logic to work as follows:
* GPU_exit() deletes backend resources
* Destroy UI GPU resources with the context active
* Call GPU_backend_exit() after deleting the context
Ref D15291
Differential Revision: https://developer.blender.org/D15465
When rendering with headless builds, show an error instead of crashing.
Previously GPU_backend_init was called indirectly from
DRW_opengl_context_create, a new function is now called from the window
manager (GPU_backend_init_once), so it's possible to check if the GPU
has a back-end.
This also disables the `bgl` Python module when building WITH_HEADLESS.
Reviewed By: fclem
Ref D15463
MTLFrameBuffer has been implemented to support creation of RenderCommandEncoders, along with supporting functionality in the Metal Context.
Optimisation stubs for GPU_framebuffer_bind_ext has been added, which enables specific assignment of attachment load-store ops at the bind level, rather than on a framebuffer object as a whole.
Begin and end frame markers are used to encapsulate frame boundaries for explicit workload submission. This is required for explicit APIs where implicit flushing of work does not occur.
Ref T96261
Reviewed By: fclem
Maniphest Tasks: T96261
Differential Revision: https://developer.blender.org/D15027