644fb2b679 fixed a long standing issue
that offscreen example showed the wrong colors. However the fix assumes
that input texture color space is always sRGB.
This adds a shader variation that draws textures that are stored in scene referred
linear color space (like all of our Image data-block).
Co-authored-by: Clément Foucault <foucault.clem@gmail.com>
Pull Request: https://projects.blender.org/blender/blender/pulls/147788
Dependencies were previously merged manually
inside the generated_sources by EEVEE.
This caused issues with double includes.
Instead, we now only gather the name of the
nodetree dependencies and add them to the
dependencies of the `GeneratedSource`.
This also make the compositor use the `GeneratedSource`
mechanism.
Pull Request: https://projects.blender.org/blender/blender/pulls/146106
The goal of this patch is to reduce final shader code footprint to
hopefully reduce shader compile time (see #145347).
This also contains a pass over most shader file to remove unused
include or use more granular ones to reduce final shader code
length.
Testing with the same setup as #145347:
| | main (ms) | PR (ms) | Delta (ms) |
| -------- | ----------- | ------------ |------------ |
| Nvidia | 257 | 207 (1.24x) | 50 |
| Mesa AMD | 323 | 295 (1.09x) | 28 |
In barbershop test scene however the saving are not so noticeable:
| | main (s) | PR (s) | Delta (s) |
| -------- | ----------- | ------------ |------------ |
| Nvidia (OpenGL) | 40 | 39 (1.02x) | 1 |
| Nvidia (Vulkan) | 29 | 29 (1.0x) | 0 |
Pull Request: https://projects.blender.org/blender/blender/pulls/145803
162a24e05d had to be reverted, since it
didn't take into account other types of dynamically generated
`ShaderCreateInfo` (external shaders like OCIO or Python ones).
This just marks `ShaderCreateInfo`s as generated by default and only
sets the ones from gpu_shader_create_info_list as non generated.
Pull Request: https://projects.blender.org/blender/blender/pulls/145128
Shader compilation no longer uses the `WM_job` API.
Add a `GPU_shader_batch_is_compiling` function to query if there's any
shader compilation happening, and update `bpy_app_is_job_running` to
handle this as a special case.
Pull Request: https://projects.blender.org/blender/blender/pulls/143559
In `pygpu_shader_attrs_info_get`, it tries to check information for all
vertex attributes that are added via `VERTEX_IN`, however some drivers
will optimize compiled shaders so some vertex attributes that are not
used will be removed. This fix makes sure that the input length that
is used in `GPU_shader_get_attribute_len` does not exceed actual max
binding number.
Pull Request: https://projects.blender.org/blender/blender/pulls/137584
Prevent race conditions caused by calling `GPUWorker::wake_up` when the
worker is not waiting.
Found to be an issue in #139627, since `wake_up` is likely to be called
before the thread has fully started.
Pull Request: https://projects.blender.org/blender/blender/pulls/139842
This allows to generate source file that will
be injected in a predefined source dependance tree.
This allow much cleaner shader workflow where
all sources are explicitly referenced from the
main source file.
Pull Request: https://projects.blender.org/blender/blender/pulls/140047
This works by wrapping the entry point call inside a
`main` function.
Since resources are still defined in global space,
function accessing these are marked with a custom
attribute. This custom attribute expands in a
`#ifdef` guard for the matching stage.
This is a temporary solution and will eventually
be lifted once we support SRD.
### TODO
- [ ] Implement `[[gpu::vertex/fragment_function]]`.
Pull Request: https://projects.blender.org/blender/blender/pulls/139233
This has limited use cases since it doesn't
profile the heavy part of the vulkan backend.
Almost 1:1 port of the metal implementation from #139551.
Doesn't cover rendergraph submission nor GPU timings.
Pull Request: https://projects.blender.org/blender/blender/pulls/139899
Allow adding compilation batches to different priority queues.
Set priorities so static shaders are always compiled first,
then materials, and optimized materials last.
Pull Request: https://projects.blender.org/blender/blender/pulls/139456
Cleanup and simplification of GPUMaterial and GPUPass compilation.
See #133674 for details/goals.
- Remove the `draw_manage_shader` thread.
Deferred compilation is now handled by the gpu::ShaderCompiler
through the batch compilation API.
Batch management is handled by the `GPUPassCache`.
- Simplify `GPUMaterial` status tracking so it just queries the
`GPUPass` status.
- Split the `GPUPass` and the `GPUCodegen` code.
- Replaced the (broken) `GPU_material_recalc_flag_get` with the new
`GPU_pass_compilation_timestamp`.
- Add the `GPU_pass_cache_wait_for_all` and
`GPU_shader_batch_wait_for_all`, and remove the busy waits from
EEVEE.
- Remove many unused functions, properties, includes...
Pull Request: https://projects.blender.org/blender/blender/pulls/135637
This caused UB in the tests now that tests are all ran
inside the same context.
A shader could be free but its pointer would be dangling
inside the `Context`. A new shader could have the same
address and generate UB after binding.
This is not the best way to solve this issue but at least
we prevent the use of the UB.
Pull Request: https://projects.blender.org/blender/blender/pulls/139109
This allows multiple threads to request different specializations without
locking usage of all specialized shaders program when a new specialization
is being compiled.
The specialization constants are bundled in a structure that is being
passed to the `Shader::bind()` method. The structure is owned by the
calling thread and only used by the `Shader::bind()`.
Only querying for the specialized shader (Map lookup) is locking the shader
usage.
The variant compilation is now also locking and ensured that
multiple thread trying to compile the same variant will never result
in race condition.
Note that this removes the `is_dirty` optimization. This can be added
back if this becomes a bottleneck in the future. Otherwise, the
performance impact is not noticeable.
Pull Request: https://projects.blender.org/blender/blender/pulls/136991
Fix the recently implemented ShaderCompiler::batch_cancel.
Expose it with GPU_shader_batch_cancel and
GPU_shader_specialization_batch_cancel.
Use them in the EEVEE ShaderModule destructor, to prevent blocking on
destruction when there are in-flight compilations.
Pull Request: https://projects.blender.org/blender/blender/pulls/138774
Part of #136993.
Share as much of the ShaderCompiler implementations as possible.
Remove the ShaderCompiler/ShaderCompilerGeneric split and make most of
its functions non virtual.
Move the `get_compiler` function from `Context` to `GPUBackend` and
creation/deletion to `GPUBackend::init/delete_resources`.
Add a `batch_cancel` function to `ShaderCompiler` (needed for the
GPUPass refactor).
As a nice extra, the multithreaded OpenGL compilation has become faster
too.
The barbershop materials + EEVEE static shaders have gone from 27s to
22s.
I have not observed any performance difference on Vulkan or Metal.
Pull Request: https://projects.blender.org/blender/blender/pulls/136676
Multiple threads would be setting the globals
`g_shader_builtin_srgb_transform` and
`g_shader_builtin_srgb_is_dirty`.
These are use for color management inside the builtin
shaders. But the render thread could modify these
values even if its shader have no use of these.
The fix is to move these globals to the `gpu::Context`
class. This way we remove the race condition.
Update the `ShaderCompilerGeneric` to support deferred compilation
using the batch compilation API, so we can get rid of
`drw_manager_shader`.
This approach also allows supporting non-blocking compilation
for static shaders.
This shouldn't cause any behavior changes at the moment, since batch
compilation is not yet used when parallel compilation is disabled.
This adds a `GPUWorker` and a `GPUSecondaryContext` as an easy to use
wrapper for managing secondary GPU contexts.
(Part of #133674)
Pull Request: https://projects.blender.org/blender/blender/pulls/136518
This is getting in the way of making the
GPUShader API more threadsafe.
This getter already doesn't work for vulkan
and Metal, and has very limited usage.
Keeping the python function to avoid errors
and display a deprecation warning.
Pull Request: https://projects.blender.org/blender/blender/pulls/136983
Move the `StaticShader` class from Workbench to `GPU_shader` and make
compilation thread-safe (Shader usage is still not thread-safe).
Use `StaticShader`s for all shader caches.
Subdivision shaders are still not ported.
(Part of #134690)
Pull Request: https://projects.blender.org/blender/blender/pulls/134812
The removal of the loose uniform made the shader not compile.
This patch adds a new define for these type of shaders and add
back the loose uniform.
Note that these shaders might no longer work on Metal as
the source is not parsed anymore.
Pull Request: https://projects.blender.org/blender/blender/pulls/134341
Compiling of graphics shaders via gpu crashed. The vulkan backend found
a compute source and continued the evaluation as if it was a compute
shader.
The compute source was added by the preprocessor that wraps the shader
source. Even empty sources were wrapped. Detection based on empty shader
sources failed.
This is not a Vulkan only issue as other platforms would have similar issues when
creating a compute shader.
Pull Request: https://projects.blender.org/blender/blender/pulls/133036
Use static CreateInfos for Overlay-Next shaders using a similar approach to Workbench shader variations.
Remove unused infos and shader sources.
Remove the `gpu_shader_create_info_get_unfinalized_copy` workaround.
Pull Request: https://projects.blender.org/blender/blender/pulls/131514
In Blender 4.4 (since commit 00a8d006fe), polyline shaders stopped
using geometry shaders and now rely on SSBOs.
In C++, workarounds allow these shaders to function as before, albeit
with some limitations.
However, this change broke the `batch_for_shader` function in Python,
as `GPUShader.attrs_info_get()` only reads attributes and does not
support SSBOs.
To address this, the method now treats polyline shaders differently,
accessing SSBO inputs instead of attributes.
fix
This port is not so straightforward.
This shader is used in different configurations and is
available to python bindings. So we need to keep
compatibility with different attributes configurations.
This is why attributes are loaded per component and a
uniform sets the length of the component.
Since this shader can be used from both the imm and batch
API, we need to inject some workarounds to bind the buffers
correctly.
The end result is still less versatile than the previous
metal workaround (i.e.: more attribute fetch mode supported),
but it is also way less code.
### Limitations:
The new shader has some limitation:
- Both `color` and `pos` attributes need to be `F32`.
- Each attribute needs to be 4byte aligned.
- Fetch type needs to be `GPU_FETCH_FLOAT`.
- Primitive type needs to be `GPU_PRIM_LINES`, `GPU_PRIM_LINE_STRIP` or `GPU_PRIM_LINE_LOOP`.
- If drawing using an index buffer, it must contain no primitive restart.
Rel #127493
Co-authored-by: Jeroen Bakker <jeroen@blender.org>
Pull Request: https://projects.blender.org/blender/blender/pulls/129315
Printf buffer read needs to be inside render boundaries
to work. Since render boundaries can be nested, use a stack.
Fixes assert when quitting blender.
Avoid measuring the length of strings repeatedly by passing their
length along with their data with `StringRefNull`. Null termination
seems to be necessary still for passing the shader sources to OpenGL.
Though I doubt this is a bottleneck, it's still nice to avoid overhead from
string operations and this helps move in that direction.
Pull Request: https://projects.blender.org/blender/blender/pulls/127702