Cleanup and simplification of GPUMaterial and GPUPass compilation.
See #133674 for details/goals.
- Remove the `draw_manage_shader` thread.
Deferred compilation is now handled by the gpu::ShaderCompiler
through the batch compilation API.
Batch management is handled by the `GPUPassCache`.
- Simplify `GPUMaterial` status tracking so it just queries the
`GPUPass` status.
- Split the `GPUPass` and the `GPUCodegen` code.
- Replaced the (broken) `GPU_material_recalc_flag_get` with the new
`GPU_pass_compilation_timestamp`.
- Add the `GPU_pass_cache_wait_for_all` and
`GPU_shader_batch_wait_for_all`, and remove the busy waits from
EEVEE.
- Remove many unused functions, properties, includes...
Pull Request: https://projects.blender.org/blender/blender/pulls/135637
This caused UB in the tests now that tests are all ran
inside the same context.
A shader could be free but its pointer would be dangling
inside the `Context`. A new shader could have the same
address and generate UB after binding.
This is not the best way to solve this issue but at least
we prevent the use of the UB.
Pull Request: https://projects.blender.org/blender/blender/pulls/139109
This allows multiple threads to request different specializations without
locking usage of all specialized shaders program when a new specialization
is being compiled.
The specialization constants are bundled in a structure that is being
passed to the `Shader::bind()` method. The structure is owned by the
calling thread and only used by the `Shader::bind()`.
Only querying for the specialized shader (Map lookup) is locking the shader
usage.
The variant compilation is now also locking and ensured that
multiple thread trying to compile the same variant will never result
in race condition.
Note that this removes the `is_dirty` optimization. This can be added
back if this becomes a bottleneck in the future. Otherwise, the
performance impact is not noticeable.
Pull Request: https://projects.blender.org/blender/blender/pulls/136991
This adds support for the extension and always
set the clip state value to 0..1 to align with vulkan
and metal. Moreover this is needed for the Reverse Z
implementation.
Note that this is a OpenGL 4.5 feature and is not
required to start Blender. So there must still be
a fallback path for now.
Rel #138898
Pull Request: https://projects.blender.org/blender/blender/pulls/138941
Replace `DRW_Attributes` with a VectorSet of std::string. The max number of
attributes is still the same. The inline buffer size is 4, and std::string's inline
buffer is smaller than the previous char array size of 64, but it seems
reasonable to save those optimizations for shorter attribute names and
fewer attributes. In return we significantly decrease the size of the batch
caches, simplify the code, and remove the attribute name length limit.
I observed roughly an 8% increase in the 30k cube objects file, a change from
12 to 13 FPS. I'm guessing this is mostly because `VectorSet<std::string>` is
smaller than `DRW_Attributes`.
Pull Request: https://projects.blender.org/blender/blender/pulls/138946
This avoid stalling the viewport when a preview job is running.
This is because both were fighting for the same GPU context.
This doesn't remove the blocking but allows to remove it using #136991.
Pull Request: https://projects.blender.org/blender/blender/pulls/138882
This commit finishes removing the uses of the integer to float
vertex buffer fetch mode. Previous commits noted below already started
that process. The last usage was geometry attributes. Now integers are
converted to floats as part of the existing upload process.
The change makes the Vulkan vertex buffer type conversion unused, so
it's removed. That's nice because Vulkan vertex buffers go from 1040 to
568 bytes in size and have significantly less overhead on creation.
Related:
- 153abc372e
- 1e1ac2bb9b
- 617858e453
Pull Request: https://projects.blender.org/blender/blender/pulls/138873
The conversion from int to float is not supported natively
so it ends up happening beforehand on the CPU or as a
step before the vertex buffer can be used. It's better to just
upload floats in the first place.
Related to:
- 1e1ac2bb9b
- 617858e453
Pull Request: https://projects.blender.org/blender/blender/pulls/138855
Caused by 617858e453.
These formats should use types aligned to 4 bytes. That's generally
required by modern GPUs. Uploading with these types also avoids
automatic conversion by the Vulkan backend which is something
we're hoping to remove fully.
In the end this PR removes a bunch of code related to supporting
the older single-byte formats.
Pull Request: https://projects.blender.org/blender/blender/pulls/138836
This unifies vertex and texture data formats
into a single base enum class.
`TextureFormat` and `VertexFormat` then mask
the invalid format for their respective usage.
Having a base enum allows casting between
`TextureFormat` and `VertexFormat` possible
(needed for Buffer Textures).
It also makes it easier to write and read data
to buffers/textures as each format will have an
associated host type.
These enum is generated from MACRO expansion.
This allow to centralize all information about
the formats in one place. This avoid duplicating
the list of enums for each backend.
This only creates the new enum. Porting older enums will
be done in other PRs.
Normalized integer CPU format are missing and waiting for #130640
Rel #130632
Pull Request: https://projects.blender.org/blender/blender/pulls/138069
Similar to 93be6baa9c.
It doesn't make sense to store the type as part of the request
since we upload any generic attribute, and the vertex buffer
type is just chosen depending on the attribute's type in the
geometry anyway.
The code in `mesh_cd_calc_used_gpu_layers` is unfortunately
still way too complicated to remove the custom data layer lookup,
but this gets us one step closer.
Pull Request: https://projects.blender.org/blender/blender/pulls/138791
Similar to b21cb20eeb.
This time the domain is removed. The idea is that the domain doesn't
change anything about how the attribute is stored on the vertex buffers
so it doesn't make sense as part of the request. If we continue that
logic to also remove the data type, we can avoid searching through the
geometry when creating the requests, instead handling invalid requests
when creating the buffers.
The complexity of the change comes from the fact that the request's
domain was used to determine whether the Curves drawing code needed to
interpolate the attribute to the evaluated points. This is now stored
separately in the curves cache. The change in the sculpt code is also
non-trivial since we delay more of the logic until after we have
looked up the attribute from the geometry.
Pull Request: https://projects.blender.org/blender/blender/pulls/138619
The goal is to separate the draw attribute request from the
CustomData implementation. For the layer index this was
already started a while ago; it's only used in a couple places
and lookups are mostly name based anyway.
Conceptually the attribute request is just a request that the
extraction system create a buffer for a certain attribute name.
Information about where the attribute is stored doesn't fit.
Pull Request: https://projects.blender.org/blender/blender/pulls/138570
When a texture wrapper wraps a second texture it doesn't free its
local resources based on the previous texture. This resulted in texture
views still being used where the backed memory could already be reused
by other allocations.
In OpenGL this might be solved inside the driver by not freeing the
backed texture unless all views have been freed. However our Vulkan
backend doesn't do this, leading to crashes when resizing the viewport
when displaying a workbench volume. In OpenGL this could lead to small
resizing artifacts, although we haven't noticed them. Overlay also wraps
existing textures.
Pull Request: https://projects.blender.org/blender/blender/pulls/138582
`DRW_gpencil_engine_needed` only checks whether grease pencil is
excluded in the viewport or whether there's grease pencil ID exists,
this can not handle cases where grease pencil strokes are generated from
other types of object in geometry nodes. Now this function is split into
two `gpencil_engine_needed_viewport` and
`gpencil_excluded` calls to provide more granulated logic for
places that need them.
Pull Request: https://projects.blender.org/blender/blender/pulls/138386
This was caused by the material indices not referring to an existing
material slot. Clamping the value to the maximum material index for
the given object fixes the issue.
Pull Request: https://projects.blender.org/blender/blender/pulls/138544
This patch adds a new `BLI_mutex.hh` header which adds `blender::Mutex` as alias
for either `tbb::mutex` or `std::mutex` depending on whether TBB is enabled.
Description copied from the patch:
```
/**
* blender::Mutex should be used as the default mutex in Blender. It implements a subset of the API
* of std::mutex but has overall better guaranteed properties. It can be used with RAII helpers
* like std::lock_guard. However, it is not compatible with e.g. std::condition_variable. So one
* still has to use std::mutex for that case.
*
* The mutex provided by TBB has these properties:
* - It's as fast as a spin-lock in the non-contended case, i.e. when no other thread is trying to
* lock the mutex at the same time.
* - In the contended case, it spins a couple of times but then blocks to avoid draining system
* resources by spinning for a long time.
* - It's only 1 byte large, compared to e.g. 40 bytes when using the std::mutex of GCC. This makes
* it more feasible to have many smaller mutexes which can improve scalability of algorithms
* compared to using fewer larger mutexes. Also it just reduces "memory slop" across Blender.
* - It is *not* a fair mutex, i.e. it's not guaranteed that a thread will ever be able to lock the
* mutex when there are always more than one threads that try to lock it. In the majority of
* cases, using a fair mutex just causes extra overhead without any benefit. std::mutex is not
* guaranteed to be fair either.
*/
```
The performance benchmark suggests that the impact is negilible in almost
all cases. The only benchmarks that show interesting behavior are the once
testing foreach zones in Geometry Nodes. These tests are explicitly testing
overhead, which I still have to reduce over time. So it's not unexpected that
changing the mutex has an impact there. What's interesting is that on macos the
performance improves a lot while on linux it gets worse. Since that overhead
should eventually be removed almost entirely, I don't really consider that
blocking.
Links:
* Documentation of different mutex flavors in TBB:
https://www.intel.com/content/www/us/en/docs/onetbb/developer-guide-api-reference/2021-12/mutex-flavors.html
* Older implementation of a similar mutex by me:
https://archive.blender.org/developer/differential/0016/0016711/index.html
* Interesting read regarding how a mutex can be this small:
https://webkit.org/blog/6161/locking-in-webkit/
Pull Request: https://projects.blender.org/blender/blender/pulls/138370
In the `grease_pencil_wire_batch_ensure` when computing the indices,
the function was using `points_by_curve` instead of
`evaluated_points_by_curve`.
Now the correct indices are used.
Pull Request: https://projects.blender.org/blender/blender/pulls/138489
* Remove `DEG_get_evaluated_object` in favor of `DEG_get_evaluated`.
* Remove `DEG_is_original_object` in favor of `DEG_is_original`.
* Remove `DEG_is_evaluated_object` in favor of `DEG_is_evaluated`.
Pull Request: https://projects.blender.org/blender/blender/pulls/138317
These are not the same. Debug Scope are for scoped
capture. These should be static to avoid flooding
debug tools with scopes. Since this code is
unlikely to be debugged, it is better to use
debug groups.
This removes the minimum thickness clamping in the shader.
The reason why this was clamped in the first place was to reduce
aliasing artifacts. With the new super sampling method for
rendering, this should no longer be an issue.
Note: This can break visual compatibility, but the previous radii
were arguably "wrong". This essentially fixes this and renders
the strokes with the actual radii from the attribute.
Previous files that might have relied on the clamping will have
to be updated.
Pull Request: https://projects.blender.org/blender/blender/pulls/138119
This implement the design detailed in #135935.
A new per object property called `Shadow Terminator Normal Offset` is
introduced to shift the shadowed position along the shading normal.
The amount of shift is defined in object space on the object datablock.
This amount is modulated by the facing ratio to the light. Faces
already facing the light will get no offset. This avoids most light
leaking artifacts.
In case of multiple shading normal, the normal used for the shift
is arbitrary. Note that this is the same behavior for other biases.
The magnitude of the bias is controlled by `Shadow Terminator Normal Offset`.
The amount of faces affected by the bias is controlled using
`Shadow Terminator Geometry Offset` just like cycles.
Tweaking the `Shadow Terminator Geometry Offset` allows to avoid too much
shadow distortion on surfaces with bump mapping.
Cycles properties are copied from the Cycles object datablock to the
blender datablock. This break the python API for Cycles.
The defaults are set to no bias because:
- There is no good default. The best value depends on the geometry.
- The best value might depend on real-time displacement.
- Any bias will introduce light leaking on surfaces that do not need it.
- There is an additional cost of enabling it, which is proportional
to the amount of pixels on screen using it.
Pull Request: https://projects.blender.org/blender/blender/pulls/136935
This was caused by missing depth of objects.
The old implementation was relying on the external engine
to provide the correct depth of the objects.
This patch does exactly this.
The downside is that, if overlays are present, the prepass
will be also drawn by the overlay engine.
Pull Request: https://projects.blender.org/blender/blender/pulls/138004
Switching to lattice edit mode when it has an armature modifier can
crash if the armature modifier's `show_in_editmode` is turned on. Now
prevent null `editlatt` access.
Pull Request: https://projects.blender.org/blender/blender/pulls/137701