Caused by 617858e453.
These formats should use types aligned to 4 bytes. That's generally
required by modern GPUs. Uploading with these types also avoids
automatic conversion by the Vulkan backend which is something
we're hoping to remove fully.
In the end this PR removes a bunch of code related to supporting
the older single-byte formats.
Pull Request: https://projects.blender.org/blender/blender/pulls/138836
This unifies vertex and texture data formats
into a single base enum class.
`TextureFormat` and `VertexFormat` then mask
the invalid format for their respective usage.
Having a base enum allows casting between
`TextureFormat` and `VertexFormat` possible
(needed for Buffer Textures).
It also makes it easier to write and read data
to buffers/textures as each format will have an
associated host type.
These enum is generated from MACRO expansion.
This allow to centralize all information about
the formats in one place. This avoid duplicating
the list of enums for each backend.
This only creates the new enum. Porting older enums will
be done in other PRs.
Normalized integer CPU format are missing and waiting for #130640
Rel #130632
Pull Request: https://projects.blender.org/blender/blender/pulls/138069
Many of the upfront specialized variants
were not needed/ They are only used if
some scene render setting changes, which
we can detect upfront.
This is noticeable on OpenGL which doesn't support
specialization constant and has to do full shader
recompilation for each variants.
Pull Request: https://projects.blender.org/blender/blender/pulls/138589
No functional changes intended.
This just replaces all calls to `PBONE_VISIBLE` and `EBONE_VISIBLE`
with appropriate functions.
In the case of editbones it is just the function that the macro already contained.
For pose bones, a new function was added that mirrors what the macro had.
Using a function will make it easier to change how selection is queried in the future.
part of #138482
Pull Request: https://projects.blender.org/blender/blender/pulls/138819
Similar to 93be6baa9c.
It doesn't make sense to store the type as part of the request
since we upload any generic attribute, and the vertex buffer
type is just chosen depending on the attribute's type in the
geometry anyway.
The code in `mesh_cd_calc_used_gpu_layers` is unfortunately
still way too complicated to remove the custom data layer lookup,
but this gets us one step closer.
Pull Request: https://projects.blender.org/blender/blender/pulls/138791
Similar to b21cb20eeb.
This time the domain is removed. The idea is that the domain doesn't
change anything about how the attribute is stored on the vertex buffers
so it doesn't make sense as part of the request. If we continue that
logic to also remove the data type, we can avoid searching through the
geometry when creating the requests, instead handling invalid requests
when creating the buffers.
The complexity of the change comes from the fact that the request's
domain was used to determine whether the Curves drawing code needed to
interpolate the attribute to the evaluated points. This is now stored
separately in the curves cache. The change in the sculpt code is also
non-trivial since we delay more of the logic until after we have
looked up the attribute from the geometry.
Pull Request: https://projects.blender.org/blender/blender/pulls/138619
Fix the recently implemented ShaderCompiler::batch_cancel.
Expose it with GPU_shader_batch_cancel and
GPU_shader_specialization_batch_cancel.
Use them in the EEVEE ShaderModule destructor, to prevent blocking on
destruction when there are in-flight compilations.
Pull Request: https://projects.blender.org/blender/blender/pulls/138774
Previously if a scene has Grease Pencil objects that are drawn in-front
and objects that are not, Grease Pencil drawing sorting call would
concat those two drawing lists into one. Since "separate pass" rendering
will call Grease Pencil drawing again, this list was concatenated twice,
resulting in a looped list which goes on indefinitely when iterated.
Now Grease Pencil will only sort once per drawing instance.
Thanks to @antonioya and @mmendio for testing and reporting!
Pull Request: https://projects.blender.org/blender/blender/pulls/138528
This adds a new `DNA_sdna_type_ids.hh` header:
```cpp
namespace blender::dna {
/**
* Each DNA struct has an integer identifier which is unique within a specific
* Blender build, but not necessarily across different builds. The identifier
* can be used to index into `SDNA.structs`.
*/
template<typename T> int sdna_struct_id_get();
/**
* The maximum identifier that will be returned by #sdna_struct_id_get in this
* Blender build.
*/
int sdna_struct_id_get_max();
} // namespace blender::dna
```
The `sdna_struct_id_get` function is used as replacement of
`SDNA_TYPE_FROM_STRUCT` in all places except the DNA defaults system. The
defaults system is C code and therefore can't use the template. There is ongoing
work to replace the defaults system as well though: #134531.
Using this templated function has some benefits over the old approach:
* No need to rely on macros.
* Can use type inferencing in functions like `BLO_write_struct` which avoids
redundancy on the call site. E.g. `BLO_write_struct(writer, ActionStrip,
strip);` can become `BLO_write_struct(writer, strip);` which could even become
`writer.write_struct(strip);`. None of that is implemented as part of this
patch though.
* No need to include the generated `dna_type_offsets.h` file which contains a
huge enum.
Implementation wise, this is done using explicit template instantiations in a
new file generated by `makesdna.cc`: `dna_struct_ids.cc`. The generated file
looks like so:
```cpp
namespace blender::dna {
template<typename T> int sdna_struct_id_get();
int sdna_struct_id_get_max();
int sdna_struct_id_get_max() { return 951; }
}
struct IDPropertyUIData;
template<> int blender:🧬:sdna_struct_id_get<IDPropertyUIData>() { return 1; }
struct IDPropertyUIDataEnumItem;
template<> int blender:🧬:sdna_struct_id_get<IDPropertyUIDataEnumItem>() { return 2; }
```
I tried using static variables instead of separate functions, but I didn't
manage to link it properly. Not quite sure yet if that's an actual limitation or
if I was just missing something.
Pull Request: https://projects.blender.org/blender/blender/pulls/138706
While in 3D View Perspective Mode you are able to disable the Floor
and/or the XYZ axis lines. But in Orthographic views turning off the
axes lines has no effect, but turning off the grid also turns off the
lines. This PR just makes these work independently, as you'd expect
from the options in Viewport Overlays.
Pull Request: https://projects.blender.org/blender/blender/pulls/138694
Part of #136993.
Share as much of the ShaderCompiler implementations as possible.
Remove the ShaderCompiler/ShaderCompilerGeneric split and make most of
its functions non virtual.
Move the `get_compiler` function from `Context` to `GPUBackend` and
creation/deletion to `GPUBackend::init/delete_resources`.
Add a `batch_cancel` function to `ShaderCompiler` (needed for the
GPUPass refactor).
As a nice extra, the multithreaded OpenGL compilation has become faster
too.
The barbershop materials + EEVEE static shaders have gone from 27s to
22s.
I have not observed any performance difference on Vulkan or Metal.
Pull Request: https://projects.blender.org/blender/blender/pulls/136676
The goal is to separate the draw attribute request from the
CustomData implementation. For the layer index this was
already started a while ago; it's only used in a couple places
and lookups are mostly name based anyway.
Conceptually the attribute request is just a request that the
extraction system create a buffer for a certain attribute name.
Information about where the attribute is stored doesn't fit.
Pull Request: https://projects.blender.org/blender/blender/pulls/138570
Image prepass pass didn't initialize the retopology offset, resulting in
clip test failures on Vulkan. Users noticed that the cycles rendering
didn't update during rendering, only when finihsed.
Pull Request: https://projects.blender.org/blender/blender/pulls/138596
When a texture wrapper wraps a second texture it doesn't free its
local resources based on the previous texture. This resulted in texture
views still being used where the backed memory could already be reused
by other allocations.
In OpenGL this might be solved inside the driver by not freeing the
backed texture unless all views have been freed. However our Vulkan
backend doesn't do this, leading to crashes when resizing the viewport
when displaying a workbench volume. In OpenGL this could lead to small
resizing artifacts, although we haven't noticed them. Overlay also wraps
existing textures.
Pull Request: https://projects.blender.org/blender/blender/pulls/138582
Like other runtime structs, it doesn't make sense to write this
data to files. And moving it out of DNA to an allocated C++ struct
means we can use other C++ features in it, like the new Mutex
type which I switched to in this commit.
Pull Request: https://projects.blender.org/blender/blender/pulls/138551
`DRW_gpencil_engine_needed` only checks whether grease pencil is
excluded in the viewport or whether there's grease pencil ID exists,
this can not handle cases where grease pencil strokes are generated from
other types of object in geometry nodes. Now this function is split into
two `gpencil_engine_needed_viewport` and
`gpencil_excluded` calls to provide more granulated logic for
places that need them.
Pull Request: https://projects.blender.org/blender/blender/pulls/138386
This was caused by the material indices not referring to an existing
material slot. Clamping the value to the maximum material index for
the given object fixes the issue.
Pull Request: https://projects.blender.org/blender/blender/pulls/138544
This is caused by the pointcloud attribute header not being added
to the volume shader.
The fix is to make volume shaders not require volume attribute
for geometry types that do not support it.
Pull Request: https://projects.blender.org/blender/blender/pulls/138212
Allows basic support for using `namespace X {}` and `X::symbol`
syntax.
Benefit:
- More sharing possible with host C++ code.
- Isolation of symbols when including shader files as C++.
Requirements:
- Nesting must be done using `namespace A::B{}` rather than
`namespace A{ namespace B {}}`, which is unsupported.
- No support for `using namespace`.
- Support of `using X` and `using X = Y` inside of function scope.
- Support of `using X` and `using X = Y` inside of namespace scope.
However, this is only to bring symbols from the same namespace
declared in another block (potentially inside another file).
- Only support namespace elision for symbols defined and used
inside of the same namespace scope.
Note that this is currently limited to blender GLSL files and
not for the shared headers. This is because we need to port a lot
of code to use namespaces before allowing this.
### Follow Up:
Nesting like `namespace A{ namespace B {}}` shouldn't be hard to
support and could be added if needed.
Rel #137446
Pull Request: https://projects.blender.org/blender/blender/pulls/137445
This patch adds a new `BLI_mutex.hh` header which adds `blender::Mutex` as alias
for either `tbb::mutex` or `std::mutex` depending on whether TBB is enabled.
Description copied from the patch:
```
/**
* blender::Mutex should be used as the default mutex in Blender. It implements a subset of the API
* of std::mutex but has overall better guaranteed properties. It can be used with RAII helpers
* like std::lock_guard. However, it is not compatible with e.g. std::condition_variable. So one
* still has to use std::mutex for that case.
*
* The mutex provided by TBB has these properties:
* - It's as fast as a spin-lock in the non-contended case, i.e. when no other thread is trying to
* lock the mutex at the same time.
* - In the contended case, it spins a couple of times but then blocks to avoid draining system
* resources by spinning for a long time.
* - It's only 1 byte large, compared to e.g. 40 bytes when using the std::mutex of GCC. This makes
* it more feasible to have many smaller mutexes which can improve scalability of algorithms
* compared to using fewer larger mutexes. Also it just reduces "memory slop" across Blender.
* - It is *not* a fair mutex, i.e. it's not guaranteed that a thread will ever be able to lock the
* mutex when there are always more than one threads that try to lock it. In the majority of
* cases, using a fair mutex just causes extra overhead without any benefit. std::mutex is not
* guaranteed to be fair either.
*/
```
The performance benchmark suggests that the impact is negilible in almost
all cases. The only benchmarks that show interesting behavior are the once
testing foreach zones in Geometry Nodes. These tests are explicitly testing
overhead, which I still have to reduce over time. So it's not unexpected that
changing the mutex has an impact there. What's interesting is that on macos the
performance improves a lot while on linux it gets worse. Since that overhead
should eventually be removed almost entirely, I don't really consider that
blocking.
Links:
* Documentation of different mutex flavors in TBB:
https://www.intel.com/content/www/us/en/docs/onetbb/developer-guide-api-reference/2021-12/mutex-flavors.html
* Older implementation of a similar mutex by me:
https://archive.blender.org/developer/differential/0016/0016711/index.html
* Interesting read regarding how a mutex can be this small:
https://webkit.org/blog/6161/locking-in-webkit/
Pull Request: https://projects.blender.org/blender/blender/pulls/138370
The GLSL processor appears to dislike a member named "select" when a function named "select" exists.
The member has been renamed to object_select to avoid collisions.
Pull Request: https://projects.blender.org/blender/blender/pulls/138466
Avoid initializing passes (and requesting their shaders) unless they're
actually needed.
Reduces the number of compiled Overlay shaders at startup
from 70 to 22.
Improves startup times.
Pull Request: https://projects.blender.org/blender/blender/pulls/138457
In the `grease_pencil_wire_batch_ensure` when computing the indices,
the function was using `points_by_curve` instead of
`evaluated_points_by_curve`.
Now the correct indices are used.
Pull Request: https://projects.blender.org/blender/blender/pulls/138489
Implementation of #137341
This adds support for using references to any variable in a local scope
inside the shader codebase.
Example:
```cpp
int a = 0;
int &b = a;
b++; /* a == 1 */
```
Using `auto` is supported for reference definition as the type is not
preserved by the copy paste procedure. Type checking is done by the
C++ shader compilation or after the copy paste procedure during shader
compilation. `auto` is still unsupported for other variable declarations.
Reference to opaque types (`image`, `sampler`) are supported since
they are never really assigned to a temp variable.
This implements all safety feature related to the implementation being
copy pasting the definition string. That is:
- No `--`, `++` operators.
- No function calls.
- Array subscript index needs to be int constants or constant variable.
The copy pasting does not replace member access:
`auto &a = b; a.a = c;` becomes `b.a = c;`
The copy pasting does not replace function calls:
`auto &a = b; a = a();` becomes `b = a();`
While limited, this already allows for nicer syntax (aliasing) for
accessing SSBOs and the potential overhead of a copy semantic:
```cpp
ViewMatrices matrices = drw_view_buf[0];
matrices.viewmat = float4x4(1);
drw_view_buf[0] = matrices;
```
Can now be written as;
```cpp
ViewMatrices &matrices = drw_view_buf[0];
matrices.viewmat = float4x4(1);
```
Which expands to;
```cpp
drw_view_buf[0].viewmat = float4x4(1);
```
Note that the reference semantic is not carried through function call
because arguments are transformed to `inout` in GLSL. `inout` has
copy semantic but it is often implemented as reference by some
implementations.
Another important note is that this copy-pasting doesn't check if a
symbol is a variable. It can match a typename. But given that our
typenames have different capitalizations style this is unlikely to be
an issue. If that issue arise, we can add a check for it.
Rel #137446
Pull Request: https://projects.blender.org/blender/blender/pulls/138412
Allows basic usage of templated functions.
There is no support for templated struct.
Benefit:
- More readable than macros in shader sources.
- Compatible with C++ tools.
- More sharing possible with host C++ code.
Requirements/Limitations:
- No default arguments to template parameters.
- Must use explicit instantiation for all variant needed.
- Explicit instantiation needs to **not** use argument deduction.
- Calls to template needs to have all template argument explicit
or all implicit.
- Template overload is not supported (redefining the same template
with different template argument or function argument types).
Currently implemented as Macros inside the build-time pre-pocessor,
but that could change to copy-paste to allow better error reporting.
However, the Macros keep the shader code reduced in the final binary
and allow different file to declare different instantiation.
The implementation is done by declaring overloads for each explicit
instantiation.
If a template has arguments not present in function
arguments, then all arguments **values** are appended to the
function name. The explicit template callsite is then modified to use
`TEMPLATE_GLUE` which will call the correct function. This is
why template argument deduction is not supported in this case.
Rel #137446
Pull Request: https://projects.blender.org/blender/blender/pulls/137441
In vertex & face modes the selection remained visible when the UV
opacity was zero (or near zero).
In those cases applying the opacity to the edge selection makes sense,
however in edge-select mode this caused UV's to become invisible.
- Use "select_{vert/edge/face}_" convention for UV's
matching edit-mesh naming.
- Rename `show_face_` to `show_face_overlay_` for 3D & UV views.
- Rename `show_face_dots_` to `select_face_dots_` since they're
they display selection and aren't used in object mode.
The same value was used to show selected UV faces & object mode face
overlays. This made logic a little more difficult to follow as the
same value was set & used differently based on the context.
Previously sync-select in edge-select mode behaved in much the same
way as vertex selection, since a selected edge could cause a vertex
on an a disconnected UV island to be selected.
Now single vertices are no longer considered selected when the
Sticky-Mode is set to "Location" (the default).
Notes on changes when sync-select is enabled:
- The main change from a user perspective is edge & face select modes
show the selected edges.
- This resolves a problem in edge & face selection modes where it wasn't
possible to differentiate between a selected edge and two selected
vertices on either side of an unselected edge.