This opt-in functionnality enabled developper keep track of unused
resources present in the `GPUShaderCreateInfo` descriptors of their
shaders.
The output is pretty noisy at the moment so we do not enforce its usage.
This uses a light parser / string modification pass to convert
C++ enum declaration syntax to GLSL compatible one.
GLSL having no support for enums, we are forced to convert the
enum values to a series of constant uints.
The parser (not really one by the way), being stupidly simple,
will not change anything to the values and thus make some C++
syntax (like omitting the values) not work.
The string replacement happens on all GLSL files on startup.
I did not measure significant changes in blender startup speed.
There is plans to do all of this at compile time.
We limit the scope of the search to `.h` and `.hh` files to prevent
confusing syntax in `.glsl` files.
There is basic error reporting with file, line and char logging
for easy debuggabiliy.
The requirements to use this enum sharing system are already listed in
`gpu_shader_shared_utils.h` and repeated on top of the preprocessor
function.
The VSE and node editor only uses an overlay buffer to draw to the screen. The
GPUViewport assumes that platforms clears all textures during creation, but
they do not on selected platforms. What would lead to drawing from
uncleared memory.
This patch fixes this by clearing all viewport textures during creation.
Also adds a few things to GPUShader for easily create shaders.
Heavy usage of macros to compose the createInfo and avoid
duplications and copy paste bugs.
This makes the link between the shader request functions
(in workbench_shader.cc) and the actual createInfo a bit
obscure since the names are composed and not searchable.
Reviewed By: jbakker
Differential Revision: https://developer.blender.org/D13910
Use more efficient logic for detecting when gizmos are under the cursor.
Even though this isn't a bottleneck, it runs on cursor motion in the
3D viewport, so avoiding any lag here is beneficial.
The common case for cursor motion without any gizmos was always
drawing two passes (one small, then again if nothing was found).
Now a single draw call at the larger size is used.
In isolation this gives around 1.2x-1.4x speedup.
When there are multiple gizmos a depth-buffer picking is used
(similar to object / bone selection) which is more involved but
still only performs 2x draw calls since the result is cached for reuse.
See note in gizmo_find_intersected_3d for a more detailed explanation.
Also restore the depth values in the selection result as they're
needed for gizmos to use selection bias.
Broken since support for GL_SELECT was removed.
When calling GPU_select_cache_begin, checking the selection mode used
the last used selection mode, not the one about to be used.
Using border select, then picking would not use the selection cache.
This wasn't noticeable by users as failing to use cache just completes
the selection without it (drawing the depth buffer unnecessarily).
In order to use a workaround builtin uniform, we need to count it
just like other uniforms and give it some space in the name buffer.
This also fixes extensions being added after the uniform declaration.
All `#extension` directives are now part of the gl backend.
With (center) position, radius and random value outputs.
Eevee does not yet support rendering point clouds, but an untested
implementation of this node was added for when it does.
Ref T92573
This makes optionnal the use of a different interface for the geometry
shader stage output. When the vertex and geometry interface instance name
matches, a `_in` and `_out` suffix is added to the end of the instance name.
This makes it easier to have optional geometry shader stages.
# Conflicts:
# source/blender/gpu/intern/gpu_shader_create_info.hh
This patch migrates the draw manager hair refine compute shader to use
GPUShaderCreateInfo.
Reviewed By: fclem
Differential Revision: https://developer.blender.org/D13915
Currently there are many function declarations in `BKE_node.h` that
don't actually have implementations in blenkernel. This commit moves
the declarations to `NOD_composite.h`, `NOD_texture.h`, and
`NOD_shader.h` instead. This helps to clarify the purpose of the
different modules.
Differential Revision: https://developer.blender.org/D13869
Fixes T93680
For current drivers of Intel HD Graphics 4400 and 4600, various Program Introspection functions appear broken and return incorrect values, causing crashes in the current handling of SSBOs. Disable use of this feature on those devices. Add checks to features that use SSBOs (Hair and Subdivision Modifier).
Reviewed By: fclem, jbakker
Maniphest Tasks: T93680
Differential Revision: https://developer.blender.org/D13806
Some drivers/glsl compilers will not warn about multiple resources using
the same binding, creating silent errors.
This patch checks for this case and outputs a descriptive error message if
a particular createInfo merge error is founds.
Other validation can be added later.
This shader needs to use the same interface as the OCIO shader. On Linux
and Windows this seems to be the case. On MacOS however there is a
mismatch that makes the overlay texture to be completely black when
switching to workbench in the Shader workspace.
This is just a temporarily work-around as this should be solved. Due to
the poor GPU debugging facilities on Mac we haven't been able to
pin-point the root cause.
This adds vertex creasing support for OpenSubDiv for modeling, rendering,
Alembic and USD I/O.
For modeling, vertex creasing follows the edge creasing implementation with an
operator accessible through the Vertex menu in Edit Mode, and some parameter in
the properties panel. The option in the Subsurf and Multires to use edge
creasing also affects vertex creasing.
The vertex crease data is stored as a CustomData layer, unlike edge creases
which for now are stored in `MEdge`, but will in the future also be moved to
a `CustomData` layer. See comments for details on the difference in behavior
for the `CD_CREASE` layer between egdes and vertices.
For Cycles this adds sockets on the Mesh node to hold data about which vertices
are creased (one socket for the indices, one for the weigths).
Viewport rendering of vertex creasing reuses the same color scheme as for edges
and creased vertices are drawn bigger than uncreased vertices.
For Alembic and USD, vertex crease support follows the edge crease
implementation, they are always read, but only exported if a `Subsurf` modifier
is present on the Mesh.
Reviewed By: brecht, fclem, sergey, sybren, campbellbarton
Differential Revision: https://developer.blender.org/D10145
This merge the description into one struct only that can be more easily
copied during `finalize()`.
The in and out layout parameters are better named and extended with the
invocation count (with fallback support)
Route cause was data alignment mismatch between GPU and CPU. This
mismatch would not allow us to bind the UBO where data wasn't available
on the GPU.
Fixed by using float4 in stead of float2. This could eventually be
packed, but that would lead to less readable code.
Cause was incorrect logic when generating the resource layout. It the
explicit_location_support setting was ignored and the binding were
generated for image, uniform buffers and storage buffers.
Cause of the issue isn't that clear, but the NVIDIA GLSL compiler
complained that it couldn't find an overloaded function when the second
parameter is an interger. This change fixes it by using a float.
This patch converts GPU_SHADER_2D_IMAGE_MULTI_RECT_COLOR shader to use
the GPUShaderCreateInfo pattern. It can be used as a reference when
converting other shaders.
In this special case the flat uniform vector cannot be used anymore as it
doesn't fit as push constants. To solve this a uniform buffer is used.
This is a first part of the Shader Create Info system could be.
A shader create info provides a way to define shader structure, resources
and interfaces. This makes for a quick way to provide backend agnostic
binding informations while also making shader variations easy to declare.
- Clear source input (only one file). Cleans up the GPU api since we can create a
shader from one descriptor
- Resources and interfaces are generated by the backend (much simpler than parsing).
- Bindings are explicit from position in the array.
- GPUShaderInterface becomes a trivial translation of enums and string copy.
- No external dependency to third party lib.
- Cleaner code, less fragmentation of resources in several libs.
- Easy to modify / extend at runtime.
- no parser involve, very easy to code.
- Does not hold any data, can be static and kept on disc.
- Could hold precompiled bytecode for static shaders.
This also includes a new global dependency system.
GLSL shaders can include other sources by using #pragma BLENDER_REQUIRE(...).
This patch already migrated several builtin shaders. Other shaders should be migrated
one at a time, and could be done inside master.
There is a new compile directive `WITH_GPU_SHADER_BUILDER` this is an optional
directive for linting shaders to increase turn around time.
What is remaining:
- pyGPU API {T94975}
- Migration of other shaders. This could be a community effort.
Reviewed By: jbakker
Maniphest Tasks: T94975
Differential Revision: https://developer.blender.org/D13360
As described in T91186, this commit moves mesh vertex normals into a
contiguous array of float vectors in a custom data layer, how face
normals are currently stored.
The main interface is documented in `BKE_mesh.h`. Vertex and face
normals are now calculated on-demand and cached, retrieved with an
"ensure" function. Since the logical state of a mesh is now "has
normals when necessary", they can be retrieved from a `const` mesh.
The goal is to use on-demand calculation for all derived data, but
leave room for eager calculation for performance purposes (modifier
evaluation is threaded, but viewport data generation is not).
**Benefits**
This moves us closer to a SoA approach rather than the current AoS
paradigm. Accessing a contiguous `float3` is much more efficient than
retrieving data from a larger struct. The memory requirements for
accessing only normals or vertex locations are smaller, and at the
cost of more memory usage for just normals, they now don't have to
be converted between float and short, which also simplifies code
In the future, the remaining items can be removed from `MVert`,
leaving only `float3`, which has similar benefits (see T93602).
Removing the combination of derived and original data makes it
conceptually simpler to only calculate normals when necessary.
This is especially important now that we have more opportunities
for temporary meshes in geometry nodes.
**Performance**
In addition to the theoretical future performance improvements by
making `MVert == float3`, I've done some basic performance testing
on this patch directly. The data is fairly rough, but it gives an idea
about where things stand generally.
- Mesh line primitive 4m Verts: 1.16x faster (36 -> 31 ms),
showing that accessing just `MVert` is now more efficient.
- Spring Splash Screen: 1.03-1.06 -> 1.06-1.11 FPS, a very slight
change that at least shows there is no regression.
- Sprite Fright Snail Smoosh: 3.30-3.40 -> 3.42-3.50 FPS, a small
but observable speedup.
- Set Position Node with Scaled Normal: 1.36x faster (53 -> 39 ms),
shows that using normals in geometry nodes is faster.
- Normal Calculation 1.6m Vert Cube: 1.19x faster (25 -> 21 ms),
shows that calculating normals is slightly faster now.
- File Size of 1.6m Vert Cube: 1.03x smaller (214.7 -> 208.4 MB),
Normals are not saved in files, which can help with large meshes.
As for memory usage, it may be slightly more in some cases, but
I didn't observe any difference in the production files I tested.
**Tests**
Some modifiers and cycles test results need to be updated with this
commit, for two reasons:
- The subdivision surface modifier is not responsible for calculating
normals anymore. In master, the modifier creates different normals
than the result of the `Mesh` normal calculation, so this is a bug
fix.
- There are small differences in the results of some modifiers that
use normals because they are not converted to and from `short`
anymore.
**Future improvements**
- Remove `ModifierTypeInfo::dependsOnNormals`. Code in each modifier
already retrieves normals if they are needed anyway.
- Copy normals as part of a better CoW system for attributes.
- Make more areas use lazy instead of eager normal calculation.
- Remove `BKE_mesh_normals_tag_dirty` in more places since that is
now the default state of a new mesh.
- Possibly apply a similar change to derived face corner normals.
Differential Revision: https://developer.blender.org/D12770