This allows multiple threads to request different specializations without
locking usage of all specialized shaders program when a new specialization
is being compiled.
The specialization constants are bundled in a structure that is being
passed to the `Shader::bind()` method. The structure is owned by the
calling thread and only used by the `Shader::bind()`.
Only querying for the specialized shader (Map lookup) is locking the shader
usage.
The variant compilation is now also locking and ensured that
multiple thread trying to compile the same variant will never result
in race condition.
Note that this removes the `is_dirty` optimization. This can be added
back if this becomes a bottleneck in the future. Otherwise, the
performance impact is not noticeable.
Pull Request: https://projects.blender.org/blender/blender/pulls/136991
Cause by trying to deference a null batch.
In normal case, these calls never create a command
and are discarded early. 4.4 introduced the
polyline_draw_workaround to remove the use of geometry
shaders. These were not guarded against zero vertices
calls.
Adding an early out clause fixes the issue.
Pull Request: https://projects.blender.org/blender/blender/pulls/136840
This was caused by the deinterleaved format being
incorrectly decoded by the `bind_attribute_as_ssbo`
function.
Accumulating the offset should be done for all attributes
and not only the one being used. Furthermore, this needs
to happen only once per attribute and not once per name.
Moving the offset computation out of the name loop
fixes the issue.
Pull Request: https://projects.blender.org/blender/blender/pulls/136821
This port is not so straightforward.
This shader is used in different configurations and is
available to python bindings. So we need to keep
compatibility with different attributes configurations.
This is why attributes are loaded per component and a
uniform sets the length of the component.
Since this shader can be used from both the imm and batch
API, we need to inject some workarounds to bind the buffers
correctly.
The end result is still less versatile than the previous
metal workaround (i.e.: more attribute fetch mode supported),
but it is also way less code.
### Limitations:
The new shader has some limitation:
- Both `color` and `pos` attributes need to be `F32`.
- Each attribute needs to be 4byte aligned.
- Fetch type needs to be `GPU_FETCH_FLOAT`.
- Primitive type needs to be `GPU_PRIM_LINES`, `GPU_PRIM_LINE_STRIP` or `GPU_PRIM_LINE_LOOP`.
- If drawing using an index buffer, it must contain no primitive restart.
Rel #127493
Co-authored-by: Jeroen Bakker <jeroen@blender.org>
Pull Request: https://projects.blender.org/blender/blender/pulls/129315
This PR introduces the concept of primitive expansion draws.
This allows to create a drawcall that will generate N amount of new
primitive for an original primitive in a `gpu::Batch`. The intent is to
phase out the use of geometry shader for this purpose.
This adds a new `Frequency::GEOMETRY` only available for SSBOs.
The resources using this will be fed the current `gpu::Batch` VBOs
using name matching.
A dedicated slot is reserved for the index buffer, which has its own
internal lib to decode the index buffer content.
A new attribute lib is added to ease the loading of unaligned attribute.
This should be revisited and made obsolete once more refactor
lands.
It is similar to the Metal backend SSBO vertex fetch path but it is
defined on a different level. The main difference is that this PR is
backend independant and modify the draw module instead of the GPU
module. However, it doesn't cover all possible attribute conversion
cases. This will only be added if needed.
This system is less automatic than the Metal backend one and needs
more care to make sure the data matches what the shader expects.
The Metal system will be removed once all its usage have been
converted.
This PR only shows example usage for workbench shadows. Cleanup PRs
will follow this one.
Rel #105221
Pull Request: https://projects.blender.org/blender/blender/pulls/125782
For Batch::verts some values weren't cleared,
for Batch::inst values after the array would be cleared,
although as these were already zeroed this probably didn't cause
problems in practice.
Now that all relevant code is C++, the indirection from the C struct
`GPUVertBuf` to the C++ `blender::gpu::VertBuf` class just adds
complexity and necessitates a wrapper API, making more cleanups like
use of RAII or other C++ types more difficult.
This commit replaces the C wrapper structs with direct use of the
vertex and index buffer base classes. In C++ we can choose which parts
of a class are private, so we don't risk exposing too many
implementation details here.
Pull Request: https://projects.blender.org/blender/blender/pulls/119825
A lot of files were missing copyright field in the header and
the Blender Foundation contributed to them in a sense of bug
fixing and general maintenance.
This change makes it explicit that those files are at least
partially copyrighted by the Blender Foundation.
Note that this does not make it so the Blender Foundation is
the only holder of the copyright in those files, and developers
who do not have a signed contract with the foundation still
hold the copyright as well.
Another aspect of this change is using SPDX format for the
header. We already used it for the license specification,
and now we state it for the copyright as well, following the
FAQ:
https://reuse.software/faq/
Documented all functions, adding use case and side effects.
Also replace the use of shortened argument name by more meaningful ones.
Renamed `GPU_batch_instbuf_add_ex` and `GPU_batch_vertbuf_add_ex` to remove
the `ex` suffix as they are the main version used (removed the few usage
of the other version).
Renamed `GPU_batch_draw_instanced` to `GPU_batch_draw_instance_range` and
make it consistent with `GPU_batch_draw_range`.
This is a new implementation of the draw manager using modern
rendering practices and GPU driven culling.
This only ports features that are not considered deprecated or to be
removed.
The old DRW API is kept working along side this new one, and does not
interfeer with it. However this needed some more hacking inside the
draw_view_lib.glsl. At least the create info are well separated.
The reviewer might start by looking at `draw_pass_test.cc` to see the
API in usage.
Important files are `draw_pass.hh`, `draw_command.hh`,
`draw_command_shared.hh`.
In a nutshell (for a developper used to old DRW API):
- `DRWShadingGroups` are replaced by `Pass<T>::Sub`.
- Contrary to DRWShadingGroups, all commands recorded inside a pass or
sub-pass (even binds / push_constant / uniforms) will be executed in order.
- All memory is managed per object (except for Sub-Pass which are managed
by their parent pass) and not from draw manager pools. So passes "can"
potentially be recorded once and submitted multiple time (but this is
not really encouraged for now). The only implicit link is between resource
lifetime and `ResourceHandles`
- Sub passes can be any level deep.
- IMPORTANT: All state propagate from sub pass to subpass. There is no
state stack concept anymore. Ensure the correct render state is set before
drawing anything using `Pass::state_set()`.
- The drawcalls now needs a `ResourceHandle` instead of an `Object *`.
This is to remove any implicit dependency between `Pass` and `Manager`.
This was a huge problem in old implementation since the manager did not
know what to pull from the object. Now it is explicitly requested by the
engine.
- The pases need to be submitted to a `draw::Manager` instance which can
be retrieved using `DRW_manager_get()` (for now).
Internally:
- All object data are stored in contiguous storage buffers. Removing a lot
of complexity in the pass submission.
- Draw calls are sorted and visibility tested on GPU. Making more modern
culling and better instancing usage possible in the future.
- Unit Tests have been added for regression testing and avoid most API
breakage.
- `draw::View` now contains culling data for all objects in the scene
allowing caching for multiple views.
- Bounding box and sphere final setup is moved to GPU.
- Some global resources locations have been hardcoded to reduce complexity.
What is missing:
- ~~Workaround for lack of gl_BaseInstanceARB.~~ Done
- ~~Object Uniform Attributes.~~ Done (Not in this patch)
- Workaround for hardware supporting a maximum of 8 SSBO.
Reviewed By: jbakker
Differential Revision: https://developer.blender.org/D15817
Previously, objects and geometries were mapped between frames
using different hash tables in a way that is incompatible with
geometry instances. That is because the geometry mapping happened
without looking at the `persistent_id` of instances, which is not possible
anymore. Now, there is just one mapping that identifies the same
object at multiple points in time.
There are also two new caches for duplicated vbos and textures used for
motion blur. This data has to be duplicated, otherwise it would be freed
when another time step is evaluated. This caching existed before, but is
now a bit more explicit and works for geometry instances as well.
Differential Revision: https://developer.blender.org/D13497
Use a shorter/simpler license convention, stops the header taking so
much space.
Follow the SPDX license specification: https://spdx.org/licenses
- C/C++/objc/objc++
- Python
- Shell Scripts
- CMake, GNUmakefile
While most of the source tree has been included
- `./extern/` was left out.
- `./intern/cycles` & `./intern/atomic` are also excluded because they
use different header conventions.
doc/license/SPDX-license-identifiers.txt has been added to list SPDX all
used identifiers.
See P2788 for the script that automated these edits.
Reviewed By: brecht, mont29, sergey
Ref D14069
Modernize loops by using the `for(type variable : container)` syntax.
Some loops were trivial to fix, whereas others required more attention
to avoid semantic changes. I couldn't address all old-style loops, so
this commit doesn't enable the `modernize-loop-convert` rule.
Although Clang-Tidy's auto-fixer prefers to use `auto` for the loop
variable declaration, I made as many declarations as possible explicit.
To me this increases local readability, as you don't need to fully
understand the container in order to understand the loop variable type.
No functional changes.
This allows subsequent redraw to work just like before.
However a better safeguard system against setting the uniforms in the
wrong shader would be nice to have.
This makes the GPUContext follow the same naming convention as the rest
of the module.
Also add a static getter for extra bonus style (no need for casts):
- Context::get()
- GLContext::get()