`GLBatch::draw_indirect` has additional overhead compared to
`GLBatch::draw`, and can become a bottleneck in scenes that require
many draw calls (ie. with too many unique meshes).
The performance difference is almost exclusively caused by the
`GL_COMMAND_BARRIER_BIT` barrier that happens on every call.
This PR adds a `GPU_storagebuf_sync_as_indirect_buffer` function that
can be used to place the barrier only once after filling the indirect
buffer content.
This function is a no-op in Vulkan and Metal since they don't need the
barrier.
Pull Request: https://projects.blender.org/blender/blender/pulls/117561
Some test cases are not support when used with the OpenGL backend. These
test cases are easier to support when using Vulkan as we do control the
GPU->CPU data conversion logic.
We remove the test cases that aren't working yet for any backend and
skip test cases where OpenGL support is failing.
Specialization constants tests use points render primitives, but the
shader isn't capable of point rendering. For the test results it doesn't
matter as it only validates the vertex output, but it would trigger an
assert when using Vulkan backend. The vulkan backend is more strict and
currently signals these common errors.
Previously a storage buffer was used to store draw list commands as it
matches already existing APIs. Unfortunately StorageBuffers prefers to
be stored on the GPU device and would reduce the benefit of a dynamic
draw list.
This PR replaces the storage buffer with a regular buffer, which keeps
more control where to store the buffer.
Pull Request: https://projects.blender.org/blender/blender/pulls/117712
The output of the Color Ramp node in the GPU compositor and EEVEE is
slightly off. That's because the factor is evaluated directly at the
sampler without proper half pixel offsets to account for the sampler's
linear interpolation, which this patch adds.
Pull Request: https://projects.blender.org/blender/blender/pulls/117677
While investigating Blender compilation time for windows-arm64, we
identified two compilation units that were taking a long time to compile
(~1h each). This affects windows-x64 builds as well.
Pull Request: https://projects.blender.org/blender/blender/pulls/117534
A draw list bundles multiple draw commands for the same geometry
and sends the draw commands in a single command. This reduces
the overhead of pipeline checking, resource validation and can
keep the load higher on the gpu as more work needs to be done.
Previously the draw list didn't bundle any commands and would still
send each call separately to the GPU. This PR implements the bundling
of the commands.
Pull Request: https://projects.blender.org/blender/blender/pulls/117548
Small change to always opt-in to using
invariant position in the vertex shader.
This ensures precision between position
calculations from different shaders which
need to produce the exact same result, by
disabling fastMath on only those instructions.
After benchmarking, the impact of this change
does not appear to affect performance bottlenecks
but will reduce the need for additional bias calculations.
Authored by Apple: Michael Parkin-White.
Pull Request: https://projects.blender.org/blender/blender/pulls/117478
When a shader performs a geometry shader injectoin to work around
features that are not supported natively on the GPU (viewport,
barycentric coordinates, layered rendering), linking would fail.
The reason was that the geometry shader was stored in a slot that was
patched by the specialization constants, resulting in an empty geometry
shader. An empty shader can be compiled, but doesn't match the interface
with other stages, so the linking would fail.
This fixes the issue that EEVEE crashed on Intel iGPUs. These GPUs
don't support viewports.
Pull Request: https://projects.blender.org/blender/blender/pulls/117440
This PR improves the place when shader stages are attached to glPrograms.
Previously it was done when shaders stages where created, in the function
create_shader_stage.
This PR will attach the shader stages inside link program.
Ensuring that create_shader_stage doesn't alter the program, which isn't
clear in its name.
Pull Request: https://projects.blender.org/blender/blender/pulls/117407
The term `PIL` stands for "platform independent library." It exists since the `Initial Revision`
commit from 2002. Nowadays, we generally just use the `BLI` (blenlib) prefix for such code
and the `PIL` prefix feels more confusing then useful. Therefore, this patch renames the
`PIL` to `BLI`.
Pull Request: https://projects.blender.org/blender/blender/pulls/117325
Ensure attachment states and load/store configs don't get out of sync
with the framebuffer layout.
In theory, a Framebuffer could have empty attachments interleaved with
valid ones so checking just the attachments "length" is not enough.
What this does instead is to ensure that valid attachments have a valid
config and that null attachments either don't have a matching config or
have an IGNORE/DONT_CARE one.
Pull Request: https://projects.blender.org/blender/blender/pulls/117073
Generated copies of GLSL sources are kept in a std::string and
it was always accessed by a long living StringRefNull which lead
to potential read from unallocated memory as std::strings are
not null terminated.
Pull Request: https://projects.blender.org/blender/blender/pulls/117120
Allows specification of per-shader threadgroup memory tuning
to optimise performance through increase of GPU occupancy.
Authored by Apple: Michael Parkin-White
Pull Request: https://projects.blender.org/blender/blender/pulls/115238
This PR adds support for specialization constants for the OpenGL
backend. The minimum OpenGL version we are targetting doesn't
have native support for specialization constants. We simulate this
by keeping track of shader programs for each set of specialization
constants that are being used.
Specialization constants can be used to reduce shader complexity
and improve performance as less registry and/or spilling is done.
This requires the ability to recompile GLShaders. In order to do this
we need to keep track of the sources that are used when the shader
was compiled. For static sources we only store references
(`GLSource::source_ref`), for dynamically generated sources we keep
a copy of the source (`GLSource::source`).
When recompiling the shader GLSL source-code is generated for the
constants stored in `Shader::constants`. When compiling the previous
GLSource that contains specialization constants is then replaced
by the new version.
Pull Request: https://projects.blender.org/blender/blender/pulls/116926
This PR cleans up shader builder CMake files and fixes vulkan includes
that could not be found on all platforms.
* Reduce code duplication
* Use private var for MANIFEST on windows
* Add system includes when compiling
Pull Request: https://projects.blender.org/blender/blender/pulls/115889
When anisotropic filtering is enabled on a sampler its value must be
between 1 and 16. In blender it is possible to set a value lower than 1.
0 actually means that anisotropic filtering is disabled in Blender.
This would trigger a validation error in Vulkan.
Pull Request: https://projects.blender.org/blender/blender/pulls/117018