Adds support for subpass transition for AMD/Intel IMR
GPUs. This enables correct functioning of EEVEE Next
deferred lighting pass on AMD platforms.
The emulation is consistent with the OpenGL approach
of generating additional texture bindings in the shader
for subpass inputs, and splitting render passes across
sub-pass boundaries.
Authored by Apple: Michael Parkin-White
Pull Request: https://projects.blender.org/blender/blender/pulls/119784
Now that all relevant code is C++, the indirection from the C struct
`GPUVertBuf` to the C++ `blender::gpu::VertBuf` class just adds
complexity and necessitates a wrapper API, making more cleanups like
use of RAII or other C++ types more difficult.
This commit replaces the C wrapper structs with direct use of the
vertex and index buffer base classes. In C++ we can choose which parts
of a class are private, so we don't risk exposing too many
implementation details here.
Pull Request: https://projects.blender.org/blender/blender/pulls/119825
* Only works on machines with a Qualcomm Snapdragon 8cx Gen3 or above.
Older generation devices are not and will not be supported due to
some driver issues
* Requires VS2022 for building.
* Uses new MSVC preprocessor for sse2neon compatibility.
* SIMD is not enabled, waiting on conversion of blenlib to C++.
Ref #119126
Pull Request: https://projects.blender.org/blender/blender/pulls/117036
This patch adds the maximum number of supported image units to the GPU
capabilities module. Currently, the GPU module assume a maximum of 8
units, so the patch is not currently particularly useful, but we can
consider committing it for the future anyways.
Pull Request: https://projects.blender.org/blender/blender/pulls/119057
Adds an option to set the capture title when using renderdoc
`GPU_debug_capture_begin` has an optional `title` parameter to set
the title of the renderdoc capture.
Pull Request: https://projects.blender.org/blender/blender/pulls/118649
Span is preferrable since it's agnostic of the source container,
makes it clearer that there is no ownership, is 8 bytes smaller,
and can be passed by value.
`GLBatch::draw_indirect` has additional overhead compared to
`GLBatch::draw`, and can become a bottleneck in scenes that require
many draw calls (ie. with too many unique meshes).
The performance difference is almost exclusively caused by the
`GL_COMMAND_BARRIER_BIT` barrier that happens on every call.
This PR adds a `GPU_storagebuf_sync_as_indirect_buffer` function that
can be used to place the barrier only once after filling the indirect
buffer content.
This function is a no-op in Vulkan and Metal since they don't need the
barrier.
Pull Request: https://projects.blender.org/blender/blender/pulls/117561
When a shader performs a geometry shader injectoin to work around
features that are not supported natively on the GPU (viewport,
barycentric coordinates, layered rendering), linking would fail.
The reason was that the geometry shader was stored in a slot that was
patched by the specialization constants, resulting in an empty geometry
shader. An empty shader can be compiled, but doesn't match the interface
with other stages, so the linking would fail.
This fixes the issue that EEVEE crashed on Intel iGPUs. These GPUs
don't support viewports.
Pull Request: https://projects.blender.org/blender/blender/pulls/117440
This PR improves the place when shader stages are attached to glPrograms.
Previously it was done when shaders stages where created, in the function
create_shader_stage.
This PR will attach the shader stages inside link program.
Ensuring that create_shader_stage doesn't alter the program, which isn't
clear in its name.
Pull Request: https://projects.blender.org/blender/blender/pulls/117407
Ensure attachment states and load/store configs don't get out of sync
with the framebuffer layout.
In theory, a Framebuffer could have empty attachments interleaved with
valid ones so checking just the attachments "length" is not enough.
What this does instead is to ensure that valid attachments have a valid
config and that null attachments either don't have a matching config or
have an IGNORE/DONT_CARE one.
Pull Request: https://projects.blender.org/blender/blender/pulls/117073
Generated copies of GLSL sources are kept in a std::string and
it was always accessed by a long living StringRefNull which lead
to potential read from unallocated memory as std::strings are
not null terminated.
Pull Request: https://projects.blender.org/blender/blender/pulls/117120
This PR adds support for specialization constants for the OpenGL
backend. The minimum OpenGL version we are targetting doesn't
have native support for specialization constants. We simulate this
by keeping track of shader programs for each set of specialization
constants that are being used.
Specialization constants can be used to reduce shader complexity
and improve performance as less registry and/or spilling is done.
This requires the ability to recompile GLShaders. In order to do this
we need to keep track of the sources that are used when the shader
was compiled. For static sources we only store references
(`GLSource::source_ref`), for dynamically generated sources we keep
a copy of the source (`GLSource::source`).
When recompiling the shader GLSL source-code is generated for the
constants stored in `Shader::constants`. When compiling the previous
GLSource that contains specialization constants is then replaced
by the new version.
Pull Request: https://projects.blender.org/blender/blender/pulls/116926
This avoid the cost of creating the tiles themselves which uses a lot
texture write. This was a bottleneck on Apple GPUs.
Also the per pixel classification allows us to remove certain checks in
the deferred lighting shader making it faster.
### TODO
- [x] Add gl_FragStencilRefARB support on other backend
- [x] Add workaround for when gl_FragStencilRefARB isnt supported
Pull Request: https://projects.blender.org/blender/blender/pulls/116704
Along with the 4.1 libraries upgrade, we are bumping the clang-format
version from 8-12 to 17. This affects quite a few files.
If not already the case, you may consider pointing your IDE to the
clang-format binary bundled with the Blender precompiled libraries.
This adds some `#line` directive between the
source file injection so that the log parser knowns
which file the errors originated from.
This is then followed by a scan over the combined
source to find out the real row number.
This needed some changes in the `Shader::plint_log`
to skip lines to avoid outputing redundant information.
Adds API to allow usage of specialization constants in shaders.
Specialization constants are dynamic runtime constants which can
be compiled into a shader pipeline state object (PSO) to improve
runtime performance by reducing shader complexity through
shader compiler constant-folding.
This API allows specialization constant values to be specified
along with a default value if no constant value has been declared.
Each GPU backend is then responsible for caching PSO permutations
against the current specialization configuration.
This patch adds support for specialization constants in the
Metal backend and provides a generalised high-level solution
which can be adopted by other graphics APIs supporting
this feature.
Authored by Apple: Michael Parkin-White
Authored by Blender: Clément Foucault (files in gpu/test folder)
Pull Request: https://projects.blender.org/blender/blender/pulls/115193
According to the issue not all legacy AMD platforms that required the
high quality normals workaround where enabled. I have not been able to
reproduce the issue due hardware availability.
This PR will enable the workaround for all HD ATI GPUs.
Pull Request: https://projects.blender.org/blender/blender/pulls/116340
Due to recent changes a cached patch string in GLShader grew out of
its bounds. This resulted in incorrect shader generation on selected
platforms (Reported was Windows/NVIDIA). The patch string can differ
based on the features that the GPU supports.
This PR replaces the old C-style string generation with CPP-style
string stream, making sure that the allocated memory grows with the
size of the string.
Pull Request: https://projects.blender.org/blender/blender/pulls/116085
This patch adds an alternative path for devices/OSs
which do not support native texture atomics in Metal.
Support is encapsulated within the backend, ensuring
any allocated texture with the USAGE_ATOMIC flag is
allocated with a backing buffer, upon which atomic
operations happen.
The shader generation is also changed for the atomic
case, which instructs the backend to insert additional
buffer bind-points for the buffer resource. As Metal
also only supports buffer-backed textures for
textureBuffers or 2D textures, TextureArrays and
3D textures are emulated within a 2D texture, with
sample locations being indirected.
All usage of atomic textures MUST now utilise the
correct atomic texture types in the high level shader
and GPUShaderCreateInfo declarations.
Authored by Apple: Michael Parkin-White
Pull Request: https://projects.blender.org/blender/blender/pulls/115956
This de-duplicate some passes in the raytracing
pipeline and make it more ready for adoption
of arbitrary closure evaluation. This last part
means the removal of some per closure type
options.
The put in common the tile classification step
that is now done only once for all 3 closure
type. Also add some speedup to the tile
compaction phase that is now only twice
faster.
The horizon-scan setup was also de-duplicated
and run only if needed, which can save up to
0.5ms is complex scenes.
However, this moves the max-roughness and and
resolution scaling to a common parameter.
This is to be able to support arbitrary closure
evaluation where multiple closure with conflicting
parameters could be evaluated in one tracing pass.
Pull Request: https://projects.blender.org/blender/blender/pulls/116009
NDEBUG is part of the C standard and disables asserts. Only this will
now be used to decide if asserts are enabled.
DEBUG was a Blender specific define, that has now been removed.
_DEBUG is a Visual Studio define for builds in Debug configuration.
Blender defines this for all platforms. This is still used in a few
places in the draw code, and in external libraries Bullet and Mantaflow.
Pull Request: https://projects.blender.org/blender/blender/pulls/115774
These drivers crash on startup caused by a driver bug. This include the
latest drivers for legacy Intel CPUs with a HD 4000/HD 5000 series GPU.
To be on the safe side all drivers with version 20.19.15.51* will be marked
unsupported as we don't have the platforms to identify the precise driver
versions that fail.
See #113124 for more information.
Pull Request: https://projects.blender.org/blender/blender/pulls/115228
OpenGL uses a depth range between -1 and 1, which is then normalized.
Metal & Vulkan uses a depth range between 0 and 1, which is already normalized.
The final plan would be to default to a depth range between 0 and 1, but
for now the depth ranges are retargetted so they won't be clipped away.
This solves the next issues for users:
- Navigate control will be rendered correctly
- Ortographic view clipping artifacts
- EEVEE light evaluation
Retargetting happens at the end of the vertex stage or when a geometry
stage is present at the end of the geometry stage. Derivatives using
depth would have a different value compared to OpenGL, but would match
Metal backend. OpenGL performs clipping and generates derivatives based
on the original depth value.
`gl_FragCoord` and clipping would have some precision differences as clipping
and normalizing are done in a different order but would match Metal.
Geometry shaders should use `gpu_EmitVertex` to ensure that the retargetting
is done per vertex.
Pull Request: https://projects.blender.org/blender/blender/pulls/114669