Commit Graph

653 Commits

Author SHA1 Message Date
Clément Foucault
d955ebce30 EEVEE: Default Startup Speedup
Low hanging fruit optimizations for improving default
startup time.

Went from 7.2sec to 4sec on my system.

Pull Request: https://projects.blender.org/blender/blender/pulls/139278
2025-05-22 15:57:41 +02:00
Miguel Pozo
7654be9e88 Fix: GPU: Profiling for compilation contexts 2025-05-22 15:55:27 +02:00
Clément Foucault
caac241c84 GPU: Make Shader Specialization Constant API Thread Safe
This allows multiple threads to request different specializations without
locking usage of all specialized shaders program when a new specialization
is being compiled.

The specialization constants are bundled in a structure that is being
passed to the `Shader::bind()` method. The structure is owned by the
calling thread and only used by the `Shader::bind()`.
Only querying for the specialized shader (Map lookup) is locking the shader
usage.

The variant compilation is now also locking and ensured that
multiple thread trying to compile the same variant will never result
in race condition.

Note that this removes the `is_dirty` optimization. This can be added
back if this becomes a bottleneck in the future. Otherwise, the
performance impact is not noticeable.

Pull Request: https://projects.blender.org/blender/blender/pulls/136991
2025-05-19 17:42:55 +02:00
Clément Foucault
ca88983af2 EEVEE: Reverse-Z implementation
This feature greatly increase depth buffer precision.
This is very noticeable in large view distance scenes.

This is enabled by default on GPUs that supports it (most of the hardware we
support already supports this). This makes rendering different on the GPUs
that do not support that feature (`glClipControl`).

While this give much better depth precision as before, we also have a lot of
imprecision caused by our vertex transformations. This can be improved in
another task.

Pull Request: https://projects.blender.org/blender/blender/pulls/138898
2025-05-19 16:29:26 +02:00
Clément Foucault
8ac5940e33 GPU: Add GL_ARB_clip_control support to the GL backend
This adds support for the extension and always
set the clip state value to 0..1 to align with vulkan
and metal. Moreover this is needed for the Reverse Z
implementation.

Note that this is a OpenGL 4.5 feature and is not
required to start Blender. So there must still be
a fallback path for now.

Rel #138898

Pull Request: https://projects.blender.org/blender/blender/pulls/138941
2025-05-16 13:53:36 +02:00
Hans Goudey
91627b3d47 GPU: Remove int float fetch mode combination
This commit finishes removing the uses of the integer to float
vertex buffer fetch mode. Previous commits noted below already started
that process. The last usage was geometry attributes. Now integers are
converted to floats as part of the existing upload process.

The change makes the Vulkan vertex buffer type conversion unused, so
it's removed. That's nice because Vulkan vertex buffers go from 1040 to
568 bytes in size and have significantly less overhead on creation.

Related:
- 153abc372e
- 1e1ac2bb9b
- 617858e453

Pull Request: https://projects.blender.org/blender/blender/pulls/138873
2025-05-15 15:29:12 +02:00
Miguel Pozo
c16aba915f GPU: Add GPU_shader_batch_cancel
Fix the recently implemented ShaderCompiler::batch_cancel.
Expose it with GPU_shader_batch_cancel and
GPU_shader_specialization_batch_cancel.
Use them in the EEVEE ShaderModule destructor, to prevent blocking on
destruction when there are in-flight compilations.

Pull Request: https://projects.blender.org/blender/blender/pulls/138774
2025-05-12 19:54:03 +02:00
Campbell Barton
2cd2f2ea4d Cleanup: quiet missing parenthesis & unused function warnings 2025-05-09 02:13:33 +00:00
Miguel Pozo
992e7c95a7 GPU: Converge ShaderCompiler implementations
Part of #136993.

Share as much of the ShaderCompiler implementations as possible.
Remove the ShaderCompiler/ShaderCompilerGeneric split and make most of
its functions non virtual.
Move the `get_compiler` function from `Context` to `GPUBackend` and
creation/deletion to `GPUBackend::init/delete_resources`.
Add a `batch_cancel` function to `ShaderCompiler` (needed for the
GPUPass refactor).

As a nice extra, the multithreaded OpenGL compilation has become faster
too.
The barbershop materials + EEVEE static shaders have gone from 27s to
22s.

I have not observed any performance difference on Vulkan or Metal.

Pull Request: https://projects.blender.org/blender/blender/pulls/136676
2025-05-08 18:16:47 +02:00
Clément Foucault
8dee08996e GPU: Shader: Add wrapper to stage agnostic function
This avoid having to guards functions that are
only available in fragment shader stage.

Calling the function inside another stage is still
invalid and will yield a compile error on Metal.

The vulkan and opengl glsl patch need to be modified
per stage to allow the fragment specific function
to be defined.

This is not yet widely used, but a good example is
the change in `film_display_depth_amend`.

Rel #137261

Pull Request: https://projects.blender.org/blender/blender/pulls/138280
2025-05-05 09:59:00 +02:00
Sergey Sharybin
806a633609 GPU: Implement blending mode for piercing a hole in alpha
The use-case of this blend mode is to be able to make parts of an
viewport overlay transparent.

The future user of this blend mode is sequencer preview drawing
where frame will be drawn to an HDR render frame-buffer, and overlays
drawn on-top. In a way it is similar to the image engine, but without
need to have custom shader.

Ref #138094

Pull Request: https://projects.blender.org/blender/blender/pulls/138307
2025-05-02 10:40:23 +02:00
Campbell Barton
43af16a4c1 Cleanup: spelling in comments, correct comment block formatting
Also use doxygen comments more consistently.
2025-05-01 11:44:33 +10:00
Hans Goudey
ec33994c62 Cleanup: Remove unused VertBuf::duplicate() function
This is completely unused, not implemented for the Vulkan backend, and
seems to add quite a bit of complexity to the Metal and OpenGL backends.
It was added for EEVEE legacy motion blur, and the last use was removed
along with EEVEE legacy. We're probably better off not maintaining it since
we should avoid duplicating vertex buffer data anyway.

Pull Request: https://projects.blender.org/blender/blender/pulls/138226
2025-04-30 22:35:47 +02:00
Brecht Van Lommel
b8b7f71520 Vulkan: Implement native handles for pixel buffers
* Pixel buffer is always allocated with export and dedicated memory flags.
* Returns an opaque file descriptor (Unix) or handle (Windows).
* Native handle now includes memory size as it may be slightly bigger
  than the requested size.

Pull Request: https://projects.blender.org/blender/blender/pulls/137363
2025-04-28 11:38:56 +02:00
Clément Foucault
59df50c326 GPU: Refactor Qualifier and ImageType
This allow to use types closer to GLSL in resource
declaration.

These are aliased for clarity in the GPU
module (i.e. `isampler2D` is shortened to `Int2D`).

Rel #137446

Pull Request: https://projects.blender.org/blender/blender/pulls/137954
2025-04-24 14:38:13 +02:00
Miguel Pozo
41ae990b29 Cleanup: Remove StencilViewWorkaround and GPU_texture_view_support()
Texture views should be fully supported everywhere now.

Pull Request: https://projects.blender.org/blender/blender/pulls/137863
2025-04-23 20:57:00 +02:00
Brecht Van Lommel
fb2ba20b67 Refactor: Use more typed MEM_calloc<> and MEM_malloc<>
Pull Request: https://projects.blender.org/blender/blender/pulls/137822
2025-04-22 11:22:18 +02:00
Brecht Van Lommel
388a21e260 Refactor: Eliminate various void pointers passed to MEM_freeN
It's safer to pass a type so that it can be checked if delete should be
used instead. Also changes a few void pointer casts to const_cast so that
if the data becomes typed it's an error.

Pull Request: https://projects.blender.org/blender/blender/pulls/137404
2025-04-21 17:59:41 +02:00
Brecht Van Lommel
637c6497e9 Refactor: Use more typed MEM_calloc<>, avoid unnecessary size_t cast
Handle some cases that were missed in previous refactor. And eliminate
unnecessary size_t casts as these could hide issues.

Pull Request: https://projects.blender.org/blender/blender/pulls/137404
2025-04-21 17:59:41 +02:00
Campbell Barton
3933f45f52 Cleanup: move doc-strings to declarations
Move into headers or to the top of the function body for internal
implementation details, in some cases remove duplicate doc-strings.
2025-04-18 22:58:36 +10:00
Campbell Barton
64f5dee6d7 Cleanup: spelling in comments (make check_spelling_*) 2025-04-17 12:06:12 +10:00
Clément Foucault
9990273d04 GPU: Change Type enum to use lower case values
This is to help for future resource declaration
using macros.

Rel #137261

Pull Request: https://projects.blender.org/blender/blender/pulls/137367
2025-04-11 22:39:01 +02:00
Miguel Pozo
a5ed5dc4bf GPU: Support deferred compilation in ShaderCompilerGeneric
Update the `ShaderCompilerGeneric` to support deferred compilation
using the batch compilation API, so we can get rid of
`drw_manager_shader`.
This approach also allows supporting non-blocking compilation
for static shaders.

This shouldn't cause any behavior changes at the moment, since batch
compilation is not yet used when parallel compilation is disabled.

This adds a `GPUWorker` and a `GPUSecondaryContext` as an easy to use
wrapper for managing secondary GPU contexts.

(Part of #133674)
Pull Request: https://projects.blender.org/blender/blender/pulls/136518
2025-04-07 15:26:25 +02:00
Clément Foucault
9d06508837 Fix #137052: GPU: Crash on startup caused by legacy pyGPU API
The removed legacy API was still in used by the pyGPU API.
Add a deprecation warning instead.

This partially reverts commit 3179cb0069.
2025-04-07 12:27:48 +02:00
Campbell Barton
a6da9e3ae7 Cleanup: quiet unused variable warning 2025-04-05 08:37:07 +00:00
наб
6935ec2fa7 OpenGL: Some legacy AMD drivers not detected
My version "4.6.14760 Core Profile Context 21.2.3 27.20.14535.3005"
was not caught by the spot-check. This change replaces the check with
a parser and range check.

Pull Request: https://projects.blender.org/blender/blender/pulls/136803
2025-04-04 20:13:52 +02:00
Clément Foucault
3179cb0069 Cleanup: GPU: Remove unused legacy_resource_location 2025-04-04 18:21:52 +02:00
Clément Foucault
3562433ae7 pyGPU: Deprecate Shader.program getter
This is getting in the way of making the
GPUShader API more threadsafe.

This getter already doesn't work for vulkan
and Metal, and has very limited usage.

Keeping the python function to avoid errors
and display a deprecation warning.

Pull Request: https://projects.blender.org/blender/blender/pulls/136983
2025-04-04 14:23:09 +02:00
Omar Emara
56b0b709ea Compositor: Support GPU OIDN denoising
This patch supports GPU OIDN denoising in the compositor. A new
compositor performance option was added to allow choosing between CPU,
GPU, and Auto device selection. Auto will use whatever the compositor is
using for execution.

The code is two folds, first, denoising code was adapted to use buffers
as opposed to passing in pointers to filters directly, this is needed to
support GPU devices. Second, device creation is now a bit more involved,
it tries to choose the device is being used by the compositor for
execution.

Matching GPU devices is done by choosing the OIDN device that matches
the UUID or LUID of the active GPU platform. We need both UUID and LUID
because not all platforms support both. UUID is supported on all
platforms except MacOS Metal, while LUID is only supported on Window and
MacOS metal.

If there is no active GPU device or matching is unsuccessful, we let
OIDN choose the best device, which is typically the fastest.

To support this case, UUID and LUID identifiers were added to the
GPUPlatformGlobal and are initialized by the GPU backend if supported.
OpenGL now requires GL_EXT_memory_object and GL_EXT_memory_object_win32
to support this use case, but it should function without it.

Pull Request: https://projects.blender.org/blender/blender/pulls/136660
2025-04-04 11:17:08 +02:00
Campbell Barton
74900afa56 Cleanup: quiet unused warnings 2025-04-04 10:33:33 +11:00
Clément Foucault
f8de6c31bc EEVEE: Move Object ID storage to gbuffer header layer
This allow to store the full object ID inside a `uint32`
buffer. This allows to get the per object data in deferred
passes and avoid to store object data inside the Gbuffer.

This data is only written if needed.

This had to modify the implementation of subpass input
for all backend to be able to bind layered texture.
This currently work because only the layer 0 is bound to the
framebuffer. This is fragile but I don't see a good builtin way
to fix it.

Rel #135935

#### Tasks
- [x] Replace light linking bits in Gbuffer
- [x] Replace Object ID in GBuffer for SSS
- [x] Conditional storage
- [x] Dummy storage if not needed

Pull Request: https://projects.blender.org/blender/blender/pulls/136428
2025-04-03 14:00:55 +02:00
Campbell Barton
90fd070c28 Cleanup: spelling in comments (make check_spelling_*) 2025-04-02 03:02:01 +00:00
Campbell Barton
42ad772a1f Cleanup: spelling & repeated terms (make check_spelling_*)
Also use comment blocks for English text.
2025-03-27 01:13:34 +00:00
Miguel Pozo
dcaa945293 Fix: Renderdoc sessions crash on startup (WGL)
The crash regression comes from 583e2b7240.

Pass the s_sharedHGLRC directly to wglCreateContextAttribsARB instead
of using wglShareLists.
Context: https://github.com/baldurk/renderdoc/issues/1224

This doesn't only fix the recent regression, but solves all the long
standing issues with Renderdoc on Windows (F12 rendering support,
multiple windows, deferred compilation...).

(Fix suggested by @LazyDodo)

Pull Request: https://projects.blender.org/blender/blender/pulls/136140
2025-03-25 15:34:48 +01:00
Jeroen Bakker
2f41aa6a52 Fix #132968: OpenGL: Strip line directives on legacy AMD
Potential fix for legacy AMD driver issue.

- Updating drivers using a clean install has proven to fix the issue as well.
As driver can leave parts of an older driver active what it actually the cau
- Solution comments out the `#line` by replacing the first two characters.

Pull Request: https://projects.blender.org/blender/blender/pulls/136231
2025-03-24 07:31:29 +01:00
Jeroen Bakker
15d88e544a GPU: Storage buffer allocation alignment
Since the introduction of storage buffers in Blender, the calling
code has been responsible for ensuring the buffer meets allocation
requirements. All backends require the allocation size to be divisible
by 16 bytes. Until now, this was sufficient, but with GPU subdivision
changes, an external library must also adhere to these requirements.

For OpenSubdiv (OSD), some buffers are not 16-byte aligned, leading
to potential misallocation. Currently, this is mitigated by allocating
a few extra bytes, but this approach has the drawback of potentially
reading unintended bytes beyond the source buffer.

This PR adopts a similar approach to vertex buffers: the backend handles
extra byte allocation while ensuring data uploads and downloads function
correctly without requiring those additional bytes.

No changes were needed for Metal, as its allocation size is already
aligned to 256 bytes.

**Alternative solutions considered**:

- Copying the CPU buffer to a larger buffer when needed (performance impact).
- Modifying OSD buffers to allocate extra space (requires changes to an external library).
- Implementing GPU_storagebuf_update_sub.

Ref #135873

Pull Request: https://projects.blender.org/blender/blender/pulls/135716
2025-03-13 15:05:16 +01:00
Clément Foucault
c02dea2e26 Fix: GL: Race condition in shader compilation
The patch strings did not have thread safe initialization.
The string might hav been returned null or incomplete
which might trigger compilation errors.
2025-03-13 14:04:59 +01:00
Jeroen Bakker
ba22e5e6be Merge branch 'blender-v4.4-release' 2025-03-10 08:49:37 +01:00
Jeroen Bakker
eceb81b21f GPU: Remove RDNA2 shader viewport workaround
It has been confirmed that the latest release of AMD drivers has fixed
issues for both OpenGL and Vulkan. Users should use AMD driver 25.3.1
or later. Removing the workaround as it has performance penalties on
RDNA2 based GPUs.

Reference: #135516
Pull Request: https://projects.blender.org/blender/blender/pulls/135630
2025-03-10 07:22:02 +01:00
Falk David
e39c83c881 Merge branch 'blender-v4.4-release' 2025-03-06 21:19:31 +01:00
Clément Foucault
b4a1a140d7 Fix #134509: GPU: Node editor links are invisible on Intel GPU
This bug also affects integrated GPU as well.
Remove the GPU familly check.
2025-03-06 18:48:56 +01:00
Jeroen Bakker
be4f9c0ac8 Merge branch 'blender-v4.4-release' 2025-03-06 16:30:16 +01:00
Jeroen Bakker
37d781aa2a Fix #135516: Vulkan: Shader output viewport broken on RDNA2
When using the official RDNA2 driver +vulkan we see the same issue we
as #123787. Adding the same workaround to vulkan as well.

Pull Request: https://projects.blender.org/blender/blender/pulls/135565
2025-03-06 16:28:47 +01:00
Campbell Barton
5b856ba447 Merge branch 'blender-v4.4-release' 2025-03-06 10:35:59 +11:00
Campbell Barton
b85fc32cae Cleanup: spelling & repeated words in comments
Address warnings from check_spelling.py
2025-03-06 10:33:21 +11:00
Clément Foucault
04fcf2f907 Merge branch 'blender-v4.4-release' 2025-03-05 12:06:00 +01:00
Clément Foucault
326ce59961 Fix #134509: GPU: Add workaround for Intel ARC nodelink driver bug
This disables the instancing optimization for this specific
hardware.

Pull Request: https://projects.blender.org/blender/blender/pulls/135458
2025-03-05 12:05:34 +01:00
Jacques Lucke
ba4cf3f738 Cleanup: add clarifying comment at assert checking if vbo is empty
I've hit this a couple of times and disabling it always worked fine for me. So
it's good to make it more obvious that there is an actual bug instead of a
missed optimization.

Pull Request: https://projects.blender.org/blender/blender/pulls/135467
2025-03-04 18:03:59 +01:00
Brecht Van Lommel
3dab100860 Fix: ASAN errors after addition of texture pool
Same fix as #132504. Free the texture pool before the derived GPU context
class, as that one is used as part of freeing the texture pool.

Pull Request: https://projects.blender.org/blender/blender/pulls/135444
2025-03-04 16:54:05 +01:00
Miguel Pozo
6b43873cf9 Cleanup: Remove unused variable 2025-02-18 16:04:27 +01:00