Commit Graph

5457 Commits

Author SHA1 Message Date
Clément Foucault
d401b36509 GPU: Metal: Fix background render crash
GHOST_CreateSystemBackground was not being
followed by the now mandatory GPU_backend_ghost_system_set
2024-09-30 14:20:12 +02:00
Jason Fielder
eb3fe75392 Metal: Add support for parallel compilation and precompilation specialisation
This speeds up EEVEE startup and material compilation time.

Authored by Apple: James McCarthy
Pull Request: https://projects.blender.org/blender/blender/pulls/125657
2024-09-30 11:21:28 +02:00
Jeroen Bakker
6ec35c619f Vulkan: Reduce Pipeline Logging
Pipeline pool could log to much information that confused developers who
are not up to date what pipelines are. This PR will hide the confusing
messages. When working on Vulkan these messages can still be shown by
raising the log level.

See !128254

Pull Request: https://projects.blender.org/blender/blender/pulls/128352
2024-09-30 08:46:17 +02:00
Jeroen Bakker
b6aa4a3142 OpenGL: Add Intel HD 405 to limited support.
This rare GPU has z-fighting issues in editor mode. Might be fixable by
changing the bias, but would decrease precision on other platforms as
well. Better to move this GPU to limited support. It is working, just
has some drawing artifacts.

See #128179

Pull Request: https://projects.blender.org/blender/blender/pulls/128351
2024-09-30 08:44:12 +02:00
Jeroen Bakker
481c8fd5a7 Vulkan: Memory leak in immediate mode during exit
When exiting the immediate buffers are discarded, but where not
destroyed making the buffers still leak.
Detected when looking into descriptor set freeze issue.

Pull Request: https://projects.blender.org/blender/blender/pulls/128249
2024-09-27 15:01:10 +02:00
Jeroen Bakker
5250e57294 Fix #127288: Vulkan: Report Marketed Driver Version
A driver (package) installed by the user can have many different drivers
and they can all report a different version. For AMD the version we
reported was from their Vulkan driver. This version isn't useful during
bug triaging.

This PR will use the driver info and driver name from the driver
properties to construct a driver version string that will be used for
reporting.

Pull Request: https://projects.blender.org/blender/blender/pulls/128232
2024-09-27 09:24:15 +02:00
Campbell Barton
33b80415aa Cleanup: use const, correct arg names, spelling, use ELEMN(..) 2024-09-27 11:01:37 +10:00
Jeroen Bakker
725b5027fb Vulkan: Refactor immediate mode
Immediate mode uses the old 'resource tracker' which has been replaced
by swap chain resource pools. This PR optimizes immediate mode buffers
by utilizing resource pools.

Pull Request: https://projects.blender.org/blender/blender/pulls/128188
2024-09-26 16:01:30 +02:00
Campbell Barton
381898b6dc Refactor: move BLI_path_util header to C++, rename to BLI_path_utils
Move to a C++ header to allow C++ features to be used there,
use the "utils" suffix as it's preferred for new files.

Ref !128147
2024-09-26 21:13:39 +10:00
Jeroen Bakker
70313f68ce Vulkan: Log selected device
Currently the log only contained the first compatible device. It is
more important to the user to know which device is used.

This PR increases the level of the first compatible device so it is only
visible when increasing the log level. It reports the device, driver and
vendor when starting blender with `--debug-gpu`.

Pull Request: https://projects.blender.org/blender/blender/pulls/128168
2024-09-26 12:05:09 +02:00
Jeroen Bakker
88b5467e0e Vulkan: Batch Upload Descriptor Sets
Descriptor sets will be uploaded in batch. This allows drivers to
do additional optimizations or at least push some looping to the
driver side.

Pull Request: https://projects.blender.org/blender/blender/pulls/128167
2024-09-26 12:04:09 +02:00
Jeroen Bakker
d75cf2efd4 Vulkan: Refactor resource binding
Resource binding was over-complicated as I didn't understood the state
manager and vulkan to make the correct decisions at that time. This
refactor will remove a lot of the complexity and improves the performance.

**Performance**

The performance improvement is noticeable in complex grease pencil
scenes.

Grease pencil benchmark file picknick:
- `NVIDIA Quadro RTX 6000` 17 fps -> 24 fps
- `Intel(R) Arc(tm) A750 Graphics (DG2)` 6 -> 21 fps

**Bottle-neck**

The performance improvements originates from moving the update entry
point from state manager to shader interface. The previous implementation
(state manager) had to loop over all the bound resources and find in the
shader interface where it was located in the descriptor set. Ignoring
resources that were not used by the shader. But also making it hard
to determine if descriptor sets actually changed. Previous implementation
assumed descriptor sets always changed.

When descriptor set changed a new descriptor set needed to be allocated.
Most drivers this is a fast operation, but on Intel/Mesa this was measurable
slow. Using an allocation pool doesn't fit the Vulkan API as you are only
able to reuse when the layout matches exactly. Of course doable, but requires
another structure to keep track of the actual layouts.

**Solution**

By using the shader interface as entry point we can:
1. Keep track if there are any changes in the state manager. If not and the
   layout is the same, the previous shader can be reused.
2. In stead of looping over each bound resource, we loop over bind points.

**Future extensions**

Bundle all descriptor set uploads just before use. This would be more
in line with how 'modern' Vulkan should be implemented. This PR already
separates the uploading from the updating and technically allows to upload
more than one descriptor set.

Instead of looking 1 set back we should measure if we can handle multiple
or keep track of the different layouts resources to improve the performance
even further.

Optional use `VK_KHR_descriptor_buffer` when available.

Pull Request: https://projects.blender.org/blender/blender/pulls/128068
2024-09-26 10:59:45 +02:00
Jason Fielder
57f7d6380c Fix #126542 Fix UV Edge overlays in Metal
Takes into account any offset that must be added to the vertex index
(usually supplied as baseVertex or startVertex in the Metal draw call)
in the code that emulates the SSBO vertex fetch.

Authored by Apple: James McCarthy

Pull Request: https://projects.blender.org/blender/blender/pulls/127864
2024-09-25 15:22:30 +02:00
Laurynas Duburas
84dedfaf4b Curves: smooth handles
Adds antialiasing to curve's handles and thickness to active ones.
Also handles now react to
 `Preferences > Interface > Display > Resolution Scale` and
`Preferences > Themes > 3D Viewport > Edge Width` as they do in
legacy curves.

Pull Request: https://projects.blender.org/blender/blender/pulls/122910
2024-09-25 15:21:31 +02:00
Campbell Barton
01825a85cb Cleanup: quiet compiler warnings 2024-09-25 20:14:06 +10:00
Vitalijs Komasilovs
2e11331dfc Fix #127286: fixing memory release after light probe bake
GPU resources created during Light probe bake job were added to discard pool, but the pool itself was never notified by worker thread to release resources.
Bake job creates dedicated `GPUContext` for its needs and later deletes it within the same thread.

Pull Request: https://projects.blender.org/blender/blender/pulls/127977
2024-09-24 10:19:49 +02:00
Jeroen Bakker
13fa6d6ae1 Vulkan: Refactor of descriptor set
Removes two levels of indirection when updating descriptor sets.
These are the easy ones to remove. Others will be removed in
a future PR.

This is part of reworking of how descriptor sets are used.
This PR Mostly reduces complexity.

Pull Request: https://projects.blender.org/blender/blender/pulls/127915
2024-09-24 10:03:16 +02:00
Jeroen Bakker
fe18daacda Vulkan: Validation error when using de-interleaved vertex buffers
De-interleaved vertex buffers offsets the attribute in the buffer to
the de-interleaved position. The vertex attribute offset is limited by a
constrained and would raise an error when the buffers just a bit larger.

*VUID-VkVertexInputAttributeDescription-offset-00622*:  offset must
be less than or equal to `VkPhysicalDeviceLimits::maxVertexInputAttributeOffset`

This PR fixes this by offsetting the buffer in stead of the attribute.
Offsetting buffers is limited by the amount of memory.

Pull Request: https://projects.blender.org/blender/blender/pulls/128031
2024-09-23 15:10:57 +02:00
Jeroen Bakker
ddb2179e37 Vulkan: GPU device selection
Allows users to override the auto detection for GPU
selection. Normally the GPU selection is done by looping
over the order Vulkan provides and finding the highest
performing device based on its type (discrete, integrated,
software).

However users might have multiple discrete cards and want
to switch between them. Or developers want to validate other
GPUs without rebooting.

This PR adds the ability to override the auto detection
for the vulkan backend.

![image](/attachments/5d9198a8-af08-4eee-aa73-363edea11cd9)

**Future improvements**:
- This PR does not include a command line option. This can be added
  later for render farms.

Pull Request: https://projects.blender.org/blender/blender/pulls/127860
2024-09-23 11:18:24 +02:00
Campbell Barton
f4c8845b60 Cleanup: remove duplicate check for AMD W6800 GPU
This read like a typo (which missed W6700) however the GPU
wasn't released. Remove the duplicate check and add a note.

Ref !127993
2024-09-23 19:09:14 +10:00
Jeroen Bakker
56b7ff256f Vulkan: Fix validation error push constants for compute shaders
Since parallel compilations was introduced, a validation error
was signalling that push constants for compute shaders didn't have
the correct pipeline binding. The root cause was that the pipeline
binding was determined, before the type of shader was known.

This PR fixes this by detemining if a shader is a compute shader up
front. It also removes some code that could lead to issues.

Pull Request: https://projects.blender.org/blender/blender/pulls/128010
2024-09-23 09:44:29 +02:00
Aras Pranckevicius
c6f5c89669 BLI: faster float<->half array conversions, use in Vulkan
In addition to float<->half functions to convert one number (#127708), add
float_to_half_array and half_to_float_array functions:
- On x64, this uses SSE2 4-wide implementation to do the conversion
  (2x faster half->float, 4x faster float->half compared to scalar),
  - There's also an AVX2 codepath that uses CPU hardware F16C instructions
    (8-wide), to be used when/if blender codebase will start to be built
    for AVX2 (today it is not yet).
- On arm64, this uses NEON VCVT instructions to do the conversion.

Use these functions in Vulkan buffer/texture conversion code. Time taken to
convert float->half texture while viewing EXR file in image space (22M
numbers to convert): 39.7ms -> 10.1ms (would be 6.9ms if building for AVX2)

Pull Request: https://projects.blender.org/blender/blender/pulls/127838
2024-09-22 17:39:54 +02:00
Jeroen Bakker
ec7fc8fef4 Vulkan: Parallel shader compilation
This PR introduces parallel shader compilation for Vulkan shader
modules. This will improve shader compilation when switching to material
preview or EEVEE render preview. It also improves material compilation.
However in order to measure the differences shaderc needs to be updated.

PR has been created so we can already start with the code review. This
PR doesn't include SPIR-V caching, what will land in a separate PR as
it needs more validation.

Parallel shader compilation has been tested on AMD/NVIDIA on Linux.
Testing on other platforms is planned in the upcoming days.

**Performance**

```
AMD Ryzen™ 9 7950X × 32, 64GB Ram
Operating system: Linux-6.8.0-44-generic-x86_64-with-glibc2.39 64 Bits, X11 UI
Graphics card: Quadro RTX 6000/PCIe/SSE2 NVIDIA Corporation 4.6.0 NVIDIA 550.107.02
```

*Test*: Start blender, open barbershop_interior.blend and wait until the viewport
has fully settled.

| Backend | Test                      | Duration |
| ------- | ------------------------- | -------- |
| OpenGL  | Coldstart/No subprocesses | 1:52     |
| OpenGL  | Coldstart/8 Subprocesses  | 0:54     |
| OpenGL  | Warmstart/8 Subprocesses  | 0:06     |
| Vulkan  | Coldstart Without PR      | 0:59     |
| Vulkan  | Warmstart Without PR      | 0:58     |
| Vulkan  | Coldstart With PR         | 0:33     |
| Vulkan  | Warmstart With PR         | 0:08     |

The difference in time (why OpenGL is faster in a warm start is that all
shaders are cached). Vulkan in this case doesn't cache anything and all
shaders are recompiled each time. Caching the shaders will be part of
a future PR. Main reason not to add it to this PR directly is that SPIR-V
cannot easily be validated and would require a sidecar to keep SPIR-V
compatible with external tools..

**NOTE**:
- This PR was extracted from #127418
- This PR requires #127564 to land and libraries to update. Linux lib
  is available as attachment in this PR. It works without, but is as slow as
  single threaded compilation.

Pull Request: https://projects.blender.org/blender/blender/pulls/127698
2024-09-20 08:30:09 +02:00
Campbell Barton
0fc27c8d81 Cleanup: spelling in comments 2024-09-20 13:14:57 +10:00
Jeroen Bakker
214a47f15c Vulkan: Make Unused Attachments Optional
Windows/Intel and Apple drivers do not support dynamic
rendering unused attachments. Due to mistakes we made
this extension partly optional. Eg. the extension was
optional, but its settings were not.

This PR makes the extension fully optional. However
without the extension some drivers might make incorrect
assumptions. This should be solved when it is more clear
why some drivers are still crashing when using dynamic
rendering.

Pull Request: https://projects.blender.org/blender/blender/pulls/127839
2024-09-19 13:03:50 +02:00
Aras Pranckevicius
92544d6d76 BLI: add float<->half conversion functions with correct math, use in Vulkan
Blender codebase had two ways to convert half (FP16) to float (FP32):

- BLI_math_bits.h half_to_float. Out of 64k possible half values, it converts
  4096 of them incorrectly. Mostly denormals and NaNs, which is perhaps not too
  relevant. But more importantly, it converts half zero to float 0.000030517578
  which does not sound ideal.
- Functions in Vulkan vk_data_conversion.hh. This one converts 2046 possible
  half values incorrectly.

Function to convert float (FP32) to half (FP16) was in Vulkan
vk_data_conversion.hh, and it got a bunch of possible inputs wrong. I guess it
did not do proper "round to nearest even" that CPU/GPU hardware does.

This PR:

- Adds BLI_math_half.hh with float_to_half and half_to_float functions.
    - Documentation and test coverage.
    - When compiling on ARM NEON, use hardware VCVT instructions.
- Removes the incorrect half_to_float from BLI_math_bits.h and replaces single
  usage of it in View3D color picking to use the new function.
- Changes Vulkan FP32<->FP16 conversion code to use the new functions, to fix
  correctness issues (makes eevee_next_bsdf_vulkan test pass). This makes it
  faster too.

Pull Request: https://projects.blender.org/blender/blender/pulls/127708
2024-09-18 13:15:00 +02:00
Campbell Barton
b70925a8cc Cleanup: prefer ASCII characters
Use ASCII quotes, punctuation so strings are easily editable.
2024-09-17 17:28:01 +10:00
Jeroen Bakker
4be5d7f99f Vulkan: Refactor cached compiler instance
ShaderC compiler was cached on the Vulkan backend. The compiler itself
is light-weight and doesn't require any caching. This PR removes the
cached instance from the backend.

Pull Request: https://projects.blender.org/blender/blender/pulls/127693
2024-09-16 15:51:55 +02:00
Jeroen Bakker
f8ff74f821 GPU: Update stubs of shader builder 2024-09-16 15:02:49 +02:00
Clément Foucault
e90a84469f EEVEE: Simplify barycentric_distances_get
This uses the path that metal was using.

This doesn't seems to create any difference in render
tests. This simplify the backend code and avoid
specific path for metal.

Idea suggested by Kevin Chuang

Pull Request: https://projects.blender.org/blender/blender/pulls/127687
2024-09-16 14:19:53 +02:00
Jeroen Bakker
a407186dbf GPU: Make shader cache clearing backend independent
Parallel shader compilation introduced `GPU_shader_cache_dir_clear_old`.
The implementation was specific to OpenGL and could not be overwritten
by other backends. This PR improves the implementation so the backend
can have its own implementation.

This is needed for upcoming changes to the Vulkan backend where we
want to use similar mechanisms to speed up shader compilation and caching.

Pull Request: https://projects.blender.org/blender/blender/pulls/127680
2024-09-16 14:03:14 +02:00
Campbell Barton
10e8f2f889 Cleanup: various non-functional changes 2024-09-15 23:22:22 +10:00
Campbell Barton
9be29e1bbc Cleanup: match function & declaration names 2024-09-15 23:14:07 +10:00
Campbell Barton
6a1bd2ff40 Cleanup: use C++ comments for disabled code 2024-09-14 12:35:00 +10:00
Campbell Barton
81e2ccbf2b Cleanup: spelling in comments 2024-09-13 10:56:26 +10:00
Clément Foucault
4021a517a8 GPU: Add DebugScope class
This class allows to define capture scopes
that can be chosen during gpu work capture.

This reduces the amount of command captured
and allow for faster replay and easier
navigation inside the debug tools like Xcode
or RenderDoc.
2024-09-11 18:37:42 +02:00
Clément Foucault
c82ddedb9b Overlay-Next: Image Space
Port all Image editor overlays.

Rel #102179

Pull Request: https://projects.blender.org/blender/blender/pulls/127366
2024-09-11 18:26:34 +02:00
Miguel Pozo
a602e5530a Fix #127437: Crash with parallel shader compilation
Avoid race conditions and handle pending async compilations
when compiling synchronously.
2024-09-11 16:34:57 +02:00
Campbell Barton
e00fed43e6 Cleanup: redundant struct declarations 2024-09-11 16:25:25 +10:00
Jeroen Bakker
59fa82118e Vulkan: Blendfile thumbnail generation
Vulkan would crash when generating a thumbnail in case there is no
3D viewport in the active workspace. When this happens the front buffer
is downscaled as thumbnail. We didn't create a front buffer as swap
chains are handled differently.

We work around this issue in the same way as Metal does. Create a dummy
front framebuffer and share the surface texture. When the thumbnail
generator reads from the front buffer it will read the data that was
created by the back buffer.

Pull Request: https://projects.blender.org/blender/blender/pulls/127393
2024-09-10 11:02:43 +02:00
Jeroen Bakker
e22fc36fd5 Vulkan: Incorrect selection in edit mode
Edit mode selection returns incorrect indices. The reason is
that the extent of the area downloaded to the CPU for evaluation was
incorrect and read pixels could not in the place where they are
actually stored.

Was an oversight when fixing face selection. It wasn't that noticeable
there as faces typically cover a larger space.

Pull Request: https://projects.blender.org/blender/blender/pulls/127386
2024-09-10 10:58:37 +02:00
Clément FOUCAULT
897f7a8482 GPU: Fix assertion when trying to use shader printf on metal
Make sure all printing happens inside render boundaries
since it needs to read a storage buffer which needs to
record some commands inside command buffers.
2024-09-09 16:30:36 +02:00
Jeroen Bakker
994e05accd Vulkan: Encoding of Mat3 in Std430 struct
The encoding of mat3 in std430 was incorrect leading to a drawing
artifact in the direction control of sunlight in sky textures.

The error was that every 3 floats requires an additional float
as each row of the mat3 is aligned to 16 bytes.

Pull Request: https://projects.blender.org/blender/blender/pulls/127246
2024-09-09 12:44:22 +02:00
Jeroen Bakker
2eb0803b43 Vulkan: Cannot use inside Renderdoc
Renderdoc only export extensions it supports.
`VK_EXT_dynamic_rendering_unused_attachments` is not supported by
renderdoc and therefor no devices could be found to start the vulkan
backend.

In GHOST we already work around this issue by not checking this specific
extension. We should do the same in VKBackend.

Pull Request: https://projects.blender.org/blender/blender/pulls/127236
2024-09-06 14:40:29 +02:00
Jeroen Bakker
2b43900ccc Vulkan: Reading subtexture
Subtexture reading is supported via GPUFramebuffer. The input
parameters was an rect, but is called an area which includes
width and height.

Due to inconsistent naming the area was assumed to be a region leading
to incorrect sub texture being downloaded or crashes due to out of
bound writes.

This fix crashes when selecting in edit mode.

Pull Request: https://projects.blender.org/blender/blender/pulls/127229
2024-09-06 11:55:36 +02:00
Jeroen Bakker
458faa6486 Preferences: GPU backend selection
This PR allows users to select a GPU backend.

In the system tab of the user preferences the GPU backend can be selected in the `Display Graphics` panel.
It will require a restart of Blender before the changes become effective.

During startup minimum requirements are checked. Blender will switch automatically
to OpenGL when no compatible Vulkan device could be detected. A dialog will be shown
to inform the user.

The setting of the in the `Display Graphics` panel are still overridden when blender is started
using the `--gpu-backend` option. When starting blender with `--debug-gpu` the backend
detection will print to the console.

See PR for detailed information and screenshots of the UI.

Implements #126504
Pull Request: https://projects.blender.org/blender/blender/pulls/126545
2024-09-06 08:28:41 +02:00
Jeroen Bakker
b2fdae6f0e Vulkan: GPU depth picking
GPU depth picking was not working on all GPUs. When a GPU requires a
DEPTH32F to store depths the conversion to unsigned normalized could
wrap around. Making depths of 1.0 become 0. In stead of MAX_DEPTH.

This solves depth picking issues on Intel and AMD GPUs.
2024-09-05 15:14:40 +02:00
Jeroen Bakker
98feb87f40 Vulkan: Disk cache for static pipelines
This PR introduces disk cache for static pipelines. The pipelines are
stored in `<cache folder>/vk-pipeline-caches/static-shaders.bin`.

Due to limitations in some drivers we add a custom header to the
cache file to identify if the cache file was created by the same driver
for the same GPU for the same Blender.

Reading/writing the cache is skipped when running blender with
`--debug-gpu` as that would generate different shader modules. For
now that isn't a problem, but the final implementation would check
before compiling a shader if a certain key is in the pipeline cache if
that is the case the compilation step is skipped and the cached shader
module is used.

Reference: #126229
Pull Request: https://projects.blender.org/blender/blender/pulls/127110
2024-09-05 13:02:40 +02:00
Jeroen Bakker
52a84b6c01 Vulkan: Don't overwrite pipeline base when already set
Base pipelines are used to optimize the vulkan pipeline creation.
Instead of building each pipeline from scratch a base pipeline can
be used to share resources and faster code-paths.

We used to overwrite the base pipeline with the last created pipeline.
This PR doesn't overwrite the base pipeline after it was initially set
giving less confusion when working with base pipelines.

Pull Request: https://projects.blender.org/blender/blender/pulls/127181
2024-09-05 13:00:39 +02:00
Anthony Roberts
5f34338296 Windows: Disable shader draw parameter support on certain Qualcomm GPUs
This works around an issue where eevee was rendering a pure black cube in certain shader configurations in the default scene (#122837). This only affects X Elite devices (8cx Gen3 is unaffected).

Pull Request: https://projects.blender.org/blender/blender/pulls/127148
2024-09-04 18:19:54 +02:00