Commit Graph

5804 Commits

Author SHA1 Message Date
Weizhen Huang
d2db9927ed Fix #86648: reduce ray differentials size for bump mapping
Use sub-pixel differentials for bump mapping helps with reducing
artifacts when objects are moving or when textures have high frequency
details.

Currently we scale it by 0.1 because it seems to work good in practice,
we can adjust the value in the future if it turns out to be impractical.

Ref: #122892

Pull Request: https://projects.blender.org/blender/blender/pulls/133991
2025-02-05 13:39:27 +01:00
Campbell Barton
b02bbbdb37 Cleanup: rename "GL" selection to "DEPTH"
The reference to OpenGL is no longer relevant.
2025-02-05 11:21:50 +11:00
Jeroen Bakker
3d20d39115 Cleanup: Vulkan: Use is_link_to_buffer
Previous implementation used the resource state tracker which is a hash
table lookup. `is_link_to_buffer` is a bit cheaper as it is compares
already loaded data.
2025-02-04 16:28:46 +01:00
Jeroen Bakker
aa535f1a5f Cleanup: Vulkan: Remove resource locking when reordering nodes
This PR changes the resource locking when reordering render graph
nodes. Reordering could be done without locking resources. No measurable
speedup detected.

Pull Request: https://projects.blender.org/blender/blender/pulls/134032
2025-02-04 13:24:01 +01:00
Jeroen Bakker
7f04a4fef3 Fix: Vulkan: Stalling shader compilation
This PR fixes an issue that shaders compilation could stall. This could
be seen in the viewport (sometime not showing first EEVEE render) but
was more prominent when running test cases.

Pull Request: https://projects.blender.org/blender/blender/pulls/134020
2025-02-04 09:49:28 +01:00
Jeroen Bakker
dda23c53f8 Metal: Add native tile input to workarounds
Native tile input wasn't part of the MTLCapability struct, but stored locally
in the shader generator and checked in MTLFramebuffer. This PR moves it
to the MTLCapability struct and disables it when workarounds are forced.

Pull Request: https://projects.blender.org/blender/blender/pulls/133818
2025-02-03 16:36:15 +01:00
Jeroen Bakker
27b9173081 Cleanup: Code-style
Remove commented out parameter-name from header.
2025-02-03 08:06:10 +01:00
Campbell Barton
4cd827870d Cleanup: quiet check_spelling_* targets
Also correct outdated references to `ghash`.
2025-02-02 13:58:34 +11:00
Clément Foucault
976ed42533 Cleanup: GPU: Use functional cast for scalar casting 2025-01-31 18:26:44 +01:00
Brecht Van Lommel
c7502b092d Cleanup: Various clang-tidy warnings in gpu
Pull Request: https://projects.blender.org/blender/blender/pulls/133734
2025-01-31 17:03:18 +01:00
Clément Foucault
651ae0e47c Metal: Add OOB coordinate rejection to image atomic functions
These should have been guarded but are not, creating
buffer out of bound access error on Apple devices.
2025-01-31 16:17:58 +01:00
Clément Foucault
636147053d Metal: Add support for repeating byte sequence for buffer clearing
This allows to run with the --debug-gpu option (which
does NAN and 0xF0F0F0F0 clearing) without asserts
even when the texture atomic workaround is enabled.
2025-01-31 16:13:56 +01:00
Clément Foucault
067f6767d4 Fix #129571: Metal: Broken texture atomic workaround
The refactor 9c0321ae9b
had the wrong mental model of the backing texture
layout for the atomic workaround.

For 3D textures, the layout is breaking the 3D texture
and reinterpreting the linear location as its 2D
linear location. This breaks the 3D texture Z slices
into non contiguous regions in 2D.

Comments have been added to avoid future confusion.

Pull Request: https://projects.blender.org/blender/blender/pulls/133830
2025-01-31 16:10:59 +01:00
Jeroen Bakker
96c9153c5e Fix: Vulkan: Use after free of VkBufferView
VkBufferViews could be used after they were freed. The reason is that
they were not managed by the discard pool. Detected when looking in
failing render tests (pointcloud_motion.blend).

This part of the API is used by motion blur in EEVEE. Fixes the next
render tests

- `eevee_next_motion_blur_vulkan`
- `eevee_next_pointcloud_vulkan`
- `eevee_next_hair_vulkan`

Related: #133546
Pull Request: https://projects.blender.org/blender/blender/pulls/133856
2025-01-31 11:41:39 +01:00
Jeroen Bakker
4dbb9b34c6 Fix: Vulkan: Compositor cryptomatte
When using cryptomatte the last identifier was never used due to a
memory alignment issue. Scalar types should not be aligned, but they
were.

Pull Request: https://projects.blender.org/blender/blender/pulls/133815
2025-01-30 15:51:09 +01:00
Clément Foucault
6ab4e99cf7 Fix #133645: Metal: Crash when activating EEVEE on MacOS 13.7.2 with AMD
This was caused by the subpass input workaround for non-tilebased
GPU using `texelFetch` on an `image`. This was supported before
the cleanup 9c0321ae9b.
But is against the GLSL specification and was removed inside the
cleanup.

Using `imageLoad` instead of `texelFetch` fixes the crash.
However rendering seems to be broken for other reasons.
2025-01-30 15:32:17 +01:00
Jeroen Bakker
4b99bc8515 Fix: Renderdoc: Corruption in debug stack
In renderdoc the debug stack got corrupted when render graphs where
reused. The previous usage didn't clear the stack. This PR clears
the debug stack when render graphs are reset.
2025-01-30 13:46:21 +01:00
Jacques Lucke
e1753900b7 BLI: improve UTF-8 safety when copying StringRef to char buffers
Previously, there was a `StringRef.copy` method which would copy the string into
the given buffer. However, it was not defined for the case when the buffer was
too small. It moved the responsibility of making sure the buffer is large enough
to the caller.

Unfortunately, in practice that easily hides bugs in builds without asserts
which don't come up in testing much. Now, the method is replaced with
`StringRef.copy_utf8_truncated` which has much more well defined semantics and
also makes sure that the string remains valid utf-8.

This also renames `unsafe_copy` to `copy_unsafe` to make the naming more similar
to `copy_utf8_truncated`.

Pull Request: https://projects.blender.org/blender/blender/pulls/133677
2025-01-29 12:12:27 +01:00
Campbell Barton
bd1ded952b Cleanup: spelling in comments 2025-01-29 12:31:19 +11:00
Clément Foucault
10c0b19213 Cleanup: GPUMaterial: Remove leftover EEVEE Legacy code 2025-01-28 17:48:34 +01:00
Jeroen Bakker
3eaa70c251 Fix #133690: Vulkan: Faster downloading of textures.
Somehow incorrect memory is selected when not setting the host write random on
a buffer that is only read.

Pull Request: https://projects.blender.org/blender/blender/pulls/133721
2025-01-28 16:41:43 +01:00
Jeroen Bakker
2d3d1d249b Cleanup: Remove compilation warnings GCC13
Pull Request: https://projects.blender.org/blender/blender/pulls/133702
2025-01-28 11:50:59 +01:00
Iliya Katushenock
c63b44eaec Fix #131095: EEVEE: Support long property path as attribute name
Attribute name could be a path built from multiple object/property names
while each of them can be 64 symbols long.
This was fixed by cff53fdb53, so Cycles
can handle this. But eevee need additional change.

Pull Request: https://projects.blender.org/blender/blender/pulls/131183
2025-01-27 12:07:32 +01:00
Jeroen Bakker
efff379ea5 Metal: Add support to force workarounds.
Recently it came to out attention that macOs13 doesn't always work due
to texture atomics not supported by that version of the OS.

Development happens most of the time on newer versions of the OS without
ability to check if it still works on the older versions.

This PR enables to disable some Metal capabilities to better check how
Blender works on those OS's. The capabilities that will be disabled
are texture gathering and texture atomics. It doesn't disable the
capabilities that are required to start Blender, which are still
part of the `MTLCapabilities` struct.

This allows us to reproduce issues like #129571

Pull Request: https://projects.blender.org/blender/blender/pulls/133636
2025-01-27 11:07:20 +01:00
Jeroen Bakker
e6b3cc8983 Vulkan: Device command builder
This PR implements a new the threading model for building render graphs
based on tests performed last month. For out workload multithreaded
command building will block in the driver or device. So better to use a
single thread for command building.

Details of the internal working is documented at https://developer.blender.org/docs/features/gpu/vulkan/render_graph/

- When a context is activated on a thread the context asks for a
  render graph it can use by calling `VKDevice::render_graph_new`.
- Parts of the GPU backend that requires GPU commands will add a
  specific render graph node to the render graph. The nodes also
  contains a reference to all resources it needs including the
  access it needs and the image layout.
- When the context is flushed the render graph is submitted to the
  device by calling `VKDevice::render_graph_submit`.
- The device puts the render graph in `VKDevice::submission_pool`.
- There is a single background thread that gets the next render
  graph to send to the GPU (`VKDevice::submission_runner`).
  - Reorder the commands of the render graph to comply with Vulkan
    specific command order rules and reducing possible bottlenecks.
    (`VKScheduler`)
  - Generate the required barriers `VKCommandBuilder::groups_extract_barriers`.
    This is a separate step to reduce resource locking giving other
    threads access to the resource states when they are building
    the render graph nodes.
  - GPU commands and pipeline barriers are recorded to a VkCommandBuffer.
    (`VKCommandBuilder::record_commands`)
  - When completed the command buffer can be submitted to the device
    queue. `vkQueueSubmit`
  - Render graphs that have been submitted can be reused by a next
    thread. This is done by pushing the render graph to the
    `VKDevice::unused_render_graphs` queue.

Pull Request: https://projects.blender.org/blender/blender/pulls/132681
2025-01-27 08:55:23 +01:00
Jeroen Bakker
e2dddea124 Fix: Vulkan: Thread safe cache folder
Vulkan shader compiler accesses the cache folder via multiple threads.
GHOST part isn't thread safe and can return and overwrite the returned
cache path. This resulted into crashes when performing background
rendering and failing test cases, loading of incorrect shaders etc.

This PR fixes this to cache the cache folder location in the
VKShaderCompiler, which is loaded via the main thread when the vulkan
backend is initialized.

Pull Request: https://projects.blender.org/blender/blender/pulls/133535
2025-01-24 12:20:43 +01:00
Jeroen Bakker
2feb435780 Fix: Vulkan: Memory allocation on no-rebar capable platforms
Memory areas was requested to be preferable host visible. On some
platforms this would fail to allocate. Best is to not add preferable
host visible for typically large allocations.

This PR also gives the caller the responsibility to set the allocation flags.

Pull Request: https://projects.blender.org/blender/blender/pulls/133528
2025-01-24 11:54:59 +01:00
Jonas Holzman
b701ba6554 macOS: Fix WITH_GPU_DRAW_TESTS build linking error
The issue was twofold, the `draw_tests` library was missing a link
dependency on `gpu_tests`, and the `gpu_tests` would only be generated
if `WITH_GPU_BACKEND_TESTS` or `WITH_VULKAN_BACKEND` were also ON due
to a superflous condition.

Pull Request: https://projects.blender.org/blender/blender/pulls/133511
2025-01-24 11:00:34 +01:00
Clément Foucault
1ac4651778 Cleanup: DRW: Remove legacy common_view_lib.glsl
No functional changes. Only moving and renaming stuff.

Pull Request: https://projects.blender.org/blender/blender/pulls/131558
2025-01-23 18:06:22 +01:00
Miguel Pozo
8d392d41c2 Fix: GPU: GPU_indexbuf_bind_as_ssbo
Make the behavior consistent across all backends.
Clean up the GL function.

Follow-up from #132712.

Pull Request: https://projects.blender.org/blender/blender/pulls/133383
2025-01-23 15:40:45 +01:00
Jeroen Bakker
2bd4e101a0 Fix #130106: Vulkan: Pixelbuffer performance
Cycles uses pixel buffers to update the display. Due to making things
work the vulkan backend downloaded the GPU allocated pixel buffer to the
CPU, Copied it to a GPU allocated staging buffer and update the display
texture using the staging buffer. Needless to say that a (CPU->)GPU->CPU->GPU
roundtrip is a bottleneck.

This PR fixes this by allowing the pixel buffer to act as a staging
buffer as well.

Viewport and final image rendering performance is now also similar.

| **Render** | **GPU Backend** | **Path tracing** | **Display** |
| ---------- | --------------- | ---------------- | ----------- |
| Viewport   | OpenGL          | 2.7              | 0.06        |
| Viewport   | Vulkan          | 2.7              | 0.04        |
| Image      | OpenGL          | 3.9              | 0.02        |
| Image      | Vulkan          | 3.9              | 0.02        |

Tested on:
```
Operating system: Linux-6.8.0-49-generic-x86_64-with-glibc2.39 64 Bits, X11 UI
Graphics card: AMD Radeon Pro W7700 (RADV NAVI32) Advanced Micro Devices radv Mesa 24.3.1 - kisak-mesa PPA Vulkan Backend
```

Pull Request: https://projects.blender.org/blender/blender/pulls/133485
2025-01-23 14:58:49 +01:00
Brecht Van Lommel
7a0a173d39 Fix: Missed case of OCIO luminance coefficients in EEVEE
Following up on #133368. Thanks Omar for spotting this.

Pull Request: https://projects.blender.org/blender/blender/pulls/133400
2025-01-22 11:19:02 +01:00
Brecht Van Lommel
5e02b4e6f1 EEVEE: Use OpenColorIO for luminance
Color to grayscale conversions should take into account the colorspace,
and these are considered to be in scene linear colorspace.

Note the RBG to BW node implementation is used for implicit conversions,
so that is covered as well.

No change with the default configuration.

Pull Request: https://projects.blender.org/blender/blender/pulls/133368
2025-01-21 18:05:56 +01:00
Jeroen Bakker
ff804882bd Refactor: Vulkan: Store large data in separate vectors
VKRenderGraphNode is 892 bytes and most of the bytes are used for
specific nodes. By storing large structs in separate vectors we can
reduce the needed memory and improve cache pre-fetching.

With this change the VKRenderGraphNode is reduced to 64 bytes.

On a (50 frames shader_balls.blend) the end user performance is improved by
2%.

| **Platform**     | **Before** | **After** |
| ---------------- | ---------- | --------- |
| AMD W7700        | 1409 ms    | 1383 ms   |
| NVIDIA RTX 6000  | 1443 ms    | 1428 ms   |

Pull Request: https://projects.blender.org/blender/blender/pulls/133317
2025-01-21 14:49:23 +01:00
Jeroen Bakker
781aeb1b3f Fix #133155: Vulkan: Initial uniform buffer upload failing
In some cases the initial uniform buffer upload fails. Using the
indirect upload (via render graph) does succeed. This PR uses the
render graph approach.

Pull Request: https://projects.blender.org/blender/blender/pulls/133293
2025-01-20 11:00:27 +01:00
Campbell Barton
90b03d2344 Cleanup: spelling in comments 2025-01-20 11:19:23 +11:00
Jeroen Bakker
390ca01685 Cleanup: Vulkan: Remove resource ownership
Images used to be tracked with ownership in order to reset swap chain
images to its original layout. This isn't used anymore as we always mark
them in VK_IMAGE_LAYOUT_UNDEFINED to make the first pipeline barrier a
nop.

This change reduces unneeded complexity and safe a few CPU cycles.

Pull Request: https://projects.blender.org/blender/blender/pulls/133197
2025-01-17 14:46:22 +01:00
Jeroen Bakker
2f18e4fe29 Vulkan: Add debug group for swapchain
Improves debugging swapchains when using renderdoc.

Pull Request: https://projects.blender.org/blender/blender/pulls/133190
2025-01-17 11:40:11 +01:00
Jeroen Bakker
56f14a0083 Vulkan: Add support for GPU_DATA_UBYTE to F16 data conversion
This data conversion is needed to download a HDR framebuffer for
color-picking and saving screenshots.

Pull Request: https://projects.blender.org/blender/blender/pulls/133187
2025-01-17 10:58:28 +01:00
Jeroen Bakker
80ec04b4ef Cleanup: Vulkan: Use full surface format
GHOST_ContextVK used to pass only the surface texture format to
the GPU backend, it didn't pass the color space. This PR also includes
the color space.

Pull Request: https://projects.blender.org/blender/blender/pulls/133185
2025-01-17 10:28:22 +01:00
Jeff Moguillansky
ea4d01923b Fix: Vulkan: local_read incorrect attachment load/store ops
This fixes a rendering issue when local read enabled.
Before fix, the output image is too bright.  This is due to incorrect load/store.

With this fix, the logic for attachment load/store ops with local_read on matches the logic with local_read off inside subpass_transition_impl...

Pull Request: https://projects.blender.org/blender/blender/pulls/133111
2025-01-16 08:21:14 +01:00
Campbell Barton
a29fe64e53 Cleanup: quiet compiler warning 2025-01-15 17:29:31 +11:00
Jeroen Bakker
264681344f Vulkan: Provide more control on memory allocation
This change give access to preferred and required allocation flags
allowing better control over where memory is allocated.

Pull Request: https://projects.blender.org/blender/blender/pulls/133059
2025-01-14 17:41:35 +01:00
Jeroen Bakker
04e64b27ea Vulkan: Ignore swapchain image layout/content
Blender always updates all pixels of the swap chain. As an optimization
we can skip the initial layout transition from present to transfer
destination as all pixels will be rewritten.

Pull Request: https://projects.blender.org/blender/blender/pulls/133061
2025-01-14 17:08:04 +01:00
Jeroen Bakker
b898fd09b4 Fix: Vulkan: Incorrect swizzling
Swizzling is supported when sampling. Outside samplers the swizzling
must always be the initial swizzling.

Detected when playing rain_restaurant.blend. EEVEE motion vectors use
swizzling.

Pull Request: https://projects.blender.org/blender/blender/pulls/133043
2025-01-14 14:09:57 +01:00
Jeroen Bakker
fbe05ac60b Cleanup: Vulkan: Remove unused code/parameters
Initial design had a more complex use case for render graphs.
They are not really used and will not in the near term. This PR
removes some code that doesn't do a thing

Pull Request: https://projects.blender.org/blender/blender/pulls/133047
2025-01-14 14:09:50 +01:00
Jeroen Bakker
9091085277 Fix: GPU: Compiling python gpu shaders
Compiling of graphics shaders via gpu crashed. The vulkan backend found
a compute source and continued the evaluation as if it was a compute
shader.

The compute source was added by the preprocessor that wraps the shader
source. Even empty sources were wrapped. Detection based on empty shader
sources failed.

This is not a Vulkan only issue as other platforms would have similar issues when
creating a compute shader.

Pull Request: https://projects.blender.org/blender/blender/pulls/133036
2025-01-14 10:51:24 +01:00
Jeroen Bakker
8a199dc77f Vulkan: pipeline barriers extraction
Pipeline barriers were extracted when recording commands. This works,
but had the downside that it locked the device resources. Extracting
pipeline barriers is fairly small task compared to recording commands.

This PR will perform the extraction of pipelines separate from command
recording. Code is easier to follow and when working with multiple threads
this will reduce locking (enabling this will be done in separate PR).

Original developed in !131965

Pull Request: https://projects.blender.org/blender/blender/pulls/132989
2025-01-14 09:50:42 +01:00
Jeroen Bakker
a4914b8972 Vulkan: Disable local read on non Qualcomm devices
Only enable by default dynamic rendering local read on Qualcomm devices. NVIDIA, AMD and Intel
performance is better when disabled (20%). On Qualcomm devices the improvement can be
substantial (16% on shader_balls.blend).

`--debug-gpu-vulkan-local-read` can be used to use dynamic rendering local read on any
supported platform.

Future: Check if bottleneck is during command building. If so we could fine-tune this after the
device command building landed (#T132682).

Pull Request: https://projects.blender.org/blender/blender/pulls/132981
2025-01-13 09:29:16 +01:00
Jeff Moguillansky
75dc76bceb Vulkan: Add support for dynamic rendering local read
This will add support for `VK_KHR_dynamic_rendering_local_read` when supported.
The extension allows reading from an attachment that has been written to by a
previous command.

Per platform optimizations still need to happen in future changes. Change will
 be limited to Qualcomm devices (in a future commit).

On Qualcomm devices this provides an uplift of 16% when using shader_balls.blend

Pull Request: https://projects.blender.org/blender/blender/pulls/131053
2025-01-13 08:10:31 +01:00