Depth navigation sends many small render graphs to the device. It can be
that a subsequent render graph uses the same shader as the previous one
with the same descriptor set tracker. The descriptor set tracker didn't
cleared its full state and a subsequent render graph was generating
commands assuming that the device was in a certain state.
However it wasn't and a command to bind a descriptor set was skipped
resulting in a device out of bound write. Depending on the platform this
could overwrite any data on the GPU, including shader programs as the
select shader writes to a storage buffer. This clarifies why the issue
resulted in very odd and none consistent behavior.
This PR fixes this by clearing the VKPipelineData and
VKDescriptorTracker.
Pull Request: https://projects.blender.org/blender/blender/pulls/140526
When quiting Blender the timeline doesn't get updated and an assert is
triggered that the order isn't correct. The order isn't that important
anymore as the mechanism has been tested. The assert was useful during
initial development.
This PR removes the assert as it isn't valid in all cases.
In several cases synchronization issues around discarded resources could lead
to crashes. These crashes where more prominent on NVIDIA as they reuse
their handles more often.
This PR requires external synchronization when data is moved from a
discard pool to the main orphaned data discard pool. Also the device
timeline should be guarded by the same mutex.
Pull Request: https://projects.blender.org/blender/blender/pulls/140274
We added descriptor buffers to 4.5 as it contains some core API changes.
However there have been various reports that the system isn't fully
mature resulting in lower performance especially on NVIDIA GPUs.
This PR will disable descriptor buffer feature for NVIDIA GPUs.
Pull Request: https://projects.blender.org/blender/blender/pulls/140263
The check for Mesa needed to be before the more coarse
check about an AMD GPU. Also, it seems the newer drivers
do not have `X.Org` in the vendor string.
Checking for `Mesa` in the version string seems to be
the correct way.
Pull Request: https://projects.blender.org/blender/blender/pulls/140204
Prevent race conditions caused by calling `GPUWorker::wake_up` when the
worker is not waiting.
Found to be an issue in #139627, since `wake_up` is likely to be called
before the thread has fully started.
Pull Request: https://projects.blender.org/blender/blender/pulls/139842
Previous implementation allows to move VKBuffers, but didn't do a
proper std::move. On second thought it is a bad idea to be able to move
GPU resources. This PR removes the ability to move the buffer and
replace the usages with a unique ptr.
Pull Request: https://projects.blender.org/blender/blender/pulls/140103
This allows to reduce the waiting time caused by
shader compilation on some GPU-driver combo.
A new settings in the User Preferences make it
possible to override the default amount of worker
threads and optionally use subprocesses.
We still use only one worker thread in cases where
there is no benefit with adding more workers
(like AMD pro driver and Intel windows).
It doesn't scale as much as subprocesses for material
shader compilation but that is for other reasons
explained in #139818.
Add some heuristic to avoid too much memory usage
and / or too many stalls.
Also add some heuristic to the default number of subprocess for
the platform that shows scalling.
Historically, multithreaded compilation was prevented by the
need of context per thread inside `DRWShader` module.
Also there was no good scaling at that time. But
nowadays numbers shows different results with
good scaling with reasonable amount of threads on many
platforms.
Even if we are going for vulkan in the next release
most of the legacy hardware will still use OpenGL for
a few other releases. So it is relevant to make this
easy improvement.
See pull request for measurements.
Pull Request: https://projects.blender.org/blender/blender/pulls/139821
This has limited use cases since it doesn't
profile the heavy part of the vulkan backend.
Almost 1:1 port of the metal implementation from #139551.
Doesn't cover rendergraph submission nor GPU timings.
Pull Request: https://projects.blender.org/blender/blender/pulls/139899
Color picking reads 3 values from a texture that is backed by 4
channels. The conversion would also convert the 4th channel.
This is a short term fix. We should reconsider the usage of reading a
different number of elements than backed by the texture. But that
requires work in the color picking and python GPU module.
Pull Request: https://projects.blender.org/blender/blender/pulls/139929
When drawing image scope the vulkan backend raised some asserts. After
checking them the issue was in the calling code, that could lead to
undefined behavior on other platforms as well.
It isn't allowed to have an immediate mode shader bound, when performing
batch drawing. There was also a point batch that didn't use any point
shader resulting in undefined behavior as well.
For 5.0 we should add this as a GPU module check.
Pull Request: https://projects.blender.org/blender/blender/pulls/139926
Descriptor sets/pools are known to be troublesome as it doesn't match
how GPUs work, or how application want to work, adding more complexity
than needed. This results is quite an overhead allocating and
deallocating descriptor sets.
This PR will use descriptor buffers when they are available. Most platforms
support descriptor buffers. When not available descriptor pools/sets
will be used.
Although this is a feature I would like to land it in 4.5 due to the API changes.
This makes it easier to fix issues when 4.5 is released.
The feature can easily be disabled by setting the feature to false if it has
to many problems.
Pull Request: https://projects.blender.org/blender/blender/pulls/138266
When buffers/images are allocated that use larger limits than supported
by the GPU blender would crash. This PR adds some safety mechanism to
allow Blender to recover from allocation errors.
This has been tested on NVIDIA drivers.
Pull Request: https://projects.blender.org/blender/blender/pulls/139876
This avoid having to compile specializations JIT and
use the same API as subprocess compilation.
This bridges the gap between subprocess and threaded
compilation.
Pull Request: https://projects.blender.org/blender/blender/pulls/139702
Compilation constants are constants defined in the create info.
They cannot be changed after the shader is created.
It is a replacement to macros with added type safety.
Reuse most of the logic from Specialization constants.
Pull Request: https://projects.blender.org/blender/blender/pulls/139703
If the compilation subprocess crashes due to an internal driver error,
the end semaphore is never signaled and leaves Blender hanging.
This replaces the `decrement` calls with `try_decrement` and checks
every second if the subprocess is lost.
Additionally, if the issue comes from a broken binary, the crashes will
happen every time a Blender session tries to load it.
So now we store the shader hash in the shared memory before trying to
load the binary. If the subprocess is lost mid compilation, the main
process will delete the broken cached binary.
Pull Request: https://projects.blender.org/blender/blender/pulls/139747
This PR changes loading of implicit vulkan layers. See #139543 where we
detected that there are vulkan layers installed on systems that try to
impersonate other software, but crashes when used in Blender.
This PR will reuse the VkInstance of GHOST_ContextVK when querying for
possible compatible devices. Previously a temporary VkInstance was
created but could trigger an error in the Vulkan loader.
Pull Request: https://projects.blender.org/blender/blender/pulls/139640
This was missing from the initial implementation.
This did not affect any user code path since they were not
checking for completion and only blocking on first `get`.
This allow to easily request async compilation
in a safe way for every static shader.
The get function will always wait if async
compilation has been requested.
Allow adding compilation batches to different priority queues.
Set priorities so static shaders are always compiled first,
then materials, and optimized materials last.
Pull Request: https://projects.blender.org/blender/blender/pulls/139456
This PR adds builtin shaders for drawing points. Using `FLAT_COLOR`,
`SMOOTH_COLOR`, `UNIFORM_COLOR` can lead to undesired behavior
on Metal and Vulkan backends. To ensure future compatibility this PR
adds `POINT_FLAT_COLOR` and `POINT_UNIFORM_COLOR`.
The point size can be set using `gpu.state.point_size_set`.
Pull Request: https://projects.blender.org/blender/blender/pulls/139583
Race condition between the main thread and worker thread. In this case
the worker thread was waiting for a device idle, but the main thread was
waiting on a specific timeline. As long as the timeline didn't pass the
device was locked.
Releaved the race condition to check on queue idle as the timeline event
isn't part of the queue.
On windows drivers could be reset and could lead to other crashes as handles
were not used anymore.
Pull Request: https://projects.blender.org/blender/blender/pulls/139579