The polyline workarounds were not working as expected
since #139627 as it was not garanteed that the polylines
shader would be correctly initialized with the workaround
tag.
Adding a wrapper class to ensure the initialization fixes
the issue.
The check for Mesa needed to be before the more coarse
check about an AMD GPU. Also, it seems the newer drivers
do not have `X.Org` in the vendor string.
Checking for `Mesa` in the version string seems to be
the correct way.
Pull Request: https://projects.blender.org/blender/blender/pulls/140204
This reduces the time needed to get to the first pixel
on screen by multithreading the builtin shader compilation.
We avoid doing this if subprocess compilation is on as
the overhead of potentially partially starting all subprocess
is far greater than the benefit of paralllel compilation.
For some reasons, the compilation is much slower when
done async for these shaders (on Metal ~200ms > ~1.2ms),
so the saving might not be substantial.
Mac M1: First frame 6s > 5s.
Pull Request: https://projects.blender.org/blender/blender/pulls/139627
The Extend Bounds input has no effect when the Fast Gaussian filter is
used. Similarly, it has no effect if the Bokeh Blur node is using
variable size. This is a known limitation and was just not implemented.
So to fix this, we implement a general solution that works globally
across the node by pre-padding the inputs of the blur. This uses more
memory but also speeds up the base case when Extend Bounds is disabled,
while also reducing the binary size due to fewer blur specializations.
The variable size Bokeh Blur test was updated since it Extend Bounds was
silently ignored.
Pull Request: https://projects.blender.org/blender/blender/pulls/140192
This patch removes the older handle selection and transformation logic,
which was accessible through user preferences by switching off the
default "simple tweaking" mode in Editing -> Video Sequencer.
There was some initial bugginess with the new handle behavior, hence
the option to revert to legacy mode, but with recent updates this no
longer applies. With the new system, all selection workflows
are still possible -- this just drops older code to make things simpler.
No functional changes intended.
Pull Request: https://projects.blender.org/blender/blender/pulls/140031
Prevent race conditions caused by calling `GPUWorker::wake_up` when the
worker is not waiting.
Found to be an issue in #139627, since `wake_up` is likely to be called
before the thread has fully started.
Pull Request: https://projects.blender.org/blender/blender/pulls/139842
This allows to generate source file that will
be injected in a predefined source dependance tree.
This allow much cleaner shader workflow where
all sources are explicitly referenced from the
main source file.
Pull Request: https://projects.blender.org/blender/blender/pulls/140047
This prevents the use of unaligned data types in
vertex formats. These formats are not supported on many
platform.
This simplify the `GPUVertexFormat` class a lot as
we do not need packing shenanigans anymore and just
compute the vertex stride.
The old enums are kept for progressive porting of the
backends and user code.
This will break compatibility with python addons.
TODO:
- [x] Deprecation warning for PyGPU (4.5)
- [x] Deprecate matrix attributes
- [x] Error handling for PyGPU (5.0)
- [x] Backends
- [x] Metal
- [x] OpenGL
- [x] Vulkan
Pull Request: https://projects.blender.org/blender/blender/pulls/138846
This works by wrapping the entry point call inside a
`main` function.
Since resources are still defined in global space,
function accessing these are marked with a custom
attribute. This custom attribute expands in a
`#ifdef` guard for the matching stage.
This is a temporary solution and will eventually
be lifted once we support SRD.
### TODO
- [ ] Implement `[[gpu::vertex/fragment_function]]`.
Pull Request: https://projects.blender.org/blender/blender/pulls/139233
Previous implementation allows to move VKBuffers, but didn't do a
proper std::move. On second thought it is a bad idea to be able to move
GPU resources. This PR removes the ability to move the buffer and
replace the usages with a unique ptr.
Pull Request: https://projects.blender.org/blender/blender/pulls/140103
This allows to reduce the waiting time caused by
shader compilation on some GPU-driver combo.
A new settings in the User Preferences make it
possible to override the default amount of worker
threads and optionally use subprocesses.
We still use only one worker thread in cases where
there is no benefit with adding more workers
(like AMD pro driver and Intel windows).
It doesn't scale as much as subprocesses for material
shader compilation but that is for other reasons
explained in #139818.
Add some heuristic to avoid too much memory usage
and / or too many stalls.
Also add some heuristic to the default number of subprocess for
the platform that shows scalling.
Historically, multithreaded compilation was prevented by the
need of context per thread inside `DRWShader` module.
Also there was no good scaling at that time. But
nowadays numbers shows different results with
good scaling with reasonable amount of threads on many
platforms.
Even if we are going for vulkan in the next release
most of the legacy hardware will still use OpenGL for
a few other releases. So it is relevant to make this
easy improvement.
See pull request for measurements.
Pull Request: https://projects.blender.org/blender/blender/pulls/139821
This has limited use cases since it doesn't
profile the heavy part of the vulkan backend.
Almost 1:1 port of the metal implementation from #139551.
Doesn't cover rendergraph submission nor GPU timings.
Pull Request: https://projects.blender.org/blender/blender/pulls/139899
Color picking reads 3 values from a texture that is backed by 4
channels. The conversion would also convert the 4th channel.
This is a short term fix. We should reconsider the usage of reading a
different number of elements than backed by the texture. But that
requires work in the color picking and python GPU module.
Pull Request: https://projects.blender.org/blender/blender/pulls/139929
When drawing image scope the vulkan backend raised some asserts. After
checking them the issue was in the calling code, that could lead to
undefined behavior on other platforms as well.
It isn't allowed to have an immediate mode shader bound, when performing
batch drawing. There was also a point batch that didn't use any point
shader resulting in undefined behavior as well.
For 5.0 we should add this as a GPU module check.
Pull Request: https://projects.blender.org/blender/blender/pulls/139926
Descriptor sets/pools are known to be troublesome as it doesn't match
how GPUs work, or how application want to work, adding more complexity
than needed. This results is quite an overhead allocating and
deallocating descriptor sets.
This PR will use descriptor buffers when they are available. Most platforms
support descriptor buffers. When not available descriptor pools/sets
will be used.
Although this is a feature I would like to land it in 4.5 due to the API changes.
This makes it easier to fix issues when 4.5 is released.
The feature can easily be disabled by setting the feature to false if it has
to many problems.
Pull Request: https://projects.blender.org/blender/blender/pulls/138266
When buffers/images are allocated that use larger limits than supported
by the GPU blender would crash. This PR adds some safety mechanism to
allow Blender to recover from allocation errors.
This has been tested on NVIDIA drivers.
Pull Request: https://projects.blender.org/blender/blender/pulls/139876
This avoid having to compile specializations JIT and
use the same API as subprocess compilation.
This bridges the gap between subprocess and threaded
compilation.
Pull Request: https://projects.blender.org/blender/blender/pulls/139702
Compilation constants are constants defined in the create info.
They cannot be changed after the shader is created.
It is a replacement to macros with added type safety.
Reuse most of the logic from Specialization constants.
Pull Request: https://projects.blender.org/blender/blender/pulls/139703
If the compilation subprocess crashes due to an internal driver error,
the end semaphore is never signaled and leaves Blender hanging.
This replaces the `decrement` calls with `try_decrement` and checks
every second if the subprocess is lost.
Additionally, if the issue comes from a broken binary, the crashes will
happen every time a Blender session tries to load it.
So now we store the shader hash in the shared memory before trying to
load the binary. If the subprocess is lost mid compilation, the main
process will delete the broken cached binary.
Pull Request: https://projects.blender.org/blender/blender/pulls/139747
This PR changes loading of implicit vulkan layers. See #139543 where we
detected that there are vulkan layers installed on systems that try to
impersonate other software, but crashes when used in Blender.
This PR will reuse the VkInstance of GHOST_ContextVK when querying for
possible compatible devices. Previously a temporary VkInstance was
created but could trigger an error in the Vulkan loader.
Pull Request: https://projects.blender.org/blender/blender/pulls/139640
This was missing from the initial implementation.
This did not affect any user code path since they were not
checking for completion and only blocking on first `get`.
This allow to easily request async compilation
in a safe way for every static shader.
The get function will always wait if async
compilation has been requested.