This feature greatly increase depth buffer precision.
This is very noticeable in large view distance scenes.
This is enabled by default on GPUs that supports it (most of the hardware we
support already supports this). This makes rendering different on the GPUs
that do not support that feature (`glClipControl`).
While this give much better depth precision as before, we also have a lot of
imprecision caused by our vertex transformations. This can be improved in
another task.
Pull Request: https://projects.blender.org/blender/blender/pulls/138898
Part of #136993.
Share as much of the ShaderCompiler implementations as possible.
Remove the ShaderCompiler/ShaderCompilerGeneric split and make most of
its functions non virtual.
Move the `get_compiler` function from `Context` to `GPUBackend` and
creation/deletion to `GPUBackend::init/delete_resources`.
Add a `batch_cancel` function to `ShaderCompiler` (needed for the
GPUPass refactor).
As a nice extra, the multithreaded OpenGL compilation has become faster
too.
The barbershop materials + EEVEE static shaders have gone from 27s to
22s.
I have not observed any performance difference on Vulkan or Metal.
Pull Request: https://projects.blender.org/blender/blender/pulls/136676
GPUViewport is creating a bunch of framebuffer textures for itself, but
some space types never initialize/use them. E.g. Sequencer, Nodes etc.
only ever use the "overlay" texture. Eventually when viewport is
"drawn", it combines this uninitialized texture data and then only by
luck it happens that most of the time it is black. But not always!
The textures were only cleared (right now) on Metal backend, under
GPU_clear_viewport_workaround as if it was some driver workaround. Stop
doing that, and just clear them always.
However, there was seemingly a performance issue on OpenGL, when this
clear was being done. At least on my machine (Win10, Geforce RTX
3080Ti), the overhead of doing the clears is measurable, and is caused
by usage of GL4.4 glClearTexImage instead of a framebuffer clear. As if
glClearTexImage makes "pixel data to exist" on the CPU side and then
later on binding this framebuffer sends off that data to the GPU, or
somesuch.
More details in the PR.
Pull Request: https://projects.blender.org/blender/blender/pulls/131518
This happened because NVidia GPUs require higher alignment
for SSBO binds than for vertex inputs.
This is related to #131103 which fixed it for vulkan.
Add a common capability option for that.
Some Vulkan platforms don't support framebuffers with gaps between the
color attachments. Workbench framebuffers can create gaps.
(`in_front_fb`, `main_fb` when used for wire frame drawing).
This PR implements a detection mechanism to detect gaps. It also disables
features that are not able to comply to this requirement.
Detected when working on #129062
Pull Request: https://projects.blender.org/blender/blender/pulls/130258
Adding a dummy storage buffer to the classification shader
seems to fix the issue on Qualcomm drivers (WoA).
The workaround is added to the force workaround option to
allow other platforms to test the fix.
Rel #122837
Pull Request: https://projects.blender.org/blender/blender/pulls/129857
This is the first commit of the several required to support
subprocess-based parallel compilation on OpenGL.
This provides the base API and implementation, and exposes the max
subprocesses setting on the UI, but it's not used by any code yet.
More information and the rest of the code can be found in #121925.
This one includes:
- A new `GPU_shader_batch` API that allows requesting the compilation
of multiple shaders at once, allowing GPU backed to compile them in
parallel and asynchronously without blocking the Blender UI.
- A virtual `ShaderCompiler` class that backends can use to add their
own implementation.
- A `ShaderCompilerGeneric` class that implements synchronous/blocking
compilation of batches for backends that don't have their own
implementation yet.
- A `GLShaderCompiler` that supports parallel compilation using
subprocesses.
- A new `BLI_subprocess` API, including IPC (required for the
`GLShaderCompiler` implementation).
- The implementation of the subprocess program in
`GPU_compilation_subprocess`.
- A new `Max Shader Compilation Subprocesses` option in
`Preferences > System > Memory & Limits` to enable parallel shader
compilation and the max number of subprocesses to allocate (each
subprocess has a relatively high memory footprint).
Implementation Overview:
There's a single `GLShaderCompiler` shared by all OpenGL contexts.
This class stores a pool of up to `GCaps.max_parallel_compilations`
subprocesses that can be used for compilation.
Each subprocess has a shared memory pool used for sending the shader
source code from the main Blender process and for receiving the already
compiled shader binary from the subprocess. This is synchronized using
a series of shared semaphores.
The subprocesses maintain a shader cache on disk inside a
`BLENDER_SHADER_CACHE` folder at the OS temporary folder.
Shaders that fail to compile are tried to be compiled again locally for
proper error reports.
Hanged subprocesses are currently detected using a timeout of 30s.
Pull Request: https://projects.blender.org/blender/blender/pulls/122232
Compute shaders are required since 4.0. There was one occasion where
an older AMD driver failed and support was turned off. This driver
is now marked unsupported.
This PR includes:
- removing the check in viewport compositing
- remove properties from system info
- always construct draw manager.
- remove unused pass logic in draw hair/curves
- add deprecation warning when accessed from python
Pull Request: https://projects.blender.org/blender/blender/pulls/120909