This PR replaces submissions with pipeline barriers when generating
mipmaps. This would reduce the amount of submissions and improve
performance during mipmap generation as dependencies will be tracked
between commands.
Pull Request: https://projects.blender.org/blender/blender/pulls/114306
Adjust clamping of inputs in the Principled BSDF to avoid errors and
inconsistencies between render engines, while trying to leave as many
inputs as possible unclamped for artisitc purposes.
Pull Request: https://projects.blender.org/blender/blender/pulls/112895
The last good commit was 8474716abb.
After this commits from main were pushed to blender-v4.0-release. These are
being reverted.
Commits a4880576dc from to b26f176d1a that happend afterwards were meant for
4.0, and their contents is preserved.
Goal is to reduce the number of command buffer flushes by tracking what is
happening in the different command queues. This is an initial step towards
advanced queue-ing strategies.
The new (intermediate) strategy records commands to different command
buffers based on what they do. There is a command buffer for data transfers,
compute pipelines and graphics pipelines.
When a compute command is recorded it ensures that all graphic commands
are finished. When a graphic command is recorded it ensures all compute
commands are finished. When a graphic or compute command is scheduled
all recorded data transfer commands are scheduled as well.
Some improvements are expected as multiple compute and data transfers
commands can now be scheduled at the same time and don't need to unbind
and rebind render passes. Especially when using EEVEE-Next which is
compute centric the performance change is visible for the user.
Pull Request: https://projects.blender.org/blender/blender/pulls/114104
When using Voronoi shader nodes on legacy Intel platforms (HD4400) Blender would crash
due to a driver bug. The bug is related to generating the `fractal_voronoi_x_fx` functions.
It doesn't effect all drivers, but mainly from vendors that don't allow installing the official
intel drivers.
We have tried several approaches including using unique function names and unroll only the
function of the body. But none worked on the failing platform.
In the future we could solve this by including our own GLSL compiler, but that is still very
experimental and requires a lot of testing.#113938
Pull Request: https://projects.blender.org/blender/blender/pulls/113834
Some of these devices are not capable of running >=4.0, due to issues
with Mesa's Compute Shaders and their D3D drivers.
This PR marks those GPUs as unsupported, and prints info to stdout.
A driver update will be available for 8cx Gen3 on the 17th October
from here:
https://www.qualcomm.com/products/mobile/snapdragon/pcs-and-tablets/snapdragon-8-series-mobile-compute-platforms/snapdragon-8cx-gen-3-compute-platform#Software
It will take longer via the standard MS Windows Update channels,
as there is certification, testing, etc required, but it is possible
to get the drivers, at least.
This issue applies even when using emulated x64.
If this does not get merged, all WoA devices will break with 4.0,
where older ones will just launch a grey screen and crash, and newer
ones will open, but scenes will not render correctly in Workbench.
These devices work by using Mesa's D3D12 Gallium driver ("GLOn12"),
which is why we have to read the DirectX driver version - the version
reported by OpenGL is the mesa version, which is independent of the
driver (which is the part with the bug).
Pull Request: https://projects.blender.org/blender/blender/pulls/113674
PR Introduces GPU_storagebuf_sync_to_host as an explicit routine to
flush GPU-resident storage buffer memory back to the host within the
GPU command stream.
The previous implmentation relied on implicit synchronization of
resources using OpenGL barriers which does not match the
paradigm of explicit APIs, where indiviaul resources may need
to be tracked.
This patch ensures GPU_storagebuf_read can be called without
stalling the GPU pipeline while work finishes executing. There are
two possible use cases:
1) If GPU_storagebuf_read is called AFTER an explicit call to
GPU_storagebuf_sync_to_host, the read will be synchronized.
If the dependent work is still executing on the GPU, the host
will stall until GPU work has completed and results are available.
2) If GPU_storagebuf_read is called WITHOUT an explicit call to
GPU_storagebuf_sync_to_host, the read will be asynchronous
and whatever memory is visible to the host at that time will be used.
(This is the same as assuming a sync event has already been signalled.)
This patch also addresses a gap in the Metal implementation where
there was missing read support for GPU-only storage buffers.
This routine now uses a staging buffer to copy results if no
host-visible buffer was available.
Reading from a GPU-only storage buffer will always stall
the host, as it is not possible to pre-flush results, as no
host-resident buffer is available.
Authored by Apple: Michael Parkin-White
Pull Request: https://projects.blender.org/blender/blender/pulls/113456
The previous formula for adjusting Coat Tint intensity resulted
in strong tints and sudden colour changes when using a low coat weight.
This commit fixes these issues by mixing between a white tint (no tint)
and the chosen tint based on the Coat Weight.
Pull Request: https://projects.blender.org/blender/blender/pulls/113468
This adds correct object bounds estimation.
This works by creating an occupancy texture where one
bit represents one froxel. A geometry pre-pass fill this
occupancy texture and doesn't do any shading. Each bit
set to 0 will not be considered occupied by the object
volume and will discard the material compute shader for
this froxel.
There is 2 method of computing the occupancy map:
- Atomic XOR: For each fragment we compute the amount of
froxels **center** in-front of it. We then convert that
into occupancy bitmask that we apply to the occupancy
texture using `imageAtomicXor`. This is straight forward
and works well for any manifold geometry.
- Hit List: For each fragment we write the fragment depth
in a list (contained in one array texture). This list
is then processed by a fullscreen pass (see
`eevee_occupancy_convert_frag.glsl`) that sorts and
converts all the hits to the occupancy bits. This
emulate Cycles behavior by considering only back-face
hits as exit events and front-face hits as entry events.
The result stores it to the occupancy texture using
bit-wise `OR` operation to compose it with other non-hit
list objects. This also decouple the hit-list evaluation
complexity from the material evaluation shader.
## Limitations
### Fast
- Non-manifolds geometry objects are rendered incorrectly.
- Non-manifolds geometry objects will affect other objects
in front of them.
### Accurate
- Limited to 16 hits per layer for now.
- Non-manifolds geometry objects will affect other objects
in front of them.
Pull Request: https://projects.blender.org/blender/blender/pulls/113731
Allow binding of framebuffers without sRGB to linear transform.
`GPU_framebuffer_bind_no_srgb`. This Patch removes color transform
artifacts in node, image and sequence editor.
When the framebuffer is an srgb framebuffer and it is bound without
the transformation, the SRGB textures are bound as UNORM variants.
As framebuffer, render pass and subpass recreation is ensured by
`VKCommandBuffer` we don't need to mark the framebuffer dirty at
this time. Later on we can optimize this by adding a state changed
detection for framebuffers and render passes.
Pull Request: https://projects.blender.org/blender/blender/pulls/113838
With the shift to GPU-driven rendering pipeline,
the SSBO vertex fetch paradigm used to
implement workbench shadows on Metal
instead of utilising the geometry shader
path no longer worked correctly.
This is because the draw submission
required vertex amplification up-front,
based on the expected output geometry
amount for a given input geometry.
This patch aims to resolve this
issue through addition of API to
enable the features within the
GPU driven pipeline.
Co-authored-by: Michael Parkin-White <mparkinwhite@apple.com>
Pull Request: https://projects.blender.org/blender/blender/pulls/113498
Changes to ensure all supported texture tests are passing with the
Metal backend and add additional tests to cover texture_3d and
texture 1d test cases.
Authored by Apple: Michael Parkin-White
Co-authored-by: Michael Parkin-White <mparkinwhite@apple.com>
Pull Request: https://projects.blender.org/blender/blender/pulls/113889
Caused by a1cc621e1e. The "is_hair_length" value needs to be
set before `attr_input_name` is called because it uses the value to set
the attribute name with the "l" alias.
When drawing node sockets an immediate mode buffer is created that can
contain all the node sockets of the node. Only visible node sockets will
then be added. Vulkan assumed that all elements in the buffer needed to
be drawn, resulting using uninitialzed memory for drawing node sockets.
This resulted in very colorful and big artifacts rendered in the node
editor. This was detected during drawing of node sockets, but would have
been visible in other places as well.
Pull Request: https://projects.blender.org/blender/blender/pulls/113830
Replaces all usage by the the gpu_shader_math
equivalent. This is because the old shader
library was quite tangled.
This avoids dependency hell trying to
mix libraries.
Changes are split into isolated commits until
I had to do mass changes because of inter-
dependencies.
Pull Request: https://projects.blender.org/blender/blender/pulls/113631
Shaders that require transform feedback should not be validated on
backends that don't support transform feedback.
In Vulkan transform feedback is implemented as an extension and
supported by half of the platforms. It isn't decided yet if we want to
support transform feedback as it is currently used as a fallback for
hair compute shader. In vulkan compute is available on all platforms.
During validation the shader printed a not implemented message.
This change hides that message.
Pull Request: https://projects.blender.org/blender/blender/pulls/113655
When Vulkan is started with validation layers the console is flooded.
Somewhere in the Vulkan validation layers or mesa driver (or the
combination) there is an issue where maxBufferSize is reported by the
driver to be 4GB, but the validation layers are reporting any buffer
size to be larger than 4GB.
For now we skip this message to be logged.
Pull Request: https://projects.blender.org/blender/blender/pulls/113652
With the shift to GPU-driven rendering pipeline,
the SSBO vertex fetch paradigm used to
implement workbench shadows on Metal
instead of utilising the geometry shader
path no longer worked correctly.
This is because the draw submission
required vertex amplification up-front,
based on the expected output geometry
amount for a given input geometry.
This WIP patch aims to resolve this
issue through addition of API to
enable the features within the
GPU driven pipeline.
Co-authored-by: Michael Parkin-White <mparkinwhite@apple.com>
Pull Request: https://projects.blender.org/blender/blender/pulls/113498
`eevee_shadow_page_tile_store` shader uses `VIEWPORT_INDEX` and `LAYER`.
Both use an optional extension in OpenGL and Vulkan. When the extension
isn't available a geometry shader is injected to emulate the
extension. The generated geometry shader requires `instance_name` to be
set. This wasn't the case for `eevee_shadow_page_tile_store` shader.
This PR also adds a detection for incompatible shader infos.
- Shaders that use `VIEWPORT_INDEX` or `LAYER` cannot have a geometry stage.
This check is done in debug and release builds.
- Shaders that use a fallback shader should have instance names set in
the stage interfaces. This check is only done in debug builds.
Pull Request: https://projects.blender.org/blender/blender/pulls/113649
This PR adds workarounds for platforms that don't support `shaderOutputLayer`
or `shaderOutputViewportIndex`. Some NVIDIA laptop GPUs and ARM GPUs don't
have those device features.
The workaround uses the same approach as OpenGL. A geometry shader is injected
to emulate the feature.
For testing the workarounds they have also been connected to the
`--debug-gpu-force-workarounds` command line argument.
Fixes#113475 by implementing #113529
Pull Request: https://projects.blender.org/blender/blender/pulls/113605
Vulkan backend doesn't support subpass transition yet. Currently
shader_builder fails as when compiling shaders that use subpass
transitions.
This PR adds a dummy global variable to the GLSL so the compilation
continues.
Development of subpass would take some time as the current API doesn't
fit Vulkan.
Pull Request: https://projects.blender.org/blender/blender/pulls/113604
The OpenGL extension `GL_ARB_shader_viewport_layer_array` wasn't turned
off when starting blender with `--debug-gpu-force-workarounds`. Although
all known supported platforms support this extension and we haven't seen
any failures with it, it was mentioned to be phased out as we consider
it to be a core extension, which it isn't.
This PR will enable the workaround for this extension and remove the
extension from the phase out list.
Pull Request: https://projects.blender.org/blender/blender/pulls/113572