There were some issues with workbench shadow drawing. This PR
does some tweaks to fix the shadow drawing on vulkan.
* Framebuffer stencil clearing when write stencil is disabled
* Tweaks to stencil operation and tests
* Disable restart for line adjacency
Pull Request: https://projects.blender.org/blender/blender/pulls/114673
Some minor tweaks to the vulkan backend to support grease pencil
drawing. The changes include:
* Add support for GPU_DATA_10_11_11_REV clearing
* Use correct index buffer start and count
Anti aliasing isn't working as they require different samplers being
configured and that require some design work.
Effects haven't been tested.
Pull Request: https://projects.blender.org/blender/blender/pulls/114659
On some platforms `VK_FORMAT_R8G8B8_*` are not supported as vertex buffers. The
obvious workaround for this is to use `VK_FORMAT_R8G8B8A8_*`. Using unsupported
vertex formats would crash Blender as it is not able to compile the graphics
pipelines that use them.
Known platforms are:
- NVIDIA Mobile GPUs (Quadro M1000M)
- AMD Polaris (open source drivers)
This PR adds the initial workings for other unsupported vertex buffer formats we
need to fix in the future.
`VKDevice.workarounds.vertex_formats` contain booleans if the workaround for
a specific format should be turned on (`r8g8b8 = true`). `VertexFormatConverter` can be
used to identify if conversions are needed and perform the conversion.
Pull Request: https://projects.blender.org/blender/blender/pulls/114572
Ensure that on closing of the application, all pending
SafeFreeLists are released. A new change to ensure
release of SafeFreeLists was always deferred until
full completion of GPU buffers meant that a pending
MTLSafeFreeList container may not be released on
shutdown.
This change ensures that the container is fully
destructed on final shutdown.
Authored by Apple: Michael Parkin-White
Pull Request: https://projects.blender.org/blender/blender/pulls/114532
Fixes a harsh transistion between diffuse and subsurface scattering
materials in the Principled BSDF as a user increases the Subsurface
Scattering Weight from 0 to 1.
Pull Request: https://projects.blender.org/blender/blender/pulls/114500
The null byte wasn't taken into account when allocating memory
to strcpy into.
The calculation to check if allocation was needed was also wrong,
causing allocation for every string.
In practice it's not so likely users would ever hit this since
the function tended to over allocate, even in the case an off by one
error occurred, in all likelihood the room would already be available.
Ref !114512
Vulkan API uses Flags and FlagBits for enumerations. The FlagBits
contains the options that can be hold with the Flags data type.
This wasn't well understood at the beginning of the project and
the FlagBits where used where Flags should have been used. This cleanup
fixes this, improving the readability of the code where bit
manipulations where used.
Pull Request: https://projects.blender.org/blender/blender/pulls/114459
This adds a new way of computing occlusion using visibility bitmask. To
make it more algorithm agnostic, we name it horizon scan.
This cleans-up / simplify the code compared to the Horizon based solution.
There is no more trickery for fading influence of distant samples which
makes the result match cycles closer.
This introduces a new thickness option. Maintaining it relatively low
makes it possible to avoid over occlusion because of in front geometry.
Making it too low will cause under occlusion.
Related #112979
Pull Request: https://projects.blender.org/blender/blender/pulls/114150
When using EEVEE-Next the final render data is readback from texture
views. This wasn't implemented yet.
This PR adds support for reading back texture views. It also makes sure
the correct layer is read when reading back data from framebuffers and
adds internal support to read back a partial texture.
Pull Request: https://projects.blender.org/blender/blender/pulls/114411
After previous changes to allow command buffers to not require
execution and completion in submission order,
guarantees for releasing freed buffers back to the memory pool
within the frame life time had changed.
This could mean a released buffer could be returned to the
memory pool prematurely, if a subsequent command buffer
completes before a previously submitted one, flagging a resource as no
longer in use by the GPU, while it still may be in use by the orignal
command buffer.
This PR defers final reference count release for buffers
being actively used until the following call to GPU_render_step,
to ensure that buffers freed will be available for the lifetime of
the frame, covering all command submissions, rather than just
within the lifetime of the command buffer submission within which a
buffer was freed.
Authored by Apple: Michael Parkin-White
Pull Request: https://projects.blender.org/blender/blender/pulls/114329
This PR replaces submissions with pipeline barriers when generating
mipmaps. This would reduce the amount of submissions and improve
performance during mipmap generation as dependencies will be tracked
between commands.
Pull Request: https://projects.blender.org/blender/blender/pulls/114306
Adjust clamping of inputs in the Principled BSDF to avoid errors and
inconsistencies between render engines, while trying to leave as many
inputs as possible unclamped for artisitc purposes.
Pull Request: https://projects.blender.org/blender/blender/pulls/112895
The last good commit was 8474716abb.
After this commits from main were pushed to blender-v4.0-release. These are
being reverted.
Commits a4880576dc from to b26f176d1a that happend afterwards were meant for
4.0, and their contents is preserved.
Goal is to reduce the number of command buffer flushes by tracking what is
happening in the different command queues. This is an initial step towards
advanced queue-ing strategies.
The new (intermediate) strategy records commands to different command
buffers based on what they do. There is a command buffer for data transfers,
compute pipelines and graphics pipelines.
When a compute command is recorded it ensures that all graphic commands
are finished. When a graphic command is recorded it ensures all compute
commands are finished. When a graphic or compute command is scheduled
all recorded data transfer commands are scheduled as well.
Some improvements are expected as multiple compute and data transfers
commands can now be scheduled at the same time and don't need to unbind
and rebind render passes. Especially when using EEVEE-Next which is
compute centric the performance change is visible for the user.
Pull Request: https://projects.blender.org/blender/blender/pulls/114104
When using Voronoi shader nodes on legacy Intel platforms (HD4400) Blender would crash
due to a driver bug. The bug is related to generating the `fractal_voronoi_x_fx` functions.
It doesn't effect all drivers, but mainly from vendors that don't allow installing the official
intel drivers.
We have tried several approaches including using unique function names and unroll only the
function of the body. But none worked on the failing platform.
In the future we could solve this by including our own GLSL compiler, but that is still very
experimental and requires a lot of testing.#113938
Pull Request: https://projects.blender.org/blender/blender/pulls/113834
Some of these devices are not capable of running >=4.0, due to issues
with Mesa's Compute Shaders and their D3D drivers.
This PR marks those GPUs as unsupported, and prints info to stdout.
A driver update will be available for 8cx Gen3 on the 17th October
from here:
https://www.qualcomm.com/products/mobile/snapdragon/pcs-and-tablets/snapdragon-8-series-mobile-compute-platforms/snapdragon-8cx-gen-3-compute-platform#Software
It will take longer via the standard MS Windows Update channels,
as there is certification, testing, etc required, but it is possible
to get the drivers, at least.
This issue applies even when using emulated x64.
If this does not get merged, all WoA devices will break with 4.0,
where older ones will just launch a grey screen and crash, and newer
ones will open, but scenes will not render correctly in Workbench.
These devices work by using Mesa's D3D12 Gallium driver ("GLOn12"),
which is why we have to read the DirectX driver version - the version
reported by OpenGL is the mesa version, which is independent of the
driver (which is the part with the bug).
Pull Request: https://projects.blender.org/blender/blender/pulls/113674
PR Introduces GPU_storagebuf_sync_to_host as an explicit routine to
flush GPU-resident storage buffer memory back to the host within the
GPU command stream.
The previous implmentation relied on implicit synchronization of
resources using OpenGL barriers which does not match the
paradigm of explicit APIs, where indiviaul resources may need
to be tracked.
This patch ensures GPU_storagebuf_read can be called without
stalling the GPU pipeline while work finishes executing. There are
two possible use cases:
1) If GPU_storagebuf_read is called AFTER an explicit call to
GPU_storagebuf_sync_to_host, the read will be synchronized.
If the dependent work is still executing on the GPU, the host
will stall until GPU work has completed and results are available.
2) If GPU_storagebuf_read is called WITHOUT an explicit call to
GPU_storagebuf_sync_to_host, the read will be asynchronous
and whatever memory is visible to the host at that time will be used.
(This is the same as assuming a sync event has already been signalled.)
This patch also addresses a gap in the Metal implementation where
there was missing read support for GPU-only storage buffers.
This routine now uses a staging buffer to copy results if no
host-visible buffer was available.
Reading from a GPU-only storage buffer will always stall
the host, as it is not possible to pre-flush results, as no
host-resident buffer is available.
Authored by Apple: Michael Parkin-White
Pull Request: https://projects.blender.org/blender/blender/pulls/113456
The previous formula for adjusting Coat Tint intensity resulted
in strong tints and sudden colour changes when using a low coat weight.
This commit fixes these issues by mixing between a white tint (no tint)
and the chosen tint based on the Coat Weight.
Pull Request: https://projects.blender.org/blender/blender/pulls/113468
This adds correct object bounds estimation.
This works by creating an occupancy texture where one
bit represents one froxel. A geometry pre-pass fill this
occupancy texture and doesn't do any shading. Each bit
set to 0 will not be considered occupied by the object
volume and will discard the material compute shader for
this froxel.
There is 2 method of computing the occupancy map:
- Atomic XOR: For each fragment we compute the amount of
froxels **center** in-front of it. We then convert that
into occupancy bitmask that we apply to the occupancy
texture using `imageAtomicXor`. This is straight forward
and works well for any manifold geometry.
- Hit List: For each fragment we write the fragment depth
in a list (contained in one array texture). This list
is then processed by a fullscreen pass (see
`eevee_occupancy_convert_frag.glsl`) that sorts and
converts all the hits to the occupancy bits. This
emulate Cycles behavior by considering only back-face
hits as exit events and front-face hits as entry events.
The result stores it to the occupancy texture using
bit-wise `OR` operation to compose it with other non-hit
list objects. This also decouple the hit-list evaluation
complexity from the material evaluation shader.
## Limitations
### Fast
- Non-manifolds geometry objects are rendered incorrectly.
- Non-manifolds geometry objects will affect other objects
in front of them.
### Accurate
- Limited to 16 hits per layer for now.
- Non-manifolds geometry objects will affect other objects
in front of them.
Pull Request: https://projects.blender.org/blender/blender/pulls/113731
Allow binding of framebuffers without sRGB to linear transform.
`GPU_framebuffer_bind_no_srgb`. This Patch removes color transform
artifacts in node, image and sequence editor.
When the framebuffer is an srgb framebuffer and it is bound without
the transformation, the SRGB textures are bound as UNORM variants.
As framebuffer, render pass and subpass recreation is ensured by
`VKCommandBuffer` we don't need to mark the framebuffer dirty at
this time. Later on we can optimize this by adding a state changed
detection for framebuffers and render passes.
Pull Request: https://projects.blender.org/blender/blender/pulls/113838
With the shift to GPU-driven rendering pipeline,
the SSBO vertex fetch paradigm used to
implement workbench shadows on Metal
instead of utilising the geometry shader
path no longer worked correctly.
This is because the draw submission
required vertex amplification up-front,
based on the expected output geometry
amount for a given input geometry.
This patch aims to resolve this
issue through addition of API to
enable the features within the
GPU driven pipeline.
Co-authored-by: Michael Parkin-White <mparkinwhite@apple.com>
Pull Request: https://projects.blender.org/blender/blender/pulls/113498
Changes to ensure all supported texture tests are passing with the
Metal backend and add additional tests to cover texture_3d and
texture 1d test cases.
Authored by Apple: Michael Parkin-White
Co-authored-by: Michael Parkin-White <mparkinwhite@apple.com>
Pull Request: https://projects.blender.org/blender/blender/pulls/113889