Metal and Vulkan don't support line smoothing. There is a workaround
implemented. This workaround is only enabled when linesmooth value is
larger than 1. However When using smooth lines it should also be used.
This is fixed by adding a `GPU_line_smooth_get` function for getting the
current line smooth state.
Pull Request: https://projects.blender.org/blender/blender/pulls/138123
* Pixel buffer is always allocated with export and dedicated memory flags.
* Returns an opaque file descriptor (Unix) or handle (Windows).
* Native handle now includes memory size as it may be slightly bigger
than the requested size.
Pull Request: https://projects.blender.org/blender/blender/pulls/137363
It's safer to pass a type so that it can be checked if delete should be
used instead. Also changes a few void pointer casts to const_cast so that
if the data becomes typed it's an error.
Pull Request: https://projects.blender.org/blender/blender/pulls/137404
This makes no sense to have in the draw namespace.
Also take the opportunity for making the coordinates
a float2 and rename them to something more descriptive.
This unify the C++ and GLSL codebase style.
The GLSL types are still in the backend compatibility
layers to support python shaders. However, the C++
shader compilation layer doesn't have them to enforce
correct type usage.
Note that this is going to break pretty much all PRs
in flight that targets shader code.
Rel #137261
Pull Request: https://projects.blender.org/blender/blender/pulls/137369
- Make the module global and allow usage from anywhere.
- Remove the matrix API for thread safety reason
- Add lifetime management
- Make display linked to the overlays for easy toggling
## Notes
- Lifetime is in redraw. If there is 4 viewport redrawing, the lifetime decrement by 4 for each window redraw. This allows one viewport to be producer and another one being the consumer.
- Display happens at the start of overlays. Any added visuals inside of the overlays drawing functions will be displayed the next redraw.
- Redraw is not given to happen. It is only given if there is some scene / render update which tags the viewport to redraw.
- Persistent lines are not reset on file load.
Rel #137006
Pull Request: https://projects.blender.org/blender/blender/pulls/137106
Includes win32 specific extensions definitions when including
`vk_common.hh`. Inside `gpu_context.cc` vulkan needs to be
included before opengl, otherwise windows 10 builders will
report a warning.
```
[6421/7520] Building CXX object source\blender\gpu\CMakeFiles\bf_gpu.dir\intern\gpu_context.cc.obj
C:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\shared\minwindef.h(130): warning C4005: 'APIENTRY': macro redefinition
C:\Users\blender\git\blender-vexp\blender.git\lib\windows_x64\epoxy\include\epoxy/gl.h(59): note: see previous definition of 'APIENTRY'
```
Pull Request: https://projects.blender.org/blender/blender/pulls/137134
Update the `ShaderCompilerGeneric` to support deferred compilation
using the batch compilation API, so we can get rid of
`drw_manager_shader`.
This approach also allows supporting non-blocking compilation
for static shaders.
This shouldn't cause any behavior changes at the moment, since batch
compilation is not yet used when parallel compilation is disabled.
This adds a `GPUWorker` and a `GPUSecondaryContext` as an easy to use
wrapper for managing secondary GPU contexts.
(Part of #133674)
Pull Request: https://projects.blender.org/blender/blender/pulls/136518
This is getting in the way of making the
GPUShader API more threadsafe.
This getter already doesn't work for vulkan
and Metal, and has very limited usage.
Keeping the python function to avoid errors
and display a deprecation warning.
Pull Request: https://projects.blender.org/blender/blender/pulls/136983
This patch supports GPU OIDN denoising in the compositor. A new
compositor performance option was added to allow choosing between CPU,
GPU, and Auto device selection. Auto will use whatever the compositor is
using for execution.
The code is two folds, first, denoising code was adapted to use buffers
as opposed to passing in pointers to filters directly, this is needed to
support GPU devices. Second, device creation is now a bit more involved,
it tries to choose the device is being used by the compositor for
execution.
Matching GPU devices is done by choosing the OIDN device that matches
the UUID or LUID of the active GPU platform. We need both UUID and LUID
because not all platforms support both. UUID is supported on all
platforms except MacOS Metal, while LUID is only supported on Window and
MacOS metal.
If there is no active GPU device or matching is unsuccessful, we let
OIDN choose the best device, which is typically the fastest.
To support this case, UUID and LUID identifiers were added to the
GPUPlatformGlobal and are initialized by the GPU backend if supported.
OpenGL now requires GL_EXT_memory_object and GL_EXT_memory_object_win32
to support this use case, but it should function without it.
Pull Request: https://projects.blender.org/blender/blender/pulls/136660
This allow to store the full object ID inside a `uint32`
buffer. This allows to get the per object data in deferred
passes and avoid to store object data inside the Gbuffer.
This data is only written if needed.
This had to modify the implementation of subpass input
for all backend to be able to bind layered texture.
This currently work because only the layer 0 is bound to the
framebuffer. This is fragile but I don't see a good builtin way
to fix it.
Rel #135935
#### Tasks
- [x] Replace light linking bits in Gbuffer
- [x] Replace Object ID in GBuffer for SSS
- [x] Conditional storage
- [x] Dummy storage if not needed
Pull Request: https://projects.blender.org/blender/blender/pulls/136428
Cause by trying to deference a null batch.
In normal case, these calls never create a command
and are discarded early. 4.4 introduced the
polyline_draw_workaround to remove the use of geometry
shaders. These were not guarded against zero vertices
calls.
Adding an early out clause fixes the issue.
Pull Request: https://projects.blender.org/blender/blender/pulls/136840
This was caused by the deinterleaved format being
incorrectly decoded by the `bind_attribute_as_ssbo`
function.
Accumulating the offset should be done for all attributes
and not only the one being used. Furthermore, this needs
to happen only once per attribute and not once per name.
Moving the offset computation out of the name loop
fixes the issue.
Pull Request: https://projects.blender.org/blender/blender/pulls/136821
Followup to 9b70851d91.
Return buffers by value rather than creating an empty/uninitialized
buffer first, then initializing it in an extraction function. This generally
makes the code easier to follow. And avoiding these half-created buffers
is an essential step to adding some sort of more global cache.
Pull Request: https://projects.blender.org/blender/blender/pulls/136570
This makes it possible to restore previous Blender 4.3 behavior of bump
mapping, where the large filter width was sometimes (ab)used to get a bevel
like effect on stepwise textures.
For bump from the displacement socket, filter width remains fixed at 0.1.
Ref #133991, #135841
Pull Request: https://projects.blender.org/blender/blender/pulls/136465
Potential fix for legacy AMD driver issue.
- Updating drivers using a clean install has proven to fix the issue as well.
As driver can leave parts of an older driver active what it actually the cau
- Solution comments out the `#line` by replacing the first two characters.
Pull Request: https://projects.blender.org/blender/blender/pulls/136231
This patch adds a new Image Info node which returns information about
compositor images. The node has three sources:
The node returns the following information:
- Pixel Coordinates: The coordinates of the centers of the pixels in the
image. Those are essentially the integer coordinates with half pixels
offsets added.
- Texture Coordinates: Zero centered pixel coordinates normalized along
the greater dimension. Somewhat analogous to Object coordinates in
shader nodes.
- Resolution: The resolution of the image.
- Location: The location of the image in the virtual compositing space.
- Rotation: The rotation of the image in the virtual compositing space.
- Scale: The scale of the image in the virtual compositing space.
This node is very useful to allow greater flexibility and procedural
creations. For instance, coordinates can be used to create procedural
effects like vignette using very simple math nodes. And size can be used
to compute size-relative parameters for pixel-parameter nodes.
Pull Request: https://projects.blender.org/blender/blender/pulls/135104
Since the introduction of storage buffers in Blender, the calling
code has been responsible for ensuring the buffer meets allocation
requirements. All backends require the allocation size to be divisible
by 16 bytes. Until now, this was sufficient, but with GPU subdivision
changes, an external library must also adhere to these requirements.
For OpenSubdiv (OSD), some buffers are not 16-byte aligned, leading
to potential misallocation. Currently, this is mitigated by allocating
a few extra bytes, but this approach has the drawback of potentially
reading unintended bytes beyond the source buffer.
This PR adopts a similar approach to vertex buffers: the backend handles
extra byte allocation while ensuring data uploads and downloads function
correctly without requiring those additional bytes.
No changes were needed for Metal, as its allocation size is already
aligned to 256 bytes.
**Alternative solutions considered**:
- Copying the CPU buffer to a larger buffer when needed (performance impact).
- Modifying OSD buffers to allocate extra space (requires changes to an external library).
- Implementing GPU_storagebuf_update_sub.
Ref #135873
Pull Request: https://projects.blender.org/blender/blender/pulls/135716
Blender already had its own copy of OpenSubDiv containing some local fixes
and code-style. This code still used gl-calls. This PR updates the calls
to use GPU module. This allows us to use OpenSubDiv to be usable on other
backends as well.
This PR was tested on OpenGL, Vulkan and Metal. Metal can be enabled,
but Vulkan requires some API changes to work with loose geometry.

# Considerations
**ShaderCreateInfo**
intern/opensubdiv now requires access to GPU module. This to create buffers
in the correct context and trigger correct dispatches. ShaderCreateInfo is used
to construct the shader for cross compilation to Metal/Vulkan. However opensubdiv
shader caching structures are still used.
**Vertex buffers vs storage buffers**
Implementation tries to keep as close to the original OSD implementation. If
they used storage buffers for data, we will use GPUStorageBuf. If it uses vertex
buffers, we will use gpu::VertBuf.
**Evaluator const**
The evaluator cannot be const anymore as the GPU module API only allows
updating SSBOs when constructing. API could be improved to support updating
SSBOs.
Current implementation has a change to use reads out of bounds when constructing
SSBOs. An API change is in the planning to remove this issue. This will be fixed in
an upcoming PR. We wanted to land this PR as the visibility of the issue is not
common and multiple other changes rely on this PR to land.
Pull Request: https://projects.blender.org/blender/blender/pulls/135296
The general idea is to keep the 'old', C-style MEM_callocN signature, and slowly
replace most of its usages with the new, C++-style type-safer template version.
* `MEM_cnew<T>` allocation version is renamed to `MEM_callocN<T>`.
* `MEM_cnew_array<T>` allocation version is renamed to `MEM_calloc_arrayN<T>`.
* `MEM_cnew<T>` duplicate version is renamed to `MEM_dupallocN<T>`.
Similar templates type-safe version of `MEM_mallocN` will be added soon
as well.
Following discussions in !134452.
NOTE: For now static type checking in `MEM_callocN` and related are slightly
different for Windows MSVC. This compiler seems to consider structs using the
`DNA_DEFINE_CXX_METHODS` macro as non-trivial (likely because their default
copy constructors are deleted). So using checks on trivially
constructible/destructible instead on this compiler/system.
Pull Request: https://projects.blender.org/blender/blender/pulls/134771
When blender is compiled with `WITH_OPENSUBDIV=Off` Blender just works
fine. However when compiling all the static shaders the OpenSubDiv
shaders are also compiled and fail as they rely on OpenSubDiv.
This PR fixes this by only adding the shaders when OpenSubDiv is
available.
This issue could be reproduced using the `--debug-gpu-compile-shaders`
option or running GPU test cases.
Pull Request: https://projects.blender.org/blender/blender/pulls/135285
Move the `StaticShader` class from Workbench to `GPU_shader` and make
compilation thread-safe (Shader usage is still not thread-safe).
Use `StaticShader`s for all shader caches.
Subdivision shaders are still not ported.
(Part of #134690)
Pull Request: https://projects.blender.org/blender/blender/pulls/134812
This PR migrates the subdiv_patch_evaluation_comp.glsl to use
shader create info.
The part of OSD that is used is included as a typedef source (osd_patch_basis.glsl).
Pull Request: https://projects.blender.org/blender/blender/pulls/134917
And slightly simplify two string processing functions in this API,
`GPU_vertformat_safe_attr_name` and `copy_attr_name`.
This makes the API easier to interface with from C++ code,
and can avoid unnecessary string length measurements.
Pull Request: https://projects.blender.org/blender/blender/pulls/134882
Move the code dealing with converting float3 to GPU normals
out of the vertex format header into a separate header. Use a
proper C++ namespace and remove duplication by only using
the more recently added C++ templated conversions.
Most of the diff comes from the removal of the indirect includes
from GPU_vertex_format.hh. A lot of files ended up mistakenly
depending on that.
Pull Request: https://projects.blender.org/blender/blender/pulls/134873