Blender already had its own copy of OpenSubDiv containing some local fixes
and code-style. This code still used gl-calls. This PR updates the calls
to use GPU module. This allows us to use OpenSubDiv to be usable on other
backends as well.
This PR was tested on OpenGL, Vulkan and Metal. Metal can be enabled,
but Vulkan requires some API changes to work with loose geometry.

# Considerations
**ShaderCreateInfo**
intern/opensubdiv now requires access to GPU module. This to create buffers
in the correct context and trigger correct dispatches. ShaderCreateInfo is used
to construct the shader for cross compilation to Metal/Vulkan. However opensubdiv
shader caching structures are still used.
**Vertex buffers vs storage buffers**
Implementation tries to keep as close to the original OSD implementation. If
they used storage buffers for data, we will use GPUStorageBuf. If it uses vertex
buffers, we will use gpu::VertBuf.
**Evaluator const**
The evaluator cannot be const anymore as the GPU module API only allows
updating SSBOs when constructing. API could be improved to support updating
SSBOs.
Current implementation has a change to use reads out of bounds when constructing
SSBOs. An API change is in the planning to remove this issue. This will be fixed in
an upcoming PR. We wanted to land this PR as the visibility of the issue is not
common and multiple other changes rely on this PR to land.
Pull Request: https://projects.blender.org/blender/blender/pulls/135296
It has been confirmed that the latest release of AMD drivers has fixed
issues for both OpenGL and Vulkan. Users should use AMD driver 25.3.1
or later. Removing the workaround as it has performance penalties on
RDNA2 based GPUs.
Reference: #135516
Pull Request: https://projects.blender.org/blender/blender/pulls/135630
The general idea is to keep the 'old', C-style MEM_callocN signature, and slowly
replace most of its usages with the new, C++-style type-safer template version.
* `MEM_cnew<T>` allocation version is renamed to `MEM_callocN<T>`.
* `MEM_cnew_array<T>` allocation version is renamed to `MEM_calloc_arrayN<T>`.
* `MEM_cnew<T>` duplicate version is renamed to `MEM_dupallocN<T>`.
Similar templates type-safe version of `MEM_mallocN` will be added soon
as well.
Following discussions in !134452.
NOTE: For now static type checking in `MEM_callocN` and related are slightly
different for Windows MSVC. This compiler seems to consider structs using the
`DNA_DEFINE_CXX_METHODS` macro as non-trivial (likely because their default
copy constructors are deleted). So using checks on trivially
constructible/destructible instead on this compiler/system.
Pull Request: https://projects.blender.org/blender/blender/pulls/134771
I've hit this a couple of times and disabling it always worked fine for me. So
it's good to make it more obvious that there is an actual bug instead of a
missed optimization.
Pull Request: https://projects.blender.org/blender/blender/pulls/135467
When blender is compiled with `WITH_OPENSUBDIV=Off` Blender just works
fine. However when compiling all the static shaders the OpenSubDiv
shaders are also compiled and fail as they rely on OpenSubDiv.
This PR fixes this by only adding the shaders when OpenSubDiv is
available.
This issue could be reproduced using the `--debug-gpu-compile-shaders`
option or running GPU test cases.
Pull Request: https://projects.blender.org/blender/blender/pulls/135285
Move the `StaticShader` class from Workbench to `GPU_shader` and make
compilation thread-safe (Shader usage is still not thread-safe).
Use `StaticShader`s for all shader caches.
Subdivision shaders are still not ported.
(Part of #134690)
Pull Request: https://projects.blender.org/blender/blender/pulls/134812
This PR migrates the subdiv_patch_evaluation_comp.glsl to use
shader create info.
The part of OSD that is used is included as a typedef source (osd_patch_basis.glsl).
Pull Request: https://projects.blender.org/blender/blender/pulls/134917
This patch refactors GPU shaders to remove includes to the utility
gpu_shader_common_math.glsl file. This is done because it has duplicate
functions that exist in other files, and it was really created for use
in GPU material nodes.
The safe_divide and hypot functions were removed since they exist in
gpu_shader_math_base_lib.glsl.
The compatible_[mod|pow] and wrap functions were moved into
gpu_shader_math_base_lib.glsl.
The floor_to_int function was inlined since it was trivial and only used
in one place.
The quick_floor was removed because it was unused.
The euler_to_mat3 function was replaced with the from_rotation function
from gpu_shader_math_matrix_lib.glsl.
Now the file only contains some GPU material node utility functions.
Pull Request: https://projects.blender.org/blender/blender/pulls/135160
This patches removes common_math_utils includes from compositor shaders
and replaces them with math lib includes. This involves moving some
functions from that file to to the math lib files.
Pull Request: https://projects.blender.org/blender/blender/pulls/135157
And slightly simplify two string processing functions in this API,
`GPU_vertformat_safe_attr_name` and `copy_attr_name`.
This makes the API easier to interface with from C++ code,
and can avoid unnecessary string length measurements.
Pull Request: https://projects.blender.org/blender/blender/pulls/134882
Move the code dealing with converting float3 to GPU normals
out of the vertex format header into a separate header. Use a
proper C++ namespace and remove duplication by only using
the more recently added C++ templated conversions.
Most of the diff comes from the removal of the indirect includes
from GPU_vertex_format.hh. A lot of files ended up mistakenly
depending on that.
Pull Request: https://projects.blender.org/blender/blender/pulls/134873
The previous fix 8f00f068ad
doesn't work as the printf buffer gets recreated.
Ensure render boundaries at lower level and do the printf
flush manually.
Subdivision shaders currently fail to compile using Metal as it doesn't recognize
packed_float3 as an internal data type. This PR includes packed_float3 as an
internal data type.
Without this `blender --debug-gpu-compile-shaders` will fail as it includes a namespace.
```
ERROR (gpu.shader): subdiv_normals_accumulate Compute Shader:
|
| source/blender/gpu/metal/mtl_shader_generator.mm:971:9: Error: no type named 'packed_float3' in 'MTLShaderComputeImpl'; did you mean simply 'packed_float3'?
|
| device MTLShaderComputeImpl::packed_float3* normals[[buffer(MTL_storage_buffer_base_index+4)]],
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
| packed_float3
|
| /System/Library/PrivateFrameworks/GPUCompiler.framework/Versions/32023/Libraries/lib/clang/32023.196/include/metal/metal_packed_vector:145:58: Note: 'packed_float3' declared here
|
| typedef __attribute__((__packed_vector_type__(3))) float packed_float3;
| ^
```
Pull Request: https://projects.blender.org/blender/blender/pulls/134925
Followup to 48e26c3afe, and discussions in !134771 about keeping
'C-style' and 'C++ template type-safe style' implementations of our
guardedalloc separated. And it makes `MEM_freeN<T>` code simpler.
Also skip type-checking in `MEM_freeN<T>` only with MSVC, as clang-cl on
windows-arm64 does work fine with DNA structs using
`DNA_DEFINE_CXX_METHODS`.
Pull Request: https://projects.blender.org/blender/blender/pulls/134861
This change migrates the first 2 subdiv shaders to use the ShaderCreateInfo.
Other shaders will follow in separate PRs.
- Should compile when using `WITH_GPU_SHADER_CPP_COMPILATION`
- A `subdiv_` prefix is added only to the functions related to `PosNorLoop`.
But eventually the prefix should also be added to other lib functions.
- Due to Metal restrictions `subdiv_set_vertex_*` is implemented using a
functional paradigma. Our Metal backend only supports `inout` qualifier
on thead local data structures.
Pull Request: https://projects.blender.org/blender/blender/pulls/134218
When compiling shaders using GCC there are warnings about functions
being declared twice. This PR will remove those warnings as they are
false positives. The warnings exists to identify typing errors.
Pull Request: https://projects.blender.org/blender/blender/pulls/134832
Though "Point Cloud" written as two words is technically correct and should be used in the UI, as one word it's typically easier to write and parse when reading. We had a mix of both before this patch, so better to unify this as well.
This commit also renames the editor/intern/ files to remove pointcloud_ prefix.
point_cloud was only preserved on the user facing strings:
* is_type_point_cloud
* use_new_point_cloud_type
Pull Request: https://projects.blender.org/blender/blender/pulls/134803
Add a `--profile-gpu` launch argument.
When set, it generates a profile in the Trace Event Format with CPU and
GPU metrics based on GPU debug scopes.
https://profilerpedia.markhansen.co.nz/formats/trace-event-format/
The profiles are best viewed at https://ui.perfetto.dev/
Notes:
- The profiler captures everything form app start to exit.
- Being JSON based the profiles can become relatively large, but they
compress very well.
- Only OpenGL profiling is supported for now, but the report formatting
code can be shared across backends.
Pull Request: https://projects.blender.org/blender/blender/pulls/133557
The vulkan backend was implemented with async in mind, however the one place
where Blender uses for async was implemented blocking. This PR splits the
readback into flushing the command and waiting for readback.
**Performance**
Improvement of animation playback performance of shader balls.blend is around 10%.
Shader balls.blend frame: 1-100, 10 x animation playback
| Branch | Total time | Average time |
| -------------------- | ---------- | ------------ |
| blender-v4.4-release | 26851 ms | 2685 ms |
| This PR | 23675 ms | 2367 ms |
Pull Request: https://projects.blender.org/blender/blender/pulls/134227