I've hit this a couple of times and disabling it always worked fine for me. So
it's good to make it more obvious that there is an actual bug instead of a
missed optimization.
Pull Request: https://projects.blender.org/blender/blender/pulls/135467
When blender is compiled with `WITH_OPENSUBDIV=Off` Blender just works
fine. However when compiling all the static shaders the OpenSubDiv
shaders are also compiled and fail as they rely on OpenSubDiv.
This PR fixes this by only adding the shaders when OpenSubDiv is
available.
This issue could be reproduced using the `--debug-gpu-compile-shaders`
option or running GPU test cases.
Pull Request: https://projects.blender.org/blender/blender/pulls/135285
Move the `StaticShader` class from Workbench to `GPU_shader` and make
compilation thread-safe (Shader usage is still not thread-safe).
Use `StaticShader`s for all shader caches.
Subdivision shaders are still not ported.
(Part of #134690)
Pull Request: https://projects.blender.org/blender/blender/pulls/134812
This PR migrates the subdiv_patch_evaluation_comp.glsl to use
shader create info.
The part of OSD that is used is included as a typedef source (osd_patch_basis.glsl).
Pull Request: https://projects.blender.org/blender/blender/pulls/134917
This patch refactors GPU shaders to remove includes to the utility
gpu_shader_common_math.glsl file. This is done because it has duplicate
functions that exist in other files, and it was really created for use
in GPU material nodes.
The safe_divide and hypot functions were removed since they exist in
gpu_shader_math_base_lib.glsl.
The compatible_[mod|pow] and wrap functions were moved into
gpu_shader_math_base_lib.glsl.
The floor_to_int function was inlined since it was trivial and only used
in one place.
The quick_floor was removed because it was unused.
The euler_to_mat3 function was replaced with the from_rotation function
from gpu_shader_math_matrix_lib.glsl.
Now the file only contains some GPU material node utility functions.
Pull Request: https://projects.blender.org/blender/blender/pulls/135160
This patches removes common_math_utils includes from compositor shaders
and replaces them with math lib includes. This involves moving some
functions from that file to to the math lib files.
Pull Request: https://projects.blender.org/blender/blender/pulls/135157
And slightly simplify two string processing functions in this API,
`GPU_vertformat_safe_attr_name` and `copy_attr_name`.
This makes the API easier to interface with from C++ code,
and can avoid unnecessary string length measurements.
Pull Request: https://projects.blender.org/blender/blender/pulls/134882
Move the code dealing with converting float3 to GPU normals
out of the vertex format header into a separate header. Use a
proper C++ namespace and remove duplication by only using
the more recently added C++ templated conversions.
Most of the diff comes from the removal of the indirect includes
from GPU_vertex_format.hh. A lot of files ended up mistakenly
depending on that.
Pull Request: https://projects.blender.org/blender/blender/pulls/134873
The previous fix 8f00f068ad
doesn't work as the printf buffer gets recreated.
Ensure render boundaries at lower level and do the printf
flush manually.
Subdivision shaders currently fail to compile using Metal as it doesn't recognize
packed_float3 as an internal data type. This PR includes packed_float3 as an
internal data type.
Without this `blender --debug-gpu-compile-shaders` will fail as it includes a namespace.
```
ERROR (gpu.shader): subdiv_normals_accumulate Compute Shader:
|
| source/blender/gpu/metal/mtl_shader_generator.mm:971:9: Error: no type named 'packed_float3' in 'MTLShaderComputeImpl'; did you mean simply 'packed_float3'?
|
| device MTLShaderComputeImpl::packed_float3* normals[[buffer(MTL_storage_buffer_base_index+4)]],
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
| packed_float3
|
| /System/Library/PrivateFrameworks/GPUCompiler.framework/Versions/32023/Libraries/lib/clang/32023.196/include/metal/metal_packed_vector:145:58: Note: 'packed_float3' declared here
|
| typedef __attribute__((__packed_vector_type__(3))) float packed_float3;
| ^
```
Pull Request: https://projects.blender.org/blender/blender/pulls/134925
Followup to 48e26c3afe, and discussions in !134771 about keeping
'C-style' and 'C++ template type-safe style' implementations of our
guardedalloc separated. And it makes `MEM_freeN<T>` code simpler.
Also skip type-checking in `MEM_freeN<T>` only with MSVC, as clang-cl on
windows-arm64 does work fine with DNA structs using
`DNA_DEFINE_CXX_METHODS`.
Pull Request: https://projects.blender.org/blender/blender/pulls/134861
This change migrates the first 2 subdiv shaders to use the ShaderCreateInfo.
Other shaders will follow in separate PRs.
- Should compile when using `WITH_GPU_SHADER_CPP_COMPILATION`
- A `subdiv_` prefix is added only to the functions related to `PosNorLoop`.
But eventually the prefix should also be added to other lib functions.
- Due to Metal restrictions `subdiv_set_vertex_*` is implemented using a
functional paradigma. Our Metal backend only supports `inout` qualifier
on thead local data structures.
Pull Request: https://projects.blender.org/blender/blender/pulls/134218
When compiling shaders using GCC there are warnings about functions
being declared twice. This PR will remove those warnings as they are
false positives. The warnings exists to identify typing errors.
Pull Request: https://projects.blender.org/blender/blender/pulls/134832
Though "Point Cloud" written as two words is technically correct and should be used in the UI, as one word it's typically easier to write and parse when reading. We had a mix of both before this patch, so better to unify this as well.
This commit also renames the editor/intern/ files to remove pointcloud_ prefix.
point_cloud was only preserved on the user facing strings:
* is_type_point_cloud
* use_new_point_cloud_type
Pull Request: https://projects.blender.org/blender/blender/pulls/134803
Add a `--profile-gpu` launch argument.
When set, it generates a profile in the Trace Event Format with CPU and
GPU metrics based on GPU debug scopes.
https://profilerpedia.markhansen.co.nz/formats/trace-event-format/
The profiles are best viewed at https://ui.perfetto.dev/
Notes:
- The profiler captures everything form app start to exit.
- Being JSON based the profiles can become relatively large, but they
compress very well.
- Only OpenGL profiling is supported for now, but the report formatting
code can be shared across backends.
Pull Request: https://projects.blender.org/blender/blender/pulls/133557
The vulkan backend was implemented with async in mind, however the one place
where Blender uses for async was implemented blocking. This PR splits the
readback into flushing the command and waiting for readback.
**Performance**
Improvement of animation playback performance of shader balls.blend is around 10%.
Shader balls.blend frame: 1-100, 10 x animation playback
| Branch | Total time | Average time |
| -------------------- | ---------- | ------------ |
| blender-v4.4-release | 26851 ms | 2685 ms |
| This PR | 23675 ms | 2367 ms |
Pull Request: https://projects.blender.org/blender/blender/pulls/134227
This commit adds the "is volume scatter" output to the light path node
in the shader editor.
All the funcitonal code for this feature already exists in Cycles SVM
and OSL, but the output wasn't exposed on the node.
EEVEE does not support the feature, so it's output will
always be zero.
Pull Request: https://projects.blender.org/blender/blender/pulls/134343
This patch adds the texture pool functionality that was previously
only available in the DRW module to the GPU module.
This allows to not rely on global `DST` variable for the managment
of these temporary textures.
Moreover, this can be extended using dedicated GPU backend
specific behavior to reduce the amount of memory needed
to render.
The implementation is mostly copy pasted from the draw implementation
but with more documentation. Also it is simplified since the
`DRW_texture_pool_query` functionality is not needed.
Pull Request: https://projects.blender.org/blender/blender/pulls/134403
There seems to be a pattern where this commonly failed.
This patch adds the async flush (which is effectively not async)
when there were no previous call to `async_flush_to_host`.
This is only done on Intel Macs (or any mac that has non
unified memory arch).
Pull Request: https://projects.blender.org/blender/blender/pulls/134216
The removal of the loose uniform made the shader not compile.
This patch adds a new define for these type of shaders and add
back the loose uniform.
Note that these shaders might no longer work on Metal as
the source is not parsed anymore.
Pull Request: https://projects.blender.org/blender/blender/pulls/134341
Use sub-pixel differentials for bump mapping helps with reducing
artifacts when objects are moving or when textures have high frequency
details.
Currently we scale it by 0.1 because it seems to work good in practice,
we can adjust the value in the future if it turns out to be impractical.
Ref: #122892
Pull Request: https://projects.blender.org/blender/blender/pulls/133991