Avoid measuring the length of strings repeatedly by passing their
length along with their data with `StringRefNull`. Null termination
seems to be necessary still for passing the shader sources to OpenGL.
Though I doubt this is a bottleneck, it's still nice to avoid overhead from
string operations and this helps move in that direction.
Pull Request: https://projects.blender.org/blender/blender/pulls/127702
Running Xcode memory graphs and the Instruments tools revealed
memory leaks caused, in the main, by over-retained objects.
This removes the unnecessary 'retains' and adds some asserts
to guard against over-retaining in the future.
There are a few memory leaks remaining involving PyUnicode_DecodeUTF8
but I am unable to identify the cause of these at this time.
Authored by Apple: James McCarthy
Pull Request: https://projects.blender.org/blender/blender/pulls/129117
This removes the need for the geometry shader and the
workaround path for Metal.
Note that creating 2 batches for each stroke might become
a bottleneck in bigger scenes. But currently the bottleneck
is always be the fill algorithm. It can be optimized further
if needed.
Rel #127493
Pull Request: https://projects.blender.org/blender/blender/pulls/129274
Addresses the case when Blender is shutdown before the
parallel compiler has finished processing all the shader batches.
The parallel compiler destructor will now attempt to terminate all
of the outstanding batches and free the shaders.
Authored by Apple: James McCarthy
Pull Request: https://projects.blender.org/blender/blender/pulls/129172
The issue was that the shader `gpu_shader_gpencil_stroke_vert_no_geom.glsl`
assumed a wrong format of the color attribute (`uchar4` instead of `float4`).
The fix uses `vertex_fetch_attribute` with `float4`.
Pull Request: https://projects.blender.org/blender/blender/pulls/129072
We cannot do this for the MSL files. Also
they are written in MSL which should not be
processed. So use datatoc for them.
Also add quote linting for GLSL files.
The goal is to reduce the startup time cost of
all of these parsing and string replacement.
All comments are now stripped at compile time.
This comment check added noticeable slowdown at
startup in debug builds and during preprocessing.
Put all metadatas between start and end token.
Use very simple parsing using `StringRef` and
hash all identifiers.
Move all the complexity to the preprocessor that
massagess the metadata into a well expected input
to the runtime parser.
All identifiers are compile time hashed so that no string
comparison is made at runtime.
Speed up the source loading:
- from 10ms to 1.6ms (6.25x speedup) in release
- from 194ms to 6ms (32.3x speedup) in debug
Follow up #129009
Pull Request: https://projects.blender.org/blender/blender/pulls/128927
This avoid cmake shenanigans to try to make proper
dependency tracking.
The previous code was not tracking changes inside
the create info files.
There is no real benefit for having these headers listed in
the cmakefile itself.
Pull Request: https://projects.blender.org/blender/blender/pulls/129027
WITH_VULKAN_GUARDEDALLOC is a development option to use Blenders guarded
allocator when allocating internal vulkan driver resources. It does not provide any benefits
as this should be covered by vulkan validation and drivers are often ignoring this. This
change will remove the option from cmake and source code.
Pull Request: https://projects.blender.org/blender/blender/pulls/129039
Intel Windows drivers for 10th gen and lower has some strange behavior
when using dynamic rendering. It requires pipeline conditions to be met,
when beginning a new rendering scope. This is strange as specs + VVL
notes that these conditions should be met during vkCmdDraw* commands.
Pull Request: https://projects.blender.org/blender/blender/pulls/129055
This change makes unused attachments extension optional.
This extension is fairly new and not all drivers have support for it.
The workaround will create additional pipelines when attachments are
not set.
Pull Request: https://projects.blender.org/blender/blender/pulls/129046
Reported by @fclem. On Metal GPU_finish enacts a CPU<-> GPU sync by
submitting a command buffer and waiting on the completion event.
However if there was no command buffer to submit then the call was
returning immediately, regardless of any outstanding GPU work.
This fixes that case by keeping track of all outstanding work and
blocking on that.
Authored by Apple: James McCarthy
Co-authored-by: James McCarthy <jamesmccarthy@apple.com>
Pull Request: https://projects.blender.org/blender/blender/pulls/128987
VK_EXT_dynamic_rendering_unused_attachments is required for correct working.
Renderdoc hides this extension, but most platforms do work. However the
Windows Intel driver crashes when using iGPUs; they don't support this
extension at all.
This change does a more strict extension test so drivers that do not
support this extension will fallback to OpenGL. When using renderdoc it
is now allowed to compile blender with `WITH_RENDERDOC=On`.
Future developments are needed to add support for Intel iGPUs on
Windows.
Pull Request: https://projects.blender.org/blender/blender/pulls/128986
Resources are shared, when running multiple contexts on the same thread.
Cycles uses the same context on multiple threads and expected same resources.
This change will introduce a single render graph per context and an updated
resource management. Render graphs are not shared anymore; Resource pools
are still shared, but garbage collection depends on the thread and if
background rendering is used.
Pull Request: https://projects.blender.org/blender/blender/pulls/128983
Cycles uses multiple threads to send commands to the GPU. The current
command buffer structure assumed that all commands from the same context
were send via the same thread. This wasn't the case and could lead to
recording commands to command buffers that are still pending (preparing
commands to send to GPU).
This is fixed by creating a command buffer each time a render graph
submits its work.
Detected when researching #128608
Pull Request: https://projects.blender.org/blender/blender/pulls/128978
When allocating a host visible buffer it could be that the returned
buffer was not host visible and access to the buffer would write to
unallocated memory.
Detected when researching #128608
Pull Request: https://projects.blender.org/blender/blender/pulls/128977