Previous approach was not clear enough and caused problems.
UBOs were taking slots and not release them after a shading group even
if this UBO was only for this Shading Group (notably the nodetree ubo,
since we now share the same GPUShader for identical trees).
So I choose to have a better defined approach:
- Standard texture and ubo calls are assured to be valid for the shgrp
they are called from.
- (new) Persistent texture and ubo calls are assured to be valid accross
shgrps unless the shader changes.
The standards calls are still valids for the next shgrp but are not assured
to be so if this new shgrp binds a new texture.
This enables some optimisations by not adding redundant texture and ubo
binds.
Basically, don't do full in-place copy of node tree datablock if it's
already expanded. Current way how node tree is evaluated is fully
built around the idea that evaluation copies values from original
to copied datablocks.
Changing links is handled on another level.
Old solution was to create a new vbo and copy it to the location of
the old vbo hoping for the batches to update their vaos before drawing.
The issue is that the new VAO caching is not updating the VAOs at all
unless the shader interface changes.
So unless we expand gawain to support updating of vertex buffers (in a
better way than the current "dynamic" buffer) we need to delete every batch
linked to the vbo we want to recreate.
This solution might have performance implications.
Requires BLI_utildefines.h to be included first,
(already noted in other inline code).
Possible alternative could be to move BLI_assert into own header.
For IDProps IDarray, IDP_EqualsProperties was called for each item,
instead of IDP_EqualsProperties_ex, discarding value of `is_strict`
option.
Probably not an issue with current code, though.
The issue was only visible with copy-on-write enabled, and related to the
fact, that dependency graph builder binds original FCurves.
For now use smallest patch possible to make things to work and to make
draguu happy.
Need to think of a smarter way to deal with drivers, bones and view layers.
I tried to use real multisampling but the main problem is the outline detection that needs to have matching depth samples.
So adding FXAA instead. Always on for now, may add a parameter for it later.
One thing to note is that we need to copy the final output once again to the main color buffer because we cannot swap the dtxl textures (they can be referenced elsewhere like GPUOffscreen).
We could improve upon this and add TAA on top if viewport is still.
This means that rendering clay with AO only needs 1 geometry pass.
Thus greatly improving performance of poly heavy scene.
This also fix a self shadow issue in the AO, making low sample count
way better.
We also do not need to blit the depth anymore since we
are doing a fullscreen shading pass.
The constant cost of running the a deferred shading pass is negligeable.
This include quite a bit of code cleanup inside clay_engine.c.
The deferred pipeline is only enabled if at least one material needs it.
Multisampling is not supported yet.
Small hacks when doing deferred:
- We invert the normal before encoding it for precision.
- We put the facing direction into the sign of the mat_id.
- We dither the normal to fight the low bitdepth artifacts of the normal
buffer (which is 8bits per channel to reduce bandwidth usage).
Sequencer rendering can use multisample render targets. Be sure to sync
thoses after rendering.
Also disable the sample loop when not needed.
Do note that currently the color correction is broken with the sequencer.
Looks like someone changed the signature of some RNA update callback,
and for some reason that 'change skey' update function was not updated
(or later got merged from master)...
We'll need RNA to check for its func signatures, some day...
When adding scene strips to the sequencer, the wrong scenes were
getting getting added if some were skipped. For example:
Given 4 scenes (A, B, C, D) if you're trying to add the last 3 scenes
(B, C, D) as strips to the first scene (A), it would ended up adding
"A, B, C" instead of "B, C, D" as expected.
Fix provided by Andrew (signal9).
Calling the rendering operator seems to kill any other WM_job running, leaving
uncompiled materials into a GPU_MAT_QUEUED state. This then made the probe update
looping indefinitely (all_materials_updated remaining to false).
To fix this, we resume compilation for materials that are in this state.
Cancelling Render before all material compilation could make certain material
remain uncompiled. Fortunately, this is not allowed as of now.