This will only be noticeable for drawing many instances.
In contrived use-case with many instances, and `USE_PROFILE` disabled
this can close to double playback FPS.
The option to disable this is left in the code in case we want to
debug memory use.
See D2756 for details.
Old performance debug was doing queries for every frame even if not debugging perf.
Also, it did not record when a pass was draw multiple time, leading to incorect measurement.
New module also allows to group the timers to limit infos displayed.
Also fix the background CPU draw timer.
For users that means you can tweak shaders in the nodetree and things
are way faster. This is a huge improvement, particularly in
systems that have no shader cache.
From the code perspective it means we are no longer re-compiling the
shader every time a value is tweaked in the UI. We are using uniforms
for those values.
It would be slow to add that many uniforms for all the shaders. So
instead we are using UBO (Uniform Buffer Objects).
This fixes the main issue of T51467. However GWN_shaderinterface_create() still
needs to be improvedi. When opening a .blend all shaders are compiled once, so
optimizing it will bring a measurable impact.
========================================================================
NOTE: This breaks update of Cycles material upon nodetree nodes
tweaking. It will be fixed separately by depsgraph, once tackling T51925
(Animated Eevee values slowdown).
The idea is to make Depsgraph update more granular. The XXX TODO in
rna_nodetree.c will be tackled at that time as well.
========================================================================
Reviewers: sergey, brecht, fclem
Differential Revision: https://developer.blender.org/D2739
UVs need specific data in the VBO, which is not computed unless the
shaders assigned to the mesh actually use UVs. When adding UVs to the
shader, the VBOs were not being recomputed to include the required data.
This adds a DEG relation between the shader and the mesh, and recomputes
the required data if the shader changed.
Thanks Sergey, for all the DEG stuff...
Read from the GPUMaterial to find custom-data layers used for drawing.
This resolves problem where having UV's would always calculate tangents
causing noticeable slow down compared to 2.7x.
This also renames some flags/variables to be more generic for updating
purposes. The call used here was previously only used for updating
paint data, but as it was reused here, flags and variables were renamed
to accomodate more clearly to the new usages.
This commit introduce the computation of a depth pyramid containing min and max depth values of the original depth buffer.
This is useful for Clustered Light Culling but also for raytracing on the depth buffer (SSR).
It's also usefull to have to fetch higher mips in order to improve texture cache usage.
As of now, 1st mip (highest res) is half the resolution of the depth buffer, but everything is already done to be able to make a fullres copy of the depth buffer in the 1st mip instead of downsampling.
Also, the texture used is RG_32F which is a too much but enough to cover the 24bits of the depth buffer. Reducing the texture size would make things quite faster.