This optimisation only works if no material in the scene require the AO pass.
For this either set the AO distance to 0 or both Cavity and Edges factors to 0.
This double the performance of scenes with very high triangle count.
This is an optimization / cleanup commit.
The use of a global ubo remove lots of uniform lookups and only transfert data when needed.
Lots of renaming for more consistent codestyle.
Both object level and camera datablock properties animation did not work with
copy on write enabled.
The root of the issue is going to the fact, that all interface elements are
referencing original datablock. For example, View3D has pointer to camera it's
using, and all areas which does access v3d->camera should in fact query for
the evaluated version of that camera, within the current context.
Annoying part of this change is that we now need to pass depsgraph in lots
of places. Which is rather annoying.
Alternative would be to cache evaluated camera in viewport itself, but then
it makes it annoying to keep things in sync.
Not sure if there is nicer solution here.
Reviewers: dfelinto, campbellbarton, mont29
Subscribers: dragoneex
Differential Revision: https://developer.blender.org/D3007
Sun is treated as a unit distant disk like in cycles.
Opti: Since computing the diffuse contribution via LTC is the same as not using the Linear Transformation, we can bypass most of the LTC code.
This replaces the sphere analytical diffuse computation as it gives a more pleasing result very close to cycles' AND cheaper.
Lights power have been retweaked to be coherent with cycles (except sun lamp with large radius where cycles has a non-uniform light distribution).
This is an improvement on the old spining quad method that was giving artifacts when the reflection ray was nearly aligned with the sphere center.
This might be a bit heavier but it's worth it.
This leads to a ~3ms improvement of CPU time during drawing.
This prevent the rendering from being stalled waiting for the texture data to be transfered.
This is because certain part of the engine may require a blank framebuffer to bind textures to.
This is the case when using only array textures, unsupported by DRW_framebuffer_init().
Now hashed alpha materials are stable when moving the camera/not using TAA.
It also converge to a noise free image when using TAA. No more numerical imprecision.
There still can be situations with multiple overlapping transparent surfaces that can lead to residual noise.
Using GL_RG16I texture for the hit coordinates increase tremendously the precision of the hit.
The sign of the integer is used to 2 flags (has_hit and is_planar).
We do not store the depth and retrieve it from the depth buffer (increasing bandwith by +8bit/px).
The PDF is stored into another GL_R16F texture.
We remove the raycount for simplicity and to reduce compilation time (less branching in refraction shader).
Instead of creating non temp textures only at framebuffer creation, we create them and bind them if their pointer is NULL.
This should simplify the framebuffers creation code.
The reason for the crash is still a bit confusing, but on Windows with Intel HD Graphics 4000 it always happens when you enable `Use Nodes` or when you try to connect the Pricipled Shader node to the output without the `Subsurface Scattering` and `Subsurface Translucency` options enabled.
Was due to the fact that the instances don't have a "static" obmat that can be referenced to use as a uniform.
Solution : precompute the full matrix for each bone and pass it as instance data. (theses are copied into a buffer and can be discarded right away)
Note: this could be optimized further and make only one drawcall (shgroup) to draw all bone instance of one type (vs. one call per armature).
Tests on my system with ~1200 objects with 128 shadow casting lamps (current max) show a significant perf improvment (cache timing : 22ms -> 9ms)
With a baseline with no shadow casting light at 6ms this give a reduction of the overhead from 16ms to 3ms.
This remove pretty much all allocations during the cache phase. Leading to a big improvement for scene with a large number of lights & shadowcasters.
The lamps storage has been replace by a union to remove the need to free/allocate everyframe (also reducing memory fragmentation).
We replaced the linked list system used to track shadow casters by a huge bitflag.
We gather the lights shadows bounds as well as the shadow casters AABB during the cache populate phase and put them in big arrays cache friendly.
Then in the cache finish phase, it's easier to iterate over the lamps shadow SphereBounds and test for intersection.
We use a double buffer system for the shadow casters arrays to detect deleted shadow casters.
Unfortunatly, it seems that deleting an object trigger an update for all other objects (thus tagging most shadow casting lamps to update), defeating the purpose of this tracking.
This needs further investigation.