There is even a chance the compilers handles this itself, but we should try to
use the internal storage as much as possible (and save 0.000001s in the process)
For experimental options, outside the scope of typical preferences.
While templates are developed we might want to make changes
to behavior which aren't fully compatible with typical work-flows.
Instead of mixing these options in with current preferences
expose separately (we could even force disable them when templates
aren't int use)
The code for vertical line was assuming that we necessarily neeeded vertical
lines for all the elements. Which is not true since we are not drawing
vertical and horizontal lines for collections.
Patch made in contribution with Philippe Schmid (@Quetzal).
Was due to the fact that the instances don't have a "static" obmat that can be referenced to use as a uniform.
Solution : precompute the full matrix for each bone and pass it as instance data. (theses are copied into a buffer and can be discarded right away)
Note: this could be optimized further and make only one drawcall (shgroup) to draw all bone instance of one type (vs. one call per armature).
Remove the critical OMP sections used to protect mem allocation.
First one can be done in a separate loop before main, parallelized one.
Second one only affect 'private' data, so we only need to ensure
guardedalloc thread safety is enabled.
This is committed as separated step to ease troubleshooting in case
bisecting becomes necesary.
Tests on my system with ~1200 objects with 128 shadow casting lamps (current max) show a significant perf improvment (cache timing : 22ms -> 9ms)
With a baseline with no shadow casting light at 6ms this give a reduction of the overhead from 16ms to 3ms.
This remove pretty much all allocations during the cache phase. Leading to a big improvement for scene with a large number of lights & shadowcasters.
The lamps storage has been replace by a union to remove the need to free/allocate everyframe (also reducing memory fragmentation).
We replaced the linked list system used to track shadow casters by a huge bitflag.
We gather the lights shadows bounds as well as the shadow casters AABB during the cache populate phase and put them in big arrays cache friendly.
Then in the cache finish phase, it's easier to iterate over the lamps shadow SphereBounds and test for intersection.
We use a double buffer system for the shadow casters arrays to detect deleted shadow casters.
Unfortunatly, it seems that deleting an object trigger an update for all other objects (thus tagging most shadow casting lamps to update), defeating the purpose of this tracking.
This needs further investigation.
Gives about 40% speedup of object which has simple-ish deformation applied
on top of subdivided mesh.
This might easily happen with single character animation.
Helps in cases of not very complex scenes and lots of system threads available.
A bit hard to measure change on it's own, it works best with the upcoming
changes and gives measurable improvements.
Mutex is now local to particular CCGDM, and guarding edge hash which is only
used by a single function only. There is no need to acquire read lock after
edge hash was created.
Unfortunately, we cannot perform set/unset checks on 'resolved'
properties (i.e. from actual IDProperties pointers, and not virtual RNA
placeholders)... IDProps in RNA are rather challenging topic. :|
This should fully fix T53715: 2.8: Removing keymap items no longer works