This patch fixes a 32-bit overflow that occurs on 64-bit systems due to a numeric literal being treated as 32-bit.
This patch allows for the generation of images that occupy more than 4GB of RAM, which previously caused a crash.
Reviewers: sergey
Reviewed By: sergey
Differential Revision: https://developer.blender.org/D2975
Brushes themselves are still affected by the mask, but the viewport is not
showing the mask. This way it's easier to see details while sculpting.
Studio request by Julien Kaspar
There is even a chance the compilers handles this itself, but we should try to
use the internal storage as much as possible (and save 0.000001s in the process)
For experimental options, outside the scope of typical preferences.
While templates are developed we might want to make changes
to behavior which aren't fully compatible with typical work-flows.
Instead of mixing these options in with current preferences
expose separately (we could even force disable them when templates
aren't int use)
The code for vertical line was assuming that we necessarily neeeded vertical
lines for all the elements. Which is not true since we are not drawing
vertical and horizontal lines for collections.
Patch made in contribution with Philippe Schmid (@Quetzal).
Was due to the fact that the instances don't have a "static" obmat that can be referenced to use as a uniform.
Solution : precompute the full matrix for each bone and pass it as instance data. (theses are copied into a buffer and can be discarded right away)
Note: this could be optimized further and make only one drawcall (shgroup) to draw all bone instance of one type (vs. one call per armature).
Remove the critical OMP sections used to protect mem allocation.
First one can be done in a separate loop before main, parallelized one.
Second one only affect 'private' data, so we only need to ensure
guardedalloc thread safety is enabled.
This is committed as separated step to ease troubleshooting in case
bisecting becomes necesary.
Tests on my system with ~1200 objects with 128 shadow casting lamps (current max) show a significant perf improvment (cache timing : 22ms -> 9ms)
With a baseline with no shadow casting light at 6ms this give a reduction of the overhead from 16ms to 3ms.
This remove pretty much all allocations during the cache phase. Leading to a big improvement for scene with a large number of lights & shadowcasters.
The lamps storage has been replace by a union to remove the need to free/allocate everyframe (also reducing memory fragmentation).
We replaced the linked list system used to track shadow casters by a huge bitflag.
We gather the lights shadows bounds as well as the shadow casters AABB during the cache populate phase and put them in big arrays cache friendly.
Then in the cache finish phase, it's easier to iterate over the lamps shadow SphereBounds and test for intersection.
We use a double buffer system for the shadow casters arrays to detect deleted shadow casters.
Unfortunatly, it seems that deleting an object trigger an update for all other objects (thus tagging most shadow casting lamps to update), defeating the purpose of this tracking.
This needs further investigation.
Gives about 40% speedup of object which has simple-ish deformation applied
on top of subdivided mesh.
This might easily happen with single character animation.