971f9e1c25daeb8064b8e6dad31dc0cfa55ba3e3
This time, with have over 300% speedup! But no, this is not due to switch to BLI_task (which 'only' gives usal 15% speedup), but to enhancement of the algorithm, flatten loop over covariance matrix items now allows to compute (usually) all items in parallel, instead of having at most 3 or 4 working threads (with unbalanced load even)...
Description
No description provided
Languages
C++
78%
Python
14.9%
C
2.9%
GLSL
1.9%
CMake
1.2%
Other
0.9%