Sergey Sharybin 01e8e7dc6d Task scheduler: Optimize parallel loop over lists
The goal is to address performance regression when going from
few threads to 10s of threads. On a systems with more than 32
CPU threads the benefit of threaded loop was actually harmful.

There are following tweaks now:

- The chunk size is adaptive for the number of threads, which
  minimizes scheduling overhead.

- The number of tasks is adaptive to the list size and chunk
  size.

Here comes performance comparison on the production shot:

 Number of threads        DEG time before        DEG time after
       44                     0.09                   0.02
       32                     0.055                  0.025
       16                     0.025                  0.025
       8                      0.035                  0.033
2018-11-20 14:58:17 +01:00
2018-11-15 17:16:40 +01:00
2018-10-09 07:58:06 +11:00
2013-12-24 22:57:27 +06:00
2010-10-13 14:44:22 +00:00
Description
No description provided
841 MiB
Languages
C++ 78%
Python 14.9%
C 2.9%
GLSL 1.9%
CMake 1.2%
Other 0.9%