Gives a global speedup of about 5% in smoke simulation (as usual, parallelized chunks themselves are about 15-25% quicker with BLI_task than with OMP), using a simple setup with two generators (one from mesh, one from particles), an obstacle and a windfield.