`BM_mesh_normals_update` was converted from OMP to new parallel iterator code,
basic test with heavily subdivided cube (24.5k faces) gives:
- old OMP code: average 10ms per run.
- new BLI_task code: average 6ms per run.
So new code seems to be easily 40% quicker, in addition to getting rid of OMP. ;)
Reviewers: sergey, campbellbarton
Differential Revision: https://developer.blender.org/D2930