threading::parallel_for implements its own check to avoid threading when the data length is below the grain size, and its overhead is lower anyway, since it doesn't use a function call for every element. Pull Request: https://projects.blender.org/blender/blender/pulls/135316