This increases the framerate in a production file from about 2.3 to 2.5
FPS, and reduces gaps in a profile where the CPU was waiting for just
a few threads to finish the BVH tree lookups. If BVH lookups become
faster in the future, this grain size could be increased.