Splitting the write-loop into stages makes it easier to understand, debug and time.
All the complex filtering in `gather_local_ids_to_write` is the same as before.
With this and some previous refactors, it's also much easier to have a clean `write_id`
function that should also work for embedded IDs later on.
Pull Request: https://projects.blender.org/blender/blender/pulls/130983