Those three ones were actually giving no significant benefits, in fact
even slowing things down in one case compared to no parallelization at
all (in `BM_mesh_elem_table_ensure()`).
Point being, once more, parallelizing *very* small tasks (like index or
flag setting, etc.) is nearly never worth it.
Also note that we could not easlily use per-item parallel looping in
those three cases, since they are heavily relying on valid
loop-generated index (or are doing non-threadable things like allocation
from a mempool)...