Render to Render workload depedency not correctly syncing
in Metal. PR adds hard pass break where GPU_memory_barrier's
occur during render workloads to ensure non-pixel-local writes
are visible to subsequent render invocations as needed. This is
required to support full pass dependencies on a tile-based GPU
architecture.
Note that these barriers are therefore expensive, so are skipped
where dependencies are local and fragment execution order is
well-represented via either blend order or explicit
raster_order_groups.
Authored by Apple: Michael Parkin-White
Pull Request: https://projects.blender.org/blender/blender/pulls/116656