The Keying node produces different mattes between the GPU and CPU
evaluators.
That's because the CPU implementation doesn't use the full argmax to
determine indices, rather, it only considers the first argmax and uses
the minimum and maximum of the other two as a form of determinism or
stability.
The algorithm seems arbitrary and makes little sense to me, so for now,
the CPU implementation was ported for consistent results.