The GPU implementation of the morphological feather erosion operator is
different from the CPU implementation. This is because the CPU does
erosion by doing dilation sandwiched between two inverse functions. So
this patch fixes the different by following the CPU implementation.