linearrgb_to_srgb_ushort4_predivide() was calling `linearrgb_to_srgb(t) * alpha` twice in the FTOUSHORT macro, which gcc didn't optimize out.
linearrgb_to_srgb_ushort4_predivide() was calling `linearrgb_to_srgb(t) * alpha` twice in the FTOUSHORT macro, which gcc didn't optimize out.