This enables most of the GPU compiler's optimizations while -ffast-math isn't set at DPC++ level. It brings an overall 1% speedup and currently doesn't change the unit tests pass rate.