transform_direction() can't handle parameters in constant address space. Creating a local copy of the parameter satisfies the OpenCL compiler. CUDA and CPU compilers should be able to optimize this away I hope.