Cycles: Add some more math optimizations

This enables three additional math optimizations: -ffp-contract=fast (enables FMA generation) -freciprocal-math (enables x/y -> x*(1/y)) -fassociative-math (enables e.g. a*b + c*b -> (a+c)*b) These are used on Windows and HIP anyways, so our code can't expect exact IEEE semantics in any case. The only difference between the new set and -ffast-math is that we don't use -ffinite-math-only since this causes issues with the BVH (see ce1f2e271d) and breaks e.g. isnan. This causes a ~1.5% speedup in my very quick test, but might be higher for some more math-intensive cases. Pull Request: https://projects.blender.org/blender/blender/pulls/128342
2024-09-29 23:38:05 +02:00
parent 908d3c387f
commit f96f3c33af
1 changed files with 8 additions and 0 deletions
--- a/intern/cycles/CMakeLists.txt
+++ b/intern/cycles/CMakeLists.txt
@@ -158,6 +158,14 @@ elseif(CMAKE_COMPILER_IS_GNUCC OR (CMAKE_CXX_COMPILER_ID MATCHES "Clang"))
  list(APPEND CYCLES_MATH_FLAGS "-fno-math-errno")
  # Let compiler optimize 0.0 - x without worrying about signed zeros.
  list(APPEND CYCLES_MATH_FLAGS "-fno-signed-zeros")
+  # Let the compiler generate fused multiply-add instructions
+  list(APPEND CYCLES_MATH_FLAGS "-ffp-contract=fast")
+  # Let the compiler replace x/y with x*(1/y)
+  list(APPEND CYCLES_MATH_FLAGS "-freciprocal-math")
+  # Let the compiler reorder terms to save operations
+  list(APPEND CYCLES_MATH_FLAGS "-fassociative-math")
+  # Don't enable -ffinite-math-only since the BVH code relies on NaNs.
+  # Otherwise, we could just use -ffast-math.

  if(CMAKE_COMPILER_IS_GNUCC)
    # Assume no signal trapping for better code generation.