dde740bcd723f3fc149b44d06155396fb425ce4a
* On sm_30 and above there is no change (was not inlined already before), this just fixes a speed regression from yesterday. 6359c36ba4
* On sm_2x (tested with sm_21), I get a nice 8% speedup in the bmw scene with this. As a bonus, cubin compilation time and memory usage is significantly reduced. Regular cubin size went from 2.5MB to 2.0MB, Experimental one from 3.8MB to 2.5MB.
Description
No description provided
Languages
C++
78%
Python
14.9%
C
2.9%
GLSL
1.9%
CMake
1.2%
Other
0.9%