Glow effect was doing the correct thing algorithmically (separable gaussian
blur), but it was 1) completely single-threaded, and 2) did operations in
several passes over the source images, instead of doing them in one go.
- Adds multi-threading to Glow effect.
- Combines some operations, e.g. instead of IMB_buffer_float_from_byte
followed by IMB_buffer_float_premultiply, do
IMB_colormanagement_transform_from_byte_threaded which achieves the same,
but more efficiently.
- Simplifies the code: removing separate loops around image boundaries is
both less code and slightly faster; use float4 vector type for more
compact code; use Array classes instead of manual memory allocation, etc.
- Removes IMB_buffer_float_unpremultiply and IMB_buffer_float_premultiply
since they are no longer used by anything whatsoever.
Applying Glow to 4K UHD sequencer output, on Windows Ryzen 5950X:
- Blur distance 4: 935ms -> 109ms (8.5x faster)
- Blur distance 20: 3526ms -> 336ms (10.5x faster)
Same on Mac M1 Max:
- Blur distance 4: 732ms -> 126ms (5.8x faster)
- Blur distance 20: 3047ms -> 528ms (5.7x faster)
Pull Request: https://projects.blender.org/blender/blender/pulls/115818