Color management: multi-thread IMB_colormanagement_transform_from_byte_threaded simple case

This function is only used in Sequencer code to convert source byte images
into float images. For a "simple case" where there is no color space
conversion needed, it was doing that on a single thread, in two passes over
the image (first byte -> float conversion, then alpha premultiply).

VSE 4K resolution playback, with two image strips where one of them has
"convert to float" option, on Windows / Ryzen 5950X:

- Overall playback FPS: 3.40 -> 4.03. Still very slow, and a lot of time
  is spent in allocating, clearing and deleting various temp images, as
  well as final color conversion for display.
- `seq_imbuf_to_sequencer_space` part: 217.9ms -> 86.6ms. Most of
  remaining cost is just in just memory allocation+clear, which is why
  it's not drastically faster due to threading.

Pull Request: https://projects.blender.org/blender/blender/pulls/115628
This commit is contained in:
Aras Pranckevicius
2023-12-01 16:18:47 +01:00
committed by Aras Pranckevicius
parent 5af7d3e2be
commit f3ce0645e4

View File

@@ -32,6 +32,7 @@
#include "BLI_rect.h"
#include "BLI_string.h"
#include "BLI_task.h"
#include "BLI_task.hh"
#include "BLI_threads.h"
#include "BKE_appdir.h"
@@ -2081,25 +2082,31 @@ void IMB_colormanagement_transform_from_byte_threaded(float *float_buffer,
const char *from_colorspace,
const char *to_colorspace)
{
using namespace blender;
ColormanageProcessor *cm_processor;
if (from_colorspace == nullptr || from_colorspace[0] == '\0') {
return;
}
if (STREQ(from_colorspace, to_colorspace)) {
/* Because this function always takes a byte buffer and returns a float buffer, it must
* always do byte-to-float conversion of some kind. To avoid threading overhead
* IMB_buffer_float_from_byte is used when color spaces are identical. See #51002.
*/
IMB_buffer_float_from_byte(float_buffer,
byte_buffer,
IB_PROFILE_SRGB,
IB_PROFILE_SRGB,
false,
width,
height,
width,
width);
IMB_premultiply_rect_float(float_buffer, 4, width, height);
if (STREQ(from_colorspace, to_colorspace) && channels == 4) {
/* Color spaces are the same, just do a simple byte->float conversion. */
int64_t pixel_count = int64_t(width) * height;
threading::parallel_for(IndexRange(pixel_count), 256 * 1024, [&](IndexRange pix_range) {
float *dst_ptr = float_buffer + pix_range.first() * channels;
uchar *src_ptr = byte_buffer + pix_range.first() * channels;
for ([[maybe_unused]] const int i : pix_range) {
/* Equivalent of rgba_uchar_to_float + premultiply. */
float cr = float(src_ptr[0]) * (1.0f / 255.0f);
float cg = float(src_ptr[1]) * (1.0f / 255.0f);
float cb = float(src_ptr[2]) * (1.0f / 255.0f);
float ca = float(src_ptr[3]) * (1.0f / 255.0f);
dst_ptr[0] = cr * ca;
dst_ptr[1] = cg * ca;
dst_ptr[2] = cb * ca;
dst_ptr[3] = ca;
src_ptr += 4;
dst_ptr += 4;
}
});
return;
}
cm_processor = IMB_colormanagement_colorspace_processor_new(from_colorspace, to_colorspace);