There exist a bunch of "give me a (filtered) image pixel at this location"
functions, some with duplicated functionality, some with almost the same but
not quite, some that look similar but behave slightly differently, etc.
Some of them were in BLI, some were in ImBuf.
This commit tries to improve the situation by:
* Adding low level interpolation functions to `BLI_math_interp.hh`
- With documentation on their behavior,
- And with more unit tests.
* At `ImBuf` level, there are only convenience inline wrappers to the above BLI
functions (split off into a separate header `IMB_interp.hh`). However, since
these wrappers are inline, some things get a tiny bit faster as a side
effect. E.g. VSE image strip, scaling to 4K resolution (Windows/Ryzen5950X):
- Nearest filter: 2.33 -> 1.94ms
- Bilinear filter: 5.83 -> 5.69ms
- Subsampled3x3 filter: 28.6 -> 22.4ms
Details on the functions:
- All of them have `_byte` and `_fl` suffixes.
- They exist in 4-channel byte (uchar4) and float (float4), as well as
explicitly passed amount of channels for other float images.
- New functions in BLI `blender::math` namespace:
- `interpolate_nearest`
- `interpolate_bilinear`
- `interpolate_bilinear_wrap`. Note that unlike previous "wrap" function,
this one no longer requires the caller to do their own wrapping.
- `interpolate_cubic_bspline`. Previous similar function was called just
"bicubic" which could mean many different things.
- Same functions exist in `IMB_interp.hh`, they are just convenience that takes
ImBuf and uses data pointer, width, height from that.
Other bits:
- Renamed `mod_f_positive` to `floored_fmod` (better matches `safe_floored_modf`
and `floored_modulo` that exist elsewhere), made it branchless and added more
unit tests.
- `interpolate_bilinear_wrap_fl` no longer clamps result to 0..1 range. Instead,
moved the clamp to be outside of the call in `paint_image_proj.cc` and
`paint_utils.cc`. Though the need for clamping in there is also questionable.
Pull Request: https://projects.blender.org/blender/blender/pulls/117387
Part of overall "improve filtering situation" (#116980) task:
* Add Bicubic filtering option to strip Transform "Filter" setting.
Previously this option only existed in Transform Effect "Interpolation"
setting.
- With this addition, it feels like the transform effect could
possibly be marked as legacy/deprecated, since the regular Transform
that is on all strips can do everything that Transform Effect did?
* Speed up bicubic filtering (used now in VSE, but also in CPU Compositor,
image paint, etc.) by slightly simplifying the code and using some SIMD.
Upscaling 96x54 image to 3840x2160 resolution, using Bicubic filtering:
- Windows (VS2022, Ryzen 5950X): 35.5ms -> 15.1ms
- Mac (clang 15, M1 Max): 29.6ms -> 24.4ms
* Add gtest coverage for bicubic functionality.
Pull Request: https://projects.blender.org/blender/blender/pulls/117100
IMB_transform is used by Sequencer (and other places) to do image
translation/rotation/scale on the CPU. This PR speeds up parts of it,
particularly when bilinear filtering is used. No behavior changes are
expected.
- Don't use virtual function calls inside inner loop. The code was using
class hierarchies with virtual calls just to do equivalent of "outside
of image? ignore" and "wrap UV coordinates or not?" decisions. Make those
use non-virtual function based code.
- Simplify pixel sampling functions to only do the work as needed by
anything within Blender codebase. For example, bilinear sampling of uchar
images always uses 4 RGBA channels and never does "UV wrap" logic.
- Bilinear interpolation uchar: completely branchless SIMD code now.
- Bilinear interpolation float: 2x floor() calls instead of 4x floor() +
2x ceil(), and final sample blending is done with SIMD.
Sequencer at 4K UHD resolution, with two image strips that need a transform,
playback framerate:
- Windows Ryzen 5950X: 18.7fps -> 26.2fps (IMB_transform time per frame goes
26.3ms -> 11.2ms)
- Mac M1 Max: 27.3fps -> 31.4fps
At that point the IMB_transform is not the slowest part of where playback
takes time (but rather sequencer effect application etc.).
Note: the amount of _actual code_ got a bit smaller. But I've added 100 lines
of unit tests in BLI_math_interp_test.cc, the bilinear interpolation
functions were only tested very indirectly by CPU compositor template
image tests.
Pull Request: https://projects.blender.org/blender/blender/pulls/115653
Using ClangBuildAnalyzer on the whole Blender build, it was pointing
out that BLI_math.h is the heaviest "header hub" (i.e. non tiny file
that is included a lot).
However, there's very little (actually zero) source files in Blender
that need "all the math" (base, colors, vectors, matrices,
quaternions, intersection, interpolation, statistics, solvers and
time). A common use case is source files needing just vectors, or
just vectors & matrices, or just colors etc. Actually, 181 files
were including the whole math thing without needing it at all.
This change removes BLI_math.h completely, and instead in all the
places that need it, includes BLI_math_vector.h or BLI_math_color.h
and so on.
Change from that:
- BLI_math_color.h was included 1399 times -> now 408 (took 114.0sec
to parse -> now 36.3sec)
- BLI_simd.h 1403 -> 418 (109.7sec -> 34.9sec).
Full rebuild of Blender (Apple M1, Xcode, RelWithDebInfo) is not
affected much (342sec -> 334sec). Most of benefit would be when
someone's changing BLI_simd.h or BLI_math_color.h or similar files,
that now there's 3x fewer files result in a recompile.
Pull Request #110944
Replace `typedef struct X {} X;` with `struct X {};`
In some cases the first and last name didn't match although this
is rarely useful, even a typo in some cases, e.g. TrachPathPoint.
Before this change the ImBuf struct had dedicated fields for the
buffer data. Now the color space is stored inside of the struct
which wraps around the buffer information.
This only changes the field placement, without changing the way
it is handled. In the future one might imagine that operations
like stealing buffer data should null-ify the buffer colorspace
pointer. Such changes would need to have more accurate thinking
before implementation.
Should be no functional changes.
Pull Request: https://projects.blender.org/blender/blender/pulls/109291
A lot of files were missing copyright field in the header and
the Blender Foundation contributed to them in a sense of bug
fixing and general maintenance.
This change makes it explicit that those files are at least
partially copyrighted by the Blender Foundation.
Note that this does not make it so the Blender Foundation is
the only holder of the copyright in those files, and developers
who do not have a signed contract with the foundation still
hold the copyright as well.
Another aspect of this change is using SPDX format for the
header. We already used it for the license specification,
and now we state it for the copyright as well, following the
FAQ:
https://reuse.software/faq/
The goal is to make it more explicit and centralized operation to
assign and steal buffer data, with proper ownership tracking.
The buffers and ownership flags are wrapped into their dedicated
structures now.
There should be no functional changes currently, it is a preparation
for allowing implicit sharing of the ImBuf buffers. Additionally, in
the future it is possible to more buffer-specific information (such
as color space) next to the buffer data itself. It is also possible
to clean up the allocation flags (IB_rect, ...) to give them more
clear naming and not have stored in the ImBuf->flags as they are only
needed for allocation.
The most dangerous part of this change is the change of byte buffer
data from `int*` to `uint8_t*`. In a lot of cases the byte buffer was
cast to `uchar*`, so those casts are now gone. But some code is
operating on `int*` so now there are casts in there. In practice this
should be fine, since we only support 64bit platforms, so allocations
are aligned. The real things to watch out for here is the fact that
allocation and offsetting from the byte buffer now need an explicit 4
channel multiplier.
Once everything is C++ it will be possible to simplify public
functions even further.
Pull Request: https://projects.blender.org/blender/blender/pulls/107609