The difficulty of implementing this iterator is that it requires lots of operator
overloads which are usually very simple to implement, but result in a lot of code.
The goal of this patch is to abstract the common parts so that it becomes easier
to implement random accessor iterators. Many algorithms can work more
efficiently with random access iterators than with other iterator types.
Also see https://en.cppreference.com/w/cpp/iterator/random_access_iterator
Pull Request: https://projects.blender.org/blender/blender/pulls/118113
The convex hull tests included a reference AABB-fitting function for
comparison which was used to validate the optimized implementation.
This wasn't great as it depended on matching exact return values and
didn't test the logic of AABB-fitting worked usefully.
Replace this with a more general test that creates random polygons with
known bounds, apply a random rotation & translation, then use
AABB-fitting to un-rotate the points, passing when the bounds are no
larger than the size of the generated input.
Details:
- Make BLI_convexhull_aabb_fit_hull_2d a static function again as it was
only exposed for tests. Use BLI_convexhull_aabb_fit_points_2d instead.
- Remove brute force reference implementation from tests,
moving this to an assertion within convexhull_2d
(disabled by default since it's quite slow).
Use EXPECT_NEAR instead of EXPECT_EQ to account for a differences in
atan2 implementation on macOS, more generally relying on exact
float comparison for tests is error prone.
Move BLI_convexhull_aabb_fit_points_2d to a public function to be able
to compare compare fitting one convex hull with a simple reference
method.
One test is disabled as it exposes an error in convex hull calculation
which needs further investigation.
The result of cross_poly_v2 was flipped compared with cross_tri_v2 &
cross_poly_v3 (with the Z values zeroed).
Ensure cross_poly_v2/3, cross_tri_v2/3 return compatible results and
updating the doc-strings noting that a negative Z is for clock-wise
polygons.
This adds two new constructors to `IndexMask`:
* `from_repeating(mask_to_repeat, repetitions, stride, initial_offset, memory)`:
It allows repeating an existing index mask with a stride.
* `from_every_nth(n, indices_num, initial_offset, memory)`: Creates an index
mask like `{0, 2, 4, 6, ...}`.
`from_every_nth` is implemented in terms of `from_repeating` which is optimized
to handle this case (and other cases) very efficiently.
Pull Request: https://projects.blender.org/blender/blender/pulls/118084
Previously, it was only possible to `find` a specific index in an `IndexMask`.
Now it's also possible to find the closest larger/smaller index if an exact
match doesn't exist. This could be used for slicing the mask so that it only
contains certain indices.
Pull Request: https://projects.blender.org/blender/blender/pulls/117852
isect_point_tri_v2 and isect_point_quad_v2 are handling tris/quads
in either clockwise or counter-clockwise vertex orderings. However,
for clockwise order it was considering points that lie on the edges
or vertices as "inside", whereas for counter-clockwise it was treating
them as "outside".
Visibly affected place is VSE: it has an optimization that checks
whether a fully opaque strip image fully covers the rendered area.
When the strip was scaled up to *exactly* cover the rendered area,
the check was failing since isect_point_quad_v2 was saying that a
point is outside the rect.
As far as I can tell, the functions have been "slightly wrong" in
this way for at least 15 years; harder to see through earlier
history in git.
Added a bunch of unit tests to cover this. Without the fix, "edge"
and "corner" cases against "cw" tri/quad were failing.
Performance (checked on clang15 on M1 Max):
- isect_point_tri_v2 is pretty much the same performance (assembly
several instructions shorter),
- isect_point_quad_v2 is about three times *faster* (assembly 2x
shorter), seemingly the compiler is able to use some SIMD now.
Pull Request: https://projects.blender.org/blender/blender/pulls/117786
Part of overall "improve image filtering situation" (#116980), this PR addresses
two issues:
- Bilinear (default) image filtering makes half a source pixel wide transparent
border around the image. This is very noticeable when scaling images/movies up
in VSE. However, when there is no scaling up but you have slightly rotated
image, this creates a "somewhat nice" anti-aliasing around the edge.
- The other filtering kinds (e.g. cubic) do not have this behavior. So they do
not create unexpected transparency when scaling up (yay), however for slightly
rotated images the edge is "jagged" (oh no).
More detail and images in PR.
Pull Request: https://projects.blender.org/blender/blender/pulls/117717
Part of overall "improve filtering situation" (#116980) task:
Add "Cubic Mitchell" filtering option to VSE strips. This is a cubic (4x4)
filter that generally looks better than bilinear, while not blurring the image
as much as the Cubic BSpline filter that exists elsewhere within Blender. It is
also default in many other apps.
Rename the (very recently added) VSE Bicubic filter option to Cubic BSpline.
Images in the PR.
Pull Request: https://projects.blender.org/blender/blender/pulls/117517
There exist a bunch of "give me a (filtered) image pixel at this location"
functions, some with duplicated functionality, some with almost the same but
not quite, some that look similar but behave slightly differently, etc.
Some of them were in BLI, some were in ImBuf.
This commit tries to improve the situation by:
* Adding low level interpolation functions to `BLI_math_interp.hh`
- With documentation on their behavior,
- And with more unit tests.
* At `ImBuf` level, there are only convenience inline wrappers to the above BLI
functions (split off into a separate header `IMB_interp.hh`). However, since
these wrappers are inline, some things get a tiny bit faster as a side
effect. E.g. VSE image strip, scaling to 4K resolution (Windows/Ryzen5950X):
- Nearest filter: 2.33 -> 1.94ms
- Bilinear filter: 5.83 -> 5.69ms
- Subsampled3x3 filter: 28.6 -> 22.4ms
Details on the functions:
- All of them have `_byte` and `_fl` suffixes.
- They exist in 4-channel byte (uchar4) and float (float4), as well as
explicitly passed amount of channels for other float images.
- New functions in BLI `blender::math` namespace:
- `interpolate_nearest`
- `interpolate_bilinear`
- `interpolate_bilinear_wrap`. Note that unlike previous "wrap" function,
this one no longer requires the caller to do their own wrapping.
- `interpolate_cubic_bspline`. Previous similar function was called just
"bicubic" which could mean many different things.
- Same functions exist in `IMB_interp.hh`, they are just convenience that takes
ImBuf and uses data pointer, width, height from that.
Other bits:
- Renamed `mod_f_positive` to `floored_fmod` (better matches `safe_floored_modf`
and `floored_modulo` that exist elsewhere), made it branchless and added more
unit tests.
- `interpolate_bilinear_wrap_fl` no longer clamps result to 0..1 range. Instead,
moved the clamp to be outside of the call in `paint_image_proj.cc` and
`paint_utils.cc`. Though the need for clamping in there is also questionable.
Pull Request: https://projects.blender.org/blender/blender/pulls/117387
`UUID` generally stands for "universally unique identifier". The session identifier that
we use is neither universally unique, nor does it follow the standard. Therefor, the term
"session uuid" is confusing and should be replaced.
In #116888 we briefly talked about a better name and ended up with "session uid".
The reason for "uid" instead of "id" is that the latter is a very overloaded term in Blender
already.
This patch changes all uses of "uuid" to "uid" where it's used in the context of a
"session uid". It's not always trivial to see whether a specific mention of "uuid" refers
to an actual uuid or something else. Therefore, I might have missed some renames.
I can't think of an automated way to differentiate the case.
BMesh also uses the term "uuid" sometimes in a the wrong context (e.g. `UUIDFaceStepItem`)
but there it also does not mean "session uid", so it's *not* changed by this patch.
Pull Request: https://projects.blender.org/blender/blender/pulls/117350
The term `PIL` stands for "platform independent library." It exists since the `Initial Revision`
commit from 2002. Nowadays, we generally just use the `BLI` (blenlib) prefix for such code
and the `PIL` prefix feels more confusing then useful. Therefore, this patch renames the
`PIL` to `BLI`.
Pull Request: https://projects.blender.org/blender/blender/pulls/117325
Part of overall "improve filtering situation" (#116980) task:
* Add Bicubic filtering option to strip Transform "Filter" setting.
Previously this option only existed in Transform Effect "Interpolation"
setting.
- With this addition, it feels like the transform effect could
possibly be marked as legacy/deprecated, since the regular Transform
that is on all strips can do everything that Transform Effect did?
* Speed up bicubic filtering (used now in VSE, but also in CPU Compositor,
image paint, etc.) by slightly simplifying the code and using some SIMD.
Upscaling 96x54 image to 3840x2160 resolution, using Bicubic filtering:
- Windows (VS2022, Ryzen 5950X): 35.5ms -> 15.1ms
- Mac (clang 15, M1 Max): 29.6ms -> 24.4ms
* Add gtest coverage for bicubic functionality.
Pull Request: https://projects.blender.org/blender/blender/pulls/117100
Bundling many tests in a single binary reduces build time and disk space
usage, but is less convenient for running individual tests command line
as filter flags need to be used.
This adds WITH_TESTS_SINGLE_BINARY to generate one executable file per
source file. Note that enabling this option requires a significant amount
of disk space.
Due to refactoring, the resulting ctest names are a bit different than
before. The number of tests is also a bit different depending if this
option is used, as one uses gtests discovery and the other is organized
purely by filename, which isn't always 1:1.
Co-authored-by: Sergey Sharybin <sergey@blender.org>
Pull Request: https://projects.blender.org/blender/blender/pulls/114604
Along with the 4.1 libraries upgrade, we are bumping the clang-format
version from 8-12 to 17. This affects quite a few files.
If not already the case, you may consider pointing your IDE to the
clang-format binary bundled with the Blender precompiled libraries.
The only user was the Python API. Convert that to use the C++ API.
That simplifies things a bit even, since the encoding of "arrays of arrays"
is a fair amount simpler with the C++ data structures. The motivation
is to simplify the changes from #111061.
IMB_transform is used by Sequencer (and other places) to do image
translation/rotation/scale on the CPU. This PR speeds up parts of it,
particularly when bilinear filtering is used. No behavior changes are
expected.
- Don't use virtual function calls inside inner loop. The code was using
class hierarchies with virtual calls just to do equivalent of "outside
of image? ignore" and "wrap UV coordinates or not?" decisions. Make those
use non-virtual function based code.
- Simplify pixel sampling functions to only do the work as needed by
anything within Blender codebase. For example, bilinear sampling of uchar
images always uses 4 RGBA channels and never does "UV wrap" logic.
- Bilinear interpolation uchar: completely branchless SIMD code now.
- Bilinear interpolation float: 2x floor() calls instead of 4x floor() +
2x ceil(), and final sample blending is done with SIMD.
Sequencer at 4K UHD resolution, with two image strips that need a transform,
playback framerate:
- Windows Ryzen 5950X: 18.7fps -> 26.2fps (IMB_transform time per frame goes
26.3ms -> 11.2ms)
- Mac M1 Max: 27.3fps -> 31.4fps
At that point the IMB_transform is not the slowest part of where playback
takes time (but rather sequencer effect application etc.).
Note: the amount of _actual code_ got a bit smaller. But I've added 100 lines
of unit tests in BLI_math_interp_test.cc, the bilinear interpolation
functions were only tested very indirectly by CPU compositor template
image tests.
Pull Request: https://projects.blender.org/blender/blender/pulls/115653
`Vector::remove_if` allows certain elements to be removed based on a predicate.
However `std::remove_if` only shifts the elements that will not be deleted
to the beginning of the container and then `Vector::remove_if` only
updates the pointer to the new last element after using `std::remove_if`.
This works well if `Vector` is used with trivial types that don't have a destructor.
Having a `Vector<std::unique_ptr>` for example, can generate undefined behavior
if the predicate gives `true` to elements that are contiguous at the end, if
`Vector::remove_if` only updates the end pointer in these specific cases, these
makes these smart pointers useless because they will not be freed by themselves.
To prevent that, also destruct the elements being removed.
Pull Request: https://projects.blender.org/blender/blender/pulls/115914
The constness of the `ImplicitSharingPtr` does not imply the constness of the
referenced data, because that is determined by the user count. Therefore,
`ImplicitSharingPtr` should never give a non-const pointer to the underlying data.
Instead, one always has to check the user count, before one can do a `const_cast`.
Pull Request: https://projects.blender.org/blender/blender/pulls/115652
This commit aim at making the behaviors of `BLI_rename` and
`BLI_rename_overwrite` more consistent and coherent across all
supported platforms.
* `BLI_rename` now only succeeds in case the target `to` path does not
exists (similar to Windows `rename` behavior).
* `BLI_rename_overwrite` allows to replace an existing target `to` file
or (empty) directory (similar to Unix `rename` behavior).
NOTE: In case the target is open by some process on the system, trying
to overwrite it will still fail on Windows, while it should succeed on
Unix-like systems.
The main change for Windows is the usage of `MoveFileExW`
instead of `_wrename`, which allows for 'native support' of file
overwrite (using the `MOVEFILE_REPLACE_EXISTING` flag). Directories
still need to be explicitly removed though.
The main change for *nix systems is the use of `renamex_np` (OSX) or
`renameat2` (most Linux systems) to allow forbidding renaming to an
already existing target in an 'atomic' way.
NOTE: While this commit aims at avoiding the TOC/TOU problem as
much as possible by using available system's primitives for most
common cases, there are some situations where race conditions
(filesystem changes between checks on FS state, and actual rename
operation) remain possible.
Pull Request: https://projects.blender.org/blender/blender/pulls/115096
The ImplicitSharingPtr has an implicit constructor for raw pointers.
This has unintended effects when comparing an ImplicitSharingPtr to a
raw pointer: The raw pointer is implicitly converted to the shared
pointer (without change in refcount) and when going out of scope will
decrement user count, eventually freeing the data.
Conversion from raw pointer to shared pointer should not happen
implicitly. The constructor is made explicit now. This requires a little
more boilerplate when constructing a sharing pointer. A special
constructor for the nullptr is added so comparison with nullptr can
still happen without writing out a constructor.
Pull Request: https://projects.blender.org/blender/blender/pulls/115476