The CPU implementation had the following downsides:
- The kernels sizes below 3 did not make much sense, often leading
to a full white image.
- The way how the total number of pixels in the kernel was calculated
quite strangely.
From these points of view the GPU compositor behaves more user friendly.
There is a versioning code which lowers the kernel size, to match the
result prior to these tweaks.
Pull Request: https://projects.blender.org/blender/blender/pulls/117505
There exist a bunch of "give me a (filtered) image pixel at this location"
functions, some with duplicated functionality, some with almost the same but
not quite, some that look similar but behave slightly differently, etc.
Some of them were in BLI, some were in ImBuf.
This commit tries to improve the situation by:
* Adding low level interpolation functions to `BLI_math_interp.hh`
- With documentation on their behavior,
- And with more unit tests.
* At `ImBuf` level, there are only convenience inline wrappers to the above BLI
functions (split off into a separate header `IMB_interp.hh`). However, since
these wrappers are inline, some things get a tiny bit faster as a side
effect. E.g. VSE image strip, scaling to 4K resolution (Windows/Ryzen5950X):
- Nearest filter: 2.33 -> 1.94ms
- Bilinear filter: 5.83 -> 5.69ms
- Subsampled3x3 filter: 28.6 -> 22.4ms
Details on the functions:
- All of them have `_byte` and `_fl` suffixes.
- They exist in 4-channel byte (uchar4) and float (float4), as well as
explicitly passed amount of channels for other float images.
- New functions in BLI `blender::math` namespace:
- `interpolate_nearest`
- `interpolate_bilinear`
- `interpolate_bilinear_wrap`. Note that unlike previous "wrap" function,
this one no longer requires the caller to do their own wrapping.
- `interpolate_cubic_bspline`. Previous similar function was called just
"bicubic" which could mean many different things.
- Same functions exist in `IMB_interp.hh`, they are just convenience that takes
ImBuf and uses data pointer, width, height from that.
Other bits:
- Renamed `mod_f_positive` to `floored_fmod` (better matches `safe_floored_modf`
and `floored_modulo` that exist elsewhere), made it branchless and added more
unit tests.
- `interpolate_bilinear_wrap_fl` no longer clamps result to 0..1 range. Instead,
moved the clamp to be outside of the call in `paint_image_proj.cc` and
`paint_utils.cc`. Though the need for clamping in there is also questionable.
Pull Request: https://projects.blender.org/blender/blender/pulls/117387
This commit fixes node_cryptomatte test in the matte category
when using GPU compositor on Linux with NVidia GPU. Before this
fix the result image was squished horizontally.
The issue was caused by mismatch in the shader info and the
actual allocation of the matte image.
Pull Request: https://projects.blender.org/blender/blender/pulls/117475
This patch changes the operation space of the Flip node from the global
space to the local space. This means that the Flip node will now flip
the image without changing its location.
The calculation of the matte could lead to a division-by-zero
leading to completely different results on different graphics
backends.
Perform the same checks of the input and key saturation before
performing division as it is done in the CPU compositors.
The `node keying matte` test is still failing on Metal, but the
look of the result is much closer to what it would be.
The compositor Texture node differ between CPU and GPU. That's because
CPU falls back to the texture intensity as the alpha channel if no alpha
channel exists. And it also broadcast the texture intensity to all color
channels if no RGB evaluation exist. So adjust the GPU implementation to
follow that.
The Tonemap node has a wrong luminance scale. This is because the
parallel reduction shader for logarithmic sum had a wrong identity
value. In particular, its identity was set to 0.0, but since its
initialization macro computed the log, the zero becomes a rather large
negative value.
To fix this, the general structure of the parallel reduction shader was
changed such that the identity is used as is, and not passed to the
INITIALIZE or LOAD macros. This simplifies the implementation and even
avoid the extra texel fetches at the boundary.
The "Keying Screen" texture is written by the compositor_keying_screen
shader and is not read by the shader. Use appropriate usage flag which
matches the shader behavior.
Note that the texture is read by a node down the road, so both flags are
needed.
Pull Request: https://projects.blender.org/blender/blender/pulls/117340
The term `PIL` stands for "platform independent library." It exists since the `Initial Revision`
commit from 2002. Nowadays, we generally just use the `BLI` (blenlib) prefix for such code
and the `PIL` prefix feels more confusing then useful. Therefore, this patch renames the
`PIL` to `BLI`.
Pull Request: https://projects.blender.org/blender/blender/pulls/117325
When using the MapUV node for certain NPR workflows (for example palette
based remapping of colors) it can be useful to not use the default
anisotropic filtering.
In preparation of potentially adding more filter modes at a later stage
and to keep things consistent with the 'transform' node we use the full
set of interpolation modes in the enum, but expose only the implemented
ones in RNA..
The Plane Deform mask output retains the original alpha of the input
image, while it should be a binary mask. This patch ignores the alpha of
the image and uses a binary mask instead.
The GPU Directional Blur node distorts the inputs if the resolution is
not square. This is because rotation is performed on normalized
coordinates. This patch performs the normalization at the texture
evaluation.
The old implementation was a simple rounding operation and was not
implemented for full-frame compositor.
The issue with the old implementation is that it will not give
satisfactory results for images with high frequency details,
including cases when is used for a preview on Cycles render with
low number of samples. Additionally, when applied on animated
footage it produces very noisy result.
The new algorithm uses an explicit pixel size setting, which allows
the node to be used on its own, without need to have scale-down and
scale-up nodes. It also uses neighbour averaging, which produces
better looking result during animation and noisy input images.
The old tiled compositor setup will render without changes with
this change. This commit does not include modifications in the GPU
compositor implementation.
Ref #88150
Pull Request: https://projects.blender.org/blender/blender/pulls/117223
For high radii Kuwahara, we use a Summed Area Table (SAT) implementation
to accelerate the classic variant of the algorithm. The problem is that
due to limited floating point precision, the SAT can produce artifacts
in its output.
An attempt to fix this was implemented in #114191, and while that patch
improved precision by 10x, the artifacts still existed, albeit less
noticeable. But since the improved precision also meant a performance
penalty, it was decided that the improvement is not worth it.
Since the artifacts are only noticeable for scenes with very high
values, this patch adds a High Precision option that defaults to false
and can be enabled by the user upon noticing any artifacts. The option
simply uses direction convolution instead of SAT in this case. The
downside, of course, is that it can be orders of magnitude slower.
An alternative to using this option is for the user to clamp the input
or downsample the image. Both methods should be documented in the
documentation.
Fixes: #113578.
Pull Request: https://projects.blender.org/blender/blender/pulls/115763
Part of overall "improve filtering situation" (#116980) task:
* Add Bicubic filtering option to strip Transform "Filter" setting.
Previously this option only existed in Transform Effect "Interpolation"
setting.
- With this addition, it feels like the transform effect could
possibly be marked as legacy/deprecated, since the regular Transform
that is on all strips can do everything that Transform Effect did?
* Speed up bicubic filtering (used now in VSE, but also in CPU Compositor,
image paint, etc.) by slightly simplifying the code and using some SIMD.
Upscaling 96x54 image to 3840x2160 resolution, using Bicubic filtering:
- Windows (VS2022, Ryzen 5950X): 35.5ms -> 15.1ms
- Mac (clang 15, M1 Max): 29.6ms -> 24.4ms
* Add gtest coverage for bicubic functionality.
Pull Request: https://projects.blender.org/blender/blender/pulls/117100
The previous commit introduced a new `RPT_()` macro to translate
strings which are not tooltips or regular interface elements, but
longer reports or statuses.
This commit uses the new macro to translate many strings all over the
UI.
Most of it is a simple replace from `TIP_()` or `IFACE_()` to
`RPT_()`, but there are some additional changes:
- A few translations inside `BKE_report()` are removed altogether
because they are already handled by the translation system.
- Messages inside `UI_but_disable()` are no longer translated
manually, but they are handled by a new regex in the translation
system.
Pull Request: https://projects.blender.org/blender/blender/pulls/116804
Pull Request: https://projects.blender.org/blender/blender/pulls/116804
This patch implements the Vector Blur node for the Realtime Compositor.
The implementation is a direct and mostly identical port of the EEVEE
motion blur implementation with the necessary adjustments to make it
work with the compositor.
The exposed parameters in the node does not match those exposed in
EEVEE, so only the parameters shared between both are currently
implemented. In the future, we should make a decision to either unify
both, or just consider them independent implementations, with the
possibility of sharing the full or part of the code.
Further, it would also make sense to port the implementation to the CPU
compositor, since the new implementation is higher in quality while also
being faster.
The default value of the node shutter setting was changed to 0.25 to
approximately match the default settings of EEVEE and Cycles, since in
their default settings, they evaluate the previous and next frames at
plus and minus 0.25.
Pull Request: https://projects.blender.org/blender/blender/pulls/116977
This patch redesigns the Sun Beams node. The old implementation produces
unexpected results with an arbitrary alpha channel, due to the use of a
step count normalization instead of a weighted normalization. The
quality is also questionable due to the use of nearest neighbour
interpolation as well as a low resolution integration.
This patches redesign the node to be a simple line integration towards
the source, limited by the maximum ray length. The sampling resolution
covers the entire path of the integration and the image is sampled using
bilinear filtering, producing smoother results. No alpha is introduced.
Pull Request: https://projects.blender.org/blender/blender/pulls/116787
The GPU Directional Blur node does not match CPU. This is because GPU
accumulates the scale, while the CPU increments it. This patch
incremenets the scale for the GPU to make them match.
This commit adds a new helper to define expected properties when a
target needs to use the unity build feature.
That new helper does what was already done for existing cases, and in
addition add the target to the Ninja 'heavy' pooljobs if relevant.
Pull Request: https://projects.blender.org/blender/blender/pulls/116791
The GPU implementation of the morphological feather erosion operator is
different from the CPU implementation. This is because the CPU does
erosion by doing dilation sandwiched between two inverse functions. So
this patch fixes the different by following the CPU implementation.
Bundling many tests in a single binary reduces build time and disk space
usage, but is less convenient for running individual tests command line
as filter flags need to be used.
This adds WITH_TESTS_SINGLE_BINARY to generate one executable file per
source file. Note that enabling this option requires a significant amount
of disk space.
Due to refactoring, the resulting ctest names are a bit different than
before. The number of tests is also a bit different depending if this
option is used, as one uses gtests discovery and the other is organized
purely by filename, which isn't always 1:1.
Co-authored-by: Sergey Sharybin <sergey@blender.org>
Pull Request: https://projects.blender.org/blender/blender/pulls/114604
Along with the 4.1 libraries upgrade, we are bumping the clang-format
version from 8-12 to 17. This affects quite a few files.
If not already the case, you may consider pointing your IDE to the
clang-format binary bundled with the Blender precompiled libraries.
Followup to #114755. Initialize operation results explicitly if the input is constant. This patch unifies the size inference behavior of the following nodes/operation:
- Flip node
- Kuwahara (classic)
- Lens distortion
- Tonemap node
Pull Request: https://projects.blender.org/blender/blender/pulls/115733
This patch unifies the behavior of the scale node by removing the arbitrary limit of scaling images. The change applies to scale and transform nodes.
Note: with this change, Blender can crash for large resizing factors (around 10 000 x 10 000 relative factors). We think it's very unlikely users will run into this issue, so we agreed errors coming from failed memory allocation won't be handled, as is the case for GPU compositor.
Pull Request: https://projects.blender.org/blender/blender/pulls/114764