This patches refactors the compositor File Output mechanism and
implements the file output node for the Realtime Compositor. The
refactor was done for the following reasons:
1. The existing file output mechanism relied on a global EXR image
resource where the result of each compositor execution for each
view was accumulated and stored in the global resource, until the
last view is executed, when the EXR is finally saved. Aside from
relying on global resources, this can cause effective memory leaks
since the compositor can be interrupted before the EXR is written and
closed.
2. We need common code to share between all compositors since we now
have multiple compositor implementations.
3. We needed to take the opportunity to fix some of the issues with the
existing implementation, like lossy compression of data passes,
and inability to save single values passes.
The refactor first introduced a new structure called the Compositor
Render Context. This context stores compositor information related to
the render pipeline and is persistent across all compositor executions
of all views. Its extended lifetime relative to a single compositor
execution lends itself well to store data that is accumulated across
views. The context currently has a map of File Output objects. Those
objects wrap a Render Result structure and can be used to construct
multi-view images which can then be saved after all views are executed
using the existing BKE_image_render_write function.
Minor adjustments were made to the BKE and RE modules to allow saving
using the BKE_image_render_write function. Namely, the function now
allows the use of a source image format for saving as well as the
ability to not save the render result as a render by introducing two new
default arguments. Further, for multi-layer EXR saving, the existent of
a single unnamed render layer will omit the layer name from the EXR
channel full name, and only the pass, view, and channel ID will remain.
Finally, the Render Result to Image Buffer conversion now take he number
of channels into account, instead of always assuming color channels.
The patch implements the File Output node in the Realtime Compositor
using the aforementioned mechanisms, replaces the implementation of the
CPU compositor using the same Realtime Compositor implementation, and
setup the necessary logic in the render pipeline code.
Pull Request: https://projects.blender.org/blender/blender/pulls/113982
The Realtime compositor currently relies on the GPU cache in image IDs.
That cache only supports single layer images, so multi-layer images will
be acquired without a cache, introducing significant IO bottlenecks for
the GPU compositor.
This patch ignores the image GPU cache and stores the images in the
static cache manager of the compositor. Draw data was introduced to the
image ID for proper cache invalidation, like other IDs such as masks.
The downside is that the cache will no longer be shared between EEVEE
and the compositor. But realistically, images are not typically shared
between materials and compositors.
This is just a temporary solution until we have proper GPU storage
support for image buffers.
Pull Request: https://projects.blender.org/blender/blender/pulls/115511
This patch creates a static cached resource from bokeh kernel images,
for better performance and reusability, since it will be used by the
Defocus node as well.
Due to changes in the build environment shader_builder wasn't able to
compile on macOs. This patch reverts several recent changes to CMake files.
* dbb2844ed9
* 94817f64b9
* 1b6cd937ff
The idea is that in the near future shader_builder will run on the buildbot as
part of any regular build to ensure that changes to the CMake doesn't break
shader_builder and we only detect it after a few days.
Pull Request: https://projects.blender.org/blender/blender/pulls/115929
Changes:
- Renamed Split Viewer Node to Split Node
- Split Node is now under `Utilities` (similar to Switch node)
- Versioning: split viewer from 4.0 and before is replaced with the new split node connected to a new viewer node.
Pull Request: https://projects.blender.org/blender/blender/pulls/114245
This patch implements a new mechanism for compositor results to wrap
external images, such as those cached in the static cache manager.
Thereby enabling zero cost use of those resources, which previously
needed a copy at each evaluation.
Pull Request: https://projects.blender.org/blender/blender/pulls/115574
This patch rewrites the Inpaint node in the Realtime Compositor. The old
method suffered from discontinuities and singularities in the inpainting
regions. Furthermore, it ignored semi-transparent areas.
The new method is inspired by a two pass method described by the paper:
Rosner, Jakub, et al. "Fast GPU-based image warping and inpainting for
frame interpolation." International Conferences on Computer Graphics,
Vision and Mathematics. 2010.
In particular, we first fill the inpainting region using jump flooding,
then we apply a variable size blur pass whose size is proportional to
the distance to the inpainting boundary. The smoothed region is then
mixed with the input using its alpha.
The new method is much closer to the Bertalmio-style diffusion-based
inpainting methods, and thus can more accurately close holes than
existing methods.
The aforementioned method requires variable size blur, which is quite
expensive for this use case, so a new implementation was added that
approximates the method using a separable implementation, which provides
a visually pleasing result assuming a sufficiently smooth radius field,
which is true for our case since the field is an SDF.
Fixes: #114422
Pull Request: https://projects.blender.org/blender/blender/pulls/114849
This patch adds support for full precision compositing for the Realtime
Compositor. A new precision option was added to the compositor to change
between half and full precision compositing, where the Auto option uses
half for the viewport compositor and the interactive render compositor,
while full is used for final renders.
The compositor context now need to implement the get_precision() method
to indicate its preferred precision. Intermediate results will be stored
using the context's precision, with a number of exceptions that can use
a different precision regardless of the context's precision. For
instance, summed area tables are always stored in full float results
even if the context specified half float. Conversely, jump flooding
tables are always stored in half integer results even if the context
specified full. The former requires full float while the latter has no
use for it.
Since shaders are created for a specific precision, we need two variants
of each compositor shader to account for the context's possible
precision. However, to avoid doubling the shader info count and reduce
boilerplate code and development time, an automated mechanism was
employed. A single shader info of whatever precision needs to be added,
then, at runtime, the shader info can be adjusted to change the
precision of the outputs. That shader variant is then cached in the
static cache manager for future processing-free shader retrieval.
Therefore, the shader manager was removed in favor of a cached shader
container in the static cache manager.
A number of utilities were added to make the creation of results as well as
the retrieval of shader with the target precision easier. Further, a
number of precision-specific shaders were removed in favor of more
generic ones that utilizes the aforementioned shader retrieval
mechanism.
Pull Request: https://projects.blender.org/blender/blender/pulls/113476
The Cryptomatte node is not searchable in the link drag search operator.
That's because it still uses socket templates, which are no longer
supported for search since f5e6d4e4b0.
This patch fixes that by using the declare method instead of socket
templates.
Pull Request: https://projects.blender.org/blender/blender/pulls/114537
The compositor sometimes produces straight alpha even though
premultiplied alpha is expected. Moreover, there is an inconsistency
between the CPU and GPU compositors.
For the GPU compositor, this is because GPU textures sometimes store
straight alpha, while the compositor always expects premultiplied alpha,
so we need to premultiply the alpha in those cases.
For the CPU compositor, this is because the image operation didn't
premultiply the alpha of byte textures, so we need to ensure
premultiplied alpha in those cases.
There is a data loss issue in case of byte images, since the IMB module
unpremultiplies premultiplied images then the compositor premultiplies
it again. But this will be handled in a different patch since it require
some design and refactoring first.
Pull Request: https://projects.blender.org/blender/blender/pulls/114305
This patch changes how wrapped translations are handled by the Realtime
Compositor. Previously, translations were always stored on the result
and delayed until automatically realized later. The wrapping status was
also stored to control this later automatic realization.
This patch changes that such that translations are immediately realized
for the axes that has enabled wrapping. Consequently, the image will not
get translated, but its content will, in a clip on one side, wrap on the
opposite side manner.
Another change is that wrapping information is no longer propagated to
future automatic realizations, so tilling or repeating an image is no
longer possible. An alternative method of repetition will be introduced
in a later patch.
Pull Request: https://projects.blender.org/blender/blender/pulls/113669
The last good commit was 8474716abb.
After this commits from main were pushed to blender-v4.0-release. These are
being reverted.
Commits a4880576dc from to b26f176d1a that happend afterwards were meant for
4.0, and their contents is preserved.
When searching for a new node by dragging from a socket, some results
were untranslated. This is because they did not use a translation
context matching other occurrences, from which the strings were
extracted to the translation files.
Three nodes using operations were affected: Mix and Vector Math.
- Vector Math used the default context when it should have used
NodeTree.
- Mix and Mix RGB used NodeTree when they should have used the default
context.
Pull Request: https://projects.blender.org/blender/blender/pulls/113485
This helps solving the problem encountered in #113553. The problem is that we
currently can't support link-drag-search for nodes which have a dynamic declaration.
With this patch, there is only a single `declare` function per node type, instead of
the separate `declare` and `declare_dynamic` functions. The new `declare` function
has access to the node and tree. However, both are allowed to be null. The final
node declaration has a flag for whether it depends on the node context or not.
Nodes that previously had a dynamic declaration should now create as much of
the declaration as possible that does not depend on the node. This allows code
like for link-drag-search to take those sockets into account even if the other
sockets are dynamic.
For node declarations that have dynamic types (e.g. Switch node), we can also
add extra information to the static node declaration, like the identifier of the socket
with the dynamic type. This is not part of this patch though.
I can think of two main alternatives to the approach implemented here:
* Define two separate functions for dynamic nodes. One that creates the "static
declaration" without node context, and on that creates the actual declaration with
node context.
* Have a single declare function that generates "build instructions" for the actual
node declaration. So instead of building the final declaration directly, one can for
example add a socket whose type depends on a specific rna path in the node.
The actual node declaration is then automatically generated based on the build
instructions. This becomes quite a bit more tricky with dynamic amounts of sockets
and introduces another indirection between declarations and what sockets the node
actually has.
I found the approach implemented in this patch to lead to the least amount of
boilerplate (doesn't require a seperate "build instructions" data structure) and code
duplication (socket properties are still only defined in one place). At the same time,
it offers more flexibility to how nodes can be dynamic.
Pull Request: https://projects.blender.org/blender/blender/pulls/113742
The GPU compositor crops the viewed images to the render resolution.
While the original size and content of the input to the viewer should be
retained as is.
This patch fixes that by specializing compositors that can use composite
outputs to be able to view images of any arbitrary size. This is still
missing the translation offset of the viewer, but this shall be tackled
separately.
This patch immediately realizes the scale and rotation components of
transformations at the point of transform nodes. The translate component is
still delayed and only realized when really needed to avoid clipping.
Transformed results are always realized in an expanded domain that avoids
clipping due to rotation or scaling. The size of the transformed domain is
clipped to the GPU texture size limit for now until we have support for huge
textures, that limit is typically 16k.
A potential optimization is to join all consecutive transform and realize
operations into a single realize operation.
Fixes#112332.
Pull Request: https://projects.blender.org/blender/blender/pulls/112332
The goal here is to make it easier to use the socket declaration builder
for cases where the actual socket type is not known at compile time.
For that purpose, all the methods that are not dependent on the specific
socket type are moved to the base socket declaration builder.
A nice side effect of this is reduced templated boilerplate and that more
code can be moved out of the header.
With this patch, one is now forced to put type specific method calls before
generic method calls in a chain. For example `.default_value(...).supports_field()`
instead of `supports_field().default_value(...)`. In theory, we could keep
support for both orders but that would involve a lot of additional boilerplate
code. Enforcing this order is simple enough. Note that this limitation only
applies when chaining multiple method calls. This is still possible:
```
auto &decl = b.add_input<decl::Vector>("Value");
decl.supports_field();
decl.default_value(...);
```
Pull Request: https://projects.blender.org/blender/blender/pulls/113410
This patch implements the Keying Screen node for the Realtime
Compositor. Draw data was introduced to the Movie Clip ID to allow
caching of the keying screen.
Pull Request: https://projects.blender.org/blender/blender/pulls/113055
This patch changes the interpolation algorithm utilized by the Keying
Screen node to a Gaussian Radial Basis Function Interpolation. This is
proposed because the current Voronoi triangulation based interpolation
has the following properties:
- Not temporally stable since the triangulation can abruptly change as
tracking markers change position.
- Not smooth in the mathematical sense, which is also readily visible in
the artists sense.
- Computationally expensive due to the triangulation and naive
rasterization algorithm.
On the other hand, the RBF interpolation method is temporally stable and
continuous, smooth and infinitely differentiable, and relatively simple
to compute assuming low number of markers, which is typically the case
for keying screen objects.
This breaks backward compatibility, but the keying screen is only used
as a secondary input for keying in typical compositor setups, so one
should expect minimal difference in outputs.
Pull Request: https://projects.blender.org/blender/blender/pulls/112480
This patch changes the image type used in the Jump Flooding Algorithm to
be Int2 instead of Float4. That's because we used to store the distance
along with the texel location, which we no longer do, so we are left
with the 2D texel location only which can be stored in an Int2 image.
We no longer store the distance because it is not necessarily needed, it
introduces a sqrt in each of the JFA passes, and it is less precise due
to storage in 16F images. Developers should compute the distance in the
user shader instead.
This is a non-functional change, but results in less memory usage,
higher performance, and higher precision.
Pull Request: https://projects.blender.org/blender/blender/pulls/112941
This patch implements the Inpaint node for the Realtime Compositor. The
inpainting region is filled by sampling the color of the nearest boundary pixel
if it is not further than the user supplied distance. Additionally, a lateral
blur is applied in the tangential path to the inpainting boundary to smooth out
the inpainted region.
The implementation is not identical to the existing CPU implementation due to
technical infeasibility. In particular, the CPU implementation uses a Manhattan
distance transform, while the GPU implementation uses an Euclidean one, which is
a consequence of the use of the Jump Flooding algorithm. Furthermore, the CPU
uses a serial convolution starting from the boundary outwards, while the GPU
uses a lateral Gaussian blur in the direction tangent to the boundary.
Pull Request: https://projects.blender.org/blender/blender/pulls/111792
This patch implements the Double Edge Mask node for the Realtime
Compositor. The implementation is primarily based on the 1+JFA Jump
Flooding algorithm, which was also introduced in this commit.
Pull Request: https://projects.blender.org/blender/blender/pulls/112223
Now that specific menus can be searched directly (see 7f9d51853c),
there is no need to maintain separate search functionality for adding
nodes. This PR removes the add node search. In a way this brings us
closer to the `NodeItem` situation before, but the setup is more
flexible since the menus are more standard and easier to customize.
In the few ways we customized the node search items before, this gives
us the same results as before. Overall the searching is less flexible,
but I think that is just a tradeoff we have to accept for the simplicity
of searching menus. In the future menus could be made more dynamic,
with each builtin node's menu path stored on the node type, similar to
assets. That might be a nice compromise. In the meantime this code
is just dead weight.
Pull Request: https://projects.blender.org/blender/blender/pulls/112056
There are a couple of functions that create rna pointers. For example
`RNA_main_pointer_create` and `RNA_pointer_create`. Currently, those
take an output parameter `r_ptr` as last argument. This patch changes
it so that the functions actually return a` PointerRNA` instead of using
the output parameters.
This has a few benefits:
* Output parameters should only be used when there is an actual benefit.
Otherwise, one should default to returning the value.
* It's simpler to use the API in the large majority of cases (note that this
patch reduces the number of lines of code).
* It allows the `PointerRNA` to be const on the call-site, if that is desired.
No performance regression has been measured in production files.
If one of these functions happened to be called in a hot loop where
there is a regression, the solution should be to use an inline function
there which allows the compiler to optimize it even better.
Pull Request: https://projects.blender.org/blender/blender/pulls/111976
The hash tables and vector blenlib headers were pulling many more
headers than they actually need, including the C base math header,
our C string API header, and the StringRef header. All of this
potentially slows down compilation and polutes autocomplete
with unrelated information.
Also remove the `ListBase` constructor for `Vector`. It wasn't used
much, and making it easy to use `ListBase` isn't worth it for the
same reasons mentioned above.
It turns out a lot of files depended on indirect includes of
`BLI_string.h` and `BLI_listbase.h`, so those are fixed here.
Pull Request: https://projects.blender.org/blender/blender/pulls/111801
This patch adds support for the realization of transformations of
operation inputs in the Realtime Compositor. Input socket declarations
can now include a preference to what sort of realization needs to
happen.
All inputs specify realization on the operation domain by default
because that is needed for the correct operation of most operations.
Nodes may choose not to be realized on the operation domain, like the
MapUV, Plane Deform, and Bokeh Blur nodes; that's because their inputs
are treated as transform-less image objects.
Nodes may chose to realize their rotation or scale, like operations that
are not rotation or scale invariant and thus need images of identity
transformations. No nodes are declared as such so far, as this is still
being considered by developers and test builds be published for testing.
This patch coincidentally also fixes#102252 by declaring the Bokeh input
of the Bokeh Blur node to need realization of rotation. Which is the only
functional change of the patch.
Fixes#102252.
Pull Request: https://projects.blender.org/blender/blender/pulls/111179
This patch implements the Anisotropic Kuwahara filter for the Realtime
compositor and replaces the existing CPU implementation with a new one to be
compatible with the GPU implementation. The implementation is based on three
papers on Anisotropic Kuwahara filtering, presented and detailed in the code.
The new implementation exposes two extra parameters that control the sharpness
and directionality of the output, giving more artistic freedom.
While the implementation is different from the existing CPU implementation, it
is a higher quality one that is also faster and conforms better to the methods
described in the papers.
Examples can be seen in the pull request description.
Pull Request: https://projects.blender.org/blender/blender/pulls/110786