Cryptomatte layer initialization required the active scene to update the
image user frame, but the image user is already updated through other
mechanisms and is thus redundant to do at draw time, so we remove the
frame update call as well as the scene argument from the call chain. The
frame number of the image user is ignored at compositing time in any
case since it is set to the compositing scene frame.
This is needed for #124738.
This commit moves generated `RNA_blender.h`, `RNA_prototype.h` and
`RNA_blender_cpp.h` headers to become C++ header files.
It also removes the now useless `RNA_EXTERN_C` defines, and just
directly use the `extern` keyword. We do not need anymore `extern "C"`
declarations here.
Pull Request: https://projects.blender.org/blender/blender/pulls/124469
This adds a new mode to the Color Balance node, which applies a white point
transformation similar to the one applied in the view transform.
Unlike the view transform, the compositor node allows specifying both the
source and the destination white point for more flexibility. Both default
to the D65 white point, so just leaving the destination alone achieves the
same behavior.
Pull Request: https://projects.blender.org/blender/blender/pulls/124110
The vector pass and potentially other vectors that store 4 values are
stored wrongly, in particular, the last channel is ignored. To fix this
we identify if a vector pass is 4D and store the information in the
result meta data, then use this information to either save a 3D or a 4D
pass in the File Output node.
This is a CPU implementation of the same mechanism in the GPU compositor
implemented in 57a6832b17. The CPU implementation is a bit more complex
because the CPU compositors stores 4D vectors in color images internally
which can lead to information loss in case of implicit conversion when
the File Output has vector sockets. So what we do is force all vector
inputs to the File Output operation to be color, then save that as 3D or
4D depending on the meta data as well as the original UI socket type.
Pull Request: https://projects.blender.org/blender/blender/pulls/124580
Properly guard the conditionals inside ImageNode::convert_to_operations
against null operations.
I've duplicated the null check in order to not add another layer of
nesting in an already very deep set of conditionals.
Pull Request: https://projects.blender.org/blender/blender/pulls/124473
The Sun beams node produces sharp results when the background has an
alpha channel. That's because the sun beams samples were weights by the
alpha channel, and when the image is mostly transparent, no smooth
falloff can exist.
This is a regression in the Sun Beams redesign that took place in v3.6.
I was mislead by the fact that the alpha was multiplied to the weight.
But the original code actually multiplied the alpha of the border color,
not the sample color. So this fix essentially restores the old behavior.
Pull Request: https://projects.blender.org/blender/blender/pulls/124532
The vector pass and potentially other vectors that store 4 values are
stored wrongly, in particular, the last channel is ignored. To fix this
we identify if a vector pass is 4D and store the information in the
result meta data, then use this information to either save a 3D or a 4D
pass in the File Output node.
This is a partial fix for the GPU compositor only. The complete fix for
the CPU compositor will be submitted separately as it is not
straightforward and will likely require a refactor.
Pull Request: https://projects.blender.org/blender/blender/pulls/124522
The Cryptomatte layers in the compositor are saved compressed and saved
in half precision in the File Output node. That's because the EXR writer
makes decision about compression and half float based on channel names.
And Cryptomatte are currently saved using RGBA channel names like other
color images.
To fix this, we follow the Cycles convention of using lowercase rgba for
Cryptomatte images, allowing them to be saved without compression and
always in full precision since they are otherwise useless.
Fixes#124208, #87988.
Pull Request: https://projects.blender.org/blender/blender/pulls/124508
This patch adds support for meta data in the GPU compositor much like
the mechanism that already exist in the CPU compositor. Only Cryptomatte
meta data is handled at the moment because that is the only meta data
that the compositor supports.
The is_data member of the result was moved to the meta data structure for
consistency with the CPU compositor.
Fixes#124222.
Pull Request: https://projects.blender.org/blender/blender/pulls/124460
The Bloom mode of the Glare node crashes if the input image is too
small. This is because bloom is computed using a down-sampling followed
by an up-sampling chain, and if the user supplied size is not maximum,
the computed chain length might be zero or negative, which is not
handled gracefully. To fix this, we just sanitize the chain length.
Pull Request: https://projects.blender.org/blender/blender/pulls/124089
The GPU compositor frees some of the cached resources when it gets
canceled during interactive editing, making the experience less smooth.
This is because when the compositor gets canceled mid-evaluation, some
of the operations won't get the chance to mark their used resources as
still in use.
To fix this, we skip the cache manager reset after canceling,
effectively only resetting when a full evaluation happens, giving all
operations the chance to keep their cached resources.
Pull Request: https://projects.blender.org/blender/blender/pulls/123886
Linked images that share the same name as an image in the file will fail
to load in the GPU compositor. That's because cached resources are keyed
using only their ID name, while they should also be keyed by the library
name.
Pull Request: https://projects.blender.org/blender/blender/pulls/123898
The Plane Track Deform node produces wrong outputs in the GPU compositor
in case the input size was different from the movie size. That's because
the coordinates were normalized based on the input size, while they
should be normalized based on the output size, which is what this
patches does.
The first input of the compositor Mix node determines resolution,
leading to situation when the second input will always be attempted
to be evaluated. If the first input is a longer image sequence than
the second input it leads to a crash.
Do a null-pointer check and return transparent image in this cases,
similar to what the Movie Clip operation is doing.
Pull Request: https://projects.blender.org/blender/blender/pulls/123493
The File Output node ignores color space overrides for EXR images. To
fix this, we save the images using save_as_render set to true. We don't
need to provide this as an option similar to other image types because
even when save_as_render is set to true, it will not have an effect
unless the user chooses to override the color space explicitly, since it
is not affected by view transforms and the like.
Pull Request: https://projects.blender.org/blender/blender/pulls/122791
The File Output node forces all inputs to have the same size, which
should only be the case for multilayer files. This is a regression in
931c188ce5. To fix this, we allow inputs to have any size, except for
multilayer files, which are realized on the automatic operation domain
of the operation.
Pull Request: https://projects.blender.org/blender/blender/pulls/122824
The CPU compositor does not recognize viewer groups inside node groups
unless a viewer node exists at the top level node group. This is caused
by bad logic when registering viewer node at the node operation builder.
And the fix is just to correct the logic to always register viewers if
no active viewer already exists, which can then later be overwritten if
a viewer node that takes precedence was discovered.
Pull Request: https://projects.blender.org/blender/blender/pulls/122869
The Glare node shifts the color of the highlights when the threshold is
high. That's because the thresholding algorithm simply subtracts the
threshold from the RGB data, which is not expected to retain the same hue
of the color.
To fix this, we do the thresholding only on the luminance of the color
in HSV color space. This eliminates the color shifting and also helps to
smooth the edges of the highlights.
This is a breaking change, but it is more of a fix rather than a change
of behavior.
Pull Request: https://projects.blender.org/blender/blender/pulls/122570
This change fixes:
- Constant folder uses rect of size INT_MIN .. INT_MAX, which overflows integer
when calculating size.
- Calculation of an offset inside of input can lead to integer overflow for big
resolutions.
- texture_bilinear_extend() and texture_nearest_extend() do not seem to handle
single element buffers correctly.
- Kuwahara has division of zero when the input size is 0.
Pull Request: https://projects.blender.org/blender/blender/pulls/122674
The issue is visible when adding an assert in the delta accessors of
the TranslateOperation operation (get_delta_x and get_delta_y), and
rendering compositor-nodes-desintegrate-wipe-01.blend (either command
line or F12, doesn't matter).
Seems that under certain circumstances the system might skip determining
the area of interest. For such cases ensure delta from the beginning of
the threaded code.
Pull Request: https://projects.blender.org/blender/blender/pulls/122686
This patch also fixes a crash when image input of `Stabilize2DNode` is unconnected and interpolation is set to bicubic.
Differences to GPU compositor:
- ~1px difference is observed due to different rounding of domain size vs. canvas size. This difference is constant for all image sizes.
- If image input is unconnected but other inputs are, CPU compositor doesn't consider the operation to be constant, whereas GPU compositor still outputs a constant result.
Pull Request: https://projects.blender.org/blender/blender/pulls/122288
This patch adds support for timing GPU compositor executions. This was
previously not possible since there was no mechanism to measure GPU
calls, which is still the case. However, since 2cf8b5c4e1, we now flush
GPU calls immediately for interactive editing, so we can now measure the
GPU evaluation on the host, which is not a very accurate method, but it
is better than having no timing information. Therefore, timing is only
implemented for interactive editing.
This is different from the CPU implementation in that it measures the
total evaluation time, including any preprocessing of the inputs like
implicit type conversion as well as things like previews.
The profiling implementation was moved to the realtime compositor since
the compositor module is optional.
Pull Request: https://projects.blender.org/blender/blender/pulls/122230
The issue was caused by a non-initialized ibuf_ used to set the non-color
space to. For now limit the tweaks to the image data-block, which is always
ensured. This makes it consistent with the GPU compositor which does not
have access to ibuf at the moment when the meta-data is being checked.
Ideally we do need to set the color space, but it needs to happen consistently,
and in a thread-safe manner. The way how the CPU compositor releases the
ibuf right after acquisition does not feel safe, so less we rely on it is
better.
The issue originates to the change in default view transform from Filmic
to AgX, which does slightly different clipping, and clips color to black
if there is any negative values.
This change implements an idea of skipping view transform for viewer
node when it is connected to the Pick output of the cryptomatte node.
It actually goes a bit deeper than this and any operation can tag its
result as a non-color data, and the viewer node will respect that.
It is achieved by passing some extra meta-data along the evaluation
pipeline. For the CPU compositor it is done via MetaData, and for the
GPU compositor it is done as part of Result.
Connecting any other node in-between of viewer and Cryptomatte's Pick
will treat the result as color values, and apply color management.
Connecting Pick to the Composite output will also consider it as color,
since there is no concept of non-color-managed render result.
An alternative approaches were tested, including:
- Doing negative value clamping at the viewer node.
It does not work for legacy cryptomatte node, as it needs to have
access to original non-modified Pick result.
- Change the order of components, and store ID in another channel.
Using one of other of Green or Blue channels might work for some view
transforms, but it does not work for AgX.
Using Alpha channel seemingly works better, but it is has different
issues caused by the fact that display transform de-associates alpha,
leading to over-exposed regions which are hard to see in the file from
the report. And might lead to the similar issues as the initial report
with other objects or view transforms.
- Use positive values in the Pick channel.
It does make things visible, but they are all white due to the nature
of how AgX works, making it not so useful as a result.
Pull Request: https://projects.blender.org/blender/blender/pulls/122177
Conversion of compositor node tree to operation is done in a job thread,
and the main thread might modify the image data-block at the same time.
This change fixes it by making it so compositor uses acquire/release
semantic for the image data-block, and making it so the image locks its
render result, preventing other threads from modifying it.
Ref #121761
Pull Request: https://projects.blender.org/blender/blender/pulls/122105
Image's render result might get freed from another thread while the
compositor is running.
Add an utility function which invokes callback on the image's stamp
data from a thread-guarded block.
Ref #118337, #121761
Pull Request: https://projects.blender.org/blender/blender/pulls/121907
Image operation's get_im_buf() function was not thread-safe:
- It had TOCTOU issue around calculating multi-layer indices and
requesting to load the image buffer.
- It accessed render result, render layer and pass pointers without
any thread guards.
This change moves all the logic needed to access the image buffer
into a single function with proper guards around the access. The
result is user-counted, so it is usable in a thread even if another
thread modifies the image.
The is still potential TOCTOU in the compositor since the image is
acquired twice: once from init_execution(), and once from the
determine_canvas(). It could cause issues if image resolution is
changed between these calls. It is still to be looked into.
Ref #118337, #121761
This is avoids reference to data which can potentially be freed from the main
thread while the compositor job is running.
There is still some direct access to RenderResult and access to its layers
and passes in the operation implementation, but is is all internal and will
be worked on later. The purpose of this patch is to avoid unsafe pointers in
the API of the operation.
Should be no functional changes.
Ref #118337, #121761
There are a couple of goals achieved with this change:
- The logic itself is de-duplicated between the Image and Cryptomatte
nodes.
- The logic which accesses render results, images, etc is more local
to the place where it needs to be used. Currently it does not matter
too much, but it allows to properly guard the access to be thread
safe.
Ref #118337, #121761
Allows to modify the user without worrying to store/restore old values,
potentially resolving threading conflicts.
Should be no changes on user level.
Ref #118337, #121761
This patch implements the Fog Glow Glare node by porting the CPU
implementation, so it is not GPU accelerated and is not expected to be
realtime. However, after d4bf23771d, it is now fast enough to be usable,
see that commit for more information on the implementation.
The only difference is that the kernel part of the convolution is cached
in the realtime compositor, so it should be about 30% faster than CPU
for interactive editing.
In the future, this implementation will be replaced by a proper GPU
implementation, likely based on VkFFT.
Optimize the Fog Glow glare code by making sure TBB is used for
threading, it only uses the needed space for the frequency domain, and
only load the TLD storage once for every threaded invocation.
The Sun Beams node produces NaNs when the ray length option is zero.
This is due to zero division in the code, which we avoid by skipping
computation altogether when the ray length is zero.
This patches optimizes the Fog Glow Glare node to be about 25x faster
for 4K images. This is mainly achieved by utilizing the FFTW library and
multi-threading support code. Further improvements are still possible by
caching kernels, but the CPU compositor does not support caching yet.
The old Hartley transform was removed, so the node no longer works when
FFTW is disabled as a build time option, much like the OIDN node. A new
BLI library was introduced for FFTW, it includes some helper routines
relevant for FFTW as well as an initialization routine that sets up
multithreading using TBB as well as thread safety.
Build system support for threaded FFTW was also added, which defines the
relevant variables to detect threading support as well as add the
relevant libraries.
We do not currently have the threaded FFTW libs in our precompiled libs,
so the threading code is disabled until the libs lands in the coming
weeks. So currently, the code is only about 9x faster.
The only functional change is that the kernel is now odd sized, which
should produce more accurate results, but the final result is almost
identical and mostly undetectable.
The plan is to port this to the GPU as well similar to how we implement
OIDN until we have a GPU FFT implementation. GPU compositor can also do
caching, so it should be faster, being able to compute a 4K image in
under half a second.
Pull Request: https://projects.blender.org/blender/blender/pulls/121653
Effectively, make GPU compositor available without need to enable
an experimental feature set.
The compositor device is now exposed in the Performance panel of
Render Buttons. It is also still available in the compositor's
N-panel, together with some other options which are more about how
editing works, and not exactly related to render performance.
Pull Request: https://projects.blender.org/blender/blender/pulls/121398
Move all header file into namespace.
Unnecessary namespaces was removed from implementations file.
Part of forward declarations in header was moved in the top part
of file just to do not have a lot of separate namespaces.
Pull Request: https://projects.blender.org/blender/blender/pulls/121637
This allows to expose these settings in the Performance panel in the
render buttons. Also moves compositor-specific options away from the
generic node tree structure.
For the backwards-compatibility the options are still present in the
DNA for the bNodeTree. This is to minimize the impact on the Studio
which has used the GPU compositor for a while now. They can be
removed in a future release.
There is no functional changes expected.
Pull Request: https://projects.blender.org/blender/blender/pulls/121583
It was meant to be included into the previous commit in the area,
but was forgotten due to some technicalities.
Also remove the DisplaceSimpleOperation, which is now not used.
Pull Request: https://projects.blender.org/blender/blender/pulls/121580
The setting was only affecting some of the blur operations, which
does not typically results in a measurable performance boost in real
compositor setups.
For the simplicity of settings on user level remove setting which
potentially makes compositor output worse, without much benefit.
There are better ways to gain performance, like compositing on a
lower resolution, exposing "preview" as an input to the node tree
(similar to the geometry nodes) etc.
Pull Request: https://projects.blender.org/blender/blender/pulls/121576
There are few issues with the logic and implementation of this option:
- While the first pass is faster in the terms of a wall-clock time, it
is often not giving usable results to artists, as the final look of
the result is so much different from what it is expected to be.
- It is not supported by the GPU compositor.
- It is based on some static rules based on the node type, rather than
on the apparent computational complexity.
The performance settings are planned to be moved to the RenderData, and
it is unideal to carry on such limited functionality to more places. There
are better approaches to quickly provide approximated results, which we can
look into later.
Pull Request: https://projects.blender.org/blender/blender/pulls/121558