The CPU and GPU implementations of the compositor Texture node sample
the texture at different locations, producing differences. This is
primarly an issue with half pixel offsets to evaluate the texture at the
center of the pixels.
The CPU compositor only adds the offsets if interpolation is disabled,
regardless if the texture is of type image or not. Further the offset
was wrong, since it was not a half pixel offset, but a full pixel one.
The offsets should always be added, since it won't make a difference in
the non interpolated case, and it should always be added especially if
interpolation is enabled. A comment in the old code mentions something
about artifacts, but it is possible that this is a consequence of the
wrong offset that was used.
This patch always adds a half pixel offset, following the GPU code.
Pull Request: https://projects.blender.org/blender/blender/pulls/118377
Currently, some compositor operations produce en empty output buffer by
specifying a COM_AREA_NONE canvas to indicate an invalid output, for
instance, when the Mask operation references an invalid mask. The
intention is that this buffer would signal that a fallback value should
fill the canvas of consumer operations.
The aforementioned behavior is currently implemented in a rather hacky
way, where it is implemented using the Translate operation as part of
canvas conversion in COM_convert_canvas, where the operation would clear
the entire buffer with zeros since out of bounds checking would always
take the out of bound case due to the empty buffer.
This behavior is problematic because we can't control the fallback
value, which would ideally be an opaque black color. Moreover, since
implicit type conversion happen before canvas conversion by design,
value typed buffers would eventually become transparent, which is rather
unexpected to the end user since float/value outputs can't have
transparency.
This is not a good design or implementation, but a redesign will be too
complex for now. So to fix this, we workaround it by handling the empty
buffer case explicitly in the Translate operation and fill the output
using a fallback black color, which works for both value and color typed
buffers, since this would also be the output of the value to color
implicit conversion.
Pull Request: https://projects.blender.org/blender/blender/pulls/118340
The File Output node crashes if it has no image input. That's because we
would be attempting to save a zero sized image. So ensure that the node
has a non zero canvas before saving anything.
This patch adds support for canceling compositor evaluations for the
realtime compositor. Only the canceling of the Denoise operation is
supported for now. That's because inter-operation canceling is not
feasible since all work will have been submitted to the GPU driver
before the user is able to cancel. So some kind of blocking operations
would need to used to actually allow canceling, which is not something
we are going to investigate as part of this patch.
Pull Request: https://projects.blender.org/blender/blender/pulls/117725
The compositor assumes the entire viewport as its compositing space even
in camera view. The current design decision was to limit the compositing
space by the camera region only if the camera passepartout is opaque,
that is, areas outside of the camera are not visible.
This patch changes that behavior to always limit the compositing space
by the camera region. The downside is that areas outside of the camera
will be left uncomposited.
This is useful to match viewport compositing to final render compositing
in terms of maintaining the same space, but not necessarily the same
resolution. However, this still has the limitation that space will be
different when the camera region intersects the viewport, since we only
composite their intersection in that case.
Pull Request: https://projects.blender.org/blender/blender/pulls/118241
Span is preferrable since it's agnostic of the source container,
makes it clearer that there is no ownership, is 8 bytes smaller,
and can be passed by value.
The tiled compositor code is mainly still around, which is only
expected to be a short-lived period. Eventually it will also be
removed.
The OpenCL, Group Buffers, and Chunk size options are already removed.
Pull Request: https://projects.blender.org/blender/blender/pulls/118010
Caused by previous change for full-frame compositor.
The determine_canvas() is still used by the tiled compositor,
so restore it to the state which makes both compositors to work.
Pull Request: https://projects.blender.org/blender/blender/pulls/118254
This aligns the node behavior to GPU compositor.
The implementation is a bit of a stretch of the full-frame
constant-foldable operations, but this is as good as it can
get in a short time.
The alternative could be to somehow inject a SetValue
operation, but since it expects ahead-of-time known value it
is not as straight forward.
Pull Request: https://projects.blender.org/blender/blender/pulls/118248
The Translate node crashed for single inputs in the Full Frame
compositor, that's because it always assumed image inputs. So fix this
by handling the single value case.
The GPU compositor Displace node produces NaN pixels if connected to the
Image node. That's because the Displace node access the MIP levels of
the texture produces by the Image node, but those levels were never
initialized, so fix this by completing the levels.
This patch adjusts the Variable Size Bokeh Blur node such that it
matches between CPU and GPU. The GPU implementation is mostly followed
for the reasons stated below.
The first difference is a bug in the CPU implementation, where the upper
limit of the blur window is not considered, but the lower limit is.
The second difference is due to CPU ignoring outside pixels instead of
clamping them to border, which is done until an option is added to the
node to control the boundary condition.
The third difference is due to CPU ignoring the bounding box input.
The fourth difference is that CPU doesn't allow zero maximum blur
radius, which is a valid option.
The fifth difference is that the threshold option, which is only used
for the Defocus node, was considered in a greater than manner, while it
should be greater than or equal. Since the default threshold of one
should allow a blur size of one.
The GPU implementation now considers the maximum size of its input,
following the CPU implementation.
Pull Request: https://projects.blender.org/blender/blender/pulls/117947
The Inpaint node crashes for large images. That's because the output
buffer was only allocated for a single chunk. So allocate the entire
output buffer in the tile initialization step.
Visually it is the same as the execution time implemented for the
geometry nodes, and it is to be enabled in the overlay popover.
The implementation is separate from the geometry nodes, as it is
not easy or practical to re-use the geometry nodes implementation.
The execution time is stored in a run-time hash, indexed by a node
instance key. This is similar to the storage of the mode preview
images, but is stored on the scene runtime data and not on the node
tree. Indexing the storage by key allows to easily copy execution
statistics from localized tree to its original version.
The time is only implemented for full-frame compositor, as for the
tiled compositor it could be tricky to calculate reliable time for
pixel processing nodes which process one pixel at a time.
Pull Request: https://projects.blender.org/blender/blender/pulls/117885
This patch adjusts the Bokeh Blur node such that it matches between CPU
and GPU. The GPU implementation is followed for the reasons stated
below.
The first difference is a bug in the CPU implementation, where the upper
limit of the blur window is not considered, but the lower limit is.
The second difference is due to an additional weight of 1.0 for blur
size less than 2, which was apparently in place to workaround the
aforementioned bug, since for zero sized blurs, the blur loop will not
run due to the missing upper limit.
The third difference is due to CPU ignoring outside pixels instead of
clamping them to border, which is done until an option is added to the
node to control the boundary condition.
An extra difference existed between Tiled and Full-frame, where the
canvas had different rounding methods, so that was unified.
Pull Request: https://projects.blender.org/blender/blender/pulls/117847
This patch ports the newly redesigned GPU Inpaint node to the CPU. The
code is mostly identical to the GPU code with the necessary adjustments
to make it work with CPU.
See #114849 for more information.
Pull Request: https://projects.blender.org/blender/blender/pulls/117716
The viewport compositor crashes when in camera view, passepartout is
opaque, and camera region is empty, that is, out of view. That's because
operations didn't handle zero sized compositing regions.
This patch fixes that by allocating invalid results for relevant
operations when the compositing region is invalid.
This patch unifies the implementation of the Bilateral Blur node across
CPU and GPU. The difference is due to two things. First, the CPU code
had a bug where the upper limit of the blur window was not included in
the accumulation. Second, CPU ignored pixels outside of the image while
GPU clamped them to the nearest boundary pixel. The latter difference
was aligned with GPU until we eventually add an option to control
boundary handing.
A few utilities were added to the node operation and memory buffer
classes to do clamped pixel reading.
Pull Request: https://projects.blender.org/blender/blender/pulls/117751
Part of overall "improve image filtering situation" (#116980), this PR addresses
two issues:
- Bilinear (default) image filtering makes half a source pixel wide transparent
border around the image. This is very noticeable when scaling images/movies up
in VSE. However, when there is no scaling up but you have slightly rotated
image, this creates a "somewhat nice" anti-aliasing around the edge.
- The other filtering kinds (e.g. cubic) do not have this behavior. So they do
not create unexpected transparency when scaling up (yay), however for slightly
rotated images the edge is "jagged" (oh no).
More detail and images in PR.
Pull Request: https://projects.blender.org/blender/blender/pulls/117717
This change makes it so the OIDN is listening to the node execution
being cancelled, allowing for a more instant user feedback on the
compositor graph changes. It is especially visible when dealing with
4k images, and using pre-filter on the guiding passes.
Similar implementation to what Cycles does.
This change covers tiles and full-frame compositors.
The GPU compositor needs extra work to have this supported.
Pull Request: https://projects.blender.org/blender/blender/pulls/117698
This patch rewrites and optimizes the Double Edge Mask node to be orders
of magnitude faster. For a 1k complex mask, it is 650x faster, while for
a 1k simple mask, it is only 50x faster.
This improvement is attributed to the use of a new Jump Flooding
algorithm as well as multi-threading, matching the GPU implementation.
The result of the new implementation differs in that the boundaries of
the masks are now identified as the last pixels inside the mask.
Pull Request: https://projects.blender.org/blender/blender/pulls/117545
The GPU implementation of the Use Alpha option of the Z Combine node
only worked if the first image is closer, since it sampled the alpha
channel from it and used it for mixing. Instead, the mix factor should
depend on the closer pixel, like the CPU implementation.
The Z Combine node switches its inputs when changing the Use Alpha
option if both Z values are equal. That's because the operations used by
the node internally use two different conditions, less than and less
than or equal. This patch fixes that by unifying it to the less than
case.
This patch changes all anti-aliasing operations to use SMAA instead of
the Scale3x-based operation. That's because SMAA is more accurate while
the Scale3x one is more a hack.
The Lighten mix mode in the Mix RGB node is wrong for factors less than
one. The lighten blend mode is a simple component-wise maximum in EEVEE,
Cycles, and other implementations. While the compositor seems to define
it as a weighted maximum using the factor as the weight. It should be
noted that its Darken counterpart is correctly implemented for the same
node, so Lighten should follow, which this patch do.
GPU compositor shares the EEVEE code, so it is already correct.
Pull Request: https://projects.blender.org/blender/blender/pulls/117630