This patch formalizes the process of data updates of single values.
Previously, this was done internally in the set_single_value method, but
there was a hack in multi-function procedure code to do the same. This
patch adds a new method for single value data updates and uses it
internally in set_single_value.
This patch refactors how values of unlinked sockets are provided to
nodes. Previously, the GPU node stack values were initialized at
construction time and linked by the node's compile methods. This meant
that we needed to handle implicit conversion if the socket is linked to
an unlinked node group input.
Alternatively, we now insert a GPU setter node for each unlinked socket
that carries the value and type of the origin socket, and let the GPU
code generator do implicit conversion at the shader level. This has
three advantages:
- It makes it easier to add new types since we no longer have to handle
those types in shader node code and it reduces code duplication.
- It makes the code more inline with how we implement multi-function
procedures. So refactoring is easier.
- It opens the door to implement things like implicit inputs, which will
be needed later for things like texture nodes.
Blender crashes when canceling a compositor job if a transform node is
used. This is because freeing shared data didn't reset data members, so
it still thinks it has allocated data, which will be double freed
causing crashes. To fix this, we simply clear data members if data is
still shared.
This patch refactors how single values are passed to MF operations by
utilizing generic pointers. This avoids boilerplate and makes it easier
to add new types.
This patch provides const variants to the cpu_data accessors that
returns GSpan. This is done to better const correctness. Some code need
non-const data even for read-only data, so we have to const cast those
for the moment.
This patch refactors GPU implicit conversion code by generating the name
of the shaders on the fly using fmt. This reduces boilerplate and makes
it easier to add new types.
This patch refactors the result class to replace proxy results with the
possibility of doing data sharing through a shared heap allocated data
reference count. This is more robust and simpler since proxy results no
longer need to be handled as a special case in a lot of the results
code. Additionally, it allows stronger const correctness since inputs to
operations can now be const.
This is somewhat similar to implicit sharing used in other parts of
Blender, so we can look into using that in the future.
Pull Request: https://projects.blender.org/blender/blender/pulls/135778
The general idea is to keep the 'old', C-style MEM_callocN signature, and slowly
replace most of its usages with the new, C++-style type-safer template version.
* `MEM_cnew<T>` allocation version is renamed to `MEM_callocN<T>`.
* `MEM_cnew_array<T>` allocation version is renamed to `MEM_calloc_arrayN<T>`.
* `MEM_cnew<T>` duplicate version is renamed to `MEM_dupallocN<T>`.
Similar templates type-safe version of `MEM_mallocN` will be added soon
as well.
Following discussions in !134452.
NOTE: For now static type checking in `MEM_callocN` and related are slightly
different for Windows MSVC. This compiler seems to consider structs using the
`DNA_DEFINE_CXX_METHODS` macro as non-trivial (likely because their default
copy constructors are deleted). So using checks on trivially
constructible/destructible instead on this compiler/system.
Pull Request: https://projects.blender.org/blender/blender/pulls/134771
This patch removes the use of auto_resource_location during shader
operation material compilation. Instead, we assign slot locations
explicitly. Output images are just assigned incremental indices, while
input samplers are also assigned incremental indices, but starting from
the number of textures in the material, because color bands might be
reserving slots already.
Pull Request: https://projects.blender.org/blender/blender/pulls/135453
This patch refactors how the compositor deals with outputs that are
allocated but not needed. Previously we allowed such allocations and
implicitly release them right after operations are computed. This is a
weak design, so we now require developers to only allocate outputs that
are actually needed and assert otherwise. The release mechanism is
therefore removed.
The compositor asserts if an unsupported unavailable socket exists. This
assert should not exist, because the GPU material compiler will itself
gracefully handle such sockets when their type is GPU_NONE, which is
already the case. So remove the assert and add a note about the
behavior.
This patch uses the BKE implicit conversion rules to implement implicit
conversion in multi-function procedure operations in the compositor.
Since conversions use ColorSceneLinear4f to represent colors instead of
float4, we need to insert extra implicit conversions around color
sockets.
Special attention was given to variable destruction in the
implementation because it is now possible for implicit variables to be
outputs, so we need to make sure they are not destructed.
Previously, the vector type in the compositor had 4-components to
accommodated float4 types, while the last component was ignored for the
rest of the vector types. But now that we have a dedicated type for
float4 in #134486. We can reduce that vector type to 3-components.
Pull Request: https://projects.blender.org/blender/blender/pulls/134570
The compositor previously overloaded the vector type to represent
multiple dimensions that are always stored in a 4D float vector. This
patch introduce a dedicated type for float4, leaving the vector type to
always represent a 3D vector, which will be done in a later commit.
This is not exposed to the user as a separate socket type with a
different color, it is only an internal type that uses the same vector
socket shape and color.
Since the vector socket represents both 4D and 3D vectors, code
generally assumes that such sockets represents 3D vectors, and the
developer is expected to set it to a 4D vector if needed in the node
operation constructor, or use the newly added skip_type_conversion flag
for nodes that do not care about types, like the File Output node.
Though this should be redundant once we add a dimension property for
vector sockets.
Pull Request: https://projects.blender.org/blender/blender/pulls/134486
The compositor leaks memory in certain setups where a transformed result
is linked to two inputs in the same pixel node. This happens due to an
overestimation in the computation of reference counts of those results.
Since pixel operations might share the same input for multiple links in
the node tree, the reference count should be corrected to take that
sharing into account.
This sharing was previously accounted for as part of releasing inputs in
the pixel operation, but input processors didn't take that into account,
so the realize on domain input processor would leak memory. So to fix
this, we correct the reference count at the evaluator level instead,
such that input processors can safely operate on the correct reference
count.
Pull Request: https://projects.blender.org/blender/blender/pulls/134666
This patch removes the compositor texture pool implementation which
relies on the DRW texture pool, and replaces it with the new texture
pool implementation from the GPU module.
Since the GPU module texture pool does not rely on the global DST, we
can use it for both the viewport compositor engine and the GPU
compositor, so the virtual texture pool implementation is removed and
the GPU texture pool is used directly.
The viewport compositor engine does not need to reset the pool because
that is done by the draw manager. But the GPU compositor needs to reset
the pool every evaluation. The pool is deleted directly after rendering
using the render pipeline or through RE_FreeUnusedGPUResources for the
interactive compositor.
Pull Request: https://projects.blender.org/blender/blender/pulls/134437
This patch refactors the ShaderNode class to be a concrete class that
is implemented in terms of the node type gpu_fn. This is done to make it
easier to reuse existing nodes in other parts of Blender.
Pull Request: https://projects.blender.org/blender/blender/pulls/134210
This patch refactors the Result class in the compositor to use
GMutableSpan and std::variant to wrap the result's data. This reduces
the complexity of the code and slightly optimizes performance. This will
also make it easier to add new types and interface with other code like
multi-function procedures.
Pull Request: https://projects.blender.org/blender/blender/pulls/134112
The compositor crashes when the user goes into a muted group that has a
viewer node while the backdrop is enabled. The compositor should not
schedule viewer nodes inside muted contexts, so we need to add checks to
prevent this.
Pull Request: https://projects.blender.org/blender/blender/pulls/134093
This patch allows the compositor context to specify exactly which
outputs it needs, selecting from: Composite, Viewer, File Output, and
Previews. Previously, the compositor fully executed if any of those were
needed, without granular control on which outputs are needed exactly.
For the viewport compositor engine, it requests Composite and Viewer,
with no Previews or File Outputs.
For the render pipeline, it requests Composite and File Output, with
node Viewer or Previews.
For the interactive compositor, it requests Viewer if the backdrop is
visible or an image editor with the viewer image is visible, it requests
Compositor if an image editor with the render result is visible, it
requests Previews if a node editor has previews overlay enabled. File
outputs are never requested.
Pull Request: https://projects.blender.org/blender/blender/pulls/133960
The File Output node sometime ignores the transformations of their
inputs. That's due to the fact that transforms are now delayed and the
File Output node does not realize its inputs on its domain in case it
was not multi-layer.
To fix this, add another realization mode for transforms only. And use
that in the File Output node, as well as the Bokeh Blur, UV Map, and
Plane Track Deform, which also need this fix.
Pull Request: https://projects.blender.org/blender/blender/pulls/133850
Nodes that process single values can fail to work for GPU compositing.
That's because multi-function procedures writes to the single values but
never uploads them to the GPU, which is what this patch does.
Pull Request: https://projects.blender.org/blender/blender/pulls/133851
There is a special case in the compositor code where viewer nodes are
treated as composite nodes. This patch renames relevant methods and
updates comments to clarify this use case.
Pull Request: https://projects.blender.org/blender/blender/pulls/133811
In certain setups, nodes whose inputs are single value and whose outputs
are expected to be single value wrongly return an image. That's because
they wrongly join a pixel operation that operates on images.
To fix this, we split pixel operations by their value types. Single
value sub-trees get compiled into their own pixel operation and none
single value sub-trees get compiled into their own pixel operation.
This might unfortunately break up a long chain of pixel operations if
one of them was single value. This is solvable, but will require more
time and research, so we need to fix the bug first then look into it.
Pull Request: https://projects.blender.org/blender/blender/pulls/133701
This patch allows the multi-function procedure pixel operation to
operate on single values. While it previously assumes a 1x1 image for
processing which was later reduced to a single value using input
processors. This is more efficient, but will allow us to use
multi-function procedures for single value sub-trees even in GPU
execution.
This patch introduces a new Derived Resources concept to the compositor.
Derived resources are resources that are computed from a particular
result and cached in it in case it is needed by another operation, which
can greatly improve performance in some cases at the cost of more memory
usage.
The first use case is to store denoised versions of the Denoising Albedo
and Denoising Normals passes if auxiliary pass denoising is enabled in
the denoise node. Consequently, multi-pass denoising setups where the
same auxiliary passes are used in multiple denoise nodes should be much
faster due to caching of the derived resources.
This implementation has the limitation that it can't preemptively
invalidate the cache when the derived resources are no longer needed to
free up memory. This requires a special resource tracking mechanism that
need to happen during node tree compilation, and will be submitted
later. The limitation is not significant in the particular derived
resources that is currently implemented. Since the auxiliary passes are
rarely used outside of denoising.
Fixes#131171.
Pull Request: https://projects.blender.org/blender/blender/pulls/125671
This patch delays applying transformations until realization happens on
some other domain.
Currently, transformations are applied immediately at the point of
transform nodes, this is problematic for a few reasons:
- If that result was then realized on some other domain, interpolation
will have happened two times, at the transform nodes and at the node
that required realization, causing less than ideal precision issues.
- It is not possible to repeat or extend a rotated result because its
empty areas will be zero filled, leaving gaps in its extension. So
this patch is a prerequisite for #132371 if we want full support for
repetition.
- Doing inverse transformations will introduce interpolation artifacts
which might be undesirable. Inverse transformations might be used to
do pixelation for instance, so this change will be undesirable in this
case. But we decided that this is not a use case that we want to
support, and we added explicit pixel size control to the pixelate node
as an alternative.
So this has four implications, two that might be considered bad:
- Transformations will now be higher quality and more precise.
- Repetition and other boundary extension methods will now be possible.
- Downsampling then upsampling will no longer produce pixelated results.
- Realization might happen multiple times with identical results in some
cases.
The last point not a big issue, since domain realization is not a big
bottleneck in the compositor, and the plan is to move realization into
pixel operations, so it will even be more efficient than it is now.
Pull Request: https://projects.blender.org/blender/blender/pulls/133158
This patch uses OCIO luminance for implicit conversion from color to
float in the compositor. This is done to match other node systems, and
because luminance is a much better default than the average formula used
before.
Versioning was added to retain average conversion for old files.
Pull Request: https://projects.blender.org/blender/blender/pulls/133206
The File Output node no longer works inside node groups since the
introduction of the new CPU compositor. This is explicit in the code, so
we just consider all file output nodes recursively to fix this issue.
This uses the following accessor methods in more places in more places:
`is_group()`, `is_group_input()`, `is_group_output()`, `is_muted()`,
`is_frame()` and `is_reroute()`.
This results in simpler code and reduces the use of `bNode.type_legacy`.
Pull Request: https://projects.blender.org/blender/blender/pulls/132899
The new description for `bNode.type_legacy`:
```
/**
* Legacy integer type for nodes. It does not uniquely identify a node type, only the `idname`
* does that. For example, all custom nodes use #NODE_CUSTOM but do have different idnames.
* This is mainly kept for compatibility reasons.
*
* Currently, this type is also used in many parts of Blender, but that should slowly be phased
* out by either relying on idnames, accessor methods like `node.is_reroute()`.
*
* A main benefit of this integer type over using idnames currently is that integer comparison is
* much cheaper than string comparison, especially if many idnames have the same prefix (e.g.
* "GeometryNode"). Eventually, we could introduce cheap-to-compare runtime identifier for node
* types. That could mean e.g. using `ustring` for idnames (where string comparison is just
* pointer comparison), or using a run-time generated integer that is automatically assigned when
* node types are registered.
*/
```
Pull Request: https://projects.blender.org/blender/blender/pulls/132858
The compositor crashes if a node inside a node group is connected to a
group input that have a different type and the node group is used
without a connection to that input. That's because the compositor code
assumes the type of the group input without implicit type conversion to
the expected type of the node. To fix this, handle implicit conversion
for unconnected sockets as well.
This patch adds support for using integer sockets in compositor nodes.
This involves updating the Result class, node tree compiler, implicit
conversion operation, multi-function procedure operation, shader
operation, and some operations that supports multiple types.
Shader operation internally treats integers as floats, doing conversion
to and from int when reading and writing. That's because the GPUMaterial
compiler doesn't support integers. This is also the same workaround used
by the shader system. Though the GPU module are eyeing adding support
for integers, so we will update the code once they do that.
Domain realization is not yet supported for integer types, but this is
an internal limitation so far, as we do not plan to add nodes that
outputs integers soon. We are not yet sure how realization should happen
with regards to interpolation and we do not have base functions to
sample integer images, that's why I decided to delay its implementation
when it is actually needed.
Pull Request: https://projects.blender.org/blender/blender/pulls/132599