Straightforward port. I took the oportunity to remove some C vector
functions (ex: copy_v2_v2).
This makes some changes to DRWView to accomodate the alignement
requirements of the float4x4 type.
Straightforward port. I took the oportunity to remove some C vector
functions (ex: `copy_v2_v2`).
This makes some changes to DRWView to accomodate the alignement
requirements of the float4x4 type.
This patch adds support for compilation and execution of GLSL compute shaders. This, along with a few systematic changes and fixes, enable realtime compositor functionality with the Metal backend on macOS. A number of GLSL source modifications have been made to add the required level of type explicitness, allowing all compilations to succeed.
GLSL Compute shader compilation follows a similar path to Vertex/Fragment translation, with added support for shader atomics, shared memory blocks and barriers.
Texture flags have also been updated to ensure correct read/write specification for textures used within the compositor pipeline. GPU command submission changes have also been made in the high level path, when Metal is used, to address command buffer time-outs caused by certain expensive compute shaders.
Authored by Apple: Michael Parkin-White
Ref T96261
Ref T99210
Reviewed By: fclem
Maniphest Tasks: T99210, T96261
Differential Revision: https://developer.blender.org/D16990
These warnings can reveal errors in logic, so quiet them by checking
if the features are enabled before using variables or by assigning
empty strings in some cases.
- Check CMAKE_THREAD_LIBS_INIT is set before use as CMake docs
note that this may be left unset if it's not needed.
- Remove BOOST/OPENVDB/VULKAN references when disable.
- Define INC_SYS even when empty.
- Remove PNG_INC from freetype (not defined anywhere).
Since internal links are only runtime data, we have the flexibility to
allocating every link individually. Instead we can store links directly
in the node runtime vector. This allows avoiding many small allocations
when copying and changing node trees.
In the future we could use a smaller type like a pair of sockets
instead of `bNodeLink` to save memory.
Differential Revision: https://developer.blender.org/D16960
This patch allows the realtime compositor to be limited to a specific
compositing region that is a subset of the full render region. In the
context of the viewport compositor, when the viewport is in camera view
and has a completely opaque passepartout, the compositing region will be
limited to the visible camera region.
On the user-level, this gives the user the ability to make the result of
the compositor invariant of the aspect ratio, shift, and zoom of the
viewport, making the result in the viewport identical to the final
render compositor assuming size relative operations.
It should be noted that compositing region is the *visible* camera
region, that is, the result of the intersection of the camera region and
the render region. So the user should be careful not to shift or zoom
the view such that the camera border extends outside of the viewport to
have the aforementioned benefits. While we could implement logic to fill
the areas outside of the render region with zeros in some cases, there
are many other ambiguous cases where such a solution wouldn't work,
including the problematic case where the user zooms in very close,
making the camera region much bigger than that of the render region.
Differential Revision: https://developer.blender.org/D16899
Reviewed By: Clement Foucault
When these declarations are built without the help of the special
builder class, it's much more convenient to set them directly rather
than with a constructor, etc. In most other situations the declarations
should be const anyway, so theoretically this doesn't affect safety too
much. Most construction of declarations should still use the builder.
The Realtime Compositor crashes on undo after an operation like Dissolve
node.
The compositor evaluator stored a reference to the compositor node tree
assuming that it will always be valid. This is not guaranteed, however,
and changes to the node tree can invalidate that reference. So we get
the node tree from the context directly every time to fix the crash.
This patch implements the Streaks Glare node. Which is an approximation
to the existing implementation in the CPU compositor. The difference due
to the approximation is bearily visible in artificial test cases, but is
less visible in actual use cases. Since the difference is rather similar
to that we discussed in the Simple Star mode, the decision to allow that
difference would probably hold here.
For the future, we can look into approximating this further using a
closed form IIR recursive filter with parallel interconnection and
block-based parallelism. That's because the streak filter is already
very similar to the causal pass of a fourth order recursive filter,
just with exponential steps.
Differential Revision: https://developer.blender.org/D16789
Reviewed By: Clement Foucault
This patch implements the variable size mode of the blur node. This is
not identical to the CPU implementation, but is visually very close.
That's because of two things. First, the Extend Bounds option introduces
a 2px offset that doesn't make sense, which is likely a bug in the CPU
implementation. Second, the CPU implementation approximate the result
using three passes, the first two of which are separable morphological
operators applied on the size input. But this approximation does not
provide an advantage because the last pass is non-separable anyways. So
the GPU implementation does not attempt this approximation for more
accurate and faster results.
Differential Revision: https://developer.blender.org/D16762
Reviews By: Clement Foucault
Expands Color Mix nodes with new Exclusion mode.
Similar to Difference but produces less contrast.
Requested by Pierre Schiller @3D_director and
@OmarSquircleArt on twitter.
Differential Revision: https://developer.blender.org/D16543
Compile each static shader using shaderc to Spir-V binaries.
The main goal is to make sure that the GLSL created using ShaderCreateInfo and able to compile to Spir-V.
For the second stage a correct pipeline needs to be created and some shader would need more
adjustments (push constants size).
With this patch future changes to GLSL sources can already be checked against vulkan, without the
backend finished.
Mechanism has been tested using MacOS and MoltenVK. For other OS, we should finetune CMake
files to find the right location to shaderc.
```
************************************************************
*** Build Mon 12 Dec 2022 11:08:07 CET
************************************************************
Shader Test compilation result: 463 / 463 passed (skipped 118 for compatibility reasons)
OpenGL backend shader compilation succeeded.
Shader Test compilation result: 529 / 529 passed (skipped 52 for compatibility reasons)
Vulkan backend shader compilation succeeded.
```
Reviewed By: fclem
Maniphest Tasks: T102760
Differential Revision: https://developer.blender.org/D16610
This patch implements the Simple Star Glare node. This is only an approximation
of the existing implementation in the CPU compositor, an approximation that
removes the row-column dependency in the original algorithm, yielding an order
of magnitude faster computations. The difference due to the approximation is
readily visible in artificial test cases, but is less visible in actual use
cases, so it was agreed that this approximation is worthwhile.
For the future, we can look into approximating this further using a closed form
IIR recursive filter with parallel interconnection and block-based parallelism.
Which is expected to yield another order of magnitude faster computations.
The different passes can potentially be combined into a single shader with some
preprocessor tricks, but doing that complicated that code in a way that makes
it difficult to experiment with future optimizations, so I decided to leave it
as is for now.
Differential Revision: https://developer.blender.org/D16724
Reviewed By: Clement Foucault
This patch implements the Ghost Glare node. It is implemented using
direct convolution as opposed to a recursive one, which produces
slightly different results---more accurate ones, however, since the
ghosts are attenuated where it matters, the difference is barely
visible and is acceptable as far as I can tell.
A possible performance improvement is to implement all passes in a
single shader dispatch, where an array of all scales and color
modulators is computed recursively on the host then used in the shader
to add all ghosts, avoiding usage of global memory and unnecessary
copies. This optimization will be implemented separately.
Differential Revision: https://developer.blender.org/D16641
Reviewed By: Clement Foucault
This patch implements the Track Position node for the realtime
compositor.
Differential Revision: https://developer.blender.org/D16387
Reviewed By: Clement Foucault
Make it more obvious in the name that an operation is not
cheap, and that the function operates on a tracks from
object and does not need a global tracking structure.
* This patch just moves runtime data to the runtime struct to cleanup
the dna struct. Arguably, some of this data should not even be there
because it's very use case specific. This can be cleaned up separately.
* `miniwidth` was removed completely, because it was not used anywhere.
The corresponding rna property `width_hidden` is kept to avoid
script breakage, but does not do anything (e.g. node wrangler sets it).
* Since rna is in C, some helper functions where added to access the
C++ runtime data from rna.
* This size of `bNode` decreases from 432 to 368 bytes.
Blender crashes when enabling Use Nodes after the viewport compositor is
already enabled.
This happens because the active viewer key is not yet initialized for the
node tree at this point, which eventually leads to a nullptr.
This patch fixes that by returning the root context in case the active
viewer key is not yet initialized.
Using output nodes inside node groups in compositor node trees doesn't
work for the realtime compositor.
Currently, the realtime compositor only considers top level output
nodes. That means if a user edits a node group and adds an output node
in the group, the output node outside of the node group will still be
used, which breaks the temporary viewers workflow where users debug
results inside a node group.
This patch fixes that by first considering the output nodes in the
active context, then consider the root context as a fallback. This is
mostly consistent with the CPU compositor, but the realtime compositor
allow viewing node group output nodes even if no output nodes exist at
the top level context.
Differential Revision: https://developer.blender.org/D16446
Reviewed By: Clement Foucault
When using two transformed compositor results, the transformation of one
of them is apparently in the local space of the other, while it should
be applied in the global space instead.
In order to realize a compositor result on a certain operation domain,
the domain of the result is projected on the operation domain and later
realized. This is done by multiplying by the inverse of the operation
domain. However, the order of multiplication was inverted, so the
transformation was applied in the local space of the operation domain.
This patch fixes that by inverting the order of multiplication in domain
realization.
The new Xcode 14.1 brings the new Apple Clang compiler which
considers sprintf unsafe and geenrates deprecation warnings
suggesting to sue snprintf instead. This only happens for C++
code by default, and C code can still use sprintf without any
warning.
This changes does the following:
- Whenever is trivial replace sprintf() with BLI_snprintf.
- For all other cases use the newly introduced BLI_sprintf
which is a wrapper around sprintf() but without warning.
There is a discouragement note in the BLI_sprintf comment to
suggest use of BLI_snprintf when the size is known.
Differential Revision: https://developer.blender.org/D16410
This patch introduces the concept of a Cached Resource that can be
cached across compositor evaluations as well as used by multiple
operations in the same evaluation. Additionally, this patch implements a
new structure for the realtime compositor, the Static Cache Manager,
that manages all the cached resources and deletes them when they are no
longer needed.
This improves responsiveness while adjusting compositor node trees and
also conserves memory usage.
Differential Revision: https://developer.blender.org/D16357
Reviewed By: Clement Foucault
This patch moves the GLSL shaders and their infos to the compositor
module as decided by the EEVEE & Viewport module. This is a non
functional change.
Differential Revision: https://developer.blender.org/D16360
Reviewed By: Clement Foucault
Currently, the realtime compositor treat vector types as 3D vectors,
which is true for most operations. However, some operations deal with
vector types as 4D vectors that encode two 2D vectors or one 3D vector
with the last component ignored. So this patch expands vector types to
include a fourth component that is only sometimes used.
Since we already stored vectors in RGBA textures, the necessary changes
are straightforward and are mostly concerned with adjusting the Result
class.
Differential Revision: https://developer.blender.org/D16359
Reviewed By: Clement Foucault
This patch implements the normalize node for the realtime compositor.
Differential Revision: https://developer.blender.org/D16279
Reviewed By: Clement Foucault
This patch implements the tone map node for the realtime compositor
based on the two papers:
Reinhard, Erik, et al. "Photographic tone reproduction for digital
images." Proceedings of the 29th annual conference on Computer graphics
and interactive techniques. 2002.
Reinhard, Erik, and Kate Devlin. "Dynamic range reduction inspired by
photoreceptor physiology." IEEE transactions on visualization and
computer graphics 11.1 (2005): 13-24.
The original implementation should be revisited later due to apparent
incompatibilities with the reference papers, which makes the operation
less useful.
Differential Revision: https://developer.blender.org/D16306
Reviewed By: Clement Foucault
These functions are almost identical, the main difference being
BLI_join_dirfile didn't trim existing slashes when joining paths
however this isn't an important difference that warrants a separate
function.
Currently, the scale node always changes the interpolation of its result
to bilinear. This was done because the scale node does not have an
interpolation option, unlike the Transform node, so a default of
bilinear was assumed. This turned out to be problematic, because in the
pixelation use cases, a nearest interpolation is typically preferred by
the user.
This patch changes the default interpolation of input nodes to bilinear,
makes the scale node keep the interpolation of the input it receives,
and makes the pixelate node changes the interpolation to nearest. In
effect, for non-pixelation use cases, the default bilinear interpolation
will be used, and for pixelation use cases, the nearest interpolation
will be used unless explicitly specified using a node that sets the
interpolation.