Previously, there were two independent algorithms for analysing how anonymous
attributes are used in a node tree: One that just computed the `aal::RelationsInNode`
for an entire node tree and one that performed a more in depth analysis to
determine how far anonymous attributes should be propagated.
As it turns out, both operations can also be done at the same time and the result
can be cached on the node tree. This reduces the amount of code and allows for
better code reuse.
This simplification is likely only an intermediate step as things will probably have
to be refactored further to support e.g. serial loops (#108896).
Store subdivision surface creases in two new named float attributes:
- `crease_vert`
- `crease_edge`
This is similar to 2a56403cb0.
The attributes are naming conventions, so their data type and domain
aren't enforced, and may be interpolated when necessary. Editing tools
and the subdivision surface modifier use the hard-coded name. It might
be best if these were edited as generic attributes in the future, but
in the meantime using generic attributes helps.
The attributes are visible in the list, which is how they're now meant
to be removed. They are now interchangeable with any tool that works
with the generic attribute system-- even tools like vertex paint can
affect creases now.
This is a breaking change. Forward compatibility isn't preserved for
versions before 3.6, and the `crease` property in RNA is removed in
favor of making a smaller API surface area with just the attribute API.
`Mesh.vertex_creases` and `Mesh.edge_creases` now just return the
matching attribute if possible, and are now implemented in Python.
New functions `*ensure` and `*remove` also replace the operators to
add and remove the layers for Python.
A few extrude node test files have to be updated because of different
(now generic) attribute interpolation behavior.
Pull Request: https://projects.blender.org/blender/blender/pulls/108089
Interpolation from edge attributes is unsupported, and the data
of the new point attribute was uninitialized. As a fix, just avoid
interpolating edge attributes in the first place.
Fractal noise is the idea of evaluating the same noise function multiple times with
different input parameters on each layer and then mixing the results. The individual
layers are usually called octaves.
The number of layers is controlled with a "Detail" slider.
The "Lacunarity" input controls a factor by which each successive layer gets scaled.
The existing Noise node already supports fractal noise. Now the Voronoi Noise node
supports it as well. The node also has a new "Normalize" property that ensures that
the output values stay in a [0.0, 1.0] range. That is except for the F2 feature where
in rare cases the output may be outside that range even with "Normalize" turned on.
How the individual octaves are mixed depends on the feature and output socket:
- F1/Smooth F1/F2:
- Distance/Color output:
The individual Distance/Color octaves are first multiplied by a factor of
`Roughness ^ (#layers - 1.0)` then added together to create the final output.
- Position output:
Each Position octave gets linearly interpolated with the combined output of the
previous octaves. The Roughness input serves as an interpolation factor with
0.0 resutling in only using the combined output of the previous octaves and
1.0 resulting in only using the current highest octave.
- Distance to Edge:
- Distance output:
The Distance octaves are mixed exactly like the Position octaves for F1/Smooth F1/F2.
It should be noted that Voronoi Noise is a relatively slow noise function, especially
at higher dimensions. Increasing the "Detail" makes it even slower. Therefore, when
optimizing a scene one should consider trying to use simpler noise functions instead
of Voronoi if the final result is close enough.
Pull Request: https://projects.blender.org/blender/blender/pulls/106827
* Store per RenderPass in RenderResult.
* Caches are cleared when starting rendering, to make more memory available
to GPU rendering.
* Caches are cleared on UI changes, when no compositing node editor and no
image editor with a render result or viewer node image is visible.
* Store 3 channel RGB passes as such, and set alpha 1 in shader.
This is an intermediate step before implementing GPU backed ImBuf, to
improve performance and figure out cache eviction.
Pull Request: https://projects.blender.org/blender/blender/pulls/108818
For node group operators (#101778), it helps to reuse the existing
geometry nodes execution. This commit adds a new moves most
of the geometry computation to the nodes module and gives the
modifier (and in the future the operator) a callback to setup the
execution context.
Pull Request: https://projects.blender.org/blender/blender/pulls/108482
This patch adds support for Viewer and File Output nodes to the realtime
compositor. The experimental render GPU compositor was also extended to
support viewers. While support for File Output nodes was added, it
remains unimplemented.
This is just an experimental implementation, the logic for viewers will
probably be changed once #108656 is agreed upon. Furthermore, the recalc
NODE_DO_OUTPUT_RECALC flags need to be taken into account to avoid
superfluous computations.
Pull Request: https://projects.blender.org/blender/blender/pulls/108804
Face maps were added as a prototype of a new rigging solution during
2.8 development. Their storage is redundant with the newer generic
attribute system (specifically with integer face attributes), and
they were never used much. This commit removes the face map list
and converts the storage to an attribute with the name `face_maps`.
There is nowhere to store the face map names anymore, so those
are not kept.
It probably still makes sense to have a feature like mesh face gizmo
selection for rigging. But the design and implementation woulds likely
have to change significantly, including possibly changing the storage
type, and making use of the generic attribute system instead of a
special type.
See #105317 for more discussion.
This increases the framerate in a production file from about 2.3 to 2.5
FPS, and reduces gaps in a profile where the CPU was waiting for just
a few threads to finish the BVH tree lookups. If BVH lookups become
faster in the future, this grain size could be increased.
The filter is used to reduce noise while preserving edges. It can be used to create a cartoon effect from photorealistic images.
It offers two variations:
1) Classic aka isotropic kuwahara filter: simple and faster computation. Algorithm splits an area around a single pixel in four parts and computes the mean of the region with the lowest standard deviation.
2) Anisotropic Kuwahara filter: improves the classical approach by considering the direction of structures of regions
This patch implements both approaches above as multi-threaded operations for the full-frame and tiled compositor.
Co-authored-by: Sergey Sharybin <sergey@blender.org>
Pull Request: https://projects.blender.org/blender/blender/pulls/107015
This patch implements the Sun Beams node for the realtime compositor.
The implementation is not identical to the existing CPU implementation,
but is very close. The new implementation is a higher quality one and
resolves some of the artefacts in the existing implementation. This is
achieved by doing a simple line integration toward the source pixel,
while having a number of integration steps that is invariant of the
angle to the source.
Pull Request: https://projects.blender.org/blender/blender/pulls/108718
This patch implements the Movie Distortion node for the realtime
compositor. The distorted coordinates are computed and cached for a
particular tracking camera distortion parameters. So for expensive
distortion models, the first run will take some time to compute, but
subsequent runs will be fast.
An alternative implementation would be to implement each of the
distortion modes in the shader, but that was decided against for a few
reasons:
1. We want to hide the implementation details of the distortion models,
since it is provided through an external library (Libmv).
2. Some distortion models are expensive to solve accurately, and can be
quite slow to solve each time the shader runs.
3. The typical usage of the node does not involve interactive editing of
the distortion parameters, rather, the parameters are computed during
camera calibration, so caching seems most fitting in that case.
Pull Request: https://projects.blender.org/blender/blender/pulls/108230
* Enable "Experimental Compositors" in preferences, then choose
Realtime GPU execution mode in node editor sidebar.
* Only supports combined pass input and Render Result combined output.
* No viewer nodes, no file output nodes, and no node previews yet.
Pull Request: https://projects.blender.org/blender/blender/pulls/108629
* Provide render data, node tree and color management directly instead
of going through scene, as these may be modified by the render pipeline.
Also better for cached texture hits this way.
* Change legacy pass type to pass name.
* Skip file output node when not doing final render.
* Gracefully handle incomplete render results.
Pull Request: https://projects.blender.org/blender/blender/pulls/108629
Prefer memcpy when exact sizes have been calculated as this removes the
implication that the string might be smaller than the length argument.
Further, passing in `len + 1` to BLI_strncpy without clamping by the
destination buffer size is reads like a common mistake,
where the length of the source may exceed the destination buffer size.
While using `std::min(sizeof(dst), len + 1)` would avoid the confusion
it's complicating a statement which can use memcpy instead.
When the loose edge and vertex status are cached in the source mesh and
the combination of selection domain and mode don't add loose elements,
copy the cache status to avoid recomputation. In a test with a 1 million
face grid:
- All: 23 -> 30 FPS
- Only faces: 22 -> 23.5 FPS
- Only edges and faces: 24 -> 27 FPS
Also remove unnecessary includes and fix a build error introduced in
the last commit to this area from an inconsistent forward declaration.
Replace the implementation of the separate and delete geometry nodes
for meshes. The new code makes more use of the `IndexMask` class, which
was recently optimized. The main goal is to make more of the work scale
with the size of the result mesh rather than the input. For example,
instead of keeping a map from input to output elements, the maps used
to copy attributes go from output to input elements.
The new implementation is generally 2-4x faster, depending on the mode
and the number of elements selected. The new code is also able to skip
more work when nothing is removed.
This also allows using more existing attribute interpolation code,
allowing the overall removal of over 300 lines. Some of the attribute
utilities from a similar change for curves (f63cfd8e28) are
reused directly.
The indices of the result changes, so the test file needs to be updated.
Pull Request: https://projects.blender.org/blender/blender/pulls/108435
A lot of files were missing copyright field in the header and
the Blender Foundation contributed to them in a sense of bug
fixing and general maintenance.
This change makes it explicit that those files are at least
partially copyrighted by the Blender Foundation.
Note that this does not make it so the Blender Foundation is
the only holder of the copyright in those files, and developers
who do not have a signed contract with the foundation still
hold the copyright as well.
Another aspect of this change is using SPDX format for the
header. We already used it for the license specification,
and now we state it for the copyright as well, following the
FAQ:
https://reuse.software/faq/
Adds the "Corners of Edge" topology node to geometry nodes.
Combining this node with the "Face of Corner" node allows getting
informations about the faces connected to an edge. The behavior is
slightly non-obvious-- the node only gives the corner neighbors
that come *before* the current edge in directly neighboring faces.
This allows the operation to be easily reversed and reduces
redundancy between nodes.
See the devtalk thread: https://devtalk.blender.org/t/29379
Pull Request: https://projects.blender.org/blender/blender/pulls/107968
2ffd08e952 introduced new system to control attribute life time.
Some specific function for that system is missed in `Shortest Edge Paths`.
This pull request add this functions:
1. `reference_pass_all` for socket declarations.
2. `for_each_field_input_recursive` for field input nodes.
Pull Request: https://projects.blender.org/blender/blender/pulls/108460
- Avoid using geometry sets from a different abstraction level
- Deduplicate basic attribute copying propagation code
- Allow more use of implicit sharing when data arrays are unchanged
- Optimize for when a point cloud delete selection is empty
- Handle face corners generically for "only face" case
Pass the curves and points to keep instead of delete. In the same test
file as the previous commit, this gave an increase from 50 to 60 FPS
when deleting curves.
Goals of this refactor:
* Reduce memory consumption of `IndexMask`. The old `IndexMask` uses an
`int64_t` for each index which is more than necessary in pretty much all
practical cases currently. Using `int32_t` might still become limiting
in the future in case we use this to index e.g. byte buffers larger than
a few gigabytes. We also don't want to template `IndexMask`, because
that would cause a split in the "ecosystem", or everything would have to
be implemented twice or templated.
* Allow for more multi-threading. The old `IndexMask` contains a single
array. This is generally good but has the problem that it is hard to fill
from multiple-threads when the final size is not known from the beginning.
This is commonly the case when e.g. converting an array of bool to an
index mask. Currently, this kind of code only runs on a single thread.
* Allow for efficient set operations like join, intersect and difference.
It should be possible to multi-thread those operations.
* It should be possible to iterate over an `IndexMask` very efficiently.
The most important part of that is to avoid all memory access when iterating
over continuous ranges. For some core nodes (e.g. math nodes), we generate
optimized code for the cases of irregular index masks and simple index ranges.
To achieve these goals, a few compromises had to made:
* Slicing of the mask (at specific indices) and random element access is
`O(log #indices)` now, but with a low constant factor. It should be possible
to split a mask into n approximately equally sized parts in `O(n)` though,
making the time per split `O(1)`.
* Using range-based for loops does not work well when iterating over a nested
data structure like the new `IndexMask`. Therefor, `foreach_*` functions with
callbacks have to be used. To avoid extra code complexity at the call site,
the `foreach_*` methods support multi-threading out of the box.
The new data structure splits an `IndexMask` into an arbitrary number of ordered
`IndexMaskSegment`. Each segment can contain at most `2^14 = 16384` indices. The
indices within a segment are stored as `int16_t`. Each segment has an additional
`int64_t` offset which allows storing arbitrary `int64_t` indices. This approach
has the main benefits that segments can be processed/constructed individually on
multiple threads without a serial bottleneck. Also it reduces the memory
requirements significantly.
For more details see comments in `BLI_index_mask.hh`.
I did a few tests to verify that the data structure generally improves
performance and does not cause regressions:
* Our field evaluation benchmarks take about as much as before. This is to be
expected because we already made sure that e.g. add node evaluation is
vectorized. The important thing here is to check that changes to the way we
iterate over the indices still allows for auto-vectorization.
* Memory usage by a mask is about 1/4 of what it was before in the average case.
That's mainly caused by the switch from `int64_t` to `int16_t` for indices.
In the worst case, the memory requirements can be larger when there are many
indices that are very far away. However, when they are far away from each other,
that indicates that there aren't many indices in total. In common cases, memory
usage can be way lower than 1/4 of before, because sub-ranges use static memory.
* For some more specific numbers I benchmarked `IndexMask::from_bools` in
`index_mask_from_selection` on 10.000.000 elements at various probabilities for
`true` at every index:
```
Probability Old New
0 4.6 ms 0.8 ms
0.001 5.1 ms 1.3 ms
0.2 8.4 ms 1.8 ms
0.5 15.3 ms 3.0 ms
0.8 20.1 ms 3.0 ms
0.999 25.1 ms 1.7 ms
1 13.5 ms 1.1 ms
```
Pull Request: https://projects.blender.org/blender/blender/pulls/104629
- Samplerate -> Sample rate: should be two words.
- "Falloff type the feather": typo.
- JPEG, OpenJPEG and JPEG 2000 are the official spellings of the
respective projects.
- "... boundary of image(including ...": missing space.
- "Points in . direction (cannot be changed ...)":
Plural, it is a collection of multiple points. Also do not use
contraction for "cannot".
- The Bevel modifier's "Only Vertices" option was replaced by a
Vertices mode in 2.90.
- "Metaball Types": affects one object, should be singular.
- "Metaball data-block to defined blobby surfaces": typo.
- "Wire Size": this option has nothing to do with wireframes, I suppose
it's an old terminology.
- "... (for negative speed.)": remove trailing period.
- "Smooth factor effect": the prop describes a factor for an effect,
not an effect for a factor.
- "... assigned to their vertices(ensures ...": missing space.
- "... used when faces have the ObColor mode enabled": ObColor is not
used anywhere else in the UI (since Blender 2.50).
- "Effect Children": typo -> Affect.
- "Distort Min/Max": copy-pasted from another pair of properties.
- "... dismiss menu on release.(in 1/100ths of sec)": replace period
with space.
- "resolution": field names should be capitalized.
Pull Request: https://projects.blender.org/blender/blender/pulls/108227
Combine the newer less efficient C++ implementations and the older
less convenient C functions. The maps now contain one large array of
indices, split into groups by a separate array of offset indices.
Though performance of creating the maps is relatively unchanged, the
new implementation uses 4 bytes less per source element than the C
maps, and 20 bytes less than the newer C++ functions (which also
had more overhead with larger N-gons). The usage syntax is simpler
than the C functions as well.
The reduced memory usage is helpful for when these maps are cached
in the near future. It will also allow sharing the offsets between
maps for different domains like vertex to corner and vertex to face.
A simple `GroupedSpan` class is introduced to make accessing the
topology maps much simpler. It combines offset indices and a separate
span, splitting it into chunks in an efficient way.
Pull Request: https://projects.blender.org/blender/blender/pulls/107861