Goals of this refactor:
* Reduce memory consumption of `IndexMask`. The old `IndexMask` uses an
`int64_t` for each index which is more than necessary in pretty much all
practical cases currently. Using `int32_t` might still become limiting
in the future in case we use this to index e.g. byte buffers larger than
a few gigabytes. We also don't want to template `IndexMask`, because
that would cause a split in the "ecosystem", or everything would have to
be implemented twice or templated.
* Allow for more multi-threading. The old `IndexMask` contains a single
array. This is generally good but has the problem that it is hard to fill
from multiple-threads when the final size is not known from the beginning.
This is commonly the case when e.g. converting an array of bool to an
index mask. Currently, this kind of code only runs on a single thread.
* Allow for efficient set operations like join, intersect and difference.
It should be possible to multi-thread those operations.
* It should be possible to iterate over an `IndexMask` very efficiently.
The most important part of that is to avoid all memory access when iterating
over continuous ranges. For some core nodes (e.g. math nodes), we generate
optimized code for the cases of irregular index masks and simple index ranges.
To achieve these goals, a few compromises had to made:
* Slicing of the mask (at specific indices) and random element access is
`O(log #indices)` now, but with a low constant factor. It should be possible
to split a mask into n approximately equally sized parts in `O(n)` though,
making the time per split `O(1)`.
* Using range-based for loops does not work well when iterating over a nested
data structure like the new `IndexMask`. Therefor, `foreach_*` functions with
callbacks have to be used. To avoid extra code complexity at the call site,
the `foreach_*` methods support multi-threading out of the box.
The new data structure splits an `IndexMask` into an arbitrary number of ordered
`IndexMaskSegment`. Each segment can contain at most `2^14 = 16384` indices. The
indices within a segment are stored as `int16_t`. Each segment has an additional
`int64_t` offset which allows storing arbitrary `int64_t` indices. This approach
has the main benefits that segments can be processed/constructed individually on
multiple threads without a serial bottleneck. Also it reduces the memory
requirements significantly.
For more details see comments in `BLI_index_mask.hh`.
I did a few tests to verify that the data structure generally improves
performance and does not cause regressions:
* Our field evaluation benchmarks take about as much as before. This is to be
expected because we already made sure that e.g. add node evaluation is
vectorized. The important thing here is to check that changes to the way we
iterate over the indices still allows for auto-vectorization.
* Memory usage by a mask is about 1/4 of what it was before in the average case.
That's mainly caused by the switch from `int64_t` to `int16_t` for indices.
In the worst case, the memory requirements can be larger when there are many
indices that are very far away. However, when they are far away from each other,
that indicates that there aren't many indices in total. In common cases, memory
usage can be way lower than 1/4 of before, because sub-ranges use static memory.
* For some more specific numbers I benchmarked `IndexMask::from_bools` in
`index_mask_from_selection` on 10.000.000 elements at various probabilities for
`true` at every index:
```
Probability Old New
0 4.6 ms 0.8 ms
0.001 5.1 ms 1.3 ms
0.2 8.4 ms 1.8 ms
0.5 15.3 ms 3.0 ms
0.8 20.1 ms 3.0 ms
0.999 25.1 ms 1.7 ms
1 13.5 ms 1.1 ms
```
Pull Request: https://projects.blender.org/blender/blender/pulls/104629
Combine the newer less efficient C++ implementations and the older
less convenient C functions. The maps now contain one large array of
indices, split into groups by a separate array of offset indices.
Though performance of creating the maps is relatively unchanged, the
new implementation uses 4 bytes less per source element than the C
maps, and 20 bytes less than the newer C++ functions (which also
had more overhead with larger N-gons). The usage syntax is simpler
than the C functions as well.
The reduced memory usage is helpful for when these maps are cached
in the near future. It will also allow sharing the offsets between
maps for different domains like vertex to corner and vertex to face.
A simple `GroupedSpan` class is introduced to make accessing the
topology maps much simpler. It combines offset indices and a separate
span, splitting it into chunks in an efficient way.
Pull Request: https://projects.blender.org/blender/blender/pulls/107861
This adds `char *simulation_bake_directory` to the nodes modifier. The path is automatically generated the first time the modifier is baked. It is _not_ automatically changed afterwards. The path is relative to the .blend file by default. For now, the path is not exposed in the UI or Python API.
This fixes issues where renaming objects/modifiers can cause the baked data to not work anymore.
Pull Request: https://projects.blender.org/blender/blender/pulls/108201
Store bevel weights in two new named float attributes:
- `bevel_weight_vert`
- `bevel_weight_edge`
These attributes are naming conventions. Blender doesn't enforce
their data type or domain at all, but some editing features and
modifiers use the hard-coded name. Eventually those tools should
become more generic, but this is a simple change to allow more
flexibility in the meantime.
The largest user-visible changes are that the attributes populate the
attribute list, and are propagated by geometry nodes. The method of
removing this data is now the attribute list as well.
This is a breaking change. Forward compatibility is not preserved, and
the vertex and edge `bevel_weight` properties are removed. Python API
users are expected to use the attribute API to get and set the values.
Fixes#106949
Pull Request: https://projects.blender.org/blender/blender/pulls/108023
For realtime use cases, storing the geometry's state in memory at every
frame can be prohibitively expensive. This commit adds an option to
disable the caching, stored per object and accessible in the baking
panel. The default is still to enable caching.
Pull Request: https://projects.blender.org/blender/blender/pulls/107767
Avoid storing redundant and unnecessary data in temporary arrays during
face corner normal calculation with custom normals. Previously we used
96 bytes per normal space (`MLoopNorSpace` (64), `MLoopNorSpace *` (8),
`LinkData` (24)), now we use 36 (`CornerNormalSpace` (32), `int` (4)).
This is achieved with a few changes:
- Avoid sharing the data storage with the BMesh implementation
- Use indices to refer to normal fan spaces rather than pointers
- Only calculate indices of all corners in each fan when necessary
- Don't duplicate automatic normal in space storage
- Avoid storing redundant flags in space struct
Reducing memory usage gives a significant performance improvement in
my test files, which is consistent with findings from previous commits
(see 9fcfba4aae). In my test, the time used to calculate
normals for a character model with 196 thousand faces reduced from
20.2 ms to 14.0 ms, a 44% improvement.
Edit mode isn't affected by this change.
A note about the `reverse_index_array` function added here: this is the
same as the mesh mapping functions (for example for mapping from
vertices to face corners). I'm planning on working this area and hoping
to generalize/reuse the implementation in the near future.
Pull Request: https://projects.blender.org/blender/blender/pulls/107592
The main goal here is to reduce the number of times thread-local data has
to be looked up using e.g. `EnumerableThreadSpecific.local()`. While this
isn't a bottleneck in many cases, it is when the action performed on the local
data is very short and that happens very often (e.g. logging used sockets
during geometry nodes evaluation).
The solution is to simply pass the thread-local data as parameter to many
functions that use it, instead of looking it up in those functions which
generally is more costly.
The lazy-function graph executor now only looks up the local data if
it knows that it might be on a new thread, otherwise it uses the local data
retrieved earlier.
Alongside with `UserData` there is `LocalUserData` now. This allows users
of the lazy-function evaluation (such as geometry nodes) to have custom
thread-local data that is passed to all the lazy-functions automatically.
This is used for logging now.
The maximum delta time is still capped to one frame even if multiple
frames are skipped. Passing in a too large delta time is mostly unexpected
and can reduce the stability of the simulation. Also it can be very slow
when the simulation uses the delta time to determine e.g. how many particles
should be added.
For derived mesh triangulation information, currently the three face
corner indices are stored in the same struct as index of the mesh
polygon the triangle is part of. While those pieces of information are
often used together, they often aren't, and combining them prevents
the indices from being used with generic utilities. It also means that
1/3 more memory has to be written when recalculating the triangulation
after deforming the mesh, and that the entire triangle data has to be
read when only the polygon indices are needed.
This commit splits the polygon index into a separate cache on `Mesh`.
The triangulation data isn't saved to files, so this doesn't affect
.blend files at all.
In a simple test deforming a mesh with geometry nodes, the time used
to recalculate the triangulation reduced from 2.0 ms to 1.6 ms,
increasing overall FPS from 14.6 to 15.
Pull Request: https://projects.blender.org/blender/blender/pulls/106774
The simulation cache is allocated lazily during evaluation when the
depsgraph is active. However, during rendering, the depsgraph is not
active, so it's not created.
This adds support for building simulations with geometry nodes. A new
`Simulation Input` and `Simulation Output` node allow maintaining a
simulation state across multiple frames. Together these two nodes form
a `simulation zone` which contains all the nodes that update the simulation
state from one frame to the next.
A new simulation zone can be added via the menu
(`Simulation > Simulation Zone`) or with the node add search.
The simulation state contains a geometry by default. However, it is possible
to add multiple geometry sockets as well as other socket types. Currently,
field inputs are evaluated and stored for the preceding geometry socket in
the order that the sockets are shown. Simulation state items can be added
by linking one of the empty sockets to something else. In the sidebar, there
is a new panel that allows adding, removing and reordering these sockets.
The simulation nodes behave as follows:
* On the first frame, the inputs of the `Simulation Input` node are evaluated
to initialize the simulation state. In later frames these sockets are not
evaluated anymore. The `Delta Time` at the first frame is zero, but the
simulation zone is still evaluated.
* On every next frame, the `Simulation Input` node outputs the simulation
state of the previous frame. Nodes in the simulation zone can edit that
data in arbitrary ways, also taking into account the `Delta Time`. The new
simulation state has to be passed to the `Simulation Output` node where it
is cached and forwarded.
* On a frame that is already cached or baked, the nodes in the simulation
zone are not evaluated, because the `Simulation Output` node can return
the previously cached data directly.
It is not allowed to connect sockets from inside the simulation zone to the
outside without going through the `Simulation Output` node. This is a necessary
restriction to make caching and sub-frame interpolation work. Links can go into
the simulation zone without problems though.
Anonymous attributes are not propagated by the simulation nodes unless they
are explicitly stored in the simulation state. This is unfortunate, but
currently there is no practical and reliable alternative. The core problem
is detecting which anonymous attributes will be required for the simulation
and afterwards. While we can detect this for the current evaluation, we can't
look into the future in time to see what data will be necessary. We intend to
make it easier to explicitly pass data through a simulation in the future,
even if the simulation is in a nested node group.
There is a new `Simulation Nodes` panel in the physics tab in the properties
editor. It allows baking all simulation zones on the selected objects. The
baking options are intentially kept at a minimum for this MVP. More features
for simulation baking as well as baking in general can be expected to be added
separately.
All baked data is stored on disk in a folder next to the .blend file. #106937
describes how baking is implemented in more detail. Volumes can not be baked
yet and materials are lost during baking for now. Packing the baked data into
the .blend file is not yet supported.
The timeline indicates which frames are currently cached, baked or cached but
invalidated by user-changes.
Simulation input and output nodes are internally linked together by their
`bNode.identifier` which stays the same even if the node name changes. They
are generally added and removed together. However, there are still cases where
"dangling" simulation nodes can be created currently. Those generally don't
cause harm, but would be nice to avoid this in more cases in the future.
Co-authored-by: Hans Goudey <h.goudey@me.com>
Co-authored-by: Lukas Tönne <lukas@blender.org>
Pull Request: https://projects.blender.org/blender/blender/pulls/104924
The length passed to IDP_NewString included the nil terminator
unlike IDP_AssignString. Callers to IDP_AssignString from MOD_nodes.cc
passed in an invalid size for e.g. There where also callers passing in
the strlen(..) of the value unnecessarily.
Resolve with the following changes:
- Add `*MaxSize()` versions of IDP_AssignString & IDP_NewString which
take a size argument.
- Remove the size argument from the existing functions as most callers
don't need to clamp the length of the string, avoid calculating &
passing in length values when they're unnecessary.
- When clamping the size is needed, both functions now include the nil
byte in the size since this is the convention for BLI_string and other
BLI API's dealing with nil terminated strings,
it avoids having to pass in `sizeof(value) - 1`.
These aren't propagated as attributes since interpolating them with the
generic rules often gives strange results, and they're intended to as
an optimization. Though theoretically it would be nice if this
copying became more generic in the future.
Currently `Mesh to Volume` creates a volume using
`openvdb::tools::meshToVolume` which is then filled with the specified
density and the class set to Fog Volume. This is wrong because
`meshToVolume` creates a signed distance field by default that needs
to be converted to a Fog Volume with `openvdb::tools::sdfToFogVolume`
to get a proper Fog volume.
Here is the description of what that function does (from OpenVDB):
"The active and negative-valued interior half of the narrow band
becomes a linear ramp from 0 to 1; the inactive interior becomes
active with a constant value of 1; and the exterior, including the
background and the active exterior half of the narrow band, becomes
inactive with a constant value of 0. The interior, though active,
remains sparse."
This means with this commit old files will not look the same.
There is no way to version this as the options for external band width
and not filling the volume is removed (they don't make any sense for a
proper fog volume).
Pull Request: https://projects.blender.org/blender/blender/pulls/107279
Implicit sharing means attribute ownership is shared between geometry
data-blocks, and the sharing happens automatically. So it's unnecessary
to choose whether to enable it when copying a mesh.
The typical order is vertex, edge, face(polygon), corner(loop), but in
these three functions polys and loops were reversed. Also use more
typical "num" variable names rather than "len"
Add the ability to retrieve implicit sharing info directly from the
C++ attribute API, which simplifies memory usage and performance
optimizations making use of it. This commit uses the additions to
the API to avoid copies in a few places:
- The "rest_position" attribute in the mesh modifier stack
- Instance on Points node
- Instances to points node
- Mesh to points node
- Points to vertices node
Many files are affected because in order to include the new information
in the API's returned data, I had to switch a bunch of types from
`VArray` to `AttributeReader`. This generally makes sense anyway, since
it allows retrieving the domain, which wasn't possible before in some
cases. I overloaded the `*` deference operator for some syntactic sugar
to avoid the (very ugly) `.varray` that would be necessary otherwise.
Pull Request: https://projects.blender.org/blender/blender/pulls/107059
Implements #95966, as the final step of #95965.
This commit changes the storage of mesh edge vertex indices from the
`MEdge` type to the generic `int2` attribute type. This follows the
general design for geometry and the attribute system, where the data
storage type and the usage semantics are separated.
The main benefit of the change is reduced memory usage-- the
requirements of storing mesh edges is reduced by 1/3. For example,
this saves 8MB on a 1 million vertex grid. This also gives performance
benefits to any memory-bound mesh processing algorithm that uses edges.
Another benefit is that all of the edge's vertex indices are
contiguous. In a few cases, it's helpful to process all of them as
`Span<int>` rather than `Span<int2>`. Similarly, the type is more
likely to match a generic format used by a library, or code that
shouldn't know about specific Blender `Mesh` types.
Various Notes:
- The `.edge_verts` name is used to reflect a mapping between domains,
similar to `.corner_verts`, etc. The period means that it the data
shouldn't change arbitrarily by the user or procedural operations.
- `edge[0]` is now used instead of `edge.v1`
- Signed integers are used instead of unsigned to reduce the mixing
of signed-ness, which can be error prone.
- All of the previously used core mesh data types (`MVert`, `MEdge`,
`MLoop`, `MPoly` are now deprecated. Only generic types are used).
- The `vec2i` DNA type is used in the few C files where necessary.
Pull Request: https://projects.blender.org/blender/blender/pulls/106638
- "Lens" can be a transparent object used in cameras, or specifically
its property of focal length
- "Empty" can be an adjective meaning void, or an object type. The
latter is already disambiguated using `ID_ID`
- "New" and "Old" are adjectives that can have agreements in some
languages
- "Modified" is an adjective that can have agreement in some languages
- "Clipping" can be a property of a camera, or a behavior of the
mirror modifier
- "Value" in HSV nodes, see #105113
- "Area" in the Face Area geometry node, can mean a measurement or a
window type
- "New" is an adjective that can have agreement
- "Tab" can be a UI element or a whitespace character
- "Volume" can mean a measurement or an object type. The latter is
already disambiguated using `ID_ID`
These changes introduce the new `BLT_I18NCONTEXT_TIME` translation
context.
They also remove `BLT_I18NCONTEXT_VIRTUAL_REALITY`, which I added at
one point but then couldn't find which messages I wanted to fix with
it.
Ref #43295
Pull Request: #106718
Implements #95967.
Currently the `MPoly` struct is 12 bytes, and stores the index of a
face's first corner and the number of corners/verts/edges. Polygons
and corners are always created in order by Blender, meaning each
face's corners will be after the previous face's corners. We can take
advantage of this fact and eliminate the redundancy in mesh face
storage by only storing a single integer corner offset for each face.
The size of the face is then encoded by the offset of the next face.
The size of a single integer is 4 bytes, so this reduces memory
usage by 3 times.
The same method is used for `CurvesGeometry`, so Blender already has
an abstraction to simplify using these offsets called `OffsetIndices`.
This class is used to easily retrieve a range of corner indices for
each face. This also gives the opportunity for sharing some logic with
curves.
Another benefit of the change is that the offsets and sizes stored in
`MPoly` can no longer disagree with each other. Storing faces in the
order of their corners can simplify some code too.
Face/polygon variables now use the `IndexRange` type, which comes with
quite a few utilities that can simplify code.
Some:
- The offset integer array has to be one longer than the face count to
avoid a branch for every face, which means the data is no longer part
of the mesh's `CustomData`.
- We lose the ability to "reference" an original mesh's offset array
until more reusable CoW from #104478 is committed. That will be added
in a separate commit.
- Since they aren't part of `CustomData`, poly offsets often have to be
copied manually.
- To simplify using `OffsetIndices` in many places, some functions and
structs in headers were moved to only compile in C++.
- All meshes created by Blender use the same order for faces and face
corners, but just in case, meshes with mismatched order are fixed by
versioning code.
- `MeshPolygon.totloop` is no longer editable in RNA. This API break is
necessary here unfortunately. It should be worth it in 3.6, since
that's the best way to allow loading meshes from 4.0, which is
important for an LTS version.
Pull Request: https://projects.blender.org/blender/blender/pulls/105938
For some reason, lattice modifier always depend on self object transform.
This fix just move extra dependencies in to if case statement, also some
cleaning of this code area.
Pull Request: https://projects.blender.org/blender/blender/pulls/105293