Goals of this refactor:
* Reduce memory consumption of `IndexMask`. The old `IndexMask` uses an
`int64_t` for each index which is more than necessary in pretty much all
practical cases currently. Using `int32_t` might still become limiting
in the future in case we use this to index e.g. byte buffers larger than
a few gigabytes. We also don't want to template `IndexMask`, because
that would cause a split in the "ecosystem", or everything would have to
be implemented twice or templated.
* Allow for more multi-threading. The old `IndexMask` contains a single
array. This is generally good but has the problem that it is hard to fill
from multiple-threads when the final size is not known from the beginning.
This is commonly the case when e.g. converting an array of bool to an
index mask. Currently, this kind of code only runs on a single thread.
* Allow for efficient set operations like join, intersect and difference.
It should be possible to multi-thread those operations.
* It should be possible to iterate over an `IndexMask` very efficiently.
The most important part of that is to avoid all memory access when iterating
over continuous ranges. For some core nodes (e.g. math nodes), we generate
optimized code for the cases of irregular index masks and simple index ranges.
To achieve these goals, a few compromises had to made:
* Slicing of the mask (at specific indices) and random element access is
`O(log #indices)` now, but with a low constant factor. It should be possible
to split a mask into n approximately equally sized parts in `O(n)` though,
making the time per split `O(1)`.
* Using range-based for loops does not work well when iterating over a nested
data structure like the new `IndexMask`. Therefor, `foreach_*` functions with
callbacks have to be used. To avoid extra code complexity at the call site,
the `foreach_*` methods support multi-threading out of the box.
The new data structure splits an `IndexMask` into an arbitrary number of ordered
`IndexMaskSegment`. Each segment can contain at most `2^14 = 16384` indices. The
indices within a segment are stored as `int16_t`. Each segment has an additional
`int64_t` offset which allows storing arbitrary `int64_t` indices. This approach
has the main benefits that segments can be processed/constructed individually on
multiple threads without a serial bottleneck. Also it reduces the memory
requirements significantly.
For more details see comments in `BLI_index_mask.hh`.
I did a few tests to verify that the data structure generally improves
performance and does not cause regressions:
* Our field evaluation benchmarks take about as much as before. This is to be
expected because we already made sure that e.g. add node evaluation is
vectorized. The important thing here is to check that changes to the way we
iterate over the indices still allows for auto-vectorization.
* Memory usage by a mask is about 1/4 of what it was before in the average case.
That's mainly caused by the switch from `int64_t` to `int16_t` for indices.
In the worst case, the memory requirements can be larger when there are many
indices that are very far away. However, when they are far away from each other,
that indicates that there aren't many indices in total. In common cases, memory
usage can be way lower than 1/4 of before, because sub-ranges use static memory.
* For some more specific numbers I benchmarked `IndexMask::from_bools` in
`index_mask_from_selection` on 10.000.000 elements at various probabilities for
`true` at every index:
```
Probability Old New
0 4.6 ms 0.8 ms
0.001 5.1 ms 1.3 ms
0.2 8.4 ms 1.8 ms
0.5 15.3 ms 3.0 ms
0.8 20.1 ms 3.0 ms
0.999 25.1 ms 1.7 ms
1 13.5 ms 1.1 ms
```
Pull Request: https://projects.blender.org/blender/blender/pulls/104629
The main issue was the fact that if a Scene is overridden, it's content
will be fully invalidated when updating the liboverride at the end of
the file reading process. Since the FileData keeps a pointer to the
active view layer, it needs to be udated then.
As a side consequence, the liblinking of global data also needs to
happen before liboverrides are updated.
Embedded IDs (master collection of scene, etc.) do not exist in the Main
data-base. However, their tags should follow these from their owners. So
e.g. if a scene is in Main, its master collection should not be tagged
as no-main.
NOTE: this is somewhat also related to our ID tags sanitizing TODO task
(#88555).
Found while invesigating #107913.
Combine the newer less efficient C++ implementations and the older
less convenient C functions. The maps now contain one large array of
indices, split into groups by a separate array of offset indices.
Though performance of creating the maps is relatively unchanged, the
new implementation uses 4 bytes less per source element than the C
maps, and 20 bytes less than the newer C++ functions (which also
had more overhead with larger N-gons). The usage syntax is simpler
than the C functions as well.
The reduced memory usage is helpful for when these maps are cached
in the near future. It will also allow sharing the offsets between
maps for different domains like vertex to corner and vertex to face.
A simple `GroupedSpan` class is introduced to make accessing the
topology maps much simpler. It combines offset indices and a separate
span, splitting it into chunks in an efficient way.
Pull Request: https://projects.blender.org/blender/blender/pulls/107861
This adds `char *simulation_bake_directory` to the nodes modifier. The path is automatically generated the first time the modifier is baked. It is _not_ automatically changed afterwards. The path is relative to the .blend file by default. For now, the path is not exposed in the UI or Python API.
This fixes issues where renaming objects/modifiers can cause the baked data to not work anymore.
Pull Request: https://projects.blender.org/blender/blender/pulls/108201
Before 9f78530d80, the -1 coarse_edge_index values in the
foreach_edge calls would return false in BLI_BITMAP_TEST_BOOL,
which made them look like loose edges. BitSpan doesn't have this
problem, so the return for negative indices must be explicit.
Some `ImagePartialUpdateTest` test are calling code that needs access to
a valid `G_MAIN`. So store the generated main there as part of the setup
step, and reset G_MAIN to its original value (should be NULL) in the
teardown step.
NOTE: Things like `ID_BLEND_PATH_FROM_GLOBAL` and
`BKE_main_blendfile_path_from_global` are pure evil. It may be necessary
in a very few small cases, but their current usages need a lot of strong
cleanup.
Pull Request: https://projects.blender.org/blender/blender/pulls/108189
The usual special ShapeKey case needs yet another extra corner case
special handling... See comments in code for details about that specific
issue.
NOTE: May be worth checking if this can be backported to 3.3 LTS too.
Covers the macro ARRAY_SIZE() and STRNCPY.
The problem this change is aimed to solve it to provide cross-platform
compiler-independent safe way pf ensuring that the functions are used
correctly.
The type safety was only ensured for GCC and only for C. The C++
language and Clang compiler would not have detected issues of passing
bare pointer to neither of those macros.
Now the STRNCPY() will only accept a bounded array as the destination
argument, on any compiler.
The ARRAY_SIZE as well, but there are a bit more complications to it
in terms of transparency of the change.
In one place the ARRAY_SIZE was used on float3 type. This worked in the
old code because the type implements subscript operator, and the type
consists of 3 floats. One would argue this is somewhat hidden/implicit
behavior, which better be avoided. So an in-lined value of 3 is used now
there.
Another place is the ARRAY_SIZE used to define a bounded array of the
size which matches bounded array which is a member of a struct. While
the ARRAY_SIZE provides proper size in this case, the compiler does not
believe that the value is known at compile time and errors out with a
message that construction of variable-size arrays is not supported.
Solved by converting the field to std::array<> and adding dedicated
utility to get size of std::array at compile time. There might be a
better way of achieving the same result, or maybe the approach is
fine and just need to find a better place for such utility.
Surely, more macro from the BLI_string.h can be covered with the C++
inlined functions, but need to start somewhere.
There are also quite some changes to ensure the C linkage is not
enforced by code which includes the headers.
Pull Request: https://projects.blender.org/blender/blender/pulls/108041
Allows to share buffer data between the render result and image buffers.
The storage of the passes and buffers in the render result have been
wrapped into utility structures, with functions to operate on them.
Currently only image buffers which are sharing buffers with the render
results are using the implicit sharing. This allows proper decoupling of
the image buffers from the lifetime of the underlying render result.
Fixes#107248: Compositor ACCESS VIOLATION when updating datablocks from handlers
Additionally, this lowers the memory usage of multi-layer EXR sequences
by avoiding having two copies of render passes in memory.
It is possible to use implicit sharing in more places, but needs
some API to ensure the render result is the only owner of data before
writing to its pixels.
Pull Request: https://projects.blender.org/blender/blender/pulls/108045
Mark `NlaStrip.frame_{start,end}` and `NlaStrip.frame_{start,end}_ui` as
to-be-ignored for the library override system, and add a new set of RNA
properties `frame_{start,end}_raw` that the library override system can
use.
Versioning code ensures that overrides on `frame_{start,end}` are
altered to be applied to the `..._raw` counterpart instead.
The override system uses RNA to update properties one-by-one, and the
RNA code trying its best to keep things consistent / valid. This is very
much desired behaviour while a human is editing the data.
However, when the library override system is doing this, it is not
replaying the individual steps (that each end in a valid configuration),
but just setting each property one by one. As a result, the intermediate
state can be invalid (for example moving one strip into another) even
when the end result is perfectly fine.
This is what the `..._raw` properties do -- they set the values without
doing any validation, so they allow the library overrides system to move
strips around.
This assumes that the result of the override is still valid. Logic to
detect invalid situations, and reshuffle the NLA strips if necessary, is
left for a future commit as it is related to #107990 (NLA Vertical
Reorder).
Additionally, this commit adds functions
`BKE_lib_override_library_property_rna_path_change()` and
`BKE_lib_override_library_property_search_and_delete()` to the library
override API. The former is used to change RNA paths of property
overrides, and the latter is used to remove a property override
identified by its RNA path.
Store bevel weights in two new named float attributes:
- `bevel_weight_vert`
- `bevel_weight_edge`
These attributes are naming conventions. Blender doesn't enforce
their data type or domain at all, but some editing features and
modifiers use the hard-coded name. Eventually those tools should
become more generic, but this is a simple change to allow more
flexibility in the meantime.
The largest user-visible changes are that the attributes populate the
attribute list, and are propagated by geometry nodes. The method of
removing this data is now the attribute list as well.
This is a breaking change. Forward compatibility is not preserved, and
the vertex and edge `bevel_weight` properties are removed. Python API
users are expected to use the attribute API to get and set the values.
Fixes#106949
Pull Request: https://projects.blender.org/blender/blender/pulls/108023
A mistake in the recent ImBuf API refactor: the Libmv image accessor
was copying destination to destination (possibly using the wrong
number of channels as well).
Pull Request: https://projects.blender.org/blender/blender/pulls/108070
While strcpy is safe in this case, it's use requires extra scrutiny
and can cause problems if the strings are later translated.
Also move the ID code assignment into material_init_data
as the ID-code is more of an internal detail.
If there are no loose vertices or edges, it's basically free to
propagate that information to the result and save calculating
it later in case it's necessary. I observed a peformance increase
from 3.6 to 4.1 FPS when extruding a 1 million face grid.
Add a ensure_utf8 argument to WM_clipboard_text_get so callers don't
have to handle validation themselves.
Copying non-utf8 text into the Python console and buttons was possible,
causing invalid cursor position and a UnicodeDecodeError accessing
ConsoleLine.body from Python.
Used to be https://archive.blender.org/developer/D17123.
Internally these are already using the same code path anyways, there's no point in maintaining two distinct nodes.
The obvious approach would be to add Anisotropy controls to the Glossy BSDF node and remove the Anisotropic BSDF node. However, that would break forward compability, since older Blender versions don't know how to handle the Anisotropy input on the Glossy BSDF node.
Therefore, this commit technically removes the Glossy BSDF node, uses versioning to replace them with an Anisotropic BSDF node, and renames that node to "Glossy BSDF".
That way, when you open a new file in an older version, all the nodes show up as Anisotropic BSDF nodes and render correctly.
This is a bit ugly internally since we need to preserve the old `idname` which now no longer matches the UI name, but that's not too bad.
Also removes the "Sharp" distribution option and replaces it with GGX, sets Roughness to zero and disconnects any input to the Roughness socket.
Pull Request: https://projects.blender.org/blender/blender/pulls/104445
Since 44e4f077a9 and related commits, geometry nodes doesn't
try to hide the difference between real geometry data and instances from
the user. Other nodes were updated to only support real geometry, but
the "Mesh Boolean" node was never updated and still implicitly gathered
all the instances. This commit removes the special instance behavior in the
boolean node and adds realize instances nodes to keep existing behavior
in most cases. Typically this doesn't make a difference in the result,
though it could in the union mode for instance inputs. Shifting more of
the work to realizing instances should generally be better for
performance, since it's much faster.
Before 9f78530d80, the -1 coarse_edge_index values in the
foreach_edge calls would return false in BLI_BITMAP_TEST_BOOL,
which made them look like loose edges. BitSpan doesn't have this
problem, so the return for negative indices must be explicit.
The goal is to make it more explicit and centralized operation to
assign and steal buffer data, with proper ownership tracking.
The buffers and ownership flags are wrapped into their dedicated
structures now.
There should be no functional changes currently, it is a preparation
for allowing implicit sharing of the ImBuf buffers. Additionally, in
the future it is possible to more buffer-specific information (such
as color space) next to the buffer data itself. It is also possible
to clean up the allocation flags (IB_rect, ...) to give them more
clear naming and not have stored in the ImBuf->flags as they are only
needed for allocation.
The most dangerous part of this change is the change of byte buffer
data from `int*` to `uint8_t*`. In a lot of cases the byte buffer was
cast to `uchar*`, so those casts are now gone. But some code is
operating on `int*` so now there are casts in there. In practice this
should be fine, since we only support 64bit platforms, so allocations
are aligned. The real things to watch out for here is the fact that
allocation and offsetting from the byte buffer now need an explicit 4
channel multiplier.
Once everything is C++ it will be possible to simplify public
functions even further.
Pull Request: https://projects.blender.org/blender/blender/pulls/107609