BLI code for enums that are meant to be used as "bit flags" defined
an ENUM_OPERATORS macro in BLI_utildefines.h. This cleans up things
related to said macro:
- Move it out into a separate BLI_enum_flags.hh header, instead of
"random bag of things" that is the current place,
- Update it to no longer need manual indication of highest individual
bit value. This originally was added in a31a87f89 (2020 Oct), in
order to silence some UBSan warnings that were coming
from GPU related structures (looking at current GPU code, I don't
think this is happening anymore). However, that caused actual
user-visible bugs due to incorrectly specified max. enum bit value,
and today 14% of all usages have incorrect highest individual
bit value spelled out.
- I have reviewed all usages of operator ~ and none of them are
used for directly producing a DNA-serialized value; all the
usages are for masking out other bits for which the new ~
behavior that just flips all bits is fine.
- Make the macro define flag_is_set() function to ease check of bits
that are set in C++ enum class cases; update existing cases to use
that instead of three other ways that were used.
Pull Request: https://projects.blender.org/blender/blender/pulls/148230
Previously, `VArrayImpl` had a `materialize` and `materialize_to_uninitialized`
function. Now both are merged into one with an additional `bool
dst_is_uninitialized` parameter. The same is done for the
`materialize_compressed` method as all as `GVArrayImpl`.
While this kind of merging is typically not ideal, it reduces the binary size by
~200kb while being basically free performance wise. The cost of this predictable
boolean check is expected to be negligible even if only very few indices are
materialized. Additionally, in most cases, this parameter does not even have to
be checked, because for trivial types it does not matter if the destination
array is already initialized or not when overwriting it.
It saves this much memory, because there are quite a few implementations being
generated with e.g. `VArray::from_func` and a lot of code was duplicated for
each instantiation.
This changes only the actual `(G)VArrayImpl`, but not the `VArray` and `GVArray`
API which is typically used to work with virtual arrays.
Pull Request: https://projects.blender.org/blender/blender/pulls/145144
This adds a new Format String node which simplifies constructing strings from
multiple values. The node takes a format string and a dynamic number of
additional parameters as input. The format string determines how the other
inputs are inserted into the string. Only integer, float and string inputs are
supported for now.
It supports two different format syntaxes:
* Python compatible format syntax which also mostly matches the behavior of the
`fmt` C++ library. Most of this is supported, but there are some small
limitations.
* Syntax of the form `###.##` where each `#` stands for a digit. This is the
syntax that was introduced in #134860.
This node greatly simplifies common string operations which would have required
potentially many nodes before to convert numbers to strings and to concatenate
them. It also makes new conversions possible that were not supported before.
This node can also be used to insert e.g. frame numbers into a file path which
was surprisingly complex before.
This node has special behavior for the name of new inputs. For the purpose of
the node, the name of the inputs must be valid identifiers and it's usually
helpful when they are short. New names are therefore initialized to be single
characters. If possible, the first character of the linked input is used. This
works well when connecting e.g. a Separate Vector/Color node. Otherwise, inputs
are named `a` to `z` by default. If that's not possible, the source socket name
is used instead (converted to be a valid identifier). If that still doesn't
work, the name is made unique using the normal `.001` mechanism except that `_`
instead of `.` is used as separator to make the name a valid identifier.
Python Syntax references:
* Python: https://docs.python.org/3/library/string.html#formatspec
* `fmt`: https://fmt.dev/latest/syntax/
More detailed notes about compatibility with the above syntax specifications:
* Conversion using e.g. `!r` like in Python is not supported (maybe the future).
* Sub-attribute access like `{vector.x}` is not supported (maybe the future).
* Using `%` like in Python is not supported (maybe in future).
* Using `#` for an alternate form is not supported. This might help in the
future to make the syntax compatible with #134860.
* Using `L` like in the `fmt` library is not supported because it depends on the
locale which is not good for determinism.
* Grouping with e.g. thousands separators using e.g. `,` or `_` like in Python
is not supported (maybe in future). Need to think about the locale here too.
* Mixing of unnamed (`{}`) and named (`{x} or {0}`) specifiers is allowed.
However, all unnamed specifiers must come before any named specifier.
The implementation uses the `fmt` library for the actual formatting. However,
the inputs are preprocessed to give us more control over the exact supported
syntax and error messages. The code is already somewhat written so that many
strings could be formatted with the same format but that's not actually used yet
because we don't have string fields yet.
Error messages are propagated using a new mechanism that allows a limited form
of error propagation from multi-functions to the node that evaluates them.
Currently, this only works in fairly limited circumstances, e.g. it does not
work during field evaluation. Since this node is never part of field evaluation
yet, that limitation seems ok, but it's something to work on at some point.
Properly supporting that requires some more changes to propagate enough context
information everywhere. Also showing errors of field evaluation on the field
node itself (instead of on the evaluation node) requires even more work because
our current logging system is not setup to support that yet.
This node comes with a few new requirements for the socket items system: names
must be valid identifiers and they are initialized in a non-trivial way.
Overall, this was fairly straight forward to implement but currently it requires
to adding a bunch of new members to all the accessors that don't really need it.
This is something that we should simplify at some point even if I'm not entirely
sure how yet. The same new requirements used in this node would probably also
exist in a potential future expression node.
Pull Request: https://projects.blender.org/blender/blender/pulls/138860
This patch turns the options of the Color Balance node into inputs.
In the process, each of the wheels were split into two inputs, a base
float and a color. For instance, Gain is controlled using both a Base
Gain and Color Gain, the former controls the gain for all channels while
the latter controls it per channel.
Reference #137223.
Pull Request: https://projects.blender.org/blender/blender/pulls/138610
Previously, the `UserData` and `LocalUserData` classes were only supposed to be
used by the lazy-function system. However, they are generic enough so that they
can also be used by the multi-function system. Therefore, this patch extracts
them into a separate header that can be used in both evaluation systems.
I'm doing this in preparation for being able to pass the geometry nodes logger
to multi-functions, to be able to report errors from there.
Pull Request: https://projects.blender.org/blender/blender/pulls/138861
This patch adds a new `BLI_mutex.hh` header which adds `blender::Mutex` as alias
for either `tbb::mutex` or `std::mutex` depending on whether TBB is enabled.
Description copied from the patch:
```
/**
* blender::Mutex should be used as the default mutex in Blender. It implements a subset of the API
* of std::mutex but has overall better guaranteed properties. It can be used with RAII helpers
* like std::lock_guard. However, it is not compatible with e.g. std::condition_variable. So one
* still has to use std::mutex for that case.
*
* The mutex provided by TBB has these properties:
* - It's as fast as a spin-lock in the non-contended case, i.e. when no other thread is trying to
* lock the mutex at the same time.
* - In the contended case, it spins a couple of times but then blocks to avoid draining system
* resources by spinning for a long time.
* - It's only 1 byte large, compared to e.g. 40 bytes when using the std::mutex of GCC. This makes
* it more feasible to have many smaller mutexes which can improve scalability of algorithms
* compared to using fewer larger mutexes. Also it just reduces "memory slop" across Blender.
* - It is *not* a fair mutex, i.e. it's not guaranteed that a thread will ever be able to lock the
* mutex when there are always more than one threads that try to lock it. In the majority of
* cases, using a fair mutex just causes extra overhead without any benefit. std::mutex is not
* guaranteed to be fair either.
*/
```
The performance benchmark suggests that the impact is negilible in almost
all cases. The only benchmarks that show interesting behavior are the once
testing foreach zones in Geometry Nodes. These tests are explicitly testing
overhead, which I still have to reduce over time. So it's not unexpected that
changing the mutex has an impact there. What's interesting is that on macos the
performance improves a lot while on linux it gets worse. Since that overhead
should eventually be removed almost entirely, I don't really consider that
blocking.
Links:
* Documentation of different mutex flavors in TBB:
https://www.intel.com/content/www/us/en/docs/onetbb/developer-guide-api-reference/2021-12/mutex-flavors.html
* Older implementation of a similar mutex by me:
https://archive.blender.org/developer/differential/0016/0016711/index.html
* Interesting read regarding how a mutex can be this small:
https://webkit.org/blog/6161/locking-in-webkit/
Pull Request: https://projects.blender.org/blender/blender/pulls/138370
This patch turns the options of the Chroma Key node into inputs.
In the process, the minimum and maximum angles were renamed to Minimum
and Maximum for consistency with other matte nodes.
Reference #137223.
Pull Request: https://projects.blender.org/blender/blender/pulls/137812
This reduces verbosity when using `LinearAllocator` or `ResourceScope` to
allocate values for a `CPPType`. Now, this is simplified and one also does not
have to manually add a destructor call anymore.
Pull Request: https://projects.blender.org/blender/blender/pulls/137685
This makes accessing these properties more convenient. Since we only ever have
const references to `CPPType`, there isn't really a benefit to using methods to
avoid mutation.
Pull Request: https://projects.blender.org/blender/blender/pulls/137482
This implements bundles and closures which are described in more detail in this
blog post: https://code.blender.org/2024/11/geometry-nodes-workshop-october-2024/
tl;dr:
* Bundles are containers that allow storing multiple socket values in a single
value. Each value in the bundle is identified by a name. Bundles can be
nested.
* Closures are functions that are created with the Closure Zone and can be
evaluated with the Evaluate Closure node.
To use the patch, the `Bundle and Closure Nodes` experimental feature has to be
enabled. This is necessary, because these features are not fully done yet and
still need iterations to improve the workflow before they can be officially
released. These iterations are easier to do in `main` than in a separate branch
though. That's because this patch is quite large and somewhat prone to merge
conflicts. Also other work we want to do, depends on this.
This adds the following new nodes:
* Combine Bundle: can pack multiple values into one.
* Separate Bundle: extracts values from a bundle.
* Closure Zone: outputs a closure zone for use in the `Evaluate Closure` node.
* Evaluate Closure: evaluates the passed in closure.
Things that will be added soon after this lands:
* Fields in bundles and closures. The way this is done changes with #134811, so
I rather implement this once both are in `main`.
* UI features for keeping sockets in sync (right now there are warnings only).
One bigger issue is the limited support for lazyness. For example, all inputs of
a Combine Bundle node will be evaluated, even if they are not all needed. The
same is true for all captured values of a closure. This is a deeper limitation
that needs to be resolved at some point. This will likely be done after an
initial version of this patch is done.
Pull Request: https://projects.blender.org/blender/blender/pulls/128340
This patch implements the multi-function procedure operation for the new
CPU compositor, which is a concrete implementation of the PixelOperation
abstraction, much like ShaderOperation, but uses the FN system to more
efficiently evaluate a group of pixel-wise operations.
A few changes were done to FN to support development. The multi-function
builder now allows retrieving the built function. A new builder method
construct_and_set_matching_fn_cb was added to allow using the SI_SO
builders with non static functions. A few other SI_SO were added to. And
a CPP type for float4 was added.
Additionally, the Gamma, Math, Brightness, and Normal nodes were
implemented as an example. The Math node implementation reused the
existing GN math node implementation, so the code was moved to a common
file.
Reference #125968.
Pull Request: https://projects.blender.org/blender/blender/pulls/126988
This adds a new type of zone to Geometry Nodes that allows executing some nodes
for each element in a geometry.
## Features
* The `Selection` input allows iterating over a subset of elements on the set
domain.
* Fields passed into the input node are available as single values inside of the
zone.
* The input geometry can be split up into separate (completely independent)
geometries for each element (on all domains except face corner).
* New attributes can be created on the input geometry by outputting a single
value from each iteration.
* New geometries can be generated in each iteration.
* All of these geometries are joined to form the final output.
* Attributes from the input geometry are propagated to the output
geometries.
## Evaluation
The evaluation strategy is similar to the one used for repeat zones. Namely, it
dynamically builds a `lazy_function::Graph` once it knows how many iterations
are necessary. It contains a separate node for each iteration. The inputs for
each iteration are hardcoded into the graph. The outputs of each iteration a
passed to a separate lazy-function that reduces all the values down to the final
outputs. This final output can have a huge number of inputs and that is not
ideal for multi-threading yet, but that can still be improved in the future.
## Performance
There is a non-neglilible amount of overhead for each iteration. The overhead is
way larger than the per-element overhead when just doing field evaluation.
Therefore, normal field evaluation should be preferred when possible. That can
partially still be optimized if there is only some number crunching going on in
the zone but that optimization is not implemented yet.
However, processing many small geometries (e.g. each hair of a character
separately) will likely **always be slower** than working on fewer larger
geoemtries. The additional flexibility you get by processing each element
separately comes at the cost that Blender can't optimize the operation as well.
For node groups that need to handle lots of geometry elements, we recommend
trying to design the node setup so that iteration over tiny sub-geometries is
not required.
An opposite point is true as well though. It can be faster to process more
medium sized geometries in parallel than fewer very large geometries because of
more multi-threading opportunities. The exact threshold between tiny, medium and
large geometries depends on a lot of factors though.
Overall, this initial version of the new zone does not implement all
optimization opportunities yet, but the points mentioned above will still hold
true later.
Pull Request: https://projects.blender.org/blender/blender/pulls/127331
This continues the cmake modernization effort and introduces support for
allowing our optional dependencies to integrate properly. TBB is added
here as it's proven troublesome to maintain correctly.
Currently the only Blender project which uses the TBB headers directly
is `blenlib`. However, all downstream projects which require blenlib as
their dependency, and wish to properly make use of its threading
facilities, needed to define various TBB items in their CMake files. Not
only is this unnecessary and arcane, but several projects didn't do this
and ended up not using threading as well as producing ODR violations
along the way[1].
This PR makes TBB a modern dependency and exposes it PUBLIC'ly from
`blenlib`. All downstream projects which depend on blenlib will now
receive everything they require from TBB automatically. This includes
the `WITH_TBB` define, the headers, and the library itself.
[1] blender/blender@05241f47f5
Pull Request: https://projects.blender.org/blender/blender/pulls/124916
The depsgraph CoW mechanism is a bit of a misnomer. It creates an
evaluated copy for data-blocks regardless of whether the copy will
actually be written to. The point is to have physical separation between
original and evaluated data. This is in contrast to the commonly used
performance improvement of keeping a user count and copying data
implicitly when it needs to be changed. In Blender code we call this
"implicit sharing" instead. Importantly, the dependency graph has no
idea about the _actual_ CoW behavior in Blender.
Renaming this functionality in the despgraph removes some of the
confusion that comes up when talking about this, and will hopefully
make the depsgraph less confusing to understand initially too. Wording
like "the evaluated copy" (as opposed to the original data-block) has
also become common anyway.
Pull Request: https://projects.blender.org/blender/blender/pulls/118338
Bundling many tests in a single binary reduces build time and disk space
usage, but is less convenient for running individual tests command line
as filter flags need to be used.
This adds WITH_TESTS_SINGLE_BINARY to generate one executable file per
source file. Note that enabling this option requires a significant amount
of disk space.
Due to refactoring, the resulting ctest names are a bit different than
before. The number of tests is also a bit different depending if this
option is used, as one uses gtests discovery and the other is organized
purely by filename, which isn't always 1:1.
Co-authored-by: Sergey Sharybin <sergey@blender.org>
Pull Request: https://projects.blender.org/blender/blender/pulls/114604
Along with the 4.1 libraries upgrade, we are bumping the clang-format
version from 8-12 to 17. This affects quite a few files.
If not already the case, you may consider pointing your IDE to the
clang-format binary bundled with the Blender precompiled libraries.
This refactors `SocketValueVariant` with the following goals in mind:
* Support type erasure so that not all users of `SocketValueVariant` have
to know about all the types sockets can have.
* Move towards supporting "rainbow sockets" which are sockets whoose
type is only known at run-time.
* Reduce complexity when dealing with socket values in general. Previously,
one had to use `SocketValueVariantCPPType` a lot to manage uninitialized
memory. This is better abstracted away now.
One related change that I had to do that I didn't see coming at first was that
I had to refactor `set_default_remaining_outputs` because now the default value
of a `SocketValueVariant` would not contain any value. Previously, it was
initialized the zero-value of the template parameter. Similarly, I had to change
how implicit conversions are created, because comparing the `CPPType` of linked
sockets was not enough anymore to determine if a conversion is necessary.
We could potentially use `SocketValueVariant` for the remaining socket types in the
future as well. Not entirely sure if that helps yet. `SocketValueVariant` can easily be
adapted to make that work though. That would also justify the name
"SocketValueVariant" better.
Pull Request: https://projects.blender.org/blender/blender/pulls/116231
Due to changes in the build environment shader_builder wasn't able to
compile on macOs. This patch reverts several recent changes to CMake files.
* dbb2844ed9
* 94817f64b9
* 1b6cd937ff
The idea is that in the near future shader_builder will run on the buildbot as
part of any regular build to ensure that changes to the CMake doesn't break
shader_builder and we only detect it after a few days.
Pull Request: https://projects.blender.org/blender/blender/pulls/115929
NDEBUG is part of the C standard and disables asserts. Only this will
now be used to decide if asserts are enabled.
DEBUG was a Blender specific define, that has now been removed.
_DEBUG is a Visual Studio define for builds in Debug configuration.
Blender defines this for all platforms. This is still used in a few
places in the draw code, and in external libraries Bullet and Mantaflow.
Pull Request: https://projects.blender.org/blender/blender/pulls/115774