Including <iostream> or similar headers is quite expensive, since it
also pulls in things like <locale> and so on. In many BLI headers,
iostreams are only used to implement some sort of "debug print",
or an operator<< for ostream.
Change some of the commonly used places to instead include <iosfwd>,
which is the standard way of forward-declaring iostreams related
classes, and move the actual debug-print / operator<< implementations
into .cc files.
This is not done for templated classes though (it would be possible
to provide explicit operator<< instantiations somewhere in the
source file, but that would lead to hard-to-figure-out linker error
whenever someone would add a different template type). There, where
possible, I changed from full <iostream> include to only the needed
<ostream> part.
For Span<T>, I just removed print_as_lines since it's not used by
anything. It could be moved into a .cc file using a similar approach
as above if needed.
Doing full blender build changes include counts this way:
- <iostream> 1986 -> 978
- <sstream> 2880 -> 925
It does not affect the total build time much though, mostly because
towards the end of it there's just several CPU cores finishing
compiling OpenVDB related source files.
Pull Request: https://projects.blender.org/blender/blender/pulls/111046
Listing the "Blender Foundation" as copyright holder implied the Blender
Foundation holds copyright to files which may include work from many
developers.
While keeping copyright on headers makes sense for isolated libraries,
Blender's own code may be refactored or moved between files in a way
that makes the per file copyright holders less meaningful.
Copyright references to the "Blender Foundation" have been replaced with
"Blender Authors", with the exception of `./extern/` since these this
contains libraries which are more isolated, any changed to license
headers there can be handled on a case-by-case basis.
Some directories in `./intern/` have also been excluded:
- `./intern/cycles/` it's own `AUTHORS` file is planned.
- `./intern/opensubdiv/`.
An "AUTHORS" file has been added, using the chromium projects authors
file as a template.
Design task: #110784
Ref !110783.
A lot of files were missing copyright field in the header and
the Blender Foundation contributed to them in a sense of bug
fixing and general maintenance.
This change makes it explicit that those files are at least
partially copyrighted by the Blender Foundation.
Note that this does not make it so the Blender Foundation is
the only holder of the copyright in those files, and developers
who do not have a signed contract with the foundation still
hold the copyright as well.
Another aspect of this change is using SPDX format for the
header. We already used it for the license specification,
and now we state it for the copyright as well, following the
FAQ:
https://reuse.software/faq/
For example
```
OIIOOutputDriver::~OIIOOutputDriver()
{
}
```
becomes
```
OIIOOutputDriver::~OIIOOutputDriver() {}
```
Saves quite some vertical space, which is especially handy for
constructors.
Pull Request: https://projects.blender.org/blender/blender/pulls/105594
This moves all multi-function related code in the `functions` module
into a new `multi_function` namespace. This is similar to how there
is a `lazy_function` namespace.
The main benefit of this is that many types names that were prefixed
with `MF` (for "multi function") can be simplified.
There is also a common shorthand for the `multi_function` namespace: `mf`.
This is also similar to lazy-functions where the shortened namespace
is called `lf`.
This is the conventional way of dealing with unused arguments in C++,
since it works on all compilers.
Regex find and replace: `UNUSED\((\w+)\)` -> `/*$1*/`
This improves performance of the procedure executor on secondary metrics
(i.e. not for the main use case when many elements are processed together,
but for the use case when a single element is processed at a time).
In my benchmark I'm measuring a 50-60% improvement:
* Procedure with a single function (executed many times): `5.8s -> 2.7s`.
* Procedure with 1000 functions (executed many times): `2.4 -> 1.0s`.
The speedup is mainly achieved in multiple ways:
* Store an `Array` of variable states, instead of a map. The array is indexed
with indices stored in each variable. This also avoids separately allocating
variable states.
* Move less data around in the scheduler and use a `Stack` instead of `Map`.
`Map` was used before because it allows for some optimizations that might
be more important in the future, but they don't matter right now (e.g. joining
execution paths that diverged earlier).
* Avoid memory allocations by giving the `LinearAllocator` some memory
from the stack.
Goals:
* Better high level control over where devirtualization occurs. There is always
a trade-off between performance and compile-time/binary-size.
* Simplify using array devirtualization.
* Better performance for cases where devirtualization wasn't used before.
Many geometry nodes accept fields as inputs. Internally, that means that the
execution functions have to accept so called "virtual arrays" as inputs. Those
can be e.g. actual arrays, just single values, or lazily computed arrays.
Due to these different possible virtual arrays implementations, access to
individual elements is slower than it would be if everything was just a normal
array (access does through a virtual function call). For more complex execution
functions, this overhead does not matter, but for small functions (like a simple
addition) it very much does. The virtual function call also prevents the compiler
from doing some optimizations (e.g. loop unrolling and inserting simd instructions).
The solution is to "devirtualize" the virtual arrays for small functions where the
overhead is measurable. Essentially, the function is generated many times with
different array types as input. Then there is a run-time dispatch that calls the
best implementation. We have been doing devirtualization in e.g. math nodes
for a long time already. This patch just generalizes the concept and makes it
easier to control. It also makes it easier to investigate the different trade-offs
when it comes to devirtualization.
Nodes that we've optimized using devirtualization before didn't get a speedup.
However, a couple of nodes are using devirtualization now, that didn't before.
Those got a 2-4x speedup in common cases.
* Map Range
* Random Value
* Switch
* Combine XYZ
Differential Revision: https://developer.blender.org/D14628
Use a shorter/simpler license convention, stops the header taking so
much space.
Follow the SPDX license specification: https://spdx.org/licenses
- C/C++/objc/objc++
- Python
- Shell Scripts
- CMake, GNUmakefile
While most of the source tree has been included
- `./extern/` was left out.
- `./intern/cycles` & `./intern/atomic` are also excluded because they
use different header conventions.
doc/license/SPDX-license-identifiers.txt has been added to list SPDX all
used identifiers.
See P2788 for the script that automated these edits.
Reviewed By: brecht, mont29, sergey
Ref D14069
Previously, the function names were stored in `std::string` and were often
created dynamically (especially when the function just output a constant).
This resulted in a lot of overhead.
Now the function name is just a `const char *` that should be statically
allocated. This is good enough for the majority of cases. If a multi-function
needs a more dynamic name, it can override the `MultiFunction::debug_name`
method.
In my test file with >400,000 simple math nodes, the execution time improves from
3s to 1s.
Sometimes not all outputs of a multi-function are required by the
caller. In those cases it would be a waste of compute resources
to calculate the unused values anyway. Now, the caller of a
multi-function can specify when a specific output is not used.
The called function can check if an output is unused and may
ignore it. Multi-functions can still computed unused outputs as
before if they don't want to check if a specific output is unused.
The multi-function procedure system has been updated to support
ignored outputs in call instructions. An ignored output just has no
variable assigned to it.
The field system has been updated to generate a multi-function
procedure where unused outputs are ignored.
Now an instruction knows the cursors where it is inserted instead
of just the instruction that references it. This has two benefits:
* An instruction knows when it is the entry instruction.
* The cursor can contain more information, e.g. if it is linked to the
true or false branch of a branch instruction.
This also simplifies updating the procedure in future optimization
passes.
This implements the initial core framework for fields and anonymous
attributes (also see T91274).
The new functionality is hidden behind the "Geometry Nodes Fields"
feature flag. When enabled in the user preferences, the following
new nodes become available: `Position`, `Index`, `Normal`,
`Set Position` and `Attribute Capture`.
Socket inspection has not been updated to work with fields yet.
Besides these changes at the user level, this patch contains the
ground work for:
* building and evaluating fields at run-time (`FN_fields.hh`) and
* creating and accessing anonymous attributes on geometry
(`BKE_anonymous_attribute.h`).
For evaluating fields we use a new so called multi-function procedure
(`FN_multi_function_procedure.hh`). It allows composing multi-functions
in arbitrary ways and supports efficient evaluation as is required by
fields. See `FN_multi_function_procedure.hh` for more details on how
this evaluation mechanism can be used.
A new `AttributeIDRef` has been added which allows handling named
and anonymous attributes in the same way in many places.
Hans and I worked on this patch together.
Differential Revision: https://developer.blender.org/D12414