On GCC, the loops created by `BLI_VEC_OP_IMPL` were not always
unrolled, leading to branching. For `attribute_math::mix4<float3>`,
this lead to a significant performance regression compared to its
older `interp_v3_v3v3v3v3` counterpart.
Instead of a using macros to create the for loops, use variadic
templates to manually unroll them. The compiler might do it anyway
(I didn't observe any effect on Clang in my tests), but there should
be no reason not to unroll these small loops, and making it explicit
and removing use of macros seems better.
On a Ryzen 3700x, this commits doubles the performance of Catmull
Rom curve position evaluation (from 18-19ms to around 9-10ms).
Differential Revision: https://developer.blender.org/D16136
This is the conventional way of dealing with unused arguments in C++,
since it works on all compilers.
Regex find and replace: `UNUSED\((\w+)\)` -> `/*$1*/`
Rewrite PBVH draw to allocate attributes into individual VBOs.
The old system tried to create a single VBO that could feed
every open viewport. This required uploading every color and
UV attribute to the viewport whether needed or not, often exceeding
the VBO limit.
This new system creates one VBO per attribute. Each attribute layout is
given its own GPU batch which is cached inside the owning PBVH node.
Notes:
* This is a full C++ rewrite. The old code is still there; ripping it out
can happen later.
* PBVH nodes now have a collection of batches, PBVHBatches, that keeps
track of all the batches inside the node.
* Batches are built exclusively from a list of attributes.
* Each attribute has its own VBO.
* Overlays, workbench and EEVEE can all have different attribute
layouts, each of which will get its own batch.
Reviewed by: Clement Foucault
Differential Revision: https://developer.blender.org/D15428
Ref D15428
Corrections for caret insertion & movement and deletion for text
strings that include non-precomposed diacritical marks (Unicode
combining characters).
See D15659 for more details and examples.
Differential Revision: https://developer.blender.org/D15659
Reviewed by Campbell Barton
Although rB67e23b4b2967 turned the problem more recurrent, the warning
messages in the console always appear when `BKE_fluid_cache_free_all`
is called.
This is because of a bug in `BLI_filelist_dir_contents` as this function
calls `BLI_strdupcat` instead of `BLI_join_dirfile`
NOTE: Other places in Blender avoid this problem by making sure to add
a `SEP_STR` to the end of the directory.
Differential Revision: https://developer.blender.org/D16043
Match minimum supported versions from the WIKI [0] by raising them to:
- GCC 9.3.1
- CLANG 8.0
- MVCS 2019 (16.9.16 / 1928)
Details:
- Add CMake checks that ensure supported compiler versions early on.
- Previously GCC per-processor version checks served to exclude
`__clang__`, in some cases this has been replaced by explicitly
excluding `__clang__`. This was needed as CLANG treated some of these
flags differently to GCC, causing the build to fail.
- Remove USE_APPLE_OMP_FIX GCC-4.2 OpenMP workaround.
- Remove linking error workaround for old MSVC versions.
[0]: https://wiki.blender.org/wiki/Building_Blender
Reviewed by: brecht, LazyDodo
Ref D16068
Previously removing elements based on a predicate was a bit cumbersome,
especially for hash tables. Now there is a new `remove_if` method in some
data structures which is similar to `std::erase_if`. We could consider adding
`blender::erase_if` in the future to more closely mimic the standard library,
but for now this is using the api design of the surrounding code is used.
This is already the case for most CMake usage.
Although some find modules are an exception to this, as they were
originally maintained externally they use some different conventions.
Also corrected bad indentation in: intern/cycles/CMakeLists.txt
I'm adding some asset APIs/types in C++ that the file-listing code would
use. I prefer porting this code to C++ over adding a C-API for the asset
code.
Includes some minor cleanups that shouldn't change behavior, like using
`MEM_new()`/`MEM_cnew()`, C++ style C-library includes,
`LISTBASE_FOREACH()`, removing unnecessary typedefs, etc.
This was used in early node based particle system development
but is not used anymore. The code also didn't match the standards
of other data structures in blenlib.
In large node setup the threading overhead was sometimes very significant.
That's especially true when most nodes do very little work.
This commit improves the scheduling by not using multi-threading in many
cases unless it's likely that it will be worth it. For more details see the comments
in `BLI_lazy_threading.hh`.
Differential Revision: https://developer.blender.org/D15976
Add new functions to `array_utils` namespace called `gather(..)`.
Versions of `GVArray::materialize_compressed_to_uninitialized(..)` with
threading have been reimplemented locally in multiple geometry node
contexts. The purpose of this patch is therefore to:
* Assemble these implementations in a single file.
* Provide a naming convention that is easier to recognize.
Differential Revision: https://developer.blender.org/D15786
Returns a new range, that contains the intersection of the current one
with the given range.
This is helpful to select a portion of a range without having to deal with
all the asserts of other functions. The resulting range being always a
valid subrange, it can be used to iterate or copy a part of a vector.
The trim functionality is implemented in the geometry module, and
generalized a bit to be potentially useful for bisecting in the future.
The implementation is based on a helper type called `IndexRangeCyclic`
which allows iteration over all control points between two points on a
curve.
Catmull Rom curves are now supported-- trimmed without resampling first.
However, maintaining the exact shape is not possible. NURBS splines are
still converted to polylines using the evaluated curve concept.
Performance is equivalent or faster then a 3.1 build with regards to
node timings. Compared to 3.3 and 3.2, it's easy to observe test cases
where the node is at least 3 or 4 times faster.
Differential Revision: https://developer.blender.org/D14481
This refactors the geometry nodes evaluation system. No changes for the
user are expected. At a high level the goals are:
* Support using geometry nodes outside of the geometry nodes modifier.
* Support using the evaluator infrastructure for other purposes like field evaluation.
* Support more nodes, especially when many of them are disabled behind switch nodes.
* Support doing preprocessing on node groups.
For more details see T98492.
There are fairly detailed comments in the code, but here is a high level overview
for how it works now:
* There is a new "lazy-function" system. It is similar in spirit to the multi-function
system but with different goals. Instead of optimizing throughput for highly
parallelizable work, this system is designed to compute only the data that is actually
necessary. What data is necessary can be determined dynamically during evaluation.
Many lazy-functions can be composed in a graph to form a new lazy-function, which can
again be used in a graph etc.
* Each geometry node group is converted into a lazy-function graph prior to evaluation.
To evaluate geometry nodes, one then just has to evaluate that graph. Node groups are
no longer inlined into their parents.
Next steps for the evaluation system is to reduce the use of threads in some situations
to avoid overhead. Many small node groups don't benefit from multi-threading at all.
This is much easier to do now because not everything has to be inlined in one huge
node tree anymore.
Differential Revision: https://developer.blender.org/D15914
Generally we don't want to do per-element operations on these spans
because of the overhead of the runtime type system, but these operations
on the whole span avoid ugly pointer arithmetic in other areas.