It appears that 8 bit `blend_color_softlight_byte` call used a wrong
blending routing (overlay), while `blend_color_softlight_float` is
correct. Seems that this was never caught. The correct fomula should be
`dst = 2ab + a^2 * (1 - 2b)`.
Pull Request: https://projects.blender.org/blender/blender/pulls/135382
The issue was that sometimes the group inputs of a node group were shuffled
around unexpectedly and thus inputs were passed to the wrong sockets.
The `or_socket_usages` function sorts the given span so that the key is more
likely to be reused, reducing the number of nodes inserted in the graph. The
issue was that `build_warning_node` passes `group_output_used_sockets_` into the
function the order of which is important. It thus should not be reordered.
The fix is to just never reorder the span passed to `or_socket_usages` but to
make a local copy instead which can be sorted without problems. Often this copy
is done already anyway when the span is inserted into
`graph_params.socket_usages_combination_cache` as `Vector`.
This fix also makes an assumption about `Map.lookup_or_add_cb` which was not
documented before. Namely it assumes that the key is moved into the map only
after the callback has been called. This behavior is now documented and there is
a unit test for it.
Pull Request: https://projects.blender.org/blender/blender/pulls/135528
The general idea is to keep the 'old', C-style MEM_callocN signature, and slowly
replace most of its usages with the new, C++-style type-safer template version.
* `MEM_cnew<T>` allocation version is renamed to `MEM_callocN<T>`.
* `MEM_cnew_array<T>` allocation version is renamed to `MEM_calloc_arrayN<T>`.
* `MEM_cnew<T>` duplicate version is renamed to `MEM_dupallocN<T>`.
Similar templates type-safe version of `MEM_mallocN` will be added soon
as well.
Following discussions in !134452.
NOTE: For now static type checking in `MEM_callocN` and related are slightly
different for Windows MSVC. This compiler seems to consider structs using the
`DNA_DEFINE_CXX_METHODS` macro as non-trivial (likely because their default
copy constructors are deleted). So using checks on trivially
constructible/destructible instead on this compiler/system.
Pull Request: https://projects.blender.org/blender/blender/pulls/134771
This adds three new functions to find the first 0 or 1 bit in an arbitrarily
long bit span:
```cpp
blender::bits::find_first_0_index(BitSpan) -> std::optional<int64_t>
blender::bits::find_first_1_index(BitSpan) -> std::optional<int64_t>
blender::bits::find_first_1_index_expr(Expr, BitSpans...) -> std::optional<int64_t>
```
The two first ones are implemented in terms of the third. The `*_expr` variant
allows e.g. finding the first set bit when ORing two bit spans together without
computing the entire intermediate result first. Or it can be used to find the
first index where two bit spans are different.
Pull Request: https://projects.blender.org/blender/blender/pulls/134923
This reimplements the CSV parser used by the (still experimental) Import CSV
node.
Reliability is improved by:
* Properly handling quoted fields.
* Unit tests.
* Generalizing the parser to be able to handle customized delimiter, quote and
escape characters (those are not exposed in the node yet though).
* More accurate detection of column types by actually taking all values of a
column into account instead of only the first row.
Performance is improved by designing the parser in a way that supports
multi-threaded parsing. I'm measuring about 5x performance improvement which
mainly comes from multi-threading. Some files I wanted to use for benchmarking
didn't load in the version that's in `main` but do load fine with this new
version.
The implementation is now split up into two parts:
1. A general CSV parser in `blenlib` that manages splitting a buffer into
records and their fields.
2. Application specific parsing of fields into e.g. floats and integers which
remains in `io/csv/importer`.
This separation simplifies unit testing and makes the core code more reusable.
Pull Request: https://projects.blender.org/blender/blender/pulls/134715
Previously, there was a `StringRef.copy` method which would copy the string into
the given buffer. However, it was not defined for the case when the buffer was
too small. It moved the responsibility of making sure the buffer is large enough
to the caller.
Unfortunately, in practice that easily hides bugs in builds without asserts
which don't come up in testing much. Now, the method is replaced with
`StringRef.copy_utf8_truncated` which has much more well defined semantics and
also makes sure that the string remains valid utf-8.
This also renames `unsafe_copy` to `copy_unsafe` to make the naming more similar
to `copy_utf8_truncated`.
Pull Request: https://projects.blender.org/blender/blender/pulls/133677
Before, it was only possible to apply modifiers through a multires modifier if
they were deform-only. The Geometry Nodes modifier is of course not deform-only.
However, often one can build a node setup, that only deforms and does nothing
else.
To make it possible to apply the Geometry Nodes modifier in such cases the
following things had to be done:
* Update `BKE_modifier_deform_verts` to work with modifiers that implement
`modify_geometry_set` instead of `deform_verts`.
* Add error handling for the case when `modify_geometry_set` does more than just
deformation.
* Allow the Geometry Nodes modifier to be applied through a multi-res modifier.
Two new utility types (`ArrayState` and `MeshTopologyState`) have been
introduced to allow for efficient and accurate checking whether the topology has
been modified. In common cases, they can detect that the topology has not been
changed in constant time, but they fall back to linear time checking if it's not
immediately obvious that the topology has not been changed.
This works with the example files from #98559 and #97603.
Pull Request: https://projects.blender.org/blender/blender/pulls/131904
This can be useful when going from a selection of curves to the corresponding
selection of points.
The simple implementation given here should work quite well in the majority of
cases. There is still optimization potential for some cases involving masks with
many gaps. The implementation is also single threaded for now. Using
multi-threading is possible, but should ideally be done in some way to still
lets us exploit the fact that the index ranges are already in sorted order.
Pull Request: https://projects.blender.org/blender/blender/pulls/133323
When using clangd or running clang-tidy on headers there are
currently many errors. These are noisy in IDEs, make auto fixes
impossible, and break features like code completion, refactoring
and navigation.
This makes source/blender headers work by themselves, which is
generally the goal anyway. But #includes and forward declarations
were often incomplete.
* Add #includes and forward declarations
* Add IWYU pragma: export in a few places
* Remove some unused #includes (but there are many more)
* Tweak ShaderCreateInfo macros to work better with clangd
Some types of headers still have errors, these could be fixed or
worked around with more investigation. Mostly preprocessor
template headers like NOD_static_types.h.
Note that that disabling WITH_UNITY_BUILD is required for clangd to
work properly, otherwise compile_commands.json does not contain
the information for the relevant source files.
For more details see the developer docs:
https://developer.blender.org/docs/handbook/tooling/clangd/
Pull Request: https://projects.blender.org/blender/blender/pulls/132608
Before, it was only possible to clear the entire cache at once or to rely on
automatic clearing when it gets full. This patch adds the ability to remove
cached data based on a predicate function.
This is useful for #124369 for partially invalidating the cache for some files.
Pull Request: https://projects.blender.org/blender/blender/pulls/132605
This adds a new `Map.lookup_try` method that returns a `std::optional<Value>`
for a given key. If the key is not in the map, `std::nullopt` is returned. If it
is in the map, then a copy of the value is returned. Note that the copy is
necessary because `std::optional` can't contain reference types.
This method helps a lot in #132219 to reduce boilerplate.
Pull Request: https://projects.blender.org/blender/blender/pulls/132278
This enables moving elements form one vector to another.
Usually this is being doing by extending a vector with the content
from the secondary vector and then clearing the secondary vector.
However sometimes this being performed to transfer ownership of managed elements,
if elements are copied from the secondary vector, but not cleared, this could
lead to 2 vectors to share ownership of objects.
Pull Request: https://projects.blender.org/blender/blender/pulls/131560
Add `bool MutableSpan<T>::contains(const T &value)` function, which is an
exact copy of the function from the `Span<T>` class. This makes it possible
to call `.contains()` directly on the `MutableSpan`, instead of having to
call `.as_span()` first.
Note that the `contains_ptr()` function was already copied from `Span` to
`MutableSpan` before.
No functional changes to the affected code, just the removal of the now
no longer necessary `.to_span()` call.
Pull Request: https://projects.blender.org/blender/blender/pulls/131149
Remove assert statement from make_update.py, making it ready for any
architecture.
Add riscv 32, 64 and 128 bit cpu architecture with little/big endian
to Blender's BLI_build_config.h and Libmv's build_config.h
Tested (to compile) on riscv64 little endian machine.
Pull Request: https://projects.blender.org/blender/blender/pulls/130920
Implements #130836:
interpolate_*_wrapmode_fl take InterpWrapMode wrap_u and wrap_v arguments.
U and V coordinate axes can have different wrap modes: clamp/extend,
border/zero, wrap/repeat.
Note that this removes inconsistency where cubic interpolation was
returning zero for samples completely outside the image, but all other
functions were not, and the behavior was not matching the function
documentation either.
Use the new functions in the new compositor CPU backend.
Possible performance impact for other places (e.g. VSE): measured on
4K resolution, transformed (scaled and rotated) 4K EXR image:
- Nearest filter: no change,
- Bilinear filter: no change,
- Cubic BSpline filter: slight performance decrease, IMB_transform
19.5 -> 20.7 ms (Ryzen 5950X, VS2022). Feels acceptable.
Pull Request: https://projects.blender.org/blender/blender/pulls/130893
This adds support for constructing a `Map` directly from key-value-pairs.
If the same key exists multiple times, only the first one is taken into account.
Note: the keys and values are copied into the map (not moved). This is because
an `std::initializer_list` is read-only. If move-behavior is needed, it is better to
use the existing `map.add*` functions.
```cpp
Map<int, std::string> map = {{1, "where"}, {3, "when"}, {5, "why"}};
````
Pull Request: https://projects.blender.org/blender/blender/pulls/130470
Cleanup to simplify code by using common terms like _first_ and _last_ in context
of predicate applying in a range just like we do for spans and range containers.
Pull Request: https://projects.blender.org/blender/blender/pulls/130380
This extends the `Vector` API to support transfering ownership of a memory
buffer to and from a `Vector`. This reduces the need for unnecessary copies in
some cases which converting between data-structures. A new
`VectorSet::extract_vector` method is added that takes O(1) time. Previously,
this was only possible in O(n) time by copying the entire array.
The `Vector::release` method can be used to e.g. build a vector with the C++
container, but then extract it for use in DNA data.
Pull Request: https://projects.blender.org/blender/blender/pulls/129736
The length checking wasn't accounting for null bytes within multi-byte
sequences and could step over the null bytes.
For BLI_strlen_utf8 this could result in an out of bounds read.
In practice most UTF8 data is validated so the extra checks
are mainly to prevent errors on invalid or corrupt UTF8 text.
Previously, passing a falsy `std::function` (i.e. it is empty) or a null
function pointer would result in a `FunctionRef` which is thruthy and thus
appeared callable. Now, when the passed in callable is falsy, the resulting
`FunctionRef` will be too.
Pull Request: https://projects.blender.org/blender/blender/pulls/128823
The paths `C:` (on WIN32) and `/` on other systems was detecting as
relative. This meant `BLI_path_abs_from_cwd` would make both paths
CWD relative. Also remove duplicate call to BLI_path_is_unc.
In addition to float<->half functions to convert one number (#127708), add
float_to_half_array and half_to_float_array functions:
- On x64, this uses SSE2 4-wide implementation to do the conversion
(2x faster half->float, 4x faster float->half compared to scalar),
- There's also an AVX2 codepath that uses CPU hardware F16C instructions
(8-wide), to be used when/if blender codebase will start to be built
for AVX2 (today it is not yet).
- On arm64, this uses NEON VCVT instructions to do the conversion.
Use these functions in Vulkan buffer/texture conversion code. Time taken to
convert float->half texture while viewing EXR file in image space (22M
numbers to convert): 39.7ms -> 10.1ms (would be 6.9ms if building for AVX2)
Pull Request: https://projects.blender.org/blender/blender/pulls/127838
Blender codebase had two ways to convert half (FP16) to float (FP32):
- BLI_math_bits.h half_to_float. Out of 64k possible half values, it converts
4096 of them incorrectly. Mostly denormals and NaNs, which is perhaps not too
relevant. But more importantly, it converts half zero to float 0.000030517578
which does not sound ideal.
- Functions in Vulkan vk_data_conversion.hh. This one converts 2046 possible
half values incorrectly.
Function to convert float (FP32) to half (FP16) was in Vulkan
vk_data_conversion.hh, and it got a bunch of possible inputs wrong. I guess it
did not do proper "round to nearest even" that CPU/GPU hardware does.
This PR:
- Adds BLI_math_half.hh with float_to_half and half_to_float functions.
- Documentation and test coverage.
- When compiling on ARM NEON, use hardware VCVT instructions.
- Removes the incorrect half_to_float from BLI_math_bits.h and replaces single
usage of it in View3D color picking to use the new function.
- Changes Vulkan FP32<->FP16 conversion code to use the new functions, to fix
correctness issues (makes eevee_next_bsdf_vulkan test pass). This makes it
faster too.
Pull Request: https://projects.blender.org/blender/blender/pulls/127708
I've done this a few times and would have benefited from a utility
function for it, apparently it's done in a few more places too. The
utilities aren't multithreaded for now, it doesn't seem important
and often multithreading happens at a different level of the call
stack anyway.
Pull Request: https://projects.blender.org/blender/blender/pulls/127517