As part of a more general Objective-C GHOST refactor and in an effort to
modernize the macOS backend for further works, this commit cleans up the
codestyle of Objective-C files. Based off the Blender C/C++ style guide,
in addition to some Objective-C specific style changes.
Changes:
- `const` correctness, use nullptr, initializer list for simple struct
- Reduced variable scope for simple functions, removed unused variables
- Use braces for conditional statements, no else after return
- Annotate inheritted function of GHOST Cocoa classes with override and
use `= default` to define trivial constructors
- Use #import instead of #include for Objective-C headers
This is only for correctness. As the Objective-C #import directive
is really just an #include with an implicit #pragma once.
- Use proper C-style comments instead of #pragma mark
#pragma mark is an XCode feature to mark code chapters, to follow
the Blender codestyle, and make the Objective-C code more editor
agnostic, these were replaced with multi-line C-style comments.
Ref #126772
Pull Request: https://projects.blender.org/blender/blender/pulls/126770
Blender codebase had two ways to convert half (FP16) to float (FP32):
- BLI_math_bits.h half_to_float. Out of 64k possible half values, it converts
4096 of them incorrectly. Mostly denormals and NaNs, which is perhaps not too
relevant. But more importantly, it converts half zero to float 0.000030517578
which does not sound ideal.
- Functions in Vulkan vk_data_conversion.hh. This one converts 2046 possible
half values incorrectly.
Function to convert float (FP32) to half (FP16) was in Vulkan
vk_data_conversion.hh, and it got a bunch of possible inputs wrong. I guess it
did not do proper "round to nearest even" that CPU/GPU hardware does.
This PR:
- Adds BLI_math_half.hh with float_to_half and half_to_float functions.
- Documentation and test coverage.
- When compiling on ARM NEON, use hardware VCVT instructions.
- Removes the incorrect half_to_float from BLI_math_bits.h and replaces single
usage of it in View3D color picking to use the new function.
- Changes Vulkan FP32<->FP16 conversion code to use the new functions, to fix
correctness issues (makes eevee_next_bsdf_vulkan test pass). This makes it
faster too.
Pull Request: https://projects.blender.org/blender/blender/pulls/127708
I've done this a few times and would have benefited from a utility
function for it, apparently it's done in a few more places too. The
utilities aren't multithreaded for now, it doesn't seem important
and often multithreading happens at a different level of the call
stack anyway.
Pull Request: https://projects.blender.org/blender/blender/pulls/127517
This adds a variant of `accumulate_counts_to_offsets` which checks for
overflows. The hot loop stays essentially the same, it just uses a `int64_t`
instead of `int` for the counter now. For now the error state is returned by
using an `std::optional`. Alternatives could be to throw `std::overflow_error`
or to use some Result/Expected type in the future.
Obviously, there are more places that should handle this kind of error. It's
also not obvious how to propagate that error further up yet so that we can
display e.g. a warning in the node. That decision should be applicable to other
nodes too. For now, there is no warning on the node.
Pull Request: https://projects.blender.org/blender/blender/pulls/127184
Calling exit() runs temporary directory cleanup and other atexit
functions that shouldn't be called while Blender runs.
The same issue [0] addresses.
Also add clarifying comments.
[0]: e00b7c4ad4
Port the macOS version of the `BLI_delete_soft` function from raw
runtime `objc_*` calls function to proper Objective-C for increased
readability and long-term maintainability. This new function is placed
in a new `intern/fileops_apple.mm` file, analogous to the existing
`intern/storage_apple.mm` file.
Pull Request: https://projects.blender.org/blender/blender/pulls/126766
This node hashes various types into an integer. Note that hashes
cannot generally used as unique identifiers because they are not
guaranteed to be unique. It can be used to generate somewhat
stable randomness though in cases where White Noise does not
offer enough flexibility.
It uses hash functions from BLI_noise.hh. These are also used in
the White Noise node.
Pull Request: https://projects.blender.org/blender/blender/pulls/110769
The use of `const` for Objective-C object pointer is not standard and
generally unsound. Unlike a C++ class, which has support for const and
non-const methods. An Objective-C object will still respond to mutable
selectors even if its object pointer is const, making it semantically
useless.
Another problem with const Objective-C object is that they cannot be
properly passed into other Objective-C object selectors due to type
differences. Even if that selector didn't modify the underlying object.
For consistency with general Objective-C code style guidelines, usage of
const pointer syntax (`Class *const`) were also removed.
Ref #126772
Pull Request: https://projects.blender.org/blender/blender/pulls/126768
This patch optimizes `IndexMask::from_bits` by making use of the fact that many
bits can be processed at once and one does not have to look at every bit
individual in many cases. Bits are stored as array of `BitInt` (aka `uint64_t`).
So we can process at least 64 bits at a time. On some platforms we can also make
use of SIMD and process up to 128 bits at once. This can significantly improve
performance if all bits are set/unset.
As a byproduct, this patch also optimizes `IndexMask::from_bools` which is now
implemented in terms of `IndexMask::from_bits`. The conversion from bools to
bits has been optimized significantly too by using SIMD intrinsics.
Pull Request: https://projects.blender.org/blender/blender/pulls/126888
When execvp() failed to replace the stack, the forked process would
return to Blender's WM_main(..) loop then hang when attempting to draw
the window.
Resolve by calling _exit() when execvp fails.
It's not entirely clear why TBB requires this. I couldn't find this restriction
in their documentation yet. It is mentioned in a code comment in
`allocate_node_default_construct` in `tbb/concurrent_hash_map.h` though.
This changes how the lazy-loading and unloading of volume grids works. With that
it should also fix#124164.
The cache is now moved to a deeper and more global level. This allows reloadable
volume grids to be unloaded automatically when a memory limit is reached. The
previous system for automatically unloading grids only worked in fairly specific
cases and also did not work all that well with caching (parts of) volume
sequences.
At its core, this patch adds a general cache system in `BLI_memory_cache.hh`. It
has a simple interface of the form `get(key, compute_if_not_cached_fn) ->
value`. To avoid growing the cache indefinitly, it uses the new
`BLI_memory_counter.hh` API to detect when the cache size limit is reached. In
this case it can automatically free some cached values. Currently, this uses an
LRU system, where the items that have not been used in a while are removed
first. Other heuristics can be implemented too, but especially for caches for
loading files from disk this works well already.
The new memory cache is internally used by `volume_grid_file_cache.cc` for
loading individual volume grids and their simplified variants. It could
potentially also be used to cache which grids are stored in a file.
Additionally, it can potentially also be used as caching layer in more places
like loading bakes or in import geometry nodes. It's not clear yet whether this
will need an extension to the API which currently is fairly minimal.
To allow different systems to use the same memory cache, it has to support
arbitrary identifiers for the cached data. Therefore, this patch also introduces
`GenericKey`, which is an abstract base class for any kind of key that is
comparable, hashable and copyable.
The implementation of the cache currently relies on a new `ConcurrentMap`
data-structure which is a thin wrapper around `tbb::concurrent_hash_map` with a
fallback implementation for when `tbb` is not available. This data structure
allows concurrent reads and writes to the cache. Note that adding data to the
cache is still serialized because of the memory counting.
The size of the cache depends on the `memory_cache_limit` property that's
already shown in the user preferences. While it has a generic name, it's
currently only used by the VSE which is currently using the `MEM_CacheLimiter`
API which has a similar purpose but seems to be less automatic, thread-safe and
also has no idea of implicit-sharing. It also seems to be designed in a way
where one is expected to create multiple "cache limiters" each of which has its
own limit. Longer term, we should probably strive towards unifying these
systems, which seems feasible but a bit out of scope right now. While it's not
ideal that these cache systems don't use a shared memory limit, it's essentially
what we already have for all cache systems in Blender, so it's nothing new.
Some tests for lazy-loading had to be removed because this behavior is more
implicit now and is not as easily observable from the outside.
Pull Request: https://projects.blender.org/blender/blender/pulls/126411
This introduces `MemoryCount` which can be used across multiple
`MemoryCounter`. Generally, `MemoryCount` is expected to live
longer (e.g. over the entire life-time of a cache), while `MemoryCounter`
is expected to only exists when actually counting the memory.
We often have the situation where it would be good if we could easily estimate
the memory usage of some value (e.g. a mesh, or volume). Examples of where we
ran into this in the past:
* Undo step size.
* Caching of volume grids.
* Caching of loaded geometries for import geometry nodes.
Generally, most caching systems would benefit from the ability to know how much
memory they currently use to make better decisions about which data to free and
when. The goal of this patch is to introduce a simple general API to count the
memory usage that is independent of any specific caching system. I'm doing this
to "fix" the chicken and egg problem that caches need to know the memory usage,
but we don't really need to count the memory usage without using it for caches.
Implementing caching and memory counting at the same time make both harder than
implementing them one after another.
The main difficulty with counting memory usage is that some memory may be shared
using implicit sharing. We want to avoid double counting such memory. How
exactly shared memory is treated depends a bit on the use case, so no specific
assumptions are made about that in the API. The gathered memory usage is not
expected to be exact. It's expected to be a decent approximation. It's neither a
lower nor an upper bound unless specified by some specific type. Cache systems
generally build on top of heuristics to decide when to free what anyway.
There are two sides to this API:
1. Get the amount of memory used by one or more values. This side is used by
caching systems and/or systems that want to present the used memory to the
user.
2. Tell the caller how much memory is used. This side is used by all kinds of
types that can report their memory usage such as meshes.
```cpp
/* Get how much memory is used by two meshes together. */
MemoryCounter memory;
mesh_a->count_memory(memory);
mesh_b->count_memory(memory);
int64_t bytes_used = memory.counted_bytes();
/* Tell the caller how much memory is used. */
void Mesh::count_memory(blender::MemoryCounter &memory) const
{
memory.add_shared(this->runtime->face_offsets_sharing_info,
this->face_offsets().size_in_bytes());
/* Forward memory counting to lower level types. This should be fairly common. */
CustomData_count_memory(this->vert_data, this->verts_num, memory);
}
void CustomData_count_memory(const CustomData &data,
const int totelem,
blender::MemoryCounter &memory)
{
for (const CustomDataLayer &layer : Span{data.layers, data.totlayer}) {
memory.add_shared(layer.sharing_info, [&](blender::MemoryCounter &shared_memory) {
/* Not quite correct for all types, but this is only a rough approximation anyway. */
const int64_t elem_size = CustomData_get_elem_size(&layer);
shared_memory.add(totelem * elem_size);
});
}
}
```
Pull Request: https://projects.blender.org/blender/blender/pulls/126295
This patch improves the isotropic Gabor noise UI controls such that
variations happen in both directions of the base orientation, as opposed
to being biased in the positive direction only.
Thanks to Charlie Jolly for suggesting this improvement.
This patch optimizes the Gabor noise standard deviation estimation by
computing the upper limit of the integral as the frequency approaches
infinity, since the integral is mostly constant for the relevant
frequency range. The limits are 0.25 for the 2D case and 1 / 4 * sqrt2
for the 3D case.
This also improves normalization for low frequencies, possibly due to
the effect of windowing.
Thanks to Charlie Jolly for spotting the optimization.
Optimize the Gabor noise texture code with an early exit for points that
are further away from the kernel center. This was already done for the
kernel, but is now being done earlier before computing the weight, so
its computation is now skipped.
Thanks to Charlie Jolly for the suggestion.
Part of #118145.
These days we aren't really benefiting from making PBVH an opaque type.
As we remove its responsibilities to focus it on being a BVH tree and look
to improve performance with data-oriented design, that will only become
more true.
There are some other future developments the current header structure
makes difficult:
- Storing selections of nodes with `IndexMask` for simpler iteration, etc.
- Specialization of node type for each PBVH type
- Reducing overhead of access to node data as nodes get smaller
- General C++ cleanliness and consistency
This PR moves `PBVH` to `blender::bke::pbvh::Tree` and moves `PBVHNode`
to `blender::bke::pbvh::Node`. Both are classes visible to elsewhere in Blender
but with private data fields.
The difficult part about the change is that we're in the middle of a transition
removing data from PBVH. Rather than making some data truly private I
chose to just give it the `_` suffix, since it will ideally be removed later.
Other things should be class methods or implemented as part of friend
classes. But the "fake" private status is much simpler for now and avoids
increasing the scope of this PR too much. Though that's a bit ugly, there's a
straightforward way to resolve these issues-- it just looks like the sort of
inconsistency you'd expect in the middle of a large refactor.
Pull Request: https://projects.blender.org/blender/blender/pulls/124919
Interpolation tool for strokes ported from GPv2.
Adds a new operator that inserts a new frame with interpolated curves.
The source curves are taken from the previous/next keyframe.
Co-authored-by: Hans Goudey <hans@blender.org>
Pull Request: https://projects.blender.org/blender/blender/pulls/122155
This implements a von-Kries-style chromatic adaption using the Bradford matrix.
The adaption is performed in scene linear space in the OCIO GLSL shader, with
the matrix being computed on the host.
The parameters specify the white point of the input, which is to be mapped to
the white point of the scene linear space. The main parameter is temperature,
specified in Kelvin, which defines the blackbody spectrum that is used as the
input white point. Additionally, a tint parameter can be used to shift the
white point away from pure blackbody spectra (e.g. to match a D illuminant).
The defaults are set to match D65 so there is no immediate color shift when
enabling the option. Tint = 10 is needed since the D-series illuminants aren't
perfect blackbody emitters.
As an alternative to manually specifying the values, there's also a color
picker. When a color is selected, temperature and tint are set such that this
color ends up being balanced to white.
This only works if the color is close enough to a blackbody emitter -
specifically, for tint values within +-150. Beyond this, there can be ambiguity
in the representation.
Currently, in this case, the input is just ignored and temperature/tint aren't
changed. Ideally, we'd eventually give UI feedback for this.
Presets are supported, and all the CIE standard illuminants are included.
One part that I'm not quite happy with is that the tint parameter starts to
give weird results at moderate values when the temperature is low.
The reason for this can be seen here:
https://commons.wikimedia.org/wiki/File:Planckian-locus.png
Tint is moving along the isotherm lines (with the plot corresponding to +-150),
but below 4000K some of that range is outside of the gamut. Not much can
be done there, other than possibly clipping those values...
Adding support for this to the compositor should be quite easy and is planned
as a next step.
Pull Request: https://projects.blender.org/blender/blender/pulls/123278
Reference identifiers instead of "above" in code comments as these
tends to become outdated. Even when declarations are removed it's at
least clear that the reference no longer exists instead of referring to
whatever is currently above the declaration.
It's also straightforward to search history for a removed identifier.
Corrected 4 cases of references to things that were no longer above
the doc-strings. Noticed other references which look to be incorrect
but need further investigation.
Currently, when color-picking from the viewport, the code will read the final
displayed pixel color and then somewhat attempt to undo the display transform.
However, this has several limitations - for example, precision is limited to 8
bit, and it does not account for e.g. View Transform or exposure/gamma.
Since we have the pre-display-transform color in a GPU texture anyways, this
code therefore adds a View3D-specific eyedropper handler (similar to e.g.
the image space) that reads from the viewport texture.
Pull Request: https://projects.blender.org/blender/blender/pulls/123408