This patch adds a new `BLI_mutex.hh` header which adds `blender::Mutex` as alias
for either `tbb::mutex` or `std::mutex` depending on whether TBB is enabled.
Description copied from the patch:
```
/**
* blender::Mutex should be used as the default mutex in Blender. It implements a subset of the API
* of std::mutex but has overall better guaranteed properties. It can be used with RAII helpers
* like std::lock_guard. However, it is not compatible with e.g. std::condition_variable. So one
* still has to use std::mutex for that case.
*
* The mutex provided by TBB has these properties:
* - It's as fast as a spin-lock in the non-contended case, i.e. when no other thread is trying to
* lock the mutex at the same time.
* - In the contended case, it spins a couple of times but then blocks to avoid draining system
* resources by spinning for a long time.
* - It's only 1 byte large, compared to e.g. 40 bytes when using the std::mutex of GCC. This makes
* it more feasible to have many smaller mutexes which can improve scalability of algorithms
* compared to using fewer larger mutexes. Also it just reduces "memory slop" across Blender.
* - It is *not* a fair mutex, i.e. it's not guaranteed that a thread will ever be able to lock the
* mutex when there are always more than one threads that try to lock it. In the majority of
* cases, using a fair mutex just causes extra overhead without any benefit. std::mutex is not
* guaranteed to be fair either.
*/
```
The performance benchmark suggests that the impact is negilible in almost
all cases. The only benchmarks that show interesting behavior are the once
testing foreach zones in Geometry Nodes. These tests are explicitly testing
overhead, which I still have to reduce over time. So it's not unexpected that
changing the mutex has an impact there. What's interesting is that on macos the
performance improves a lot while on linux it gets worse. Since that overhead
should eventually be removed almost entirely, I don't really consider that
blocking.
Links:
* Documentation of different mutex flavors in TBB:
https://www.intel.com/content/www/us/en/docs/onetbb/developer-guide-api-reference/2021-12/mutex-flavors.html
* Older implementation of a similar mutex by me:
https://archive.blender.org/developer/differential/0016/0016711/index.html
* Interesting read regarding how a mutex can be this small:
https://webkit.org/blog/6161/locking-in-webkit/
Pull Request: https://projects.blender.org/blender/blender/pulls/138370
This changes how the lazy-loading and unloading of volume grids works. With that
it should also fix#124164.
The cache is now moved to a deeper and more global level. This allows reloadable
volume grids to be unloaded automatically when a memory limit is reached. The
previous system for automatically unloading grids only worked in fairly specific
cases and also did not work all that well with caching (parts of) volume
sequences.
At its core, this patch adds a general cache system in `BLI_memory_cache.hh`. It
has a simple interface of the form `get(key, compute_if_not_cached_fn) ->
value`. To avoid growing the cache indefinitly, it uses the new
`BLI_memory_counter.hh` API to detect when the cache size limit is reached. In
this case it can automatically free some cached values. Currently, this uses an
LRU system, where the items that have not been used in a while are removed
first. Other heuristics can be implemented too, but especially for caches for
loading files from disk this works well already.
The new memory cache is internally used by `volume_grid_file_cache.cc` for
loading individual volume grids and their simplified variants. It could
potentially also be used to cache which grids are stored in a file.
Additionally, it can potentially also be used as caching layer in more places
like loading bakes or in import geometry nodes. It's not clear yet whether this
will need an extension to the API which currently is fairly minimal.
To allow different systems to use the same memory cache, it has to support
arbitrary identifiers for the cached data. Therefore, this patch also introduces
`GenericKey`, which is an abstract base class for any kind of key that is
comparable, hashable and copyable.
The implementation of the cache currently relies on a new `ConcurrentMap`
data-structure which is a thin wrapper around `tbb::concurrent_hash_map` with a
fallback implementation for when `tbb` is not available. This data structure
allows concurrent reads and writes to the cache. Note that adding data to the
cache is still serialized because of the memory counting.
The size of the cache depends on the `memory_cache_limit` property that's
already shown in the user preferences. While it has a generic name, it's
currently only used by the VSE which is currently using the `MEM_CacheLimiter`
API which has a similar purpose but seems to be less automatic, thread-safe and
also has no idea of implicit-sharing. It also seems to be designed in a way
where one is expected to create multiple "cache limiters" each of which has its
own limit. Longer term, we should probably strive towards unifying these
systems, which seems feasible but a bit out of scope right now. While it's not
ideal that these cache systems don't use a shared memory limit, it's essentially
what we already have for all cache systems in Blender, so it's nothing new.
Some tests for lazy-loading had to be removed because this behavior is more
implicit now and is not as easily observable from the outside.
Pull Request: https://projects.blender.org/blender/blender/pulls/126411
There was one functional issue with the previous API which was its
use in `VolumeGrid<T>::grid_for_write(tree_token)`. The issue was
that the tree token had to be received before the grid is accessed.
However, this `grid_for_write` method might create a copy of the
`VolumeGridData` internally and if it does, the passed in `tree_token`
corresponds to the wrong tree.
The solution is to output the token as part of the method. This has two
additional benefits:
* The API is more safe, because one can't pass an r-value into the methods
anymore. This generally shouldn't be done, because the token should
live at least as long as the OpenVDB tree is used and shouldn't be freed
immediatly.
* The API is a bit simpler, because it's not necessary to call the
`VolumeGrid.tree_access_token()` method anymore.
This refactors how volume grids are stored with the following new goals in mind:
* Get a **stand-alone volume grid** data structure that can be used by geometry nodes.
Previously, the `VolumeGrid` data structure was tightly coupled with the `Volume` data block.
* Support **implicit sharing of grids and trees**. Previously, it was possible to share data
when multiple `Volume` data blocks loaded grids from the same `.vdb` files but this was
not flexible enough.
* Get a safe API for **lazy-loading and unloading** of grids without requiring explicit calls
to some "load" function all the time.
* Get a safe API for **caching grids from files** that is not coupled to the `Volume` data block.
* Get a **tiered API** for different levels of `openvdb` involvement:
* No `OpenVDB`: Since `WITH_OPENVDB` is optional, it's helpful to have parts of the API that
still work in this case. This makes it possible to write high level code for volumes that does
not require `#ifdef WITH_OPENVDB` checks everywhere. This is in `BKE_volume_grid_fwd.hh`.
* Shallow `OpenVDB`: Code using this API requires `WITH_OPENVDB` checks. However, care
is taken to not include the expensive parts of `OpenVDB` and to use forward declarations as
much as possible. This is in `BKE_volume_grid.hh` and uses `openvdb_fwd.hh`.
* "Full" `OpenVDB`: This API requires more heavy `OpenVDB` includes. Fortunately, it turned
out to be not necessary for the common API. So this is only used for task specific APIs.
At the core of the new API is the `VolumeGridData` type. It's a wrapper around an
`openvdb::Grid` and adds some features on top like implicit sharing, lazy-loading and unloading.
Then there are `GVolumeGrid` and `VolumeGrid` which are containers for a volume grid.
Semantically, each `VolumeGrid` has its own independent grid, but this is cheap due to implicit
sharing. At highest level we currently have the `Volume` data-block which contains a list of
`VolumeGrid`.
```mermaid
flowchart LR
Volume --> VolumeGrid --> VolumeGridData --> openvdb::Grid
```
The loading of `.vdb` files is abstracted away behind the volume file cache API. This API makes
it easy to load and reuse entire files and individual grids from disk. It also supports caching
simplify levels for grids on disk.
An important new concept are the "tree access tokens". Whenever some code wants to work
with an openvdb tree, it has to retrieve an access token from the corresponding `VolumeGridData`.
This access token has to be kept alive for as long as the code works with the grid data. The same
token is valid for read and write access. The purpose of these access tokens is to make it possible
to detect when some code is currently working with the openvdb tree. This allows freeing it if it's
possible to reload it later on (e.g. from disk). It's possible to free a tree that is referenced by
multiple owners, but only no one is actively working with. In some sense, this is similar to the
existing `ImageUser` concept.
The most important new files to read are `BKE_volume_grid.hh` and `BKE_volume_grid_file_cache.hh`.
Most other changes are updates to existing code to use the new API.
Pull Request: https://projects.blender.org/blender/blender/pulls/116315