2023-08-16 00:20:26 +10:00
|
|
|
/* SPDX-FileCopyrightText: 2023 Blender Authors
|
2023-05-31 16:19:06 +02:00
|
|
|
*
|
|
|
|
|
* SPDX-License-Identifier: GPL-2.0-or-later */
|
Geometry Nodes: refactor multi-threading in field evaluation
Previously, there was a fixed grain size for all multi-functions. That was
not sufficient because some functions could benefit a lot from smaller
grain sizes.
This refactors adds a new `MultiFunction::call_auto` method which has the
same effect as just calling `MultiFunction::call` but additionally figures
out how to execute the specific multi-function efficiently. It determines
a good grain size and decides whether the mask indices should be shifted
or not.
Most multi-function evaluations benefit from this, but medium sized work
loads (1000 - 50000 elements) benefit from it the most. Especially when
expensive multi-functions (e.g. noise) is involved. This is because for
smaller work loads, threading is rarely used and for larger work loads
threading worked fine before already.
With this patch, multi-functions can specify execution hints, that allow
the caller to execute it most efficiently. These execution hints still
have to be added to more functions.
Some performance measurements of a field evaluation involving noise and
math nodes, ordered by the number of elements being evaluated:
```
1,000,000: 133 ms -> 120 ms
100,000: 30 ms -> 18 ms
10,000: 20 ms -> 2.7 ms
1,000: 4 ms -> 0.5 ms
100: 0.5 ms -> 0.4 ms
```
2021-11-26 11:05:47 +01:00
|
|
|
|
|
|
|
|
#include "FN_multi_function_params.hh"
|
|
|
|
|
|
2023-01-07 17:32:28 +01:00
|
|
|
namespace blender::fn::multi_function {
|
Geometry Nodes: refactor multi-threading in field evaluation
Previously, there was a fixed grain size for all multi-functions. That was
not sufficient because some functions could benefit a lot from smaller
grain sizes.
This refactors adds a new `MultiFunction::call_auto` method which has the
same effect as just calling `MultiFunction::call` but additionally figures
out how to execute the specific multi-function efficiently. It determines
a good grain size and decides whether the mask indices should be shifted
or not.
Most multi-function evaluations benefit from this, but medium sized work
loads (1000 - 50000 elements) benefit from it the most. Especially when
expensive multi-functions (e.g. noise) is involved. This is because for
smaller work loads, threading is rarely used and for larger work loads
threading worked fine before already.
With this patch, multi-functions can specify execution hints, that allow
the caller to execute it most efficiently. These execution hints still
have to be added to more functions.
Some performance measurements of a field evaluation involving noise and
math nodes, ordered by the number of elements being evaluated:
```
1,000,000: 133 ms -> 120 ms
100,000: 30 ms -> 18 ms
10,000: 20 ms -> 2.7 ms
1,000: 4 ms -> 0.5 ms
100: 0.5 ms -> 0.4 ms
```
2021-11-26 11:05:47 +01:00
|
|
|
|
2023-01-14 15:35:44 +01:00
|
|
|
void ParamsBuilder::add_unused_output_for_unsupporting_function(const CPPType &type)
|
Geometry Nodes: refactor multi-threading in field evaluation
Previously, there was a fixed grain size for all multi-functions. That was
not sufficient because some functions could benefit a lot from smaller
grain sizes.
This refactors adds a new `MultiFunction::call_auto` method which has the
same effect as just calling `MultiFunction::call` but additionally figures
out how to execute the specific multi-function efficiently. It determines
a good grain size and decides whether the mask indices should be shifted
or not.
Most multi-function evaluations benefit from this, but medium sized work
loads (1000 - 50000 elements) benefit from it the most. Especially when
expensive multi-functions (e.g. noise) is involved. This is because for
smaller work loads, threading is rarely used and for larger work loads
threading worked fine before already.
With this patch, multi-functions can specify execution hints, that allow
the caller to execute it most efficiently. These execution hints still
have to be added to more functions.
Some performance measurements of a field evaluation involving noise and
math nodes, ordered by the number of elements being evaluated:
```
1,000,000: 133 ms -> 120 ms
100,000: 30 ms -> 18 ms
10,000: 20 ms -> 2.7 ms
1,000: 4 ms -> 0.5 ms
100: 0.5 ms -> 0.4 ms
```
2021-11-26 11:05:47 +01:00
|
|
|
{
|
2023-01-14 15:56:43 +01:00
|
|
|
ResourceScope &scope = this->resource_scope();
|
2025-04-17 22:01:07 +02:00
|
|
|
void *buffer = scope.allocator().allocate_array(type, min_array_size_);
|
2023-01-14 15:35:44 +01:00
|
|
|
const GMutableSpan span{type, buffer, min_array_size_};
|
|
|
|
|
actual_params_.append_unchecked_as(std::in_place_type<GMutableSpan>, span);
|
2025-04-14 17:48:17 +02:00
|
|
|
if (!type.is_trivially_destructible) {
|
2023-01-14 15:56:43 +01:00
|
|
|
scope.add_destruct_call(
|
2023-01-14 15:35:44 +01:00
|
|
|
[&type, buffer, mask = mask_]() { type.destruct_indices(buffer, mask); });
|
Geometry Nodes: refactor multi-threading in field evaluation
Previously, there was a fixed grain size for all multi-functions. That was
not sufficient because some functions could benefit a lot from smaller
grain sizes.
This refactors adds a new `MultiFunction::call_auto` method which has the
same effect as just calling `MultiFunction::call` but additionally figures
out how to execute the specific multi-function efficiently. It determines
a good grain size and decides whether the mask indices should be shifted
or not.
Most multi-function evaluations benefit from this, but medium sized work
loads (1000 - 50000 elements) benefit from it the most. Especially when
expensive multi-functions (e.g. noise) is involved. This is because for
smaller work loads, threading is rarely used and for larger work loads
threading worked fine before already.
With this patch, multi-functions can specify execution hints, that allow
the caller to execute it most efficiently. These execution hints still
have to be added to more functions.
Some performance measurements of a field evaluation involving noise and
math nodes, ordered by the number of elements being evaluated:
```
1,000,000: 133 ms -> 120 ms
100,000: 30 ms -> 18 ms
10,000: 20 ms -> 2.7 ms
1,000: 4 ms -> 0.5 ms
100: 0.5 ms -> 0.4 ms
```
2021-11-26 11:05:47 +01:00
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
2023-01-07 17:32:28 +01:00
|
|
|
} // namespace blender::fn::multi_function
|